3.4 KiB

Raw Permalink Blame History

GLaDOS Text-to-speech (TTS) Voice Generator

Neural network based TTS Engine.

Notes about this fork

Forked by ben ( @benediktkr) from github:VRCWizard/glados-tts-voice-wizard, which in turn was a fork of github:R2D2FISH/glados-tts.

This fork modernizes and improves the Python code in the project and does a bunch of housekeeping.

[DONE]: Gets rid of the SciPy dependency (replaced with the more modern and lightwight pysoundfile (since all it was used for was writing a .wav file to disk)
[DONE]: Support modern stable Python 3 versions, and update dependencies.
[DONE]: Versioned packages with poetry and pyproject.toml
[DONE]: Configuration handling with click.
[DONE]: Better logging with loguru
[DONE]: Python coding style and code quality improvements (proper handling of file object, improved logging..)
[DONE]: Switch to using ASGI with uvicorn and fastapi instead of Flask and WSGI, and support production-capable deployments as default.
[DONE]: Docker support
[TODO]: Support Home Assistant through the notify integration
[TODO]: see if its possible to avoid espeak-ng as a system package dependency (python bindings, buliding the C library, etc)

No work on the speech model itself is expected.

Description

The initial, regular Tacotron model was trained first on LJSpeech, and then on a heavily modified version of the Ellen McClain dataset (all non-Portal 2 voice lines removed, punctuation added).

The Forward Tacotron model was only trained on about 600 voice lines.
The HiFiGAN model was generated through transfer learning from the sample.
All models have been optimized and quantized.

Install

First you need to install the espeak-ng system packages.

# for debian/ubuntu:
sudo apt-get install espeak-ng

# for fedora/amazon:
sudo yum install espeak-ng

This can hopefully be improved in the future. There is a Python bindings for espeak (at a glance, found py-espeak-ng).

Then install the poetry-managed virtualenv

poetry install

Usage

If you want to just play around with the TTS, works on the shell:

poetry run gladosctl

The TTS engine can also run as a web server:

poetry run gladosctl restapi

A public instance of the http api is running at http://www.sudo.is/api/glados, where you can also read api documentation.