A GLaDOS TTS, using Forward Tacotron and HiFiGAN. Inference is fast and stable, even on the CPU. A low quality vocoder model is included for mobile use. Rudimentary TTS script included. Works perfectly on Linux, partially on Maybe someone smarter than me can make a GUI.

Go to file

ben f5992a2730 ben/glados-tts/pipeline/head This commit looks good Details fixes (#9 ) gitignore and links on docs hide index for now Co-authored-by: Ben Kristinsson <ben@sudo.is> Reviewed-on: b/glados-tts#9		2023-05-21 14:34:09 +00:00
audio	start making things more pythonic, use poetry and structure as a module, PEP8 compliance and summing up goals and changes in README (#1 )	2023-05-05 19:09:16 +00:00
glados_tts	fixes (#9 )	2023-05-21 14:34:09 +00:00
old	FastAPI, documentation, etc. (#5 )	2023-05-21 12:32:12 +00:00
tests	FastAPI, documentation, etc. (#5 )	2023-05-21 12:32:12 +00:00
.flake8	FastAPI, documentation, etc. (#5 )	2023-05-21 12:32:12 +00:00
.gitignore	fixes (#9 )	2023-05-21 14:34:09 +00:00
Dockerfile	cleanup of Dockerfile (#6 )	2023-05-21 13:44:55 +00:00
Jenkinsfile	FastAPI, documentation, etc. (#5 )	2023-05-21 12:32:12 +00:00
LICENSE	Add License File	2022-11-08 17:06:54 -10:00
README.md	cleanup of Dockerfile (#6 )	2023-05-21 13:44:55 +00:00
chell.jpg	start making things more pythonic, use poetry and structure as a module, PEP8 compliance and summing up goals and changes in README (#1 )	2023-05-05 19:09:16 +00:00
docker-compose.yml	bugfixes, click meta dict and handle invalid json in config file (#8 )	2023-05-21 14:20:30 +00:00
poetry.lock	FastAPI, documentation, etc. (#5 )	2023-05-21 12:32:12 +00:00
pyproject.toml	bump version to 0.2.0 (#7 )	2023-05-21 13:52:00 +00:00

README.md

GLaDOS Text-to-speech (TTS) Voice Generator

Neural network based TTS Engine.

Notes about this fork

Forked by ben ( @benediktkr) from github:VRCWizard/glados-tts-voice-wizard, which in turn was a fork of github:R2D2FISH/glados-tts.

This fork modernizes and improves the Python code in the project and does a bunch of housekeeping.

[DONE]: Gets rid of the SciPy dependency (replaced with the more modern and lightwight pysoundfile (since all it was used for was writing a .wav file to disk)
[DONE]: Support modern stable Python 3 versions, and update dependencies.
[DONE]: Versioned packages with poetry and pyproject.toml
[DONE]: Configuration handling with click.
[DONE]: Better logging with loguru
[DONE]: Python coding style and code quality improvements (proper handling of file object, improved logging..)
[DONE]: Switch to using ASGI with uvicorn and fastapi instead of Flask and WSGI, and support production-capable deployments as default.
[DONE]: Docker support
[TODO]: Support Home Assistant through the notify integration
[TODO]: see if its possible to avoid espeak-ng as a system package dependency (python bindings, buliding the C library, etc)

No work on the speech model itself is expected.

Description

The initial, regular Tacotron model was trained first on LJSpeech, and then on a heavily modified version of the Ellen McClain dataset (all non-Portal 2 voice lines removed, punctuation added).

The Forward Tacotron model was only trained on about 600 voice lines.
The HiFiGAN model was generated through transfer learning from the sample.
All models have been optimized and quantized.

Install

First you need to install the espeak-ng system packages.

# for debian/ubuntu:
sudo apt-get install espeak-ng

# for fedora/amazon:
sudo yum install espeak-ng

This can hopefully be improved in the future. There is a Python bindings for espeak (at a glance, found py-espeak-ng).

Then install the poetry-managed virtualenv

poetry install

Usage

If you want to just play around with the TTS, works on the shell:

poetry run gladosctl

The TTS engine can also run as a web server:

poetry run gladosctl restapi

A public instance of the http api is running at http://www.sudo.is/api/glados, where you can also read api documentation.