31 lines
1.5 KiB
Markdown
31 lines
1.5 KiB
Markdown
---
|
|
title: "Contributing your voice"
|
|
---
|
|
|
|
You can help us and the rest of the open voice community develop **speech-to-text** and **text-to-speech** models for your language.
|
|
|
|
## Speech-to-text
|
|
|
|
When you speak to a computer, it **transcribes** the audio from your voice into text. There are many ways to do this, but they all rely on recordings of people speaking.
|
|
|
|
For speech-to-text, it is important to have:
|
|
|
|
* Many different speakers and accents
|
|
* A variety of recording devices and quality levels
|
|
* Typically 16Khz audio with 16-bit samples
|
|
* Multiple recording environments, including different rooms and noise levels
|
|
|
|
We recommend that users contribute to [Mozilla's Common Voice](https://commonvoice.mozilla.org) project for speech-to-text. This free and open dataset crowd sources spoken sentences from people around the world. Contributors may also help by validating existing recordings.
|
|
|
|
|
|
## Text-to-speech
|
|
|
|
When a computer speaks to you, it **synthesizes** audio from text. This has different requirements than a speech to text dataset:
|
|
|
|
* A single speaker, or equal amounts of data for all speakers
|
|
* A high quality recording device
|
|
* Typically 48Khz with 32-bit samples
|
|
* A quiet, controlled recording environment such as a sound-proof booth
|
|
|
|
We recommend that users contribute to the [LibriVox](https://librivox.org/) project for text to speech. This not only provides training data for the open voice community, but also free audio books for everyone to enjoy. Importantly, the books that are read must be in the public domain.
|