Support and General Use > User Interface and Voice
Create voice file on a Mac
(1/1)
tm4hman:
Hello,
This is my first post. I really love rockbox, and have helped many blind and visually impaired individuals install it properly on many iPods. However, I myself and a few others do own Macs.
It seems that the configure script in tools of the development download would let you build a voice file. However, it requires Linux based speech technology. I also have a Mac, and it has many great voices just waiting to show their presence on my iPod.
My question is, would someone be willing to try and work with me to implement "say" into the configure script? I have gathered the pieces of the puzzle, but I can't quite gather the full progression the script takes.
First, I noticed it builds variables such as the speech engine of choice. Then it creates a shell script with these choices. Another script is ran, which includes the new script. Finally, the soundfont.c library is used to somehow crunch the mp3s into a single file of unknown format that is readable by rockbox as a voice.
Say allows for the choosing of a voice, and outputs audio to either standard output or an AIFF file. The script relies on lame, which to my knowledge doesn't take AIFF as a direct input. Also, the wavtrim application obviously doesn't like AIFF files. Is sox or a similar app going to be necessary to convert AIFF to WAV?
Well, I sure hope others have a similar interest. If you're on a Mac, please explore the voices on your system. Just about any of them, even the goofy ones, sound better than the MS voices packaged with the system. Fred is especially clear for a robotic voice, and could be great for the speedy reads. Of course, we don't know if we're allowed to publish the voice clips to the site as voice files. However, I would argue that the voices have been used in the comercial media space for years, and I've never heard of Apple going after people for it.
Let me know what you think, I'm new to the community, but I think I have an idea that may very well catch on here.
Domonoky:
hey, welcome to rockbox.
About the building of voice files on Mac:
The only thing which you have to change is the generating of wav files with the tts engine you would want to use. This is already very configurable in the current script, so that it shouldnt be a problem to add some TTS support for Mac.
So you need a executable or a litte script for all of the following tasks:
1. setup the tts engine (if needed )
2. generate a wav file for a given text, with a commad like : espeak "foobar" foobar.wav
3. stop the TTS engine.
So you need to research how you can use those TTS Engines on mac with the command console, and (if needed) write a litte script to do these Tasks ( if they need more than one step on the console).
I dont know how you use "say" on mac. But if it can generate a wav file for a given text, it should be possible to use it in the voicefile building. About AIFF, if lame can read this ( lame is used to convert the wav files to mp3), then you dont have to care about it.
About distributing this files, it questionable if rockbox could distribute such voice files ( licensing problems), but a user sure could do this.
so i hope i could help a litte :-)
tm4hman:
Hello,
An AIFF file is still PCM, but has a different header from wav. Also, say is very simple, and here is the general syntax and man of say:
SAY(1) Speech Synthesis Manager SAY(1)
NAME
say - Convert text to audible speech
SYNOPSIS
say [-v voice] [-o out.aiff] [-f file | string ...]
DESCRIPTION
This tool uses the Speech Synthesis manager to convert input text to
audible speech and either play it through the sound output device cho-
sen in System Preferences or save it to an AIFF file.
OPTIONS
string
Specify the text to speak on the command line. This can consist of
multiple arguments, which are considered to be separated by spaces.
-f file
Specify a file to be spoken. If file is - or neither this parameter
nor a message is specified, read from standard input.
-v voice
Specify the voice to be used. Default is the voice selected in Sys-
tem Preferences.
-o out.aiff
Specify an AIFF file to be written.
If the input is a TTY, text is spoken line by line, and the output
file, if specified, will only contain audio for the last line of the
input. Otherwise, text is spoken all at once.
ERRORS
say returns 0 if the text was spoken successfully, otherwise non-zero.
Diagnostic messages will be printed to standard error.
1.0 2003-05-23 SAY(1)
So, as you can see, it takes either a string in quotes or a text file and will output a single AIFF file with name of your choice per execution. So, for every string in the ".lang" file say will need to be executed.
So, what we need are two things, both of which I'm not aware of a simple shell scriptable answer for:
1. A tool that simply takes input AIFF and generates WAV.
2. A rewrite of the portion of the script that sends everything at once to the engine. It needs to make one call per string. Hopefully this could be done in multiple threads to speed things up. Otherwise it could take a while.
Ive had say generate entire chapters of my text books in college within 30 seconds, so it is quite fast, but the actual conversion to mp3 may best be done with multiple threads. I just have a hard time trying to determine how I could add a new option to the engine menu. That, and how to ask for the input for voice choice. All other options are dependant on the com.apple.speech.plist file, which contains the default speech voice and speed. Pitch is not selectable with say.
Thanks a lot, this is very encouraging.
Ryan
Navigation
[0] Message Index
Go to full version