PDA

View Full Version : Dynamic Text to Speech



ElectricGrandpa
December 16th, 2005, 04:06 PM
www.r7.ca

This is a little test page I've been working on. It will take the text you enter and generate an mp3 based on it(it could easily be put into a flash file).

As an extra bonus today, it was also read the text you enter aloud *off my computer in my office*. That's right, anything you enter into that textfield will go straight to the indusblue office, so be nice ;) Just kidding, say whatever you want.

I'll probably only keep it up for a little bit, as it's partly just a stress test to see how well it works, as we've got some sweet ideas for it in our new website.

-Matt

kritikal
December 16th, 2005, 04:33 PM
lol not bad, would be even cooler if it could play the sound right away without having to download the mp3, kind of like those sitepal things

ElectricGrandpa
December 16th, 2005, 04:45 PM
yeah it could, if I used flash. right now I'm also using the crappy microsoft voices, which are also way slower at processing the sounds... soon we're probably going to buy some cepstral.com voices, as they're pretty sweet.

-Matt

PS: What was your message? Wash the boys?

kritikal
December 16th, 2005, 04:51 PM
lol so I guess the mp3 files stay on your server :P My message was wassup boys

Well can't wait to see the outcome :)

TheCanadian
December 16th, 2005, 04:58 PM
Sweet. I typed in the lyrics of a Doors song (Hello, I Love You). The mp3 played to me in WMP and then the very next song that came on was Hello, I Love You - pretty strange.

ElectricGrandpa
December 16th, 2005, 05:26 PM
lol so I guess the mp3 files stay on your server

Yep, and they are broadcasted into my office :D

Krilnon
December 16th, 2005, 06:18 PM
Can other people in your office hear this?

Lolek
December 16th, 2005, 07:11 PM
good

booler
December 16th, 2005, 07:21 PM
thats pretty cool man

ElectricGrandpa
December 16th, 2005, 09:14 PM
Can other people in your office hear this?

Yep :D But there's only 4 of us, so it's all good :D

ElectricGrandpa
December 19th, 2005, 10:13 AM
Haha, I left it on over the weekend, and I came back and there were 419(!) mp3s. Yowza! The longest one was 42(!) minutes long. Maybe I'll put up a page that lists them all for you... :D It seems to have crashed at the moment though.

komarik
December 20th, 2005, 03:10 PM
This is very nice.
Would be nice if you could choose different voices ;)

Jeff Wheeler
December 20th, 2005, 03:33 PM
That is awesome :thumb:

It seems you're forcing the download though (attachment header).

Seb Hughes
December 20th, 2005, 05:01 PM
Haha, I left it on over the weekend, and I came back and there were 419(!) mp3s. Yowza! The longest one was 42(!) minutes long. Maybe I'll put up a page that lists them all for you... :D It seems to have crashed at the moment though.
42 Mins thats insane.

H4T
December 20th, 2005, 05:18 PM
Goodness, that is sweet. Can you please hint a little bit at the process? Do you have individual sound clips of vowels, letters, etc, or of words and such and then you string the words together into a file? Don't tell me you have a full blown text-processing algorithm that generates waveforms on the fly, lol. :P

I love this stuff, very cool man!

Jeff Wheeler
December 20th, 2005, 05:19 PM
I'm curious too. Is it possible on linux?

ElectricGrandpa
December 20th, 2005, 07:43 PM
Yeah I'll post how to do it soon, and yep, it's definitely possible on linux(maybe even easier on linux, but I haven't tried).

"It seems you're forcing the download though"

Yeah I could just make it a download link, but I was too lazy, so right now it forces the download.

Jeff Wheeler
December 20th, 2005, 08:25 PM
Hmm… I'm curious now ;P

tedc
December 20th, 2005, 09:15 PM
WOW! What libraries are you using for the text 2 speech? And for writing the mp3? Very good work, I'd love to see it in action as an accessibility tool...

ElectricGrandpa
December 20th, 2005, 10:00 PM
WOW! What libraries are you using for the text 2 speech? And for writing the mp3? Very good work, I'd love to see it in action as an accessibility tool...

In this example I'm using Microsoft's SAPI 5, and a demo version of an "api wrapper" called Active TTS to save the mp3(The reason it takes so long for your mp3s to get created is because it's a demo version with a nag screen). I've got another version(which has a Flash front-end) that I'll put up soon that uses Cepstral voices(they're much better, www.cepstral.com), Cepstral's "Spark.exe" and BladeEnc to encode the mp3's.

Nokrev, as for doing it on linux, you should look into "Festival" which is an open source TTS engine(but unfortunately for me there's no Windows build). If you use a combination of that and PHP's shell commands(ie. exec) you should be good to go... but I'm not really that experienced with Linux servers(or Windows servers, for that matter).

-Matt

ElectricGrandpa
December 21st, 2005, 12:19 PM
Wow. Well I was checking through the files this morning, and apparently someone generated a 15 hour(yes that's HOUR) long mp3 file, with just the the worlds "this is fun" over and over and over. It's 108 megs. Nice job :) That's a sweet stress test. Also, someone has been making a whole lot of mp3 files for "t2klive.net" this morning haha :D

-Matt

somdow
December 21st, 2005, 12:28 PM
Wow. Well I was checking through the files this morning, and apparently someone generated a 15 hour(yes that's HOUR) long mp3 file, with just the the worlds "this is fun" over and over and over. It's 108 megs. Nice job :) That's a sweet stress test. Also, someone has been making a whole lot of mp3 files for "t2klive.net" this morning haha :D

-Matt


ROFL ahahah yeah that was me lol my bad...i didnt read the part that said itll be playable in your offices LOL. my teeam says some funny stuff so i was gonna add it in out teams profiles lol

lmao thats funny lol

Jeff Wheeler
December 21st, 2005, 03:07 PM
Wow, that's cool. Nice :thumb:

boswell255
December 22nd, 2005, 01:47 PM
@ H4T: I think he uses the microsoft voice on his computer, so it will take the text, process it through the prebuilt TTS engine and then sends you the mp3.

This is very nice work, as i said to the TTS, I am impressed :D

Excellent work :thumb:

One thing though, when i typed the message "wow that's incredible" it did not distinguish the apostrophe, and for some reason put a backslash in there somewhere.

ElectricGrandpa
December 22nd, 2005, 04:17 PM
ROFL ahahah yeah that was me lol my bad...i didnt read the part that said itll be playable in your offices LOL. my teeam says some funny stuff so i was gonna add it in out teams profiles lol

lmao thats funny lol

Hah yeah don't worry about it :), I've got it going through my headphones instead now.


@ H4T: I think he uses the microsoft voice on his computer, so it will take the text, process it through the prebuilt TTS engine and then sends you the mp3.

Yep :D


One thing though, when i typed the message "wow that's incredible" it did not distinguish the apostrophe, and for some reason put a backslash in there somewhere.

Yeah I'm not doing any escaping... I've got another version that figures it out much better.

▄▄▄
December 23rd, 2005, 03:36 PM
cool, i like this kind of stuff. Were you create new things from flash (... and php .. etc)

i have made something similar butt with images, where i take an image from the webcam and save that image but non in jpg format, instead in *.nim fomrat... for more click:
http://www.kirupa.com/forum/showthread.php?p=1737095

bw_111
December 29th, 2005, 01:49 AM
WoW! very nice work!

komarik
December 29th, 2005, 10:35 PM
Unbelivable ;)

asap
December 30th, 2005, 07:30 AM
¿Do you knoy ALAN-UNO?
Is a BOT system based in artificial intelligence, you need to install an Amiga emulator. And speak spanish, sorry. :)

http://www.3dpoder.com/foro3dpoder/showthread.php?t=32723

dunger99us
March 21st, 2006, 01:19 PM
http://149.99.41.142/xampp/ spmwhow the link redirects to the webpage noted...

Sciurus
March 21st, 2006, 10:10 PM
^yeah me too

jwopitz
March 15th, 2007, 04:12 PM
I know this is an old post but I am hoping ElectricGrandpa is still around. I wanted to see this text to mp3 working but the site is down. Is there a possibility you will have it up again or can repost elsewhere?

Spectro
March 17th, 2007, 11:49 AM
My friend is working on a similar project, except his focus is on translation. The way he figures, if there is an availability of speech to text in Microsoft word, then why not have the text translated into another language and then returned using a similar method as this. An on-the-fly speech translator isn't something we've seen on the market quite yet, so I'd love to see progress in this field!

Jeff Wheeler
March 17th, 2007, 11:54 AM
Hmm, except in the gazillion translator devices that the Chinese use...

Seb Hughes
March 17th, 2007, 02:56 PM
My friend is working on a similar project, except his focus is on translation. The way he figures, if there is an availability of speech to text in Microsoft word, then why not have the text translated into another language and then returned using a similar method as this. An on-the-fly speech translator isn't something we've seen on the market quite yet, so I'd love to see progress in this field!

Doing that for German would be pretty much impossible. Infact I know it will be impossible for German.

joran420
March 17th, 2007, 06:18 PM
I dont think its that easy. for text to speech it just needs to know the phonetics of the language its entered in, for translation it has to comprehend whats being said not just translate single words(although there are plenty of speech dictionary's out there that will speak a word in a foriegn language.)

the algorithm would have to understand past tense /present tense/ and a million other nuances... I dont think well see acurate translators like this for quite a few years yet.

Seb Hughes
March 18th, 2007, 12:06 PM
I dont think its that easy. for text to speech it just needs to know the phonetics of the language its entered in, for translation it has to comprehend whats being said not just translate single words(although there are plenty of speech dictionary's out there that will speak a word in a foriegn language.)

the algorithm would have to understand past tense /present tense/ and a million other nuances... I dont think well see acurate translators like this for quite a few years yet.

Correct, in German it would have to know when to use ö,ä,ü and also some word in german can have like 5 different meanings, also it would have to know when to use den, des, dem. Also putting it into the correct order and the list just goes on.

Spectro
March 18th, 2007, 02:12 PM
Doing that for German would be pretty much impossible. Infact I know it will be impossible for German.

My friend was specifically going to have an English to Spanish translation engine for ease, but it was just a lot of low level ideas for a school project, nothing commercial.

hybrid101
March 19th, 2007, 08:45 AM
@seb
does windows german edition have speech capabilities?
lol, just noticed the year on the first post. this brings back memories:D