NXC: "Speech"-Recognition

HaWe · Post by **HaWe** » 05 Feb 2011, 18:37

hi folks,
a funny little program which introduces to speech recognition by the NXT.
Actually - it's of course not speech recognition, it's more "Rhythm Detection".

Construction: Sound Sensor at port S2.

http://www.mindstormsforum.de/viewtopic.php?f=70&t=6386

HaWe · Post by **HaWe** » 06 Feb 2011, 15:08

as the pattern of my sound recordings is an oscillation of differnent noise levels (Noise Vibration), I got the idea to use a Fast Fourier Transformation (FFT) for characterizing my SoundRec[400] array.
Unfortunately I have no experience with FFT's at all, and the underlying maths are quite nebulous to me.

IIUC, a FFT approximates a vibration by a sum of sinus waves of different frequencies.
(f1, f2, f3,..., each frequency (resp. wavelength) twice as long as the previous one),
each term multiplied by a specific coefficient.
FFT(t) = c1*sin(f1(t)) + c2*sin(f2(t)) + c3*sin(f3(t)) +...+ cn*sin(fn(t))
As my RecordLenght consists of 400 samples, I suppose the frequencies (resp. wavelengths) could be
f1=1
f2=2
f3=4
f4=8
f5=16
f6=32
f7=64
f8=128
f9=256
That (at least up to f16) should fit, so I have to handle n=9(-16?) terms with 9(-16?) frequencies and 9(-16?) coefficients for 400 noise level samples.

Can anybody tell me how to implement a FFT algorithm for these conditions?

kvols · Post by **kvols** » 07 Feb 2011, 21:34

Hi doc

I wrote one in Lejos some time ago, and it works pretty well under the given circumstances (small processor, very limited amount of space, coarse sampling frequency etc.). There are lots of FFT algorihms out there, but you'll probably need to do some translation.

Google for FFT numerical recipes:
http://www.google.com/search?q=FFT+numerical+recipes

There is some explanation here:
http://en.wikipedia.org/wiki/Fast_Fourier_transform

Best of luck!

Povl

gloomyandy · Post by **gloomyandy** » 08 Feb 2011, 00:29

For the talk we gave about leJOS at JavaOne a year or so ago, Roger created a demo that used "speech recognition". It wasn't as sophisticated as what Doc has planned but it worked pretty well and had a number of people fooled until we told them how it worked. A video of our test (and backup if we had problems on the day) is here:
http://www.youtube.com/watch?v=sjPzcmWSfQs
Some clips from the actual talk are here:
http://www.youtube.com/watch?v=fJD6vGHKLTQ
The voice demo starts about 4:30 into the clip.

Andy

HaWe · Post by **HaWe** » 08 Feb 2011, 08:07

well, what was your algorithm like?
Mine is based on the sum of the least square deviations of loudness patterns, and it works quite well as you may have observed. Notice, that the Lego Sound Sensor doesn't detect frequencies but just loudness oscillations (dbA) - nevertheless the recognition works (in a well-defined sub-population of rhythmically concise spoken words)!

But something like a FT oder FFT seems to be even more promising. Any ideas for a FT or FFT with 10 (max 20) terms (coefficients, frequencies)...?
I'm not a programmer and not a mathematician, and I already googled a lot but didn't find something suitable.

gloomyandy · Post by **gloomyandy** » 08 Feb 2011, 08:37

Hi Doc,
Sorry I'm not sure how the algorithm worked. It was Roger's demo so I'll drop him so mail to find out....

Andy

HaWe · Post by **HaWe** » 08 Feb 2011, 18:38

new version with oscillograph (revised version) :)

mightor · Post by **mightor** » 08 Feb 2011, 19:31

new version with oscillograph (revised version)

Are the graphs with our without a German accent?

This is pretty cool stuff.

- Xander

HaWe · Post by **HaWe** » 08 Feb 2011, 19:35

accent?
what is "accent" ?
;)

HaWe · Post by **HaWe** » 09 Feb 2011, 08:48

Hi,
what do you think: what would be the best way to transfer al those sound arrays as a file to the PC,
e.g. 10 samples of each of 6 spoken words = 60 arrays[400] ?

in order to process the data on an external computer (by Excel or a ANSI C++ program) .
I think a text file with a separation of all numbers by ";" would be ok.

Mindboards

NXC: "Speech"-Recognition

NXC: "Speech"-Recognition

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Re: NXC: "Speech"-"Recognition"

Who is online