- Home /
Unity3D & Karaoke / Speech recognition
Dear Experts,
I have a question regarding Unity3D and an application we are developing, what we need to do is compare speech input from users using a mic, for instance the pronunciation of the letter "B", I though that a Karaoke system or API would do the job even if it wasn't completely accurate, is there anything already established regarding Unity3D and speech recognition / Karaoke ??
I appreciate the help.
Its a good thing its actually regarding Unity3D and an application you are developing. I would have feared a question about God or how to bake a cake.
Answer by Noise crime · Jul 14, 2010 at 01:00 PM
I'd be careful about level of expectations. I seriously doubt any Karaoke machine does such advanced speech recognition. Most singing games will simply track pitch as thats relatively easy. Being able to detect specific pronunciations is much harder. Thats not to say it can't be done, but i'd imagine that is state of the art (at least ones with a good recognition percentage) and would have a price tag to match.
If you search online you should find dll plugins, but I doubt they will be cheap. Then its a question of getting that to work in Unity via its own plugin architecture.
The alternative is to look into an FFT (fast Fourier Transform) algorithm (or dll) and build on top of that to detect pitch range. An FFT (in this case) provides a snapshot of the magnitude of all frequencies of a specific timeslice from an audio input. Ha, thats not quite a s simple as I was hoping it was going to sound ;)
Anyway there is plenty of literature online about FFT and you should find easily find code samples. It could feasibly be built in c#, though you might be burning up cpu cycles that could be used elsewhere, in which case a c++ plugin will be needed or you could even try doing it on the gpu. Granted there is much to learn, stuff like the nyquist theorm, how to make FFT efficient etc, but it shouldn't be out of reach.
Personally thats where i'd start, I wouldn't even dream of trying to detect actual words or letters, just pitch.
Answer by Mike 3 · Jul 14, 2010 at 11:13 AM
I don't believe so, no. Most likely you'll need to look for a c++ API, and hook into that (via Unity Pro)
Answer by makisig.du · Sep 18, 2011 at 10:09 AM
Hi, there!
Your question cannot be answered in just one reply here, unfortunately. Speech processing is a big discipline on its own and it took my team several weeks to research and develop it. Anyway, Noisecrime already gave you something to start with which is where we started ourselves.
My team and I successfully implemented a cross-platform speech recognition plugin for Unity. We can probably help you out - just drop me a line.
maybe you might consider releasing that as a plugin in the asset store?