Thursday, June 12, 2008

ASIMO voice recognition improving

New Scientist reports: [edited]

ASIMO just got a new superpower – it can understand three humans shouting at once. For now the modified Asimo's new ability are being used to judge rock-paper-scissors contests. But the number of voices and the complexity of the sentences the software can deal with should grow in future.

Hiroshi Okuno at Kyoto University, and Kazuhiro Nakadai at the Honda Research Institute in Saitama have designed the new software which they call HARK.

HARK uses an array of eight microphones to work out where each voice is coming from and isolate it from other sound sources. The software then works out how reliably it has extracted an individual voice, before passing it onto speech-recognition software to decode.

The HARK system actually goes beyond normal human listening capabilities, Okuno told New Scientist. "It can listen to several things at once, and not just focus on a particular single sound source."


Skep said...

A good idea, but I've always thought that the best way of teaching computers to recognise speech and voice is to build what is essentially a baby. A machine with very limited ability, nothing yet functional to the point of helping itself, but with the programming to allow it to learn, to recognise our speech in the same way a child would.
Afterall, the best designs always come from nature, and so far, teaching it an array of words without letting it know how sentence structure and pronunciation works has proved far less than effective.

brett jordan said...

hi skep

the HARK project is interesting because it is aiming to do stuff that humans can't... however, i agree that if we are to produce robots that mimic humans, then the 'learning' method of programming is significant

