Monday, November 12, 2012
Machine translator speaks Chinese in your voice
New Scientist reports: [edited]
Microsoft Research has demonstrated not only how to convert spoken English into Mandarin with just a few seconds' delay - but also how to output that Mandarin speech in the vocal style of the original speaker.
The technology was demonstrated by Microsoft's research chief Rick Rashid in Tjianjin, China, on 25 October.
Rashid spoke just eight English sentences into the lab's new speech-recognition, translation and generation system, yet the company reports the Mandarin output wowed a crowd of 2000 students and academics.
Microsoft's trick is to use a novel neural networking (machine learning) system that reduces word-recognition errors to one in seven or eight. That means the translation engine, Bing Translate, has a far better chance of creating intelligible Mandarin text to feed into the speaking engine.
But the real prize here is the generation of Mandarin speech in a voice like that of the speaker's: if you can preserve the speaker's vocal cadence in the translation, their meaning will be more apparent and the conversation will be more effective.
This was done by having Rashid train a machine-learning algorithm for a full hour, rather than the quick recitation of a stock page of text that software like Dragon Naturally Speaking asks for.