Speech recognition breakthrough from Microsoft

  sonic0002      2012-11-09 23:53:21      3,311    0

Recently, Microsoft Research published a demo video on YouTube that demonstrated simultaneously translating English speech to Chinese speech with Microsoft's new research achievement. The result is quite impressive, it has large improvement compared to previous speech recognition systems. It's a big breakthrough of speech recognition.

In this demo, Microsoft Chief Research Officer Rick Rashid demonstrates a speech recognition breakthrough.First when he speaks, the screen will display what he is saying in English, this is amazing already, isn't it? Then at the end of the demo, the real impressive thing happens, his spoken English words were  translated into computer-generated Chinese language via machine translation.

According to Rick Rashid, people were using a pattern matching method to recognize speaker's voice around 60 years ago. They take voices from speakers and match them with some known waveforms of words. But the drawbacks.are each person speaks differently, different tones,different volumes, even the same person will speak differently in different mood,  different environment. The error rate is relative high at that time.

In late 1970s, major change in speech recognition, this work in done in Carnegie Mellon university. That is a statistic model called Markov model. They collect much data from different speakers and build a statistic model. It's a large improvement in speech recognition. But it still has some mistakes. The error rate for arbitrary speech is 20%-25%.

A few years ago, Microsoft research and University of Toronto researchers came together and brought out a new way to do speech recognition. They used a way called neural networks which patterned the human brains. This increases the recognition rate about 30%, that means in the past, there would be 2-3 words errors out of 5 words, but now it is 1 out of 7 words.

Today, we will use machine translation to improve this further. The combination of statistics and big data has made us to do much better job.  It will help us translate one language into another language in short time. Rick Rashid demonstrates how to translate English into Chinese. It happens in two steps : 1. First the English words come out of the speaker will be translated into Chinese words through Microsoft translation system. 2. The Chinese words will be converted into Chinese language through the text to speech system.

If the real effect of Microsoft translation likes they demonstrated in the video. Then it's really a breakthrough. This means in the future, we no need to bring translator with us when we visit another countries, there will be no simultaneous translator in some conferences etc. Again, just like Rick Rahid said there is a long way to go. Look forward to seeing more technical details about the new machine translation.

Here is the demo video :



