U wot m8? Microsoft Speech Recognition Beats Humans
Supplicating to our machine overlords will soon be no problem at all, as a team from Microsoft's Speech & Dialog research group have announced their development of speech recognition software that recognises words in a conversation as well as a human does.
In the report, released on Monday, the team described how the software reported a word error rate (WER) of just 5.9%, about the same as a professional amanuensis (try and transcribe that one - the human resistance lives on ✊) and, Microsoft claims, the lowest ever recorded against the industry standard Switchboard test. “We’ve reached human parity,” said Xuedong Huang, the company’s chief speech scientist. “This is an historic achievement.” Achievement unlocked!
This marks the first time that AI transcription software has hit this level of accuracy, a goal the Microsoft team set themselves a year ago, and one they thought couldn't be hit even in five years. Geoffrey Zweig, who manages the Speech & Dialog research group, ascribed the monumental progress made to the improvements in neural network technology and its usage.
Neural language models can represent words as continuous vectors in space, where semantically similar words are close together, allowing the bot to generalise well when choosing words. This combined with deep learning neural algorithms on specialised GPUs to exponentially improve learning speeds for the bot, bringing it to the milestone it's at today.
The continuing advancement of voice recognition tech holds a variety of implications, from speech-to-text software to digital AI assistants like Cortana. In terms of social media, it could permit us to control our social media networks through speech, changing the way we interact with them and each other. Indeed, a while back Facebook purchased speech recognition software Wit.ai.
I hope that when Microsoft implements this kind of tech, they learnt from some of their previous experience with AI. And maybe this will mean some more accurate translation software is in the pipeline, better than Skype Translator, which was at one point translating "It's nice to meet you!" into "It's f***ing nice to f*** you!" Maybe they skipped the voice recognition part and went straight to mindreading.
Regardless, fast, accurate speech recognition holds a lot of potential in terms of how users can interact with their computers. Microsoft CEO Satya Nadella said that the power of fully interacting with an OS could be a quantum leap forward as large as the introduction of the GUI. At least we'll know the robot masters are lying when they pretend not to understand us. "Please, mercy!"
"WHAT'S THAT? PEAS, MURPHY?"
Squad goals for the team in the future include better tuning the AI to recognise speech in surroundings with a lot of background noise, better assigning individual voices to individual people, and better recognising a greater variety of pronunciations. Long term, the next step will be to move from recognition to understanding, the difference between recognising that the machines will eventually be able to understand speech, and understanding that this will be part of the sentient singularity that marks the downfall of us all. And of course there is still continued work needs to be done on the robot's ability to deal with awkward silences, bad jokes, and mumblers.
Speech recognition first took off as a viable technology with DARPA in the 70s, and since then become more and more of a mainstream tech concern, from IBM's Tangora voice-activated typewriter in the 80s to the early 90s Sphinx-II. Xuedong Huang, developer of the Sphinx-II, actually started the
speech recognition group at Microsoft and was part of the team that just hit this milestone.
At least this breaks down one less barrier towards having our own personal AI assistants, which would be Cortana in Microsoft's case. Just heed the warnings from Her about falling for your AI - definitely don't do a Master Chief and fall in love with Cortana. Unless they get Scarlett Johansson to do the voice, then feel free to fall away. Zuckerberg already got Robert Downey Jr to voice his AI... Get on it Bill!
The research group includes; Wayne Xiong, Geoffrey Zweig, Xuedong Huang, Dong Yu, Frank Seide, Mike Seltzer, Jasha Droppo and Andreas Stolcke. |
This marks the first time that AI transcription software has hit this level of accuracy, a goal the Microsoft team set themselves a year ago, and one they thought couldn't be hit even in five years. Geoffrey Zweig, who manages the Speech & Dialog research group, ascribed the monumental progress made to the improvements in neural network technology and its usage.
Neural language models can represent words as continuous vectors in space, where semantically similar words are close together, allowing the bot to generalise well when choosing words. This combined with deep learning neural algorithms on specialised GPUs to exponentially improve learning speeds for the bot, bringing it to the milestone it's at today.
The continuing advancement of voice recognition tech holds a variety of implications, from speech-to-text software to digital AI assistants like Cortana. In terms of social media, it could permit us to control our social media networks through speech, changing the way we interact with them and each other. Indeed, a while back Facebook purchased speech recognition software Wit.ai.
I hope that when Microsoft implements this kind of tech, they learnt from some of their previous experience with AI. And maybe this will mean some more accurate translation software is in the pipeline, better than Skype Translator, which was at one point translating "It's nice to meet you!" into "It's f***ing nice to f*** you!" Maybe they skipped the voice recognition part and went straight to mindreading.
Regardless, fast, accurate speech recognition holds a lot of potential in terms of how users can interact with their computers. Microsoft CEO Satya Nadella said that the power of fully interacting with an OS could be a quantum leap forward as large as the introduction of the GUI. At least we'll know the robot masters are lying when they pretend not to understand us. "Please, mercy!"
"WHAT'S THAT? PEAS, MURPHY?"
Squad goals for the team in the future include better tuning the AI to recognise speech in surroundings with a lot of background noise, better assigning individual voices to individual people, and better recognising a greater variety of pronunciations. Long term, the next step will be to move from recognition to understanding, the difference between recognising that the machines will eventually be able to understand speech, and understanding that this will be part of the sentient singularity that marks the downfall of us all. And of course there is still continued work needs to be done on the robot's ability to deal with awkward silences, bad jokes, and mumblers.
Speech recognition first took off as a viable technology with DARPA in the 70s, and since then become more and more of a mainstream tech concern, from IBM's Tangora voice-activated typewriter in the 80s to the early 90s Sphinx-II. Xuedong Huang, developer of the Sphinx-II, actually started the
speech recognition group at Microsoft and was part of the team that just hit this milestone.
At least this breaks down one less barrier towards having our own personal AI assistants, which would be Cortana in Microsoft's case. Just heed the warnings from Her about falling for your AI - definitely don't do a Master Chief and fall in love with Cortana. Unless they get Scarlett Johansson to do the voice, then feel free to fall away. Zuckerberg already got Robert Downey Jr to voice his AI... Get on it Bill!
With
a master’s in Literature, Sam inhales books and anything readable, spending his
working hours reformulating the info he gathers into digestible articles. When
not reading or writing, he likes to put his camera to work around the world,
snapping street photography from Stockholm to Tokyo. Too much of this time
spent in Japan teaching English has nurtured a weakness for sashimi, Japanese
whisky, and robot cafés. Follow
him @SamF_Songbird
Contact
us on Twitter,
on Facebook, or
leave your comments below. To find out about social media training or
management why not take a look at our website for more info: TheSMFGroup.com
U wot m8? Microsoft Speech Recognition Beats Humans
Reviewed by Unknown
on
Monday, October 24, 2016
Rating: