IBM today announced that it broke the industry record for speech recognition, creating a technology that recognizes spoken words ever closer to human parity.
Last year, IBM announced a major improvement in conversational speech recognition: a system that achieved a 6.9 percent word error rate. Since then, IBM Researchers have continued to push the boundaries of accuracy rates, achieving this historic milestone and setting an industry record of 5.5 percent, a 20% improvement from the rate than was reported six months prior.
“These speech developments build on decades of research, and achieving speech recognition comparable to that of humans is a complex task. At IBM, we are dedicated to creating the technology that will one day match the complexity of how the human ear, voice and brain interact,” said Michael Karasick, IBM Vice President, Cognitive Computing. “This progress will have important implications for how man and machine collaborate in the future, making the interactions more natural and productive. We believe it is only a matter of time before we achieve parity on speech recognition with humans.”
The success of speech recognition technology is measured against human parity, an error rate on par with that of two humans speaking. Previously, human parity was considered a 5.9 percent word error rate; IBM partnered with Appen, a speech and technology service provider, to reassess the industry benchmark and determined that human parity is lower than what anyone has yet achieved: 5.1 percent.