MLLP researcher Miguel Á. del Agua Teba has been at Interspeech 2016 in San Francisco (8–12 September) to present his article on “ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks”, and to present the MLLP’s submission to the CHiME 4 Speech Separation and Recognition Challenge (13 September). Here are his impressions after an intense conference.
The Interspeech conferences (CORE A), organized by the International Speech Communication Association, are attended annually by more than 1,000 participants from all over the world, who present their work on all the scientific and technological aspects of Speech. Interspeech 2016 (#IS2016) has focused on the topic Understanding Speech Processing in Humans and Machines, and it has been held in San Francisco, California, on 8–12 September.
The MLLP’s Miguel Á. del Agua Teba was there to present the paper “ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks” (you can find the abstract in our previous news post). Miguel was overwhelmed by the interest of Interspeech attendants in our work. Representatives from both industry and research institutions at #IS2016 (Apple, Microsoft, Nuance…) dropped by to ask him about our recent work in this field. Miguel reports that it has been a great occasion to have interesting exchanges, and we’d like to thank everyone for their interest.
Miguel also made the most of attending the #IS2016 sessions on Neural Networks for Speech Recognition, where interesting work was reported on CNNs, BLSTMs and CTC-based training. In the #IS2016 proceedings you can find the papers that were presented in sessions such as “Neural Networks in Speech Recognition”, “New Trends in Neural Networks for Speech Recognition”, “Feature Extraction and Acoustic Modelling Using Neural Networks for ASR”, “Far-Field Speech Processing”, “Acoustic Model Adaptation”, “Neural Networks for Language Modelling”, and “Speech Synthesis Oral I: Neural Networks”.
At the trade booth exhibition area, some of the largest players in technology and innovation were present (Google, Apple, Microsoft, Amazon, eBay). Miguel noticed an important presence of resource providers for ASR system training, as well as of companies working in the field of Text-to-Speech Synthesis, in which the MLLP is also active.
Then, on 13 September, it was time for the satellite event CHiME 4 Speech Separation and Recognition Challenge, co-organized by Google, INRIA, MERL and the University of Sheffield’s SPandH group. This new challenge revisited the datasets originally recorded for CHiME-3, i.e., Wall Street Journal corpus sentences spoken by talkers situated in challenging noisy environments recorded using a 6-channel tablet based microphone array. For CHiME-4, the level of difficulty was increased by constraining the number of microphones available for testing. Miguel, who was in charge of the MLLP’s participation in the challenge, was there to present the MLLP’s submission. You can now see it all in detail, as the organizers have already posted the full CHiME 4 Challenge proceedings and results.
Interspeech 2017 will be held in Stockholm. We look forward to it after this year’s fruitful conference!
Update (6 Oct 2016): The text has been edited to include a link to the Interspeech 2016 proceedings.