The MLLP will be at Interspeech 2016 and CHiME 4

The MLLP is back at work after the summer holidays. MLLP researcher Miguel Á. del Agua Teba will be at Interspeech 2016 in San Francisco (8–12 September) to present his article on “ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks”, and to present the MLLP’s submission to the CHiME 4 Speech Separation and Recognition Challenge (13 September).

The annual Interspeech conferences, organized by the International Speech Communication Association, include papers on all the scientific and technological aspects of Speech. More than 1,000 participants from all over the world attend this CORE A conference annually to present their work in oral and poster sessions. Interspeech 2016 will be organized around the topic Understanding Speech Processing in Humans and Machines, and it will be held in the Hyatt Regency San Francisco hotel in San Francisco, California, on 8–12 September.

The MLLP article “ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks” was accepted for publication at Interspeech 2016, and its first author Miguel Á. del Agua Teba will be there to present it. You can read below the paper’s abstract. The poster presentation is slated to take place on 12 September, from 10:00 to 12:00, at the conference centre’s Pacific Concourse, Poster D area (within the session “Robustness and adaptation”), so don’t forget to drop by!

ASR Confidence Estimation with Speaker-Adapted Recurrent Neural Networks

Miguel Ángel del-Agua, Santiago Piqueras, Adrià Giménez, Alberto Sanchis, Jorge Civera, Alfons Juan

Abstract: Confidence estimation for automatic speech recognition has been very recently improved by using Recurrent Neural Networks (RNNs), and also by speaker adaptation (on the basis of Conditional Random Fields). In this work, we explore how to obtain further improvements by combining RNNs and speaker adaptation. In particular, we explore different speaker-dependent and speaker-independent data representations for Bidirectional Long Short Term Memory RNNs of various topologies. Empirical tests are reported on the LibriSpeech dataset, showing that the best results are achieved by the proposed combination of RNNs and speaker adaptation.

Miguel will also be at the satellite event CHiME 4 Speech Separation and Recognition Challenge on 13 September, co-organized by Google, INRIA, MERL and the University of Sheffield’s SPandH group. This new challenge revisits the datasets originally recorded for CHiME-3, i.e., Wall Street Journal corpus sentences spoken by talkers situated in challenging noisy environments recorded using a 6-channel tablet based microphone array. CHiME-4 increases the level of difficulty by constraining the number of microphones available for testing. Miguel was in charge of the MLLP’s participation in the challenge, and will be there to present the MLLP system and results. You can read the abstract below for more details.

The MLLP system for the 4th CHiME Challenge

Miguel Ángel del-Agua, Adrià Martínez-Villaronga, Adrià Giménez, Alberto Sanchis, Jorge Civera, Alfons Juan

Abstract: The MLLP’s CHiME-4 system is presented in this paper. It has been built using the transLectures-UPV toolkit (TLK), developed by the MLLP research group, which makes use of state-of-the-art speech techniques. Our best system built for the CHiME-4 challenge consists on the combination of different sub-systems in order to deal with the variety of acoustic conditions. Each sub-system, in turn, follows a hybrid approach with different acoustic models, such as Deep Neural Networks and BLSTM Networks.

The MLLP is very glad to participate in this year’s Interspeech and CHiME. We look forward to seeing you there!