Jorge, Javier ; Martínez-Villaronga, Adrià ; Golik, Pavel ; Giménez, Adrià ; Silvestre-Cerdà, Joan Albert ; Doetsch, Patrick ; Císcar, Vicent Andreu ; Ney, Hermann ; Juan, Alfons ; Sanchis, Albert MLLP-UPV and RWTH Aachen Spanish ASR Systems for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge Inproceedings Proc. of IberSPEECH 2018: 10th Jornadas en Tecnologías del Habla and 6th Iberian SLTech Workshop, pp. 257–261, Barcelona (Spain), 2018. Abstract | Links | BibTeX | Tags: Automatic Speech Recognition, Iberspeech-RTVE-Challenge2018, IberSpeech2018, Speech-to-Text @inproceedings{Jorge2018,
title = {MLLP-UPV and RWTH Aachen Spanish ASR Systems for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge},
author = {Jorge, Javier and Martínez-Villaronga, Adrià and Golik, Pavel and Giménez, Adrià and Silvestre-Cerdà, Joan Albert and Doetsch, Patrick and Císcar, Vicent Andreu and Ney, Hermann and Juan, Alfons and Sanchis, Albert},
doi = {10.21437/IberSPEECH.2018-54},
year = {2018},
date = {2018-01-01},
booktitle = {Proc. of IberSPEECH 2018: 10th Jornadas en Tecnologías del Habla and 6th Iberian SLTech Workshop},
pages = {257--261},
address = {Barcelona (Spain)},
abstract = {This paper describes the Automatic Speech Recognition systems built by the MLLP research group of Universitat Politècnica de València and the HLTPR research group of RWTH Aachen for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge. We participated in both the closed and the open training conditions. The best system built for the closed conditions was a hybrid BLSTM-HMM ASR system using one-pass decoding with a combination of an RNN LM and show-adapted n-gram LMs. It was trained on a set of reliable speech data extracted from the train and dev1 sets using the MLLP’s transLectures-UPV toolkit (TLK) and TensorFlow. This system achieved 20.0% WER on the dev2 set. For the open conditions, we used approx. 3800 hours of out-of-domain training data from multiple sources and trained a one-pass hybrid BLSTM-HMM ASR system using the open-source tools RASR and RETURNN developed at RWTH Aachen. This system scored 15.6% WER on the dev2 set. The highlights of these systems include robust speech data filtering for acoustic model training and show-specific language modelling.},
keywords = {Automatic Speech Recognition, Iberspeech-RTVE-Challenge2018, IberSpeech2018, Speech-to-Text},
pubstate = {published},
tppubtype = {inproceedings}
}
This paper describes the Automatic Speech Recognition systems built by the MLLP research group of Universitat Politècnica de València and the HLTPR research group of RWTH Aachen for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge. We participated in both the closed and the open training conditions. The best system built for the closed conditions was a hybrid BLSTM-HMM ASR system using one-pass decoding with a combination of an RNN LM and show-adapted n-gram LMs. It was trained on a set of reliable speech data extracted from the train and dev1 sets using the MLLP’s transLectures-UPV toolkit (TLK) and TensorFlow. This system achieved 20.0% WER on the dev2 set. For the open conditions, we used approx. 3800 hours of out-of-domain training data from multiple sources and trained a one-pass hybrid BLSTM-HMM ASR system using the open-source tools RASR and RETURNN developed at RWTH Aachen. This system scored 15.6% WER on the dev2 set. The highlights of these systems include robust speech data filtering for acoustic model training and show-specific language modelling. |