2015
|
Brouns, Francis; Serrano Martínez-Santos, Nicolás ; Civera, Jorge; Kalz, Marco; Juan, Alfons Supporting language diversity of European MOOCs with the EMMA platform Inproceedings Proc. of the European MOOC Stakeholder Summit EMOOCs 2015, pp. 157–165, Mons (Belgium), 2015. Abstract | Links | BibTeX | Tags: Automatic Speech Recognition, EMMA, Statistical machine translation @inproceedings{Brouns2015,
title = {Supporting language diversity of European MOOCs with the EMMA platform},
author = {Francis Brouns and Serrano Martínez-Santos, Nicolás and Jorge Civera and Marco Kalz and Alfons Juan},
url = {http://www.emoocs2015.eu/node/55},
year = {2015},
date = {2015-01-01},
booktitle = {Proc. of the European MOOC Stakeholder Summit EMOOCs 2015},
pages = {157--165},
address = {Mons (Belgium)},
abstract = {This paper introduces the cross-language support of the EMMA MOOC platform. Based on a discussion of language diversity in Europe, we introduce the development and evaluation of automated translation of texts and subtitling of videos from Dutch into English. The development of an Automatic Speech Recognition (ASR) system and a Statistical Machine Translation (SMT) system is described. The resources employed and evaluation approach is introduced. Initial evaluation results are presented. Finally, we provide an outlook into future research and development.},
keywords = {Automatic Speech Recognition, EMMA, Statistical machine translation},
pubstate = {published},
tppubtype = {inproceedings}
}
This paper introduces the cross-language support of the EMMA MOOC platform. Based on a discussion of language diversity in Europe, we introduce the development and evaluation of automated translation of texts and subtitling of videos from Dutch into English. The development of an Automatic Speech Recognition (ASR) system and a Statistical Machine Translation (SMT) system is described. The resources employed and evaluation approach is introduced. Initial evaluation results are presented. Finally, we provide an outlook into future research and development. |
2012
|
Silvestre-Cerdà, Joan Albert; Andrés-Ferrer, Jesús; Civera, Jorge Explicit length modelling for statistical machine translation Journal Article Pattern Recognition, 45 (9), pp. 3183 - 3192, 2012, ISSN: 0031-3203. Abstract | Links | BibTeX | Tags: Length modelling, Log-linear models, Phrase-based models, Statistical machine translation @article{Silvestre-Cerdà2012a,
title = {Explicit length modelling for statistical machine translation},
author = {Joan Albert Silvestre-Cerdà and Jesús Andrés-Ferrer and Jorge Civera},
url = {http://hdl.handle.net/10251/34996},
issn = {0031-3203},
year = {2012},
date = {2012-01-01},
journal = {Pattern Recognition},
volume = {45},
number = {9},
pages = {3183 - 3192},
abstract = {Explicit length modelling has been previously explored in statistical pattern recognition with successful results. In this paper, two length models along with two parameter estimation methods and two alternative parametrisation for statistical machine translation (SMT) are presented. More precisely, we incorporate explicit bilingual length modelling in a state-of-the-art log-linear SMT system as an additional feature function in order to prove the contribution of length information. Finally, a systematic evaluation on reference SMT tasks considering different language pairs prove the benefits of explicit length modelling.},
keywords = {Length modelling, Log-linear models, Phrase-based models, Statistical machine translation},
pubstate = {published},
tppubtype = {article}
}
Explicit length modelling has been previously explored in statistical pattern recognition with successful results. In this paper, two length models along with two parameter estimation methods and two alternative parametrisation for statistical machine translation (SMT) are presented. More precisely, we incorporate explicit bilingual length modelling in a state-of-the-art log-linear SMT system as an additional feature function in order to prove the contribution of length information. Finally, a systematic evaluation on reference SMT tasks considering different language pairs prove the benefits of explicit length modelling. |
Turró, Carlos; Juan, Alfons; Civera, Jorge; Orliĉ, Davor; Jermol, Mitja transLectures: Transcription and Translation of Video Lectures Inproceedings Proc. of Cambridge 2012: Innovation and Impact - Openly Collaborating to Enhance Education, pp. 543-546, Cambridge (UK), 2012. Abstract | Links | BibTeX | Tags: Automatic Speech Recognition, Statistical machine translation @inproceedings{Turró2012,
title = {transLectures: Transcription and Translation of Video Lectures},
author = {Carlos Turró and Alfons Juan and Jorge Civera and Davor Orliĉ and Mitja Jermol},
url = {http://oro.open.ac.uk/id/eprint/33640
http://hdl.handle.net/10251/54166},
year = {2012},
date = {2012-01-01},
booktitle = {Proc. of Cambridge 2012: Innovation and Impact - Openly Collaborating to Enhance Education},
pages = {543-546},
address = {Cambridge (UK)},
abstract = {transLectures is a FP7 project aimed at developing innovative, cost-effective solutions to produce accurate transcriptions and translations in large repositories of video lectures. This paper describes user requirements, first integration steps and evaluation plans at transLectures case studies, VideoLectures.NET and poliMedia.},
keywords = {Automatic Speech Recognition, Statistical machine translation},
pubstate = {published},
tppubtype = {inproceedings}
}
transLectures is a FP7 project aimed at developing innovative, cost-effective solutions to produce accurate transcriptions and translations in large repositories of video lectures. This paper describes user requirements, first integration steps and evaluation plans at transLectures case studies, VideoLectures.NET and poliMedia. |
2011
|
Silvestre-Cerdà, Joan Albert; Andrés-Ferrer, Jesús ; Civera, Jorge Explicit Length Modelling for Statistical Machine Translation Incollection Vitrià, Jordi ; Sanches, JoãoMiguel ; Hernández, Mario (Ed.): Pattern Recognition and Image Analysis (IbPRIA 2011), 6669 , pp. 273-280, Springer Berlin Heidelberg, 2011, ISBN: 978-3-642-21256-7. Abstract | Links | BibTeX | Tags: Length modelling, Log-linear models, Phrase-based models, Statistical machine translation @incollection{Silvestre-Cerdà2011,
title = {Explicit Length Modelling for Statistical Machine Translation},
author = { Joan Albert Silvestre-Cerdà and Jesús Andrés-Ferrer and Jorge Civera},
editor = {Vitrià, Jordi and Sanches, JoãoMiguel and Hernández, Mario},
url = {http://hdl.handle.net/10251/35749
http://dx.doi.org/10.1007/978-3-642-21257-4_34},
isbn = {978-3-642-21256-7},
year = {2011},
date = {2011-01-01},
booktitle = {Pattern Recognition and Image Analysis (IbPRIA 2011)},
volume = {6669},
pages = {273-280},
publisher = {Springer Berlin Heidelberg},
series = {Lecture Notes in Computer Science},
abstract = {Explicit length modelling has been previously explored in statistical pattern recognition with successful results. In this paper, two length models along with two parameter estimation methods for statistical machine translation (SMT) are presented. More precisely, we incorporate explicit length modelling in a state-of-the-art log-linear SMT system as an additional feature function in order to prove the contribution of length information. Finally, promising experimental results are reported on a reference SMT task.},
keywords = {Length modelling, Log-linear models, Phrase-based models, Statistical machine translation},
pubstate = {published},
tppubtype = {incollection}
}
Explicit length modelling has been previously explored in statistical pattern recognition with successful results. In this paper, two length models along with two parameter estimation methods for statistical machine translation (SMT) are presented. More precisely, we incorporate explicit length modelling in a state-of-the-art log-linear SMT system as an additional feature function in order to prove the contribution of length information. Finally, promising experimental results are reported on a reference SMT task. |
Silvestre-Cerdà, Joan Albert; García-Martínez, Mercedes; Barrón-Cedeño, Alberto; Civera, Jorge; Rosso, Paolo Extracción de corpus paralelos de la Wikipedia basada en la obtención de alineamientos bilingües a nivel de frase Inproceedings Proceedings of the Workshop on Iberian Cross-Language Natural Language Processing Tasks (ICL 2011), pp. 14-21, CEUR-WS, 2011, ISSN: 1613-0073. Abstract | Links | BibTeX | Tags: Comparable Corpora, Parallel Sentences Extraction, Statistical machine translation @inproceedings{Silvestre-Cerdà2011b,
title = {Extracción de corpus paralelos de la Wikipedia basada en la obtención de alineamientos bilingües a nivel de frase},
author = {Joan Albert Silvestre-Cerdà and Mercedes García-Martínez and Alberto Barrón-Cedeño and Jorge Civera and Paolo Rosso},
url = {http://hdl.handle.net/10251/27930},
issn = {1613-0073},
year = {2011},
date = {2011-01-01},
booktitle = {Proceedings of the Workshop on Iberian Cross-Language Natural Language Processing Tasks (ICL 2011)},
volume = {824},
pages = {14-21},
publisher = {CEUR-WS},
abstract = {This paper presents a proposal for extracting parallel corpora from Wikipedia on the basis of statistical machine translation techniques. We have used word-level alignment models from IBM in order to obtain phrase-level bilingual alignments between documents pairs. We have manually annotated a set of test English-Spanish comparable documents in order to evaluate the model. The obtained results are encouraging.},
keywords = {Comparable Corpora, Parallel Sentences Extraction, Statistical machine translation},
pubstate = {published},
tppubtype = {inproceedings}
}
This paper presents a proposal for extracting parallel corpora from Wikipedia on the basis of statistical machine translation techniques. We have used word-level alignment models from IBM in order to obtain phrase-level bilingual alignments between documents pairs. We have manually annotated a set of test English-Spanish comparable documents in order to evaluate the model. The obtained results are encouraging. |