Current MLLP projects and contracts
- X5gon: Cross Modal, Cross Cultural, Cross Lingual, Cross Domain and Cross Site Global OER Network
- Period: 1/9/2017 – 31/12/2020
- Project supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 761758. Read more…
The X5gon proposal stands for innovative technology elements converging currently scattered Open Educational Resources (OER) available in various modalities across Europe and the globe. X5gon combines content understanding, user modelling quality assurance methods and tools to boost a homogeneous network of (OER) sites and provides users (teachers, learners) with a common learning experience. X5gon deploys open technologies for recommendation, learning analytics and learning personalisation services that work across various OER sites, independent of languages, modalities, scientific domains, and socio-cultural contexts.
> Visit the X5gon official site
> Find the X5gon project factsheet at the EU CORDIS database
- Multisub: Multilingual subtitling of classrooms and plenary sessions
- Period: 1/1/2019 – 31/12/2021
- Research project supported by the Spanish Ministry of Science under ref. no. RTI2018-094879-B-I00 (MCIU/AEI/FEDER, EU). Read more…
Universities worldwide are beginning to see benefits in producing OER as lecture recordings to support teaching and learning activities, which are typically published in online learning platforms and are becoming even more popular among students. In the same way, parliaments are using even more video as a powerful way to bring the attention of society towards the work of parliaments, and are increasingly making live streams and recordings available allowing to search for parts of plenary sessions or speeches. The main aim of this project is to further improve the state of the art in Automatic Speech Recognition (ASR) and Machine Translation (SMT) to deal with these kinds of audiovisual collections, addressing: i) channel variability, noise and reverberation; ii) far-field speech recognition; iii) speaker diarization; iv) multispeaker recognition; v) on-line speech recognition; vi) neural machine translation; vii) translation of sentences containing ASR errors. The technology developed will be integrated into Opencast and Transparency Portal platforms to enable real-life impact.
> Find out more about Multisub
- À Punt – UPV: R&D collaboration agreement between the Corporació Valenciana de Mitjans de Comunicació (À Punt) and the Universitat Politècnica de València (UPV) for real-time computer assisted subtitling of audiovisual contents based on artificial intelligence
- Period: 6/10/2020 – 5/10/2022
- R&D collaboration agreement between the CVMC (À Punt) and UPV (VRAIN-MLLP). Read more…
The Corporació Valenciana de Mitjans de Comunicació (À Punt) is the public broadcaster of the region of València, Spain, with its À Punt channels on TV, radio and the web. The goal of this joint project is to research, develop and deploy automatic and computer-assisted subtitling technology (by the MLLP group) for À Punt’s live and recorded broadcast contents, in Valencian/Catalan and Spanish, based on artificial intelligence.
> Read the announcement of the À Punt – UPV R&D collaboration agreement
- VideoLectures.NET – MLLP: Technological support agreement between the Universitat Politècnica de València (UPV) and the Jožef Stefan Institute (IJS) for the provision of online video transcription and translation services
- Period: 1/10/2016 – 27/11/2020
- Technological support agreement between the IJS (VideoLectures.NET) and UPV (MLLP). Read more…
The Jožef Stefan Institute (IJS), the leading Slovenian scientific research institute, operates the award-winning VideoLectures.NET, one of the biggest academic online video repositories in the world, which offers free and open access to over 19,000 video lectures. Under this agreement, the MLLP group provides online video transcription and translation services in several languages for the IJS and VideoLectures.NET.
> Read the announcement of the VideoLectures.NET-MLLP technological support agreement
- tL-UC3M: Provision of automatic video transcription and translation services to the Carlos III University of Madrid
- Period: 1/7/2014 – 11/11/2020
- Technology transfer contract between UC3M and UPV (MLLP). Read more…
The Carlos III University of Madrid (UC3M), Spain, has been making a strong effort in the production of quality video lecture-based courses. Under this contract, the MLLP group provides the UC3M with technology for the automatic transcription and translation of their video lectures in Spanish and English.
> Read the announcement of the tL-UC3M technology transfer contract
- poliTrans: Official automatic video transcription and translation service by Universitat Politècnica de València
- Period: 1/1/2017 –
- Project funded by Universitat Politècnica de València. Read more…
The MLLP group maintains poliTrans, the UPV’s official automatic multilingual transcription and translation service in several languages, now available for all interested universities and organizations. poliTrans is also the UPV’s internal automatic transcription and translation service in the university’s official languages, English, Catalan and Spanish, for all videos in the UPV’s institutional video lecture repository, poliMèdia. Users can choose the language of the subtitles while viewing a video lecture, and they can also edit the subtitles if they find any mistakes. The MLLP group is working as well on the challenge of extending this service to UPV video lectures recorded through live lecture capture in classrooms within the UPV’s Opencast-based project Videoapunts.
> Read the announcement of poliTrans’ launch
> Try poliTrans, the UPV’s official automatic video transcription and translation service
Past projects and contracts led by MLLP researchers
- EMMA: European Multiple MOOC Aggregator
- Period: 1/2/2014 – 31/7/2016
- Project supported by the European Union’s Competitiveness and Innovation Framework Programme (CIP) under grant agreement no. 621030. Read more…
The European Multiple MOOC Aggregator, called EMMA for short, was a 30-month pilot action supported by the European Union. Its aim and result was to showcase excellence in innovative teaching methodologies and learning approaches through the large-scale piloting of MOOCs on different subjects. EMMA is providing a system for the delivery of free, open, online courses in multiple languages from different European universities to help preserve Europe’s rich cultural, educational and linguistic heritage and to promote real cross-cultural and multi-lingual learning.
> Find out more about the results of EMMA
- transLectures: Transcription and translation of video lectures
- Period: 1/11/2011 – 31/10/2014
- Project supported by the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 287755. Read more…
transLectures was an EU-funded project to develop innovative, cost-effective tools for the automatic transcription and translation of online educational videos. Automatic transcription tools were developed to provide verbatim subtitles of the talks recorded on video, thereby allowing the hard-of-hearing to access this content. Language learners and other non-native speakers can also benefit from these monolingual subtitles. At the same time, machine translation tools were developed to make these subtitles available in languages other than that in which the video was recorded.
> Find out more about the results of transLectures
- AppTek – MLLP: Technological support agreement between the Universitat Politècnica de València (UPV) and Apptek for the provision of consulting services on machine learning and language processing
- Period: 14/12/2017 – 13/12/2019
- Technological support agreement between AppTek and UPV (MLLP). Read more…
AppTek (McLean, Virginia, USA) is a leading company providing automatic speech recognition and machine translation technology for the transcription, translation and analysis of telephony, audio and video content to government organizations, media agencies, call centres and leading retailers. Under this agreement, the MLLP group provides consulting services on machine learning and language processing to AppTek.
> Read the announcement of the AppTek-MLLP technological support agreement
- CdT – MLLP: Technological support agreement between the Universitat Politècnica de València (UPV) and the Translation Centre for the Bodies of the European Union (CdT) for the provision of remote video transcription and translation services
- Period: 25/11/2015 – 24/11/2016
- Technological support agreement between CdT and UPV (MLLP). Read more…
The Translation Centre for the Bodies of the European Union (CdT) is the EU public agency in charge of providing and coordinating translation services for the EU agencies scattered across Europe, covering the 24 official EU languages. Under this agreement, the MLLP group provided remote video transcription and translation services in several languages to the CdT.
> Read the announcement of the CdT-MLLP technological support agreement
- tL-UPV: Provision of automatic video transcription and translation services to the Universitat Politècnica de València
- Period: 1/11/2014 – 31/10/2016
- Project funded by Universitat Politècnica de València. Read more…
The MLLP group has been maintaining an automatic transcription and translation service in the UPV’s official languages, English, Catalan and Spanish, for all videos in the UPV’s institutional video lecture repository, poliMèdia. Users can choose the language of the subtitles while viewing a video lecture, and they can also edit the subtitles if they find any mistakes. The MLLP group has also worked on the challenge of extending this service to UPV video lectures recorded through live lecture capture in classrooms within the UPV’s Opencast-based project Videoapunts.
> Watch any poliMèdia lecture with our automatic subtitles: media.upv.es
> Find out more about the UPV’s lecture capture project Videoapunts: UPV paper presented at EUNIS 2013
- MORE: Multilingual Open Resources for Education
- Period: 1/1/2016 – 31/12/2018
- Research project supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under reference no. TIN2015-68326-R. Read more…
This project aims to dramatically foster Open Education by providing multilingual access to OER and by enabling multilingual online communication in MOOC platforms. These two general purposes can be achieved by using a combination of transdisciplinary tools: automatic speech recognition (ASR), speech synthesis (Text-To Speech, TTS), statistical machine translation (SMT) and dialogue. By using ASR, TTS and SMT, multilingual access to OER will be possible for everyone regardless of their mother tongue or learning abilities. By adapting SMT models to the specific structure of dialogues in discussion forums, learners will be enabled to conduct cross-lingual conversations. Additionally, by adapting the cross-lingual dialogue tools thus developed, an ambitious automatic virtual educator will provide academic assistance to learners, increasing feedback from the system. The tools developed will be integrated into Matterhorn and Open edX, with the aim of evaluating them on real-life data and settings, and ensuring that our innovative solutions will rapidly spread over many OER and MOOC providers worldwide.
> Find out more about MORE
- Active2Trans: Active Interaction for Speech Transcription and Translation
- Period: 1/2/2013 – 31/1/2016
- Research project supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under reference no. TIN2012-31723. Read more…
Active2Trans aimed at investigating open research challenges in active interaction. The first one is to go beyond the sequential left-to-right order in user supervision to study alternative supervision strategies that reduce user effort and fully exploit user supervision. Thus, novel search algorithms were researched in order to produce predictions based on non-consecutive words. Moreover, new interaction strategies were designed in which the user’s effort can be better exploited. As a result, systems can benefit from different degrees of user supervision: supervised, semi-supervised, or unsupervised. The second challenge is to study adaptation techniques in which system models can be improved by using both user-validated data and high-confident system predictions. Finally, multimodal interaction by means of speech was explored as a more sophisticated interaction model, enhancing overall system performance and usability.
> Find out more about Active2Trans
- iTrans2: Interactive Transcription and Translation
- Period: 1/1/2010 – 30/6/2013
- Research project supported by the Spanish Ministry of Science and Innovation (MICINN) under the reference TIN2009-14511. Read more…
The aim of the ITRANS2 project was to develop an innovative computer-assisted system to facilitate the production of high-quality speech transcriptions and text translations. To do this, ITRANS2 proposed a novel interactive-predictive approach which places a human operator at the centre of the process and embeds a statistical machine translation engine (for translation) or an automatic speech recognizer (for transcription) within an interactive editing environment. The human serves as the guarantor of high-quality; the role of the machine translation engine or automatic speech recognizer is to increase the operator’s productivity by predicting extensions to the current target text which the operator may then accept, correct or ignore. Interactivity allows the system to take advantage of the human-validated portion to improve the accuracy of subsequent predictions. Indeed, ITRANS2 offered a unique context in which new adaptive learning techniques were tested, with the ultimate goal of having the system learn from the operator’s corrections in order to dynamically update its underlying models.
- PASCAL2 UPV node
- Period: 1/3/2008 – 28/2/2013
- Project supported by the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 216886. Read more…
PASCAL2 built on the PASCAL Network of Excellence that created a distributed institute pioneering principled methods of pattern analysis, statistical modelling, and computational learning. While retaining some of the structuring elements and mechanisms of its predecessor, PASCAL2 refocused the institute towards the emerging challenges created by adaptive systems technology and its central role in the development of artificial cognitive systems of different scales. Learning technology is the key to making robots more versatile, effective and autonomous, and to endowing machines with advanced interaction capabilities.
> Visit the PASCAL2 website: www.pascal-network.org
> Visit the PASCAL2 fact sheet at the EU CORDIS portal: Project ref. 216886
- Period: 1/11/2009 – 30/4/2012
- Project supported by the Spanish Ministry of Industry’s “Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008-2011” under the reference TSI-020110-2009-439. Read more…
erudito.com was an Experimental Design project with the main objective of developing a tool to encapsulate, distribute and intelligently use digital content. This service, thanks to organizational processes and innovative preprocessing and representation techniques, allows for the edition, interpretation and reuse of digital content currently shown on thematic TV channels.
- CMIPTRANS: Aplicació de mesures de confiança per a la millora de la interacció-predicció en sistemes interactius-predictius d’assistència a la transcripció de parla i text manuscrit (Application of Confidence Measures to Improve Interactive Systems)
- Period: 1/1/2010 – 31/12/2011
- Research project supported by the Generalitat Valenciana (Spain) under the reference GV/2010/067. Read more…
Nowadays, there is an important increasing interest in obtaining transcriptions from multimedia repositories and digital libraries. Transcriptions produced by human transcribers can provide high quality results. Nevertheless, the overall process is very slow and usually expensive. One way to deal with this important drawback is to automatically produce transcriptions based on automatic systems. However, automatic transcriptions are still far from producing the desired quality in some specific scenarios. Indeed, an important human effort to supervise the automatically produced transcriptions is required. This supervision effort can be reduced through the use of interactive systems. In interactive systems, a human operator is placed at the centre of the transcription process, and an automatic system is embedded within an interactive editing environment. The automatic system and the human transcriber tightly cooperate to generate the final transcription, thereby combining human accuracy with system efficiency. Confidence estimation (CE) has been largely applied in fully automatic systems to predict output reliability. Nevertheless, its use in interactive systems has not been thoroughly explored. CE could be used to improve the performance of interactive systems in different ways. On the one hand, the supervision effort should be reduced if only low-confidence output parts are supervised by the user. On the other hand, better automatic transcriptions should be produced by improving the underlying system models based on supervised and high-confidence output parts. The main aim of the project will be to explore these novel strategies in interactive speech recognition and handwritten transcription.
- iTransDoc: Interactive Transcription and Translation of Old Text Documents
- Period: 1/10/2006 – 30/9/2009
- Research project supported by the Spanish Ministry of Science and Innovation (MICINN) under the reference TIN2006-15694-CO2-01. Read more…
iDoc aimed at developing advanced techniques and interfaces for the analysis, transcription and translation of images of old archive documents, following an interactive-predictive approach. It was a coordinated project with two subprojects: iAnaDoc (“Interactive Analysis of Old Archive Documents”) and iTransDoc (“Interactive Transcription and Translation of Old Text Documents”). iTransDoc brought together specialists from off-line Handwritten Text Recognition and Machine Translation, working closely with archivists to study the problems involved in the analysis, transcription and translation of old archive document images. Advanced techniques and software tools were developed to increase user productivity by predicting extensions to their current, partial hypothesis on the text transcription or translation. The techniques studied and developed in iTransDoc have been embedded in a software tool developed exclusively for this purpose: GiDoc a friendly and intelligent tool with state-of-the-art HTR technology any non expert can use to anlyse, transcribe and translate old archive documents.
- ATRAM: Aplicació de Tècniques de Reconeixement de Formes per a l’Anàlisi Morfològica del Peu i Fabricació del Calçat (Application of pattern recognition techniques for the morphological analysis of the foot and footwear manufacturing)
- Period: 28/12/2001 – 27/12/2004
- Research project supported by the Spanish Ministry of Science and Technology under the reference DPI2001-0880-CO2-02. Read more…
The Universitat Politècnica de València’s Departament de Sistemes Informàtics i Computació (DSIC), in collaboration with the Institut de Biomecànica de València (IBV), developed an expert system to assign footwear based on comfort criteria. The study of the shape and dimensions of the foot and their interaction with the footwear during walking enabled us to establish algorithms to predict footwear fitting. The application of this system to aid in the selection of footwear in retail outlets was an innovation in the footwear sector from the point of view of product distribution and sale, as well as a customization approach with a competitive cost. The process to apply this new methodology consists of a first characterization of the user, where the three dimensional shape of the foot is registered using a laser scanner. This information is used as an input for the expert system which selects, among all the shoe models in the store, those which provide to the user high comfort features with the highest probability.
Current projects and contracts
Past projects and contracts
- EMMA [EU CIP] (2014–2016)
- transLectures [EU FP7] (2012–2014)
- AppTek-MLLP (2017–2019)
- CdT-MLLP (2015–2016)
- tL-UPV (2015-2016)
- Other contracts
- MORE (2016–2018)
- Active2Trans (2013–2015)
- iTrans2 (2010–2013)
- PASCAL2 (2008–2013)
- erudito.com (2009–2012)
- CMIPTRANS (2010–2011)
- iTransDoc (2006–2009)
- ATRAM (2001–2004)