Early software by MLLP researchers (2010-2015): AK, GIDOC, jaf_Tools, Bilingual Text Classification.

Updated 4 months ago

This repository contains the code for the paper "Stream-level Latency Evaluation for Simultaneous Machine Translation".

Updated 4 months ago

This repository contains the code for the segmentation system proposed in "Direct Segmentation Models for Streaming Speech Translation".

Updated 4 months ago

A 1300-hour English speech and text corpus of parliamentary debates for (streaming) ASR training and benchmarking, speech data filtering and speech data verbatimization.

Updated 7 months ago

Updated 1 year ago