|
@@ -290,7 +290,7 @@ Europarl-ASR (EN) includes:
|
|
|
|
|
|
* 70 million tokens of English-language text data.
|
|
|
|
|
|
-#### Language models
|
|
|
+#### Pretrained language models
|
|
|
|
|
|
* The Europarl-ASR English-language n-gram language model and vocabulary.
|
|
|
|
|
@@ -312,6 +312,10 @@ Detailed dates of the EP speech and text data gathered:
|
|
|
* Europarl v10 (selected to avoid overlapping): 1996-04-15 to 1999-07-19.
|
|
|
* DCEP (does not include any EP reports of proceedings): 2001 to 2012.
|
|
|
|
|
|
+For more information on the Europarl-ASR corpus and its creation, including
|
|
|
+off-line and streaming ASR baselines, please refer to the Europarl-ASR
|
|
|
+article (see "[CITATION](#citation)" above).
|
|
|
+
|
|
|
|
|
|
ACKNOWLEDGEMENTS
|
|
|
---------------
|