Browse Source

Minor update: added download sizes and SHA-256 checksums

Gonçal V. Garcés Díaz-Munío 2 years ago
parent
commit
ac00435468
1 changed files with 12 additions and 9 deletions
  1. 12 9
      README.md

+ 12 - 9
README.md

@@ -82,8 +82,9 @@ GET THE DATA
 
 Download the full Europarl-ASR speech and text corpus from:
 
-https://www.mllp.upv.es/europarl-asr/Europarl-ASR_v1.0.tar.gz
-
+https://www.mllp.upv.es/europarl-asr/Europarl-ASR_v1.0.tar.gz  
+Size: 18 GiB  
+SHA-256 checksum: 4d360170ef8f1d1ece55566eda4211274b27328427a3443061f43d80d3346e74
 
 ADDITIONAL Europarl-ASR MATERIALS
 ---------------------------------
@@ -93,21 +94,23 @@ described in this document, we are making available for download the following
 materials to facilitate the reproducibility of our experiments:
 
 * The pretrained Europarl-ASR English-language n-gram language model, together
-  with its vocabulary file:
-  
-  https://www.mllp.upv.es/europarl-asr/Europarl-ASR_v1.0_ngram_lm_and_vocab.tar.gz
+  with its vocabulary file:    
+  https://www.mllp.upv.es/europarl-asr/Europarl-ASR_v1.0_ngram_lm_and_vocab.tar.gz  
+  Size: 1,1 GiB  
+  SHA-256 checksum: 2be8eb7918086a233545e6e5a0592b7ae83a09ffb5ce479b68e28329d710cd6a
 
 * The Europarl-ASR English-language verbatim transcription guidelines, which
   were applied to produce the manually revised verbatim transcriptions for the
-  dev and test sets:
-  
-  https://www.mllp.upv.es/europarl-asr/Europarl-ASR_transcription_guidelines.pdf
+  dev and test sets:    
+  https://www.mllp.upv.es/europarl-asr/Europarl-ASR_transcription_guidelines.pdf  
+  Size: 309 KiB  
+  SHA-256 checksum: 66dac867b76c984d9e583caab0a8fd7540a664017e88e9ec4190c90ab67ce8e6
 
 
 CORPUS STRUCTURE AND CONTENTS
 -----------------------------
 
-Total size: 20 GB
+Total size: 18 GiB
 
 The data is organized in 3 main directories: "train" (training data), "dev"
 (validation data) and "test" (evaluation data). Each directory contains the