4 years ago · 121d04efe1
--- a/README.md
+++ b/README.md
@@ -182,8 +182,8 @@ whether it is in the train set or in the dev/test sets):
 
				 Finally, in "refs" (only in "dev" and "test") each file contains every speech
			
 
				 in the corresponding dev or test set, that is, the full reference for that
			
 
				 set. In each case, we will find 4 files, containing the official non-verbatim
			
 
				-reference (*.orig.*) and the manually revised verbatim reference (*.rev.*), as
			
 
				-transcriptions (*.ref) and as segment time marked files (*.stm). In all 4
			
 
				+reference (`*.orig.*`) and the manually revised verbatim reference (`*.rev.*`), as
			
 
				+transcriptions (`*.ref`) and as segment time marked files (`*.stm`). In all 4
			
 
				 cases, the text is presented preprocessed for evaluation (tokenized,
			
 
				 lowercased, punctuation removed...).
			
 
				 
			
@@ -199,10 +199,10 @@ Each "text" directory contains 2 subdirectories: "raw" (except in
 
				 "train/external"), "prepro" (in all sets), or "scripts" (only in
			
 
				 "train/external").
			
 
				 
			
 
				-&nbsp;&nbsp;&nbsp;&nbsp;"raw" contains the raw text data for the corresponding set (*.txt.gz), and
			
 
				-  its metadata (*.csv). In the cases of "dev" and "test", both the official
			
 
				+&nbsp;&nbsp;&nbsp;&nbsp;"raw" contains the raw text data for the corresponding set (`*.txt.gz`), and
			
 
				+  its metadata (`*.csv`). In the cases of "dev" and "test", both the official
			
 
				   non-verbatim transcriptions (*.orig.*) and the manually revised verbatim
			
 
				-  transcriptions (*.rev.*) are included.
			
 
				+  transcriptions (`*.rev.*`) are included.
			
 
				 
			
 
				 &nbsp;&nbsp;&nbsp;&nbsp;"prepro" contains the text data for the corresponding set, preprocessed for
			
 
				   training or evaluation (tokenized, lowercased, punctuation removed...). This