DARPA TIDES Machine Translation 2002 Evaluation Sets ==================================================== This set contains the evaluation sets (source data and human reference translations), DTDs, scoring software, and evaluation plans from the DARPA TIDES Machine Translation 2002 Evaluation. Please refer to the evaluation plan included in this package for details on how the evaluation was run. A test set consists of two files, a source and a reference file. Each reference file contains four independent translations of the data set. The evaluation year, source language, test set (which, by default, is "evalset"), version of the data, and source vs. reference file (with the latter being indicated by "-ref") are reflected in the file name. A reference file contains four independent reference translations unless noted otherwise under "Package Contents" below. DARPA TIDES MT and NIST OpenMT evaluations used SGML-formatted test data until 2008 and XML-formatted test data thereafter. The test sets in this package are provided in both formats. Please contact mt_poc@nist.gov with questions. Package Contents ---------------- README.txt This file. Evaluation plan: DARPATIDESMT02EvalPlan_v1-3.pdf Test sets: Arabic-to-English Chinese-to-English Scoring utility: mteval-kit-v09.tar.gz DTD: mteval-v1.1.dtd Data Set Statistics ------------------- Data genres: nw = newswire Test set Genre Source/Ref Documents Segments Tokens MT02_Arabic-to-English nw source 100 728 16428 MT02_Arabic-to-English nw ahd 100 728 20118 MT02_Arabic-to-English nw ahg 100 728 20130 MT02_Arabic-to-English nw ahh 100 728 19267 MT02_Arabic-to-English nw ahi 100 728 21431 MT02_Chinese-to-English nw source 100 878 36185 MT02_Chinese-to-English nw E01 100 878 25919 MT02_Chinese-to-English nw E02 100 878 24865 MT02_Chinese-to-English nw E03 100 878 23512 MT02_Chinese-to-English nw E04 100 878 22472 The token counts for Chinese data are "character" counts, which were obtained by counting tokens matching the UNICODE-based regular expression "\w". The token counts for all other languages included here are "word" counts, which were obtained by counting tokens matching the UNICODE-based regular expression "\w+". The Python "re" module was used to obtain these counts.