DARPA TIDES Machine Translation 2005 Evaluation Sets ==================================================== This set contains the evaluation sets (source data and human reference translations), DTDs, scoring software, and evaluation plans from the DARPA TIDES Machine Translation 2005 Evaluation. Please refer to the evaluation plan included in this package for details on how the evaluation was run. A test set consists of two files, a source and a reference file. Each reference file contains four independent translations of the data set. The evaluation year, source language, test set (which, by default, is "evalset"), version of the data, and source vs. reference file (with the latter being indicated by "-ref") are reflected in the file name. A reference file contains four independent reference translations unless noted otherwise under "Package Contents" below. DARPA TIDES MT and NIST OpenMT evaluations used SGML-formatted test data until 2008 and XML-formatted test data thereafter. The test sets in this package are provided in both formats. Please contact mt_poc@nist.gov with questions. Package Contents ---------------- README.txt This file. Evaluation plan: DARPATIDESMT05EvalPlan_v1-1.pdf Test sets: Arabic-to-English Chinese-to-English Scoring utility: mteval-v11b-2008-01-23.tar.gz DTD: mteval-v1.1.dtd Data Set Statistics ------------------- Data genres: nw = newswire Test set Genre Source/Ref Documents Segments Tokens MT05_Arabic-to-English nw source 100 1056 25700 MT05_Arabic-to-English nw ahd 100 1056 31973 MT05_Arabic-to-English nw ahn 100 1056 31499 MT05_Arabic-to-English nw ahp 100 1056 34138 MT05_Arabic-to-English nw ahq 100 1056 32430 MT05_Chinese-to-English nw source 100 1082 47444 MT05_Chinese-to-English nw chc 100 1082 31742 MT05_Chinese-to-English nw chf 100 1082 31537 MT05_Chinese-to-English nw chg 100 1082 32132 MT05_Chinese-to-English nw chh 100 1082 31421 The token counts for Chinese data are "character" counts, which were obtained by counting tokens matching the UNICODE-based regular expression "\w". The token counts for all other languages included here are "word" counts, which were obtained by counting tokens matching the UNICODE-based regular expression "\w+". The Python "re" module was used to obtain these counts.