DARPA TIDES Machine Translation 2003 Evaluation Sets ==================================================== This set contains the evaluation sets (source data and human reference translations), DTDs, scoring software, and evaluation plans from the DARPA TIDES Machine Translation 2003 Evaluation. Please refer to the evaluation plan included in this package for details on how the evaluation was run. A test set consists of two files, a source and a reference file. Each reference file contains four independent translations of the data set. The evaluation year, source language, test set (which, by default, is "evalset"), version of the data, and source vs. reference file (with the latter being indicated by "-ref") are reflected in the file name. A reference file contains four independent reference translations unless noted otherwise under "Package Contents" below. DARPA TIDES MT and NIST OpenMT evaluations used SGML-formatted test data until 2008 and XML-formatted test data thereafter. The test sets in this package are provided in both formats. Please contact mt_poc@nist.gov with questions. Package Contents ---------------- README.txt This file. Evaluation plan: DARPATIDESMT03EvalPlan_v2.pdf Test sets: Arabic-to-English Chinese-to-English DTD: mteval-v1.1.dtd Scoring utility: mteval-v09c.pl Data Set Statistics ------------------- Data genres: nw = newswire Test set Genre Source/Ref Documents Segments Tokens MT03_Arabic-to-English nw source 100 663 14824 MT03_Arabic-to-English nw ahd 100 663 16991 MT03_Arabic-to-English nw ahe 100 663 19346 MT03_Arabic-to-English nw ahg 100 663 17228 MT03_Arabic-to-English nw ahi 100 663 17579 MT03_Chinese-to-English nw source 100 919 38391 MT03_Chinese-to-English nw E01 100 919 26630 MT03_Chinese-to-English nw E02 100 919 27359 MT03_Chinese-to-English nw E03 100 919 25131 MT03_Chinese-to-English nw E04 100 919 24627 The token counts for Chinese data are "character" counts, which were obtained by counting tokens matching the UNICODE-based regular expression "\w". The token counts for all other languages included here are "word" counts, which were obtained by counting tokens matching the UNICODE-based regular expression "\w+". The Python "re" module was used to obtain these counts.