Home › Language Resources › Data

NIST 2004 Open Machine Translation (OpenMT) Evaluation

Item Name:	NIST 2004 Open Machine Translation (OpenMT) Evaluation
Author(s):	NIST Multimodal Information Group
LDC Catalog No.:	LDC2010T12
ISBN:	1-58563-550-2
ISLRN:	081-787-574-125-1
DOI:	https://doi.org/10.35111/kj6h-ys32
Release Date:	July 19, 2010
Member Year(s):	2010
DCMI Type(s):	Text
Data Source(s):	government documents, newswire
Project(s):	NIST MT
Application(s):	machine translation
Language(s):	Mandarin Chinese, Standard Arabic, Arabic
Language ID(s):	cmn, arb, ara
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC2010T12 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	NIST Multimodal Information Group. NIST 2004 Open Machine Translation (OpenMT) Evaluation LDC2010T12. Web Download. Philadelphia: Linguistic Data Consortium, 2010.
Related Works: Hide	View isOutcomeOf LDC2004T08 Hong Kong Parallel Text LDC2005T14 Chinese Gigaword Second Edition LDC2006T02 Arabic Gigaword Second Edition LDC2007T38 Chinese Gigaword Third Edition isSimilarWith LDC2010T10 NIST 2002 Open Machine Translation (OpenMT) Evaluation LDC2010T11 NIST 2003 Open Machine Translation (OpenMT) Evaluation LDC2010T14 NIST 2005 Open Machine Translation (OpenMT) Evaluation LDC2010T17 NIST 2006 Open Machine Translation (OpenMT) Evaluation LDC2010T21 NIST 2008 Open Machine Translation (OpenMT) Evaluation LDC2010T23 NIST 2009 Open Machine Translation (OpenMT) Evaluation LDC2013T07 NIST 2008-2012 Open Machine Translation (OpenMT) Progress Test Sets LDC2013T03 NIST 2012 Open Machine Translation (OpenMT) Evaluation LDC2014T02 NIST 2012 Open Machine Translation (OpenMT) Progress Test Five Language Source

Introduction

NIST 2004 Open Machine Translation (OpenMT) Evaluation, is a package containing source data, reference translations, and scoring software used in the NIST 2004 OpenMT evaluation. It is designed to help evaluate the effectiveness of machine translation systems. The package was compiled and scoring software was developed by researchers at NIST, making use of newswire source data and reference translations collected and developed by LDC.

The objective of the NIST OpenMT evaluation series is to support research in, and help advance the state of the art of, machine translation (MT) technologies -- technologies that translate text between human languages. Input may include all forms of text. The goal is for the output to be an adequate and fluent translation of the original.

The MT evaluation series started in 2001 as part of the DARPA TIDES (Translingual Information Detection, Extraction) program. Beginning with the 2006 evaluation, the evaluations have been driven and coordinated by NIST as NIST OpenMT. These evaluations provide an important contribution to the direction of research efforts and the calibration of technical capabilities in MT. The OpenMT evaluations are intended to be of interest to all researchers working on the general problem of automatic translation between human languages. To this end, they are designed to be simple, to focus on core technology issues, and to be fully supported. The 2004 task was to evaluate translation from Chinese to English and from Arabic to English.

Additional information about these evaluations may be found at the NIST Open Machine Translation (OpenMT) Evaluation web site.

Scoring Tools

This evaluation kit includes a single Perl script (mteval-v11a.pl) that may be used to produce a translation quality score for one (or more) MT systems. The script works by comparing the system output translation with a set of (expert) reference translations of the same source text. Comparison is based on finding sequences of words in the reference translations that match word sequences in the system output translation. More information on the evaluation algorithm may be obtained from the paper detailing the algorithm: BLEU: a Method for Automatic Evaluation of Machine Translation (Papineni et al, 2002).

The included scoring script was released with the original evaluation, intended for use with SGML-formatted data files, and is provided to ensure compatibility of user scoring results with results from the original evaluation. An updated scoring software package (mteval-v13a-20091001.tar.gz), with XML support, additional options and bug fixes, documentation, and example translations, may be downloaded from the NIST Multimodal Information Group Tools website.

Data

This corpus consists of 150 Arabic newswire documents, 150 Chinese newswire documents, and 29 Chinese "prepared speech" documents, and a corresponding set of four separate human expert reference translations. Because LDC lacks permission to publicly distribute some of the source text used in the original evaluation, all 50 Arabic "prepared speech" documents and 21 of 50 Chinese "prepared speech" documents (and their corresponding reference translations) have been removed from the current release.

The reference translations included in this corpus have not previously been publicly available. Some of the source text in this corpus has been publicly released as part of other LDC publications, including Arabic Gigaword Second Edition, LDC2006T02 (Agence France-Presse (AFP) and Xinhua News Agency (Xinhua)); Chinese Gigaword Second Edition, LDC2005T14 (Xinhua, and Zaobao News Agency); Chinese Gigaword Third Edition, LDC2007T38 (AFP); and Hong Kong Parallel Text, LDC2004T08 (Hong Kong Special Administrative Region).

The source text included in this corpus was collected from the following sources:

Arabic

DocID prefix	Source	Date	Document count
AFA	Agence France-Presse	Jan. 2004	50
ALH	Al Hayat	Jan.-Mar. 2004	25
ANN	An Nahar	Feb. 2004-Mar. 2004	25
XIN	Xinhua News Agency	Jan. 2004	50

Chinese

DocID prefix	Source	Date	Document count
AFC	Agence France-Presse	Jan. 2004	50
HKN	Hong Kong Special Administrative Region	Jan.-Mar. 2003	16
PD	People's Daily	Apr. 2003-Mar. 2004	34
XIN	Xinhua News Agency	Oct. 2002-Jan. 2004	53
ZBN	Zao Bao News Agency	Sept. 2003-Mar. 2004	26

For each language, the test set consists of two files: a source and a reference file. Each reference file contains four independent translations of the data set. The evaluation year, source language, test set (which, by default, is "evalset"), version of the data, and source vs. reference file (with the latter being indicated by "-ref") are reflected in the file name.

DARPA TIDES MT and NIST OpenMT evaluations used SGML-formatted test data until 2008 and XML-formatted test data thereafter. The files in this package are provided in both formats.

Sample

Sample text file containing excerpts from different xml files included in this corpus, including reference translations and source text for a single newswire document. The file is encoded in UTF-8.

Updates

There are no updates available at this time.

Copyright

Portions © 2004 Agence France-Presse, © 2004 Al Hayat, © 2004 An Nahar, © 2003-2004 People's Daily, © 2003-2004 SPH AsiaOne, Ltd., © 2003 The Government of the Hong Kong Special Administrative Region, © 2002-2004 Xinhua News Agency, © 2004, 2005, 2006, 2007, 2010 Trustees of the University of Pennsylvania.