2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set


Item Name: 2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set
Authors: NIST Multimodal Information Group and LDC
LDC Catalog No.: LDC2011T05
ISBN: 1-58563-575-8
Release Date: Mar 16, 2011
Data Type: text
Data Source(s): broadcast conversation, broadcast news, newswire, web collection
Project(s): GALE
Application(s): machine learning, machine translation
Language(s): Arabic, Chinese
Language ID(s): ara, cmn
Distribution: Web Download
Member fee: $0 for 2011 members
Non-member Fee: US $250.00
Reduced-License Fee: US $250.00
Extra-Copy Fee: N/A
Non-member License: yes
Online documentation: yes
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: NIST Multimodal Information Group and LDC
2011
2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set
Linguistic Data Consortium, Philadelphia

Introduction

2008/2010 NIST Metrics for Machine Translation (MetricsMaTr) GALE Evaluation Set, Linguistic Data Consortium (LDC) catalog number LDC2011T05 and isbn 1-58563-575-8, is a package containing source data, reference translations, machine translations and associated human judgments used in the NIST 2008 and 2010 MetricsMaTr evaluations. The package was compiled by researchers at NIST, making use of Arabic and Chinese broadcast, newswire and web data and reference translations collected and developed by LDC for Phase 2 and Phase 2.5 of the DARPA GALE program.

NIST MetricsMaTr is a series of research challenge events for machine translation (MT) metrology, promoting the development of innovative MT metrics that correlate highly with human assessments of MT quality. Participants submit their metrics to NIST (National Institute of Standards and Technology). NIST runs those metrics on certain held-back test data for which it has human assessments measuring quality and then calculates correlations between the automatic metric scores and the human assessments. Specifically, the goals of MetricsMATR are: to inform other MT technology evaluation campaigns and conferences with regard to improved metrology to establish an infrastructure that encourages the development of innovative metrics to build a diverse community that will bring new perspectives to MT metrology research and to provide a forum for MT metrology discussion and for establishing future directions of MT metrology.

The first MetricsMaTr challenge was held in 2008 the development data from the 2008 program is available from LDC, 2008 NIST Metrics for Machine Translation (MetricsMATR08) Development Data LDC2009T05. The MetricsMaTr10 evaluation plan is included in this release.

Data

This release contains 149 documents with corresponding reference translations (Arabic-to-English and Chinese-to-English), system translations and human assessments. The human assessments include the following: Adequacy7 (a 7-point scale for judging the meaning of a system translation with respect to the reference translation) Adequacy Yes/No (whether the given system segment meant essentially the same as the reference translation) Preference (the judges preference between two candidate translations when compared to a human reference translation) and HTER (Human Targeted Error Rate, human edits to a system translation to have the same meaning as a reference translation).

Updates

Additional information, updates, bug fixes may be available in the LDC catalog entry for this corpus at LDC2011T05.

Sponsorship

This work is supported in part by the Defense Advanced Research Projects Agency, GALE Program Grant No. HR0011-06-1-003. The content of this publication does not necessarily reflect the position or policy of the Government, and no official endorsement should be inferred.

Samples

Content Copyright

Portions 2006 Abu Dhabi TV, Agence France Presse, Al-Ahram, Al Alam News Channel, Al Arabiya, Al Hayat, Al Iraqiyah, Al Quds-Al Arabi, An Nahar, Asharq al-Awsat, China Central TV, China Military Online, Chinanews.com, Guangming Daily, Kuwait TV, New Tang Dynasty TV, Nile TV, PAC, Ltd, Peoples Daily Online, Phoenix TV, Syria TV, Xinhua News Agency, 2006, 2011 Trustees of the University of Pennsylvania

Contact: ldc@ldc.upenn.edu 2011 Linguistic Data Consortium , Trustees of the University of Pennsylvania . All Rights Reserved.