Abstract Meaning Representation 2.0 - Four Translations

Item Name: Abstract Meaning Representation 2.0 - Four Translations
Author(s): Marco Damonte, Shay Cohen
LDC Catalog No.: LDC2020T07
ISBN: 1-58563-924-9
ISLRN: 359-968-732-813-3
DOI: https://doi.org/10.35111/fr89-3285
Release Date: April 15, 2020
Member Year(s): 2020
DCMI Type(s): Text
Data Source(s): discussion forum, weblogs, newswire
Project(s): BOLT, DEFT
Application(s): machine translation
Language(s): Italian, Spanish, German, Mandarin Chinese
Language ID(s): ita, spa, deu, cmn
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2020T07 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Damonte, Marco, and Shay Cohen. Abstract Meaning Representation 2.0 - Four Translations LDC2020T07. Web Download. Philadelphia: Linguistic Data Consortium, 2020.
Related Works: View

Introduction

Abstract Meaning Representation 2.0 - Four Translations was developed by researchers at the University of Edinburgh, School of Informatics and consists of Spanish, German, Italian and Chinese Mandarin translations of a subset of sentences from Abstract Meaning Representation (AMR) Annotation Release 2.0 (LDC2017T10).

AMR Annotation Release 2.0 is a semantic treebank of over 39,000 English natural language sentences from broadcast conversations, newswire and web text. The translated data in this release was designed for use in cross-lingual parsing.

Data

This corpus contains translations of the test split sentences from LDC2017T10, a total of 5,484 sentences or 1,371 sentences per language. The source sentences were drawn from material collected by the Linguistic Data Consortium, specifically, discussion forum text from the DARPA BOLT and DARPA DEFT programs, transcripts and English translations of Mandarin Chinese broadcast news programming, Wall Street Journal text, translated Xinhua news texts, various newswire texts from NIST OpenMT evaluations and weblog data from the DARPA GALE program.

All data are presented as UTF-8 encoded plain text.

Samples

Please view this Italian text sample (TXT).

Updates

None at this time.

Available Media

View Fees





Login for the applicable fee