Translanguage English Database (TED) Transcripts

Item Name: Translanguage English Database (TED) Transcripts
Authors: A. Kipp, L. Lamel, J. Mariani, F. Schiel, N. Martey, D. A. Miller, K. T. Jones, S. Dunn, and R. Markoff
LDC Catalog No.: LDC2002T03
ISBN: 1-58563-202-3
Data Type: text
Data Source(s): microphone speech
Application(s): speech recognition
Language(s): English
Language ID(s): eng
Distribution: Web Download
Member fee: $0 for 2002 members
Non-member Fee: US $250.00
Reduced-License Fee: US $250.00
Extra-Copy Fee: US $250.00
Non-member License: yes
Online documentation: yes
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: A. Kipp, et al.
Translanguage English Database (TED) Transcripts
Linguistic Data Consortium, Philadelphia


The Translanguage English Database (TED) Transcripts project, Linguistic Data Consortium (LDC) catalog number LDC2002T03 and ISBN 1-58563-202-3 is a joint publication between the European Language Resources Association (ELRA) and the LDC. Joint LDC/ELRA distribution of this work was sponsored in part by National Science Foundation Grant No. IIS-9982201.


The 39 audio files transcribed are a subset of the 188 speeches available in the corresponding audio publication, which is available as LDC2002S04, an LDC release of the ELRA TED corpus of recordings made at Eurospeech '93 in Berlin. The TED audio recordings have non-native English speakers presenting academic papers for approximately 15 minutes each. Included on the TED audio publication are the papers, poster sessions, and original transcripts of oral recordings for a subset of the presentations.

The 39 transcripts in this publication are in Universal Transcription Format (UTF) and were prepared by the LDC. All utf files in the transcript publication were validated against an the utf.dtd. Tables containing speaker demographic information and cross-reference of file names from the TED audio corpus are included. Please go here for a sample of one of the transcripts.

Please note that poster presentations, notes and questionnaires are not available for every author in the corresponding TED audio publication LDC2002S04.


There are no updates at this time

Content Copyright

Portions 1993-2002 University of Munich, Germany; LIMSI-CNRS, France; ELRA; and the Trustees of the University of Pennsylvania