CALLHOME Spanish Transcripts

Item Name: CALLHOME Spanish Transcripts
Author(s): Barbara Wheatley
LDC Catalog No.: LDC96T17
ISBN: 1-58563-084-5
ISLRN: 979-631-848-400-3
DOI: https://doi.org/10.35111/zhm8-q076
Member Year(s): 1996, 1997
DCMI Type(s): Text
Data Source(s): telephone conversations
Project(s): Hub5-LVCSR
Application(s): speech recognition
Language(s): Spanish
Language ID(s): spa
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC96T17 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Wheatley, Barbara. CALLHOME Spanish Transcripts LDC96T17. Web Download. Philadelphia: Linguistic Data Consortium, 1996.
Related Works: View

Introduction

CALLHOME Spanish Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcripts corresponding to approximately 38 hours of speech from 120 unscripted telephone conversations between native Spanish speakers.  

The CALLHOME series consists of telephone conversations, transcripts and lexicons developed by LDC and Rutgers, The State University of New Jersey, in support of research in speaker identification, language identification and related technologies. Languages in the series include American English, Egyptian Arabic, German, Japanese, Mandarin Chinese, and Spanish.

Data

Transcripts cover a contiguous five-minute or ten-minute call segment and are presented in standard orthography, time-stamped by speaker turn for alignment with the speech signal.

Calls were manually audited for language, recording quality, channel characteristics, dialect, and region. Auditing information and other metadata (information on calls, speakers and demographics on call originators) are included in the documentation accompanying this release. 

The corresponding conversational telephone speech dataset (LDC96S35) and an associated lexicon (LDC96L16) are available separately.

Updates

There are no updates at this time.

Available Media

View Fees





Login for the applicable fee