CALLHOME Spanish Speech

Item Name: CALLHOME Spanish Speech
Author(s): Alexandra Canavan, George Zipperlen
LDC Catalog No.: LDC96S35
ISBN: 1-58563-083-7
ISLRN: 321-477-528-167-2
DOI: https://doi.org/10.35111/2skn-2002
Member Year(s): 1996, 1997
DCMI Type(s): Sound
Sample Type: 2-channel ulaw
Sample Rate: 8000
Data Source(s): telephone conversations
Project(s): Hub5-LVCSR
Application(s): speech recognition
Language(s): Spanish
Language ID(s): spa
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC96S35 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Canavan, Alexandra, and George Zipperlen. CALLHOME Spanish Speech LDC96S35. Web Download. Philadelphia: Linguistic Data Consortium, 1996.
Related Works: View

Introduction

CALLHOME Spanish Speech was developed by the Linguistic Data Consortium (LDC) and contains approximately 38 hours of speech from 120 unscripted telephone conversations between native Spanish speakers.  

The CALLHOME series consists of telephone conversations, transcripts and lexicons developed by LDC and Rutgers, The State University of New Jersey, in support of research in speaker identification, language identification and related technologies. Languages in the series include American English, Egyptian Arabic, German, Japanese, Mandarin Chinese, and Spanish.

Data

The conversational telephone speech in this release represents training and development data and a subset of evaluation data. Calls originated in North America and were placed to locations overseas. Most participants called family members or close friends. Participants spoke on topics of their choice in a single telephone call lasting up to 30 minutes. 

Audio files are presented as 8 kHz u-law SPHERE files compressed with SHORTEN. 

Corresponding transcripts (LDC96T17) and an associated lexicon (LDC96L16) are available separately.

Samples

Please listen to this audio sample.

Updates

06/12/2018: 16 SPHERE files from the train and devtest directories were corrupted. Corrected versions of these files were included with the corpus as of the date above. 

Available Media

View Fees





Login for the applicable fee