Home › Language Resources › Data

CALLHOME American English Speech

Item Name:	CALLHOME American English Speech
Author(s):	Alexandra Canavan, David Graff, George Zipperlen
LDC Catalog No.:	LDC97S42
ISBN:	1-58563-111-6
ISLRN:	952-976-147-406-5
DOI:	https://doi.org/10.35111/exq3-x930
Member Year(s):	1997
DCMI Type(s):	Sound
Sample Type:	2-channel ulaw
Sample Rate:	8000
Data Source(s):	telephone conversations
Project(s):	EARS, GALE, Hub5-LVCSR
Application(s):	speech recognition
Language(s):	English
Language ID(s):	eng
License(s):	LDC User Agreement for Non-Members
Online Documentation:	LDC97S42 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Canavan, Alexandra, David Graff, and George Zipperlen. CALLHOME American English Speech LDC97S42. Web Download. Philadelphia: Linguistic Data Consortium, 1997.
Related Works: Hide	View hasVersion LDC2026S08 CALLHOME American English Second Edition hasAnnotation LDC97T14 CALLHOME American English Transcripts LDC2002T43 2000 HUB5 English Evaluation Transcripts LDC2003T02 1998 HUB5 English Transcripts LDC2018S04 Rhythm and Pitch hasOutcome LDC2002S09 2000 HUB5 English Evaluation Speech LDC2002S10 1998 HUB5 English Evaluation LDC2002S23 1997 HUB5 English Evaluation isSimilarWith LDC96S46 CALLFRIEND American English-Non-Southern Dialect LDC96S47 CALLFRIEND American English-Southern Dialect relatesTo LDC97L20 CALLHOME American English Lexicon (PRONLEX) LDC2026L05 CALLHOME American English Lexicon (PRONLEX) Second Edition

Introduction

CALLHOME American English Speech was developed by the Linguistic Data Consortium (LDC) and contains approximately 56 hours of speech from 120 unscripted telephone conversations between native American English speakers.

The CALLHOME series consists of telephone conversations and transcripts developed by LDC and Rutgers, The State University of New Jersey, in support of research in speaker identification, language identification and related technologies. Languages in the series include American English, Egyptian Arabic, German, Japanese, Mandarin Chinese, and Spanish.

Data

The conversational telephone speech in this release represents training, development and a subset of evaluation data. Most participants called family members or close friends. Participants spoke on topics of their choice in a single telephone call lasting up to 30 minutes.

Audio files are presented as 8 kHz u-law SPHERE files compressed with SHORTEN.

Corresponding transcripts (LDC97T14) and an associated lexicon (LDC97L20) are available separately.

CALLHOME American English Speech

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees