ATIS3 Training Data
Item Name: | ATIS3 Training Data |
Author(s): | Deborah A. Dahl, Madeleine Bates, Michael Brown, William Fisher, Kate Hunicke-Smith, David Pallett, Christine Pao, Alexander Rudnicky, Elizabeth Shriberg, John S. Garofolo, Jonathan G. Fiscus, Denise Danielson, Enrico Bocchieri, Bruce Buntschuh, Beverly Schwartz, Sandra Peters, Robert Ingria, Robert Weide, Yuzong Chang, Eric Thayer, Lynette Hirschman, Joe Polifroni, Bruce Lund, Goh Kawai, Tom Kuhn, Lew Norton |
LDC Catalog No.: | LDC94S19 |
ISBN: | 1-58563-028-4 |
ISLRN: | 396-239-314-326-3 |
DOI: | https://doi.org/10.35111/hs15-5a36 |
Member Year(s): | 1994 |
DCMI Type(s): | Sound |
Sample Type: | 1-channel pcm compressed |
Sample Rate: | 16000 |
Data Source(s): | microphone speech |
Project(s): | ATIS |
Application(s): | spoken dialogue systems, speech recognition |
Language(s): | English |
Language ID(s): | eng |
License(s): |
LDC User Agreement for Non-Members |
Online Documentation: | LDC94S19 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Dahl, Deborah A., et al. ATIS3 Training Data LDC94S19. Web Download. Philadelphia: Linguistic Data Consortium, 1994. |
Related Works: | View |
Introduction
ATIS3 Training Data contains 774 scenarios completed by 137 participants in the ATIS (Air Travel Information Services) collection. The ATIS collection was developed to support the research and development of speech understanding systems. Participants were presented with various hypothetical travel planning scenarios and asked to solve them by interacting with partially or completely automated ATIS systems. The resulting utterances were recorded and transcribed.
Data was collected in the early 1990s at five US sites: Raytheon BBN, Carnegie Mellon University, MIT Laboratory for Computer Science, National Institute for Standards and Technology and SRI International.
Data
This release contains over 7,300 utterances, all of which were transcribed; a subset of 2,900 utterances were categorized and annotated with canonical reference answers.
The relational database for this dataset included flight information for 46 cities and 52 airports.
Two 1,000-utterance test sets were set aside. The first set was used in December 1993; the second was reserved for future testing.