|Author(s):||John S. Garofolo, Jonathan G. Fiscus, Kate Hunicke-Smith, Denise Danielson, Elizabeth Shriberg, Enrico Bocchieri, Bruce Buntschuh, Beverly Schwartz, Sandra Peters, Robert Ingria, Robert Weide, Yuzong Chang, Eric Thayer, Lynette Hirschman, Joe Polifroni, Bruce Lund, Goh Kawai, Tom Kuhn, Lew Norton, Deborah Dahl, Madeleine Bates, Michael Brown, Alexander Rudnicky, David Pallett|
|LDC Catalog No.:||LDC93S5|
|Sample Type:||1-channel pcm compressed|
|Data Source(s):||microphone speech|
|Application(s):||spoken dialogue systems, speech recognition|
LDC User Agreement for Non-Members
|Online Documentation:||LDC93S5 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Garofolo, John S., et al. ATIS2 LDC93S5. DVD. Philadelphia: Linguistic Data Consortium, 1993.|
The ATIS2 corpus contains approximately 15,000 utterances recorded from approximately 450 subjects at five sites: ATT, BBN, CMU, MIT's Laboratory for Computer Science and SRI. All utterances have been transcribed and almost 10,000 of them annotated with categorizations and canonical reference answers. Unlike the ATIS0 corpus, much of the data in ATIS2 was collected using partially or fully-automated data collection systems. The fully-automated data collection systems were, in fact, working ATIS prototypes.
For ATIS2, the ten-city relational database of ATIS0 was revised to accommodate connecting flights and fares and some table headings were renamed.
In addition to training data, the February and November '92 ATIS Benchmark Tests are included as well. Each contains approximately 1,000 utterances from the pool of data collected by the five sites.
This publication has been condensed from four CDROM discs to a single web download.