Switchboard Corpus

Credit Card Conversations

Wordspotting Training Set

NIST Speech Disc 8-1.2

May, 1992

This CD-ROM contains training data for wordspotting on the Switchboard credit card conversations. Thirty five conversations are included. They may be used for cross validation and algorithm parameter determination, as well as for ordinary training. A ten conversation test set will be released later.

Directory and File Structures

The following directories and files are contained in the top-level directory of this disc:

this file
directory containing Switchboard credit card conversations and documentation
The directory, "swb1", contains three subdirectories:
training data
keyword markings
The data for Switchboard conversations are contained in a set of associated files under the "training" subdirectory. All files for a given conversation share the same basename but contain different extensions indicating file type.

All Switchboard corpus files, other than ref files (see below) are of the form:



CONVERSATION-ID ::= 1000 ... 9999 (base 10)

FILETYPE ::= .wav | .txt | .mrk (see below for descriptions)

The thirty-five included conversations are:
1026 1037 1038 1044 1060
1081 1083 1088 2023 2067
2163 2301 2313 2390 2399
2409 2536 2682 2710 2718
2764 2800 2883 2917 2951
2987 2999 3170 3332 3409
3439 3751 2781 3821 3855

Switchboard Filetypes
two-channel u-law encoded audio waveform files with standard NIST SPHERE headers. Each .wav file contains one conversation of not more than ten minutes. Each channel was intended to contain the audio for one speaker in the conversation (although crosstalk between channels is known to exist for some conversations).

For the earlier conversations, those preceding 3170, there was generally an initial time offset between the channels, and variation in the offset as the conversation proceeded. This was due to certain peculiarities in the collection process including some random losses of data. For the later conversations this problem was corrected.

For some of these conversations, those with significant cross talk, using which the offset could be tracked, samples have been deleted from non-speech parts of the data to approximately correct the offsets. Corresponding changes have been made in the marked transcript files. For these conversations, as well as for conversations with little crosstalk, a combined channel version may be created by summing. It is assumed, however, that the standard procedure will be to process the channels separately. The following conversations have been processed in this manner:

1060 2023 2163 2301 2313 2390 2409 2710 2718 2800 2883 2951 2987

text files containing interleaved transcriptions of both channels. The .txt files contain headers which describe various parameters of the conversation. See "txt_spec.doc" for more details.
time-aligned word transcriptions. See "mrk_spec.doc" for more details.
text files for each keyword containing markings for all instances of the keyword in all included conversations. These are included under the "keywords" subdirectory. See "ref.spec" for more details.


The following files are located in the Switchboard documentation, "swb1/doc", directory:

characterization of the contents of the training set of Switchboard credit card conversations including the keywords and variants to be used.
mu-law to PCM conversion table
.mrk file specifications
.txt file specifications
.ref file specifications
information about the speakers in the conversations