This is the release of the CallFriend American English Speech
Corpus Non-Southern Dialect, produced by the Linguistic Data
Consortium.  This release contains speech data files ONLY, along with
documentation describing speaker information (sex, age, education,
callee telephone number) and call information (channel quality, number
of speakers). These files are not compressed.


Summary of contents:
---------------------------

index.html              html page that links to everything in
                        the docs folder.

docs/
    README.txt          This file.

    cf_eng_n.txt        Description of the CallFriend
                        telephone speech corpus for American
                        Non-Southern Dialect.

    callinfo.txt        Explanation of the audit information
                        provided in "callinfo.tbl".

    callinfo.tbl        A list of audit information as
                        explained in "callinfo.txt", with
                        information on number and sex of
                        speakers and several sound quality
                        judgements.

    headerinfo.txt      Explanation of the SPH header
                        information provided in "headerinfo.tbl".

    headerinfo.tbl      A table of the data that was originally
                        in the SPH header for each audio file
                        before they were converted.

    spkrinfo.txt        Explanation of the speaker demographic
                        information provided in "spkrinfo.tbl".

    spkrinfo.tbl        A table of information provided about
                        the speakers involved in each phone
                        call, such as age and hometown.

    file_partitions.txt Categorizes each of the audio files in
                        the corpus into their original partitions
                        (train, devtest, evltest)

data/             	The speech data files. These files
                        were originally divided into train, devtest
                        and evltest partitions, which are now
                        described in file_partitions.txt


Note that the partitioning of speech data into sets for "training",
"development test" and "evaluation test" sets reflects the original
usage of the speech files by participants in the U.S.  Government-
sponsored project on Language Identification (LID). In this 
release, all 60 files are combined in one data folder.


METADATA:
_______________

Total Duration:		26:47:05

Duration by language:
    - American English	26:47:05

Calls per caller: 1