Portland Cellular Corpus
                            Release 1.3

              Center for Spoken Language Understanding

UPDATED: 23 August 2002

All phonetic transcribtion (files with extention wrd), were moved to 
labels directory. Small modification to readme.txt and file structure.

This document describes the file naming conventions used for this
distribution and gives a brief description of the various file formats
used.

File Naming Convention
----------------------
File naming follows the following convention:

	CE-150.spelllastname.txt

The first field ("CE") is the prefix indicating the corpus to which 
this data belongs, and the second field ("150") represents a unique ID 
number for the speaker. The third field is an identifier indicating the
prompt for this particular utterance. Please see the protocol section 
of overview.txt for information on the mapping of these identifiers.

These files are subdivided into directories based on their 
call number divided by 10. So, the files for call 103 could be found in the /10
subdirectory.

The /trans and /labels directory file structures exactly parallel the structure 
of the /speech directory.

File Formats
------------
The data was captured digitally from the CSLU T1 connection, and saved as 
8 khz 8-bit ulaw. These files have been converted to the RIFF standard 
file format. This file format is 16-bit linearly encoded.

Transcriptions
--------------

The text transcriptions were performed according to the non 
time-aligned word-level conventions described in the CSLU 
Labeling Guide. This document is available at the CSLU web site.

Phonetic transcriptions are plain text files that carry 
time-aligned phonetic labels. The first two lines of the
file are a header which defines the length of a "frame" 
in milliseconds. The rest of the files consists of two
numbers that define a frame range, and a label that 
applies to that region. For example:

   MillisecondsPerFrame: 1.000000
   END OF HEADER
   2 113 .pau
   113 191 w
   191 267 ^
   267 395 n

So, we can see here that a frame corresponds to 1 millisecond (ms) 
of time, and that from 2 to 113 ms into the file, there is 
a pause (.pau), with the first phoneme (w) starting at 113 ms 
and stretching to 191 ms.

The text transcriptions were performed according to the 
non time-aligned word-level conventions described in 
the CSLU Labeling Guide.

Phonetic transcriptions are plain text files that carry 
time-aligned phonetic labels. The first two lines 
of the file are a header which defines the length of a "frame" 
in milliseconds. The rest of the files consists 
of two numbers that define a frame range, and a label 
that applies to that region. For example:

   MillisecondsPerFrame: 1.000000
   END OF HEADER
   2 113 .pau
   113 191 w
   191 267 ^
   267 395 n        

So, we can see here that a frame corresponds to 1 millisecond (ms) 
of time, and that from 2 to 113 ms into the file, there is 
a pause (.pau), with the first phoneme (w) starting at 113 ms 
and stretching to 191 ms.

The word-level transcription files follow the same format, 
with word labels in place of the phonetic labels. 
The .com files that are found with the .wrd files contain 
information about breathing during the speech. They are in 
a similar time-aligned format.

Labels
------
The lola files are ASCII "location and label" files.  They are similar
to the ".phn" files of the TIMIT database except:

1) the locations are given in a unit of time other than the sample.  
2) there is a short header saying what this unit is

Each file in this distribution has the header:

	MillisecondsPerFrame: 3.0
	END OF HEADER

After that are a series of lines, one per segment, of the form

	<begin frame> <end frame + 1>  label

For example

	200  237   ah
	237  289   m

The [ah] segment extends from from 200 to frame 236 inclusive.  The
end label is 237 for historical reasons.

The lola files have the extension ".ptlola"