1. Intro Phonetic labeling was done for internal purposes. We are happy to distribute the results of this effort, but this should be kept in mind. The phonetic labeling procedure involved a first pass by a trained labeler; generation of spectrograms with time-aligned phonetic labelers; inspection of the spectrograms by an expert; and a final pass where boundaries were moved or labels changed in line with the corrections. Certain categories which we do not currently use in our classifier (such [ao] as in caught) were nevertheless labeled for future use, but not as consistently as possible. For example, the [pv] or prevoicing label, was often used before voiced stops when strong prevoicing was present, but it was not always used. Similarly, [epi], or epinthetic closure, was sometimes but not always used. The categories [ao], [aa] and [aw] will be re-checked for consistency by R. Cole in the near future. These and other corrections will be announced when available. We would appreciate feedback on the labeling. Send email to vincew@cse.ogi.edu. 2. Symbols The label set is very similar to that used by TIMIT (see the included file timit.phonecode.doc from the TIMIT distribution CD). The new symbols are - A segment labeled "-" indicates that the previous segment was incomplete. This should only occur as the last segment in an utterance when was recording was terminated during speech. pv Prevoicing. ao-r Common in names, it is the sound in Portland and York. bn Background noise. ns Caller sounds which are not speech. ls Lip smack. ln Line noise. br Breath. glot Glottalization. unk Unknown. This label was used when the speech could not be described using the label set; typically when a caller with a foreign accent produced sounds not used in English. 3. Procedure Several different full and part time employees participated. The labeler could see the waveform and play selected regions. All labels were scanned by a second person to look for obvious errors. Even so, we are aware that inconsistencies remain. Because of the time required, the redundant labeling used for word transcriptions was not used for phonetic labeling. 4. Spelling The phonetic labeling of spelled utterances was done some time prior to the labeling of names and may not follow the conventions exactly. 5. Resolution For historical reasons, all labels were produced with a 3ms resolution. Since frame sizes used in practice are typically larger, we feel this is acceptable, although using a finer resolution would have been a better idea.