# A listing of the non-speech orthographic items used in the JCSD # Corpus. Note that the last item, {non-speech noise}, was only used in # the rare case that none of the other existing items applied. This list # is a subset of that used in other ARPA/LDC corpora. # {breath noise} {sigh} {mouth noise} {sneeze} {throat clear} {sniff} {cough} {whistle} {paper rustle} {non-speech noise} # # A listing of the orthographic items corresponding to alternate pronunciations # for data containing digit sequences (isolated and four digit sequences). For # example, "[hach]" denotes a pronunciation of "hachi" in which the final vowel # was omitted. # [dei] [ich] [dok] [doku] [rok] [shich] [hach]