Communicator Transcription Guidelines

Version 1.2, March 9, 2000

 

This document contains the specification for transcription of speech in the Communicator program.  Per the Communicator Evaluation Committee’s request, with the exception of spelled letters, a proper subset of the final ATIS transcription specifications is used.

 

The transcription is intended to be an orthographic lexical transcription with a few details included to represent audible acoustic events (speech and non-speech) present in the corresponding waveform files.

 

Transcription may be made in two passes: the first pass in which words are transcribed and an optional second pass in which additional details (non-speech phenomena) may be added.  Many non-speech phenomena (loud inhalations, coughing, hissing, smacking, TV-in-background, etc.) are easy to miss unless specifically attended to.  For Communicator, the annotation of these non-speech artifacts is not required, but it is supported for those sites who wish to record such information.  Note, however, that these non-speech annotations will be ignored in scoring.

 

Transcriptions developed using these guidelines will be referred to as CTF (Communicator Transcription Format) transcripts and may be stored in a file with a .ctf extension.

 

Required transcription rules:

 

1.      Case

 

Transcriptions may use normal capitalization.  However, case will be ignored in scoring.

 

2.      Spelling

 

Follow American Heritage Dictionary (AHD) conventions where possible.

 

3.      Number Sequences

 

Transcribe number sequences (flight numbers, times, dates, aircraft types, dollar amounts, etc.) as spoken in word form.  For example,

 

flight six one three

flight six thirteen

seven thirty

august twenty first

seven forty seven

                       

Note: Care should be taken to transcribe the digit "0" as "zero" or "oh", depending on what the speaker said.

 

4.      Letter Sequences

 

Spoken letters occur in acronyms and abbreviations.  Transcribe each spoken letter in lower case followed a period and a space.  For example,

 

d. f. w.

five thirty p. m.

washington d. c.

 

Transcribe inflections of letters as if they were inflections of words.  For example,

 

b. s.ing  (no  space  between  s.  and  ing)

i. d.ed  him

the t. i.er's last name

i. b. m.'s new machine

the ten c. e. o.s' votes

 

If a speaker pronounces an acronym or abbreviation as a word, transcribe it as a word (e.g., "den" or "bos"), rather than as separate letters (not "d. e. n." or "b. o. s.").

 

5.      Contractions

 

When a standard orthographic form exists for a contraction and the contraction is actually spoken, transcribe it as spoken (e.g., "can't"; "don't"; "i'd"; "we'd", "y'all").

 

For verbal contractions in which words are phonetically reduced and elided, if a written version of the contraction does not exist in formal American English, use the expanded form of the constituent words.  For example,

 

Spoken            Transcribed

wanna                 want to

wanna                 want a

gonna                 going to

hafta                   have to

useta                  used to

oughta                ought to

sonova               son of a

 

6.      New or Invented Words

 

For new or invented words, if each constituent can stand alone as an independent word, transcribe the constituents as hyphenated words or separate words.  Otherwise, concatenate the constituents without a hyphen.  Bear in mind that hyphens are treated as whitespace in scoring speech recognition accuracy.  For example,

 

Spoken            Transcribed

ecommerce            e-commerce (or e commerce)

cybercafe            cybercafe (not cyber-cafe)

cancelbot            cancel-bot (or cancel bot)

 

In the last example, "bot" is considered to be a legitimate word.

 

7.      Compound Words

 

Follow the AHD for hyphenation of common compound words.  If the word is not in the AHD, follow the convention for new or invented words in section 6.  For example,

 

In AHD

nonstop

nonsmoking

time-share

 

Not in AHD

under-floor column (or under floor column)

 

The existence of such a compound word does not preclude the use of its component words independently in different contexts.  For example, the following should be transcribed as shown (with both "after noon" and "afternoon").

 

seven p. m. is after noon but it is in the evening not in the afternoon

 

8.      Punctuation

 

Do not use any English sentence punctuation such as periods, commas, question marks, exclamation marks, etc.

 

Transcribe only the special abbreviations "Mr.", "Ms.", and "Mrs." in their abbreviated form.  All other words should be represented in their full spelled-out form.

 

Use only 7-bit ASCII characters.  Do not use accent marks on foreign imported words (e.g, "fiance" not "fiancé").    

 

9.      Mispronunciations

 

If a speaker mispronounces a word and the mispronunciation cannot be interpreted as another properly pronounced word, transcribe the word as it is spoken, surrounded with asterisks.  For example,

 

show me flights from *atlanty* to dallas

 

10.  Pause Fillers

           

Transcribe pause fillers as words enclosed in square brackets.  If possible, try to limit these to this list: [uh],  [um], [er], [ah], [mm].

 

11.  Word Fragments

 

If a speaker does not completely pronounce a word, spell out the fragment of the word that was spoken followed by a hyphen to indicate the missing portion of the word.  For example,

           

            show me the fli- flights to boston

 

Optionally, if the identity of the fragmented word is obvious, the missing portion of the fragmented word can be shown in parentheses.  The example above can be transcribed as:

 

show me the fli(ghts)- flights to boston

 

The spoken fragment convention should also be used for false starts/restarts in which words are cut off.  For example,

 

i- i need a flight on Thursday (where the first i was not completely pronounced).

 

Within word hesitations may be transcribed as:

              

            dal-  -as (indicating a silence interrupting a word)

dal- [um] -las (indicating a within word interruption - rare)

 

Note that fragments will be scored as optionally deletable words.

 

12.  Yes/No Sounds

 

Transcribe as spoken.  Do not enclose it with square brackets.  For example,

 

yep                                nope

yup                                nah

um-hum              hum-um

uh-huh                            uh-uh

mm-hmm             hmm-mm      

 

13.  Other Acoustic Events

 

Other non-speech acoustic artifacts may be transcribed with a string enclosed by square brackets [description_of_event].  It is suggested that this convention be used sparingly.  A list of some possible non-speech artifacts is given below.

 

[TV]                       [baby]                          [baby_crying]              [baby_talking]

[barking]                [beep]                          [bell]                            [bird_squawk]

[breathing]              [buzz]                           [buzzer]                       [child]

[child_crying]            [child_laughing]            [child_talking]              [child_whining]

[child_yelling]            [children]                     [children_talking]            [children_yelling]

[chiming]                [clanging]                     [clanking]                     [click]

[clicking]                [clink]                           [clinking]                     [cough]

[dishes]                  [door]                          [footsteps]                   [gasp]

[groan]                   [hiss]                            [horn]                           [hum]

[inhaling]                 [laughter]                      [meow]                        [motorcycle]   

[music]                   [noise]                          [nose_blowing]            [phone_ringing]

[popping]               [pounding]                   [printer]                       [rattling]

[ringing]                  [rustling]                       [scratching]                   [screeching]

[sigh]                      [singing]                       [siren]                           [smack]

[sneezing]               [sniffing]                       [snorting]                     [squawking]

[squeak]                 [static]                          [swallowing]                [talking]

[tapping]                 [throat_clearing]            [thumping]                   [tone]

[tones]                    [trill]                             [tsk]                             [typewriter]

[ugh]                      [wheezing]                   [whispering]                 [whistling]

[yawning]               [yelling]                       

 

As noted in the Switchboard Transcription, "effort expended on extremely detailed marking of noise has not proven productive to date."

 

Note that these annotations will be ignored in scoring.

 

14.  Nonstandard Grammar

 

Do not attempt to correct nonstandard grammar.  If a phrase such as, "i'd like to take them flights" was spoken, transcribe it as it was spoken.

 

15.  Resolution of Questions

 

Direct all questions about the transcription specifications to your local MADCOW Committee Representative.