Communicator
Transcription Guidelines
Version 1.2, March 9, 2000
This
document contains the specification for transcription of speech in the
Communicator program. Per the
Communicator Evaluation Committee’s request, with the exception of spelled
letters, a proper subset of the final ATIS transcription specifications is
used.
The
transcription is intended to be an orthographic lexical transcription with a
few details included to represent audible acoustic events (speech and
non-speech) present in the corresponding waveform files.
Transcription
may be made in two passes: the first pass in which words are transcribed and an
optional second pass in which additional details (non-speech phenomena) may be
added. Many non-speech phenomena (loud
inhalations, coughing, hissing, smacking, TV-in-background, etc.) are easy to
miss unless specifically attended to.
For Communicator, the annotation of these non-speech artifacts is not
required, but it is supported for those sites who wish to record such
information. Note, however, that these
non-speech annotations will be ignored in scoring.
Transcriptions
developed using these guidelines will be referred to as CTF (Communicator
Transcription Format) transcripts and may be stored in a file with a .ctf
extension.
Required transcription rules:
1.
Case
Transcriptions may use normal capitalization. However, case will be ignored in scoring.
2.
Spelling
Follow American Heritage Dictionary (AHD)
conventions where possible.
3.
Number
Sequences
Transcribe number sequences (flight numbers, times,
dates, aircraft types, dollar amounts, etc.) as spoken in word form. For example,
flight six one three
flight six thirteen
seven thirty
august twenty first
seven forty seven
Note: Care should be taken to
transcribe the digit "0" as "zero" or "oh",
depending on what the speaker said.
4.
Letter
Sequences
Spoken letters occur in acronyms and
abbreviations. Transcribe each spoken
letter in lower case followed a period and a space. For example,
d. f. w.
five thirty p. m.
washington d. c.
Transcribe inflections of letters as if they were
inflections of words. For example,
b. s.ing (no
space between s.
and ing)
i. d.ed him
the t. i.er's last name
i. b. m.'s new machine
the ten c. e. o.s' votes
If a speaker pronounces an acronym or abbreviation
as a word, transcribe it as a word (e.g., "den" or "bos"),
rather than as separate letters (not "d. e. n." or "b. o.
s.").
5.
Contractions
When a standard orthographic form exists for a
contraction and the contraction is actually spoken, transcribe it as spoken
(e.g., "can't"; "don't"; "i'd"; "we'd",
"y'all").
For verbal contractions in which words are
phonetically reduced and elided, if a written version of the contraction does
not exist in formal American English, use the expanded form of the constituent
words. For example,
Spoken Transcribed
wanna want to
wanna want a
gonna going to
hafta have to
useta used to
oughta ought
to
sonova son
of a
6.
New
or Invented Words
For new or invented words, if each constituent can
stand alone as an independent word, transcribe the constituents as hyphenated
words or separate words. Otherwise,
concatenate the constituents without a hyphen.
Bear in mind that hyphens are treated as whitespace in scoring speech
recognition accuracy. For example,
Spoken Transcribed
ecommerce e-commerce (or e commerce)
cybercafe cybercafe (not cyber-cafe)
cancelbot cancel-bot (or cancel bot)
In the last example, "bot" is considered
to be a legitimate word.
7.
Compound
Words
Follow the AHD for hyphenation of common compound
words. If the word is not in the AHD,
follow the convention for new or invented words in section 6. For example,
nonstop
nonsmoking
time-share
under-floor column (or under
floor column)
The existence of such a compound word does not
preclude the use of its component words independently in different
contexts. For example, the following
should be transcribed as shown (with both "after noon" and
"afternoon").
seven p. m. is after noon
but it is in the evening not in the afternoon
8.
Punctuation
Do not use any
English sentence punctuation such as periods, commas, question marks,
exclamation marks, etc.
Transcribe only the special abbreviations
"Mr.", "Ms.", and "Mrs." in their abbreviated
form. All other words should be represented
in their full spelled-out form.
Use only 7-bit ASCII characters. Do not use accent marks on foreign imported
words (e.g, "fiance" not "fiancé").
9.
Mispronunciations
If a speaker mispronounces a word and the
mispronunciation cannot be interpreted as another properly pronounced word,
transcribe the word as it is spoken, surrounded with asterisks. For example,
show me flights from
*atlanty* to dallas
10.
Pause
Fillers
Transcribe pause fillers as words enclosed in square
brackets. If possible, try to limit
these to this list: [uh], [um], [er],
[ah], [mm].
11.
Word
Fragments
If a speaker does not completely pronounce a word,
spell out the fragment of the word that was spoken followed by a hyphen to
indicate the missing portion of the word.
For example,
show
me the fli- flights to boston
Optionally, if the identity of the fragmented word
is obvious, the missing portion of the fragmented word can be shown in
parentheses. The example above can be
transcribed as:
show me the fli(ghts)-
flights to boston
The spoken fragment convention should also be used for false starts/restarts in which words are cut off. For example,
i- i need a flight on Thursday (where the first i
was not completely pronounced).
Within word hesitations may be transcribed as:
dal- -as (indicating a silence interrupting a
word)
dal- [um] -las (indicating a
within word interruption - rare)
Note that fragments will be scored as optionally
deletable words.
12.
Yes/No
Sounds
Transcribe as spoken. Do not enclose it with square brackets. For example,
yep nope
yup nah
um-hum hum-um
uh-huh uh-uh
mm-hmm hmm-mm
13.
Other
Acoustic Events
Other non-speech acoustic artifacts may be
transcribed with a string enclosed by square brackets
[description_of_event]. It is suggested
that this convention be used sparingly.
A list of some possible non-speech artifacts is given below.
[TV] [baby] [baby_crying] [baby_talking]
[barking] [beep] [bell] [bird_squawk]
[breathing] [buzz] [buzzer] [child]
[child_crying] [child_laughing] [child_talking] [child_whining]
[child_yelling] [children] [children_talking] [children_yelling]
[chiming] [clanging] [clanking] [click]
[clicking] [clink] [clinking] [cough]
[dishes] [door] [footsteps] [gasp]
[groan] [hiss] [horn] [hum]
[inhaling] [laughter] [meow] [motorcycle]
[music] [noise] [nose_blowing] [phone_ringing]
[popping] [pounding] [printer] [rattling]
[ringing] [rustling] [scratching] [screeching]
[sigh] [singing] [siren] [smack]
[sneezing] [sniffing] [snorting] [squawking]
[squeak] [static] [swallowing] [talking]
[tapping] [throat_clearing] [thumping] [tone]
[tones] [trill] [tsk] [typewriter]
[ugh] [wheezing] [whispering] [whistling]
[yawning] [yelling]
As noted in the Switchboard Transcription, "effort expended on extremely detailed marking of noise has not proven productive to date."
Note that these annotations will be ignored in scoring.
14.
Nonstandard
Grammar
Do not attempt to correct nonstandard grammar. If a phrase such as, "i'd like to take
them flights" was spoken, transcribe it as it was spoken.
15.
Resolution
of Questions
Direct all questions about the transcription
specifications to your local MADCOW Committee Representative.