File: callinfo.doc ------------------- Explanation of the audit information provided in "callinfo.tbl" The file "callinfo.tbl" contains the information produced by manual auditing of the CallHome German conversations. The auditing process was set up to accomplish two things: (a) provide information about the speakers and channel quality for the conversation, and (b) establish the region of the conversation to be used for transcription. The table file contains multiple lines for each conversation, to provide the following information: - total number of speakers and total number of females and males - channel quality for each side of the conversation (see below) - number of speakers per channel and number of males/females per channel - gender, approximate age, and other features of each speaker Each line begins with the four-digit conversation number. Here is a sample set of lines for one of the conversations: ge_4073 ntalkers=3 nfemale=1 nmale=2 ge_4073 siA ntalkers=1 nfemale=0 nmale=1 ge_4073 siA difficulty=3 bgnoise=1 chnoise=3 distortion=2 crosstalk=3 ge_4073 siB ntalkers=2 nfemale=1 nmale=1 ge_4073 siB difficulty=3 bgnoise=1 chnoise=3 distortion=2 crosstalk=3 ge_4073 taA sex=male age=adult rate=normal artic=mumbled accent="swabian" acc_pron=1 dial_wds=1 ge_4073 taB sex=male age=adult rate=normal artic=clear accent="swabian" acc_pron=1 dial_wds=1 ge_4073 taB1 sex=female age=adult rate=slow artic=clear accent="swabian" acc_pron=1 dial_wds=1 The first line gives the total number of speakers in the transcribed portion of the conversation. The next four lines indicate the channel quality and the number of speakers on a per-channel basis. The remaining lines provide information on each of the speakers (typically, there are only two speakers in the conversation - one per channel - so there would be only two lines in this portion of the table). Accent information was provided for each speaker. In CallHome German the following accents were identified: * Standard - commonly referred to as High German; the standardized accent of contemporary Germany. Resulted from attempts throughout the centuries (post Gutenberg) to standardize German for the stage (Theodore Siebs) and literature (Martin Luther). Popularized in the modern era by television, radio and compulsory education. Most similar to the indigenous accent particular to the area of and around the Hanover region. * Northern - a Low German accent particular to the Northern area of Germany (Kiel, Bremen, lower Saxony). * Swabian - an Upper German accent particular to the area of and around Swabia in Southwest Germany. * Hessian - an accent particular to the area of and around the state of Hesse. * Bavarian - an Upper German accent particular to the area of and around the state of Bavaria in Southern Germany. Also commonly referred to as Southern German. The information on channel quality should be interpreted as follows: - Difficulty refers to the overall quality of the channel in terms of number of speakers, background noise, channel noise, speed, accent, articulation, etc. - Background noise refers to the amount of sounds not made by the speakers, e.g., baby crying, television, radio, etc. - Channel noise refers to static or channel break-up. - Distortion refers to echo and other types of recording problems. - Crosstalk refers to audibility of the channel A speaker on channel B, and vice-versa (i.e. due to the echo in the telephone circuit) All attributes are rated on a scale of 1 to 5, where 1 always refers to optimal channel conditions and 5 refers to the least optimal, but still usable channel conditions.