The Design of the HCRC Map Task Corpus Catherine Sotillo Ellen Gurman Bard Jan McAllister Anne H. Anderson Henry S. Thompson Ellen Gurman Bard Formatting Henry S. Thompson Minor formatting, TEI tags UK Economic and Social Research Council &HCRC.dist;

Based on the appendix to the original occasional paper

Jan McAllister Catherine Sotillo Ellen Gurman Bard Anne H. Anderson Using the map task to investigate variability in speech Department of Linguistics, University of Edinburgh Edinburgh Occasional Paper

Plain ascii text, with spaces and tabs used for formatting

_______________________________________________________________________ The Design of the HCRC Map Task Corpus _______________________________________________________________________ I. Introduction The current version of the Map Task design was intended to provide a common corpus for a vertical study of dialogue, generating material which can be discussed at levels from the acoustic to the sociolinguistic. All the relevant parameters incorporated in the design are described in this document. The Map Task is a cooperative task involving two participants. The two speakers sit opposite one another and each has a map which the other cannot see. One speaker, the Instruction Giver, has a route marked on his/her map, while the other, the Instruction Follower, has no route. The speakers are told that their goal is to reproduce the Instruction Giver's route on the Instruction Follower's map. The maps are not identical and the speakers are told this explicitly at the beginning of their first session. It is, however, up to them to discover how the two maps differ. The differences in the maps result from the systematic manipulation of the following design variables: 1. phonological charateristics of feature name 2. the extent to which features contrast or are shared between the maps The assignment of speakers to maps involves two further variables: 1. familiarity 2. eye-contact This document describes these variables and the ways in which they were manipulated. II. Materials II.A. Phonological Characteristics II.A.1 Phonological Modifications The maps draw upon four phonological modification categories, or reduction types: 1. t-deletion 2. glottalisation 3. d-deletion 4. nasal assimilation Opportunities for reductions of these types characterize the names of landmarks. The use made of the landmarks subdivides them into Master features and Other kinds of features. II.A.1.a Master Features Each map includes one or both of a potential pair of Master Features (landmarks). For each reduction type, there is a different pair of master feature names, and each pair of master features appears on an equal number of maps in the corpus. The name pairs are as follows: Code Reduction-type Master Feature names 1 t-deletion east lake / west lake 2 glottalisation white mountain / slate mountain 3 d-deletion diamond mine / gold mine 4 nasal assimilation crane bay / green bay The description of Contrast/Match given below explains how maps come to have one or both of their master features. II.A.1.b Other Features In addition to master features, landmark names offering sites for the four categories of reduction type occur on the maps as Other feature types, which will be described in more detail below. Each map contains at least one example of each reduction type. II.A.2 Prosodic Structure Each map contains at least one example of each of two Polysyllabic Categories: 1. initial Strong-Weak (eg "buffalo") 2. initial Weak-Strong (eg "baboons") II.B. Feature Types II.B.1 Introduction The maps include labelled drawings of a number of landmarks or features, arranged on an A3 page in a systematic way. The Giver's and Follower's maps were carefully constructed to include features which differed along a number of dimensions: 1. Contrast 2. Sharedness 3. Odd-Man-Out Additional Incidental features were included for lexical variety. II.B.2 Contrast and Match Over the maps in the design, the pairs of master features listed above appear in a balanced set of Contrast conditions. If there is a contrast (+), then both members of the master feature pair are on the Instruction Giver's map, and the Instruction Giver has an opportunity to subject members of the pair to contrastive stress. If there is a match (+), the Instruction Follower's map matches the Instruction Giver's in respect of contrast. Thus either Giver or Follower or both or neither may have the pair of master features. There are equal numbers of Giver-Follower map pairs of each of the following types: ++ Both maps have both members of their master feature pair. +- The Giver's has both, the Follower's only one. -+ Both maps have only one member of the pair. -- The Giver's has one, the Follower's both. II.B.3 Sharedness Other features on each map belong to a number of categories dependent on whether the name, the illustration, or the number of instances of the feature are the same on both maps in a pair. Examples of the following categories of sharedness were included on each map: Common Feature a feature which is common to both the Instruction Giver and Follower's map; ie the same drawing occuring with the same name in the same location on the map. Name Change a feature that is common to both maps (same drawing, same location) but which is named differently on the two maps; eg., where the Instruction Giver has "white water" the Instruction Follower might have "rapids". Absent/Present a feature that is present on one speaker's map but not the other's Two-to-One (2:1) a feature which the Giver has two of, one of which is relevant to the route and one irrelevant. The Follower only has the irrelevant feature. For each map, each one of these sharedness types occurs with a different one of the four reduction types. II.B.4 Odd-Man-Out Feature names were chosen for meaning as well as for the sound of their names. A Scenario was devised for each map and features chosen to fit in with this stereotypical location. For example, there might be a "Wild West" scenario, with "Apache camp", "canoes", "buffalo", "gold mine" and "cavalry" as landmark names. One feature, the Odd-Man-Out, would be alien to this scenario, like a "nuclear test site" in the "Wild West". II.C. Routes II.C.1 Description Four different routes were constructed for the maps. To help ensure that the routes differed sufficiently, random number tables were used to generate grid co-ordinates which determined the positions of the major features (those involving a potential phonological modification) on the page. A route was then drawn around the features observing the following criteria: 1. The route starts at a shared, or common, feature; 2. The route finishes at a common feature; 3. Intermediate landmarks along the route alternate between common features and those that differ in some way (see above); 4. There are at least two features which only appear on the Giver's map, and two features which only the Follower has. II.C.2 Routes and Master Features The four routes are associated with particular master features. For example, the master feature "east lake" always occurs in the same location on any map in which it appears, and these maps all share the same route. For this reason, routes can be assigned the number given to the phonological reduction type of the master feature. Thus the route associated with the master feature "east lake" is route number 1 as "east lake" is the t-deletion master feature. II.D. Quartets Each of the 16 maps constructed in the manner described above has a unique Route(master feature) x Contrast/Match combination. Four different Quartets of maps were created using a latin square to ensure that each contained one example of each contrast/match condition and of each route, but that no Route(master feature) x Contrast/Match combination occurred more than once. The 16 maps (4 for each quartet) are allocated as follows: Quartet Contrast/Match/Route Qrt1 ++1 +-2 -+3 --4 Qrt2 ++4 +-1 -+2 --3 Qrt3 ++3 +-4 -+1 --2 Qrt4 ++2 +-3 -+4 --1 where +Contrast = Instruction Giver's map contains contrasting master features (eg "east lake" and "west lake") -Contrast = Instruction Giver's map contains only one member of the master feature pair (eg "east lake") +Match = Instruction Follower's map matches Giver's in contrast, so if Giver has both lakes so does Follower, but if Giver has only one then Follower only has one -Match = Instruction Follower's map mismatches Giver's in contrast, so if Giver has two lakes Follower has one, if Giver has one, Follower has two Route 1 = associated with t-deletion master feature Route 2 = associated with glottalisation master feature Route 3 = associated with d-deletion master feature Route 4 = associated with nasal assimilation master feature II.E. Assignment of Feature Names to Feature Types, Maps, and Quartets =========================================================================== Map Type of Sharedness (master ---------------------------------------------------------- features) 2:1 Absent/ Name Common Odd Man Out Present Change =========================================================================== Quartet 1 ++1 1 2 3 4 1 east lake fenced picket old mill/ caravan nuclear west lake meadow fence mill wheel park test site +-2 4 3 2 1 4 white mountain site of round hot wells/ collapsed roman baths slate mountain plane rocks hot springs shelter crash -+3 3 4 1 2 3 diamond mine carved salooon fast flow- flat walled city gold mine wooden bar ing river/ rocks pole fast running creek --4 2 1 4 3 2 crane bay wheat forest cliffs/ old rocket green bay fields stream sandstone lighthouse warehouse cliffs =========================================================================== Quartet 2 ++4 3 4 1 2 3 crane bay farmed iron forked pirate computer green bay land bridge stream/ ship controlled gurgling brook sub +-1 2 1 4 3 2 east lake great popular farmer's ruined rocket west lake viewpoint tourist gate/ monastery launch pad spot broken gate -+2 1 2 3 4 1 white mountain lost straight ancient fallen soft slate mountain steps river ruins/ pillars furnishings ruined city store --3 4 3 2 1 4 diamond mine stone manned white water/ rift swan pond gold mine creek fort rapids valley =========================================================================== Quartet 3 ++3 2 1 4 3 2 diamond mine great parched indian carved trout farm gold mine rock river bed country stones +-4 1 2 3 4 1 crane bay vast white- reclaimed seven crashed green bay meadow washed fields beeches spaceship cottage -+1 4 3 2 1 4 east lake train privately granite site of lion country west lake crossing owned quarry forest fields fire --2 3 4 1 2 3 white mountain poisoned lemon crest falls remote cobbled slate mountain stream grove village street =========================================================================== Quartet 4 ++2 4 3 2 1 4 white mountain golden submerged secret extinct saxon slate mountain beach rocks valley volcano barn +-3 3 4 1 2 3 diamond mine field overgrown highest great disused gold mine station gully viewpoint lake warehouse -+4 2 1 4 3 2 crane bay boat washed pine grove pebbled coconut green bay house stones grove shore palm --1 1 2 3 4 1 east lake parked flight disused telephone thatched west lake van museum monastery box mud hut =========================================================================== NB Due to an error there is no change of name on the maps in Quartets 3 and 4 II.F. Lists and Diagnostics At the start of each Map Task session subjects were asked to read four accent diagnostic sentences taken from Barry, Hoequist, and Nolan (1989): 1. After tea father fed the cat. 2. Father hid that awful cart at the top of the park. 3. Father cooked two of the puddings in butter. 4. Father bought a lot of cloth. For the complete reference and further discussion, see the file diagnost.sgm in this directory. After completing the Map Task conversations subjects read a wordlist containing all the landmark names on the maps used by those subjects. The list was read twice, in a different, randomised, order. III Subject Conditions III.A. Introduction The 64 subjects were grouped into sixteen quadruples (referred to hereafter as quads). Two variables classifying the subjects were incorporated in the full design of the Map Task Corpus: 1. Eye-Contact 2. Familiarity III.B. Eye-Contact The Map Task involves subjects facing one another across a pair of drawing boards arranged back to back so that each subject's map is hidden from the other. Subjects in the With Eye Contact condition could see each other's faces over the drawing boards. Subjects in the No Eye Contact condition performed the task with an additional barrier between them which prevented them from seeing one another's faces. In all conditions subjects were instructed to avoid using hand gestures. No subject served in more than one eye-contact condition -- eight quads were in the eye-contact condition, eight in the no-eye-contact condition. III.C. Familiarity Each quad contained two pairs of speakers, Pair A and Pair B. The members of a pair knew each other well, but had never before met the members of the other pair. In each pair, one member was Instruction Giver first and the other was Instruction Follower first. In quads 1 - 4 in each eye-contact condition (those in Layer One in the design below) all subjects worked with an unfamiliar partner in their first conversation; all subjects in the remaining quads (each eye-contact condition, quads 5 - 8, Layer Two) started with a familiar partner in their first conversation. In both layers, familiarity alternated: for either role (Giver/Follower) each subject participated once with a familiar and once with an unfamiliar partner. If, for example, a Layer One speaker had a familiar partner for her first session as Giver, she had an unfamiliar partner for her second session in that role. The Layer Two speaker delivering the same map reversed the order of +/- familiar sessions. III.D. Dialogue and Map Identifying Codes It follows from the above that a quad number (1 - 8) and an eye-contact condition (eye-contact/no-eye-contact) determine a group of subjects. Eight conversations (1 - 8) were recorded for each group, half between familiar and half between unfamiliar speakers. This gives rise to the form of the identifying codes used for each dialogue in the Map Task Corpus: q(1-8) - indicates the Quad number (This determines which quartet of maps were used) n/e - indicates whether speakers were in the eye-contact or no-eye-contact condition. Taken together with quad number, determines the subject group. (n=NO-EYE-CONTACT, e=WITH-EYE-CONTACT) c(1-8) - indicates the conversation number. (This identifies who was performing the task and the map which was being used) f/u - indicates the familiarity of the pair in the dialogue. (f=FAMILIAR, u=UNFAMILIAR) Thus, the code q5ec1f identifies the dialogue between speakers from the fifth quad in the eye-contact condition, who were participating in the first dialogue of that quad and who were familiar with each other. The maps are defined uniquely using their contrast/match/reduction code (see section II above), for example ++1, +-2, -+3, --4 etc. Each actual map is labelled as either the Giver's (g) or Follower's (f) map, ie ++1g or ++1f. A shorthand coding is also used in the names of the files which contain the map images on this CD, which numbers the maps from 0 to 15. This coding is derived by adding an offset based on the contrast/match code to the reduction number: contrast/match offset -- -1 -+ 3 +- 7 ++ 11 so that e.g. +-3 is coded as 10. IV The Full Design IV.A. Assignment of Subjects to Conditions, Maps, etc. Any subject served as a single Member (1, 2) of a single Pair (A, B) in a single Quad (1 - 8) in a single Eye-Contact Condition (eye contact, no eye contact). Each subject participated in 4 dialogues, 2 as Instruction Giver (using the same map each time), 2 as Instruction Follower (for obvious reasons using a different map each time). One dialogue conducted in each role was with a familiar partner and the other with an unfamiliar partner. Each Quad of subjects used a single Quartet of maps and made use of each map in 2 dialogues. The table in the following section gives the full design for 64 Map Task Dialogues. The design was filled once by the 32 subjects in the Eye Contact condition and again, exactly the same way, by the 32 (different) subjects in the No Eye Contact condition. For example, in Quad 3, Conversation 1 the same map, subject assignment, and therefore familiarity, contrast etc were used for both the with and without eye contact conditions. Thus the maps for the dialogues called Quad 3, No Eye Contact, Conversation 1, Unfamiliar (q3nc1u) and Quad 3, Eye Contact, Conversation 1, Unfamiliar (q3ec1u) are the same. IV.B. Design Used for Both Eye Contact Conditions _______________________________________________________________________ LAYER ONE _______________________________________________________________________ Subject Quadruple 1 2 3 4 Map/Master Conversation Familiarity Giver Follower Con- Match Reduction Number trast 1 4 3 2 1 - a1 b1 + + 2 1 4 3 2 - b2 a2 + - 3 2 1 4 3 + a2 a1 - + 4 3 2 1 4 + b1 b2 - - 3 2 1 4 5 - a2 b2 - + 4 3 2 1 6 - b1 a1 - - 1 4 3 2 7 + a1 a2 + + 2 1 4 3 8 + b2 b1 + - _______________________________________________________________________ LAYER TWO _______________________________________________________________________ Subject Quadruple 5 6 7 8 Map/Master Conversation Familiarity Giver Follower Con- Match Reduction Number trast 1 4 3 2 1 + a1 a2 + + 2 1 4 3 2 + b2 b1 + - 3 2 1 4 3 - a2 b2 - + 4 3 2 1 4 - b1 a1 - - 3 2 1 4 5 + a2 a1 - + 4 3 2 1 6 + b1 b2 - - 1 4 3 2 7 - a1 b1 + + 2 1 4 3 8 - b2 a2 + - ________________________________________________________________________ Familiarity: + a familiar pair - an unfamiliar pair Reduction: 1 = possible t-deletion (east/west lake) 2 = possible glottalization (white/slate mountain) 3 = possible d-deletion (diamond/gold mine) 4 = possible nasal assimilation (green/crane bay) Contrast: + both members of pair present on Giver's map - only 1 member of pair present on Giver's map Match: + Follower's map matches contrast on Giver's - Follower's map mismatches contrast on Giver's Table of Contents ___________________________________________________________________________