Our rationale on what should be included/excluded in the analysis is based on what the experimental design calls for, but we (NIST) have included everything in the data distribution except for the obviously bad data (see items 4, 5, 6 below). 1. Extra calls - calls where a subject had completed all the required calls but for whatever reasons made additional calls. (6 subjects; 7 extra calls) ==> The extra calls could be excluded in the analysis but we've included them in the distribution 2. Repeated calls - calls where a subject filled out a questionnaire for every retry (of the same itinerary) he/she has done. We checked all the hypothetical scenarios and randomly checked some real trips. (7 subjects; 8 retried calls with surveys) ==> We suggest including only the last retry in the analysis but have included everything in the data distribution 3. Out-of-order calls - calls where a subject did not follow the prescribed order (applied only to long subjects where they have a prescribed order of calls to make: hypothetical scenarios 1-4 first, four real trips next, and hypothetical scenarios 5-6 last). (6 subjects: five subjects finished scenarios 1-5 first and then realized they didn't follow the required order, so after they finished four real trips, they redid scenario 5 and then resumed with scenario 6; one subject did scenario 4 first (probably punched in the wrong scenario ID when he called in to get the scenario). ==> We suggest including the first calls unless there are consecutive retries in which case we suggest including the last consecutive retry. However, we've included everything in the data distribution 4. Calls that are "contaminated" - calls where a subject called a system that he/she wasn't assigned to. (3 subjects; 8 contaminated calls) ==> Include only the calls before the subject was "contaminated". All calls after contamination are not included in the analysis or in the data distribution (note that sites could not provide transcripts of the contaminated calls). 5. Duplicated surveys - where a subject submitted more than one survey with the same responses for a single call. (3 subjects; 4 calls with duplicated surveys) ==> Currently exclude the duplicate copies of surveys the analysis and the data distribution. These are true duplicates and were hand-verified. 6. Bad subject calls - calls from a subject who does not fit the frequent flyer profile. (1 subject; 3 bad calls where the subject requested a no longer existing airline -- the subject was "fired" at the request of the eval committee) ==> We've excluded this subject's calls from the analysis and the data distribution. We (NIST) name the calls by the dates and times given in the call notification, when logfile filenames do not agree with call notifications. Please let me know what you think. Thanks. -Audrey