The spreadsheet comm01_data_v7.xls is the main distribution of
2001 data.  For each call or survey it includes all the associated
	* user questionnaire responses,
	* logfile metrics,
	* participant profile information, and
	* other information about the call

The spreadsheet was created with Excel 2000 for Windows.
 
We've processed the logfiles received on November 12th or earlier
and their results are included here.  In cases where there was
a disagreement between logfile name and call ID (typically a
typo in a logfile name), we have treated the automatically-generated
call ID as defining the correct name and attempted to repair the
incorrect names.
 
In addition, we have attached ten Unix text files that describe
the data.  They are:
 
survey_list.txt		a list of the user questionnaire items
                        in the spreadsheet

metric_list.txt		a list of the logfile metrics in the
			spreadsheet

profile_list.txt	a list of profile data in the spreadsheet

call_info.txt		a list of other information about the call

bad_logfiles.txt	a list of call IDs for logfiles
                        that could not be processed due to
                        syntactic or semantic errors.  These
                        typically represent calls while systems
                        were not functioning as intended.  No
                        metrics are available for these calls
 
pin_list.txt		explanations of how PINs were generated
                        and of how they can be validated
                        (Note: pins 10022 through 10979
                               and 60003 through 62529 were
                               assigned to paid participants)

name_list.txt		a list of participant aliases/pseudonyms
 
scenario_list.txt	a list of the scenarios used in the study

code_list.txt		a list of the codes in the spreadsheet

what_was_included.txt	explanations of what was included in the
			distribution

=========================================================
What we aimed for:
=========================================================
Number of short subjects per site = 20
Number of long subjects per site = 16

Number of calls required for each short subject = 4
Number of calls required for each long subject = 10

Total calls from short subjects per site = 80 
Total calls from long subjects per site = 160

Total calls (from long+short subjects) per site = 240

Number of sites = 8
Total calls for all sites = 1920

=========================================================
What we actually got:
=========================================================
Subject distribution by site:
Site    Short   Long
01      21      17
02      21      16
03      21      16
04      20      16
05      21      15*
06      20      16
08      19*     16
09      19*     16
* Some subjects dropped out at the last minute, thus leaving us
little time to replace them and little time for new subjects to
complete the experiment.

Total surveys minus bad surveys (duplicated, contaminated, bad subjects)
= 1351

Total surveys with matching call notifications
= 1302 (1302/1920 = 67.8%)

Distribution of 1302 surveys with matching calls by site:
Site    Short   Long    Total   
01      58      118     176     (176/240 = 73.3%)
02      43      110     153     (153/240 = 63.8%)
03      47      97      144     (144/240 = 60.0%)
04      50      119     169     (169/240 = 70.4%)
05      38      133     171     (171/240 = 71.3%)
06      60      117     177     (177/240 = 73.8%)
08      42      105     147     (147/240 = 61.3%)
09      41      124     165     (165/240 = 68.8%)
These numbers do not exclude repeated and extra calls.


Distribution excluding repeated and extra calls:
Site    Short   Long    Total   
01      54      117     171     (171/240 = 71.3%)
02      39      109     148     (148/240 = 61.7%)
03      45       97     142     (142/240 = 59.2%)
04      50      119     169     (169/240 = 70.4%)
05      37      127     164     (164/240 = 68.3%)
06      56      115     171     (171/240 = 71.3%)
08      42      103     145     (145/240 = 60.4%)
09      40      123     163     (163/240 = 67.9%)
In the latest release of the spreadsheet (v2), we  have
updated the code column where it indicates if a call
is repeated, extra, out-of-order, etc.  We made a pass
at all the surveys but due to time constraint not with
fine tooth comb.