; File: logfile-experimental-procedures.doc., Updated 11/04/92 ; Logfile-dry-run experimental procedures (wizard or experimenter instructions) ; [BBN] OCT, 1992 ATIS experiments Instructions to the Experimenter - PRIOR TO EXPERIMENTS ------------------------------------------------------ 1) Given the schedule, confirm by phone or email the time and location (i.e., your office) with the subject 1-2 days prior to the appointment. 2) You should check the equipment (about 15 mins prior to the experiment): - make sure the headphones and microphone are operational - Login to the system as: login name: ATISDEV login password: - Enlarge screen by clicking on the square button to the top-right of the screen. 3) When subject arrives in your office, lead them to the experiment room. 4) You should have with you the following material: - Subject instructions - Experimenter instructions to subject - Pad of paper and pencil(s) - Hardcopy of the practice scenario - Hardcopies of the 4 scenarios designated to that subject (based on the Latin Square, as specified in the "Proposal for End-to-End Logfile Test Evaluation" document). - A "EXPERIMENTS IN PROGRESS" sign to hang outside the experiment room. - Hardcopy of the questionnaire - Note the directory in which the on-line questionnaire is kept is: /d4m/spnl/logile-eval/ILogs/ The rest of the instructions to be followed are stated in the "Experimenter Instructions to Subject" document. Experimenter Instructions to Subject MADCOW Introduction ATIS (or Air Travel Information System) is a travel planning application that uses speech as input. We are studying how such systems are used by asking people such as yourself to use it for solving a set of problems in travel planning. The database used by the ATIS system is very similar to one that a travel agent might use to help you plan a trip. However, due to current limitations in the technology we have available, the version you will be using today is in comparison very limited. Specifically, the system knows only the following information: Cities and airports Atis will know about flights between the nine airports that are listed on your sheet. You can refer to an airport by city name (for example, PITTSBURGH), by the airport code (for example, ATL) for Atlanta) or by the airport's name, if there is one, (for example, LOGAN airport in Boston). The cities are as follows: [read down the column of the table] City or Airport Name Atlanta Baltimore/Washington Boston Dallas/Fort Worth Denver Oakland (California) Philadelphia Pittsburgh San Francisco The system also contains a limited amount of information about ground transportation between an airport and the city (or cities) it serves, for example it can tell you if a limousine service is available and what it costs. Airlines The system knows about the nine airlines listed in the table below. You can refer to them by name, for example DELTA or by abbreviation, for example AA. [read down the column] Airline American Continental Delta Eastern Lufthansa Midway TWA United USAir If you've been following the news, you will notice that some of the airlines we list no longer exist (the database was constructed at a time, February 1990, when they did). For purposes of this session, please assume that they still exist and are still flying. Information about a flight The system has a great deal of information about each flight in its database. The information includes the following categories: Abbreviations (ATL = Atlanta, AA = American Airlines, etc.) Aircraft (names, seating capacity, etc.) Airfares (class of fare, restrictions, price, etc.) Airports (Logan Airport is in Boston, etc.) Classes of Service (one-way, coach, first class, etc.) Flight Numbers (USAir 123, etc.) Ground Transportation (between airport and downtown) Meals (served on the flight, e.g breakfast, snack, etc.) Number of Stops (zero or more) Schedules (arrival, departure time, days of week, dates, etc.) Summary Remember that ATIS is a limited system and knows only the information noted above. If you ask it questions about other topics (for example, the weather), ATIS will not be able to answer them. In any case, you should limit your questions to ones that bear directly on the solution of your current problem. Becoming familiar with the system To acquaint you with the operation of the ATIS system, you will first complete a practice problem, which we will refer to as a ``scenario''. We will begin with an explanation of how to operate the system. Speech input The headphones and the attached microphone are located in front of you on the table. Adjust the headphones around your head to your convenience. The microphone should be in front and to the side of your mouth, at an angle of about 45 degrees fromthe corner of your mouth. Make sure it is not directly in front of your mouth, or way off to the side. Also, it should not be allowed to drift up towards your nose, or down towards your chin. To set the distance between the microphone and your mouth, put two fingers between your mouth and the microphone. The Display The How to Use the ATIS System document provides you with a description of the ATIS screen and instructions of how to control the screen During the Practice session I will explain to you the details described in this document. Now, the first thing you will do is to type your name by clicking on the USER: button. A window will appear to which you will type your name. Use the right/left arrow keys to advance/backspace the cursor, and the backspace key to delete characters. To close the window you will click on the DONE bar at the top of that window. You enter your name only once and do not have to repeat that procedure again. Hints for operating the system The ATIS system is quite capable but it is nevertheless limited in its abilities. To use it most effectively, you should remember the following advice: When posing a query, try to speak naturally, not too fast and not too slow. Avoid using ``computerese''. The system has been trained on naturally spoken language and will respond best to such speech. Short and to-the-point questions will generally work better than long, complex queries. If the system doesn't appear to be understanding you, try rephrasing your question. If that doesn't work, there is a chance that the particular type of information you're after may lie outside the system's knowledge base; you should consider changing your line of questioning. Practice scenario We will now do a practice scenario. The purpose is to better acquaint you with the ATIS system and to give you an idea of the type of problem you will be solving. I will be able to help you with any problems you may have with the equipment or with understanding the display, but I will not be able to answer any questions about the problem itself. If you find you need to be reminded of anything, such as which cities ATIS knows about, please refer to your copy of the Subject Instructions and your crib sheet. To start this scenario you will click on the SCENARIO: button and choose P from the menu of scenarios. When you feel you have found the answer to the problem, stop and tell me. Then, click on the TYPE ANSWER button. A window will appear to which you will type your answer. Use the right/left arrow keys to advance/backspace the cursor, and the backspace key to delete characters. To close the window you will click on the DONE bar at the top of that window. [Place the scenario in the subject's view; do not initiate any interaction with the subject, wait for them to ask questions. Then ask if they understand the problem scenario. Explain the sceen layout using the How to Use the ATIS System document. Explain the CANCEL and RESET buttons. Tell the subject query history display toward the end of the practice session. Ask again if they have any questions. If not, let them start. If they are doing something clearly wrong (e.g. misusing the equipment, you may interrupt and correct them).] The scenarios We are now ready to start the main set of scenarios. Before we do, I would like to remind you of the following: \begin{itemize} Begin each scenario by choosing your designated scenario from the menu of scenarios, by clicking on the SCENARIO: button. Please work exclusively on finding a solution to the problem at hand. Your goal is to identify what you believe to be the correct solution to each problem that I give you. If you encounter any difficulties in operating the system, please alert me immediately by calling Varda Shaked at x3753 or by notifying me in my office (room 15/150). Note that I will not be able to help you find a solution to the problem! Call only if the system malfunctions or if you feel you don't understand something about the procedure. If you would like to rest or leave the room during the session, let me know when you've just completed a problem. (We will only be able to interrupt the session for at most five minutes at a time and only between problems.) Please try to solve every problem. If after several reasonable attempts you feel you are unable to complete a problem, please click on the TYPE ANSWER button to open the asnwer window. Type NONE or the reason for not being able to complete the problem solving. When you have completed a scenario, please mark the answer on the sheet provided. (The answer should be in the form of a specific flight or an itenerary.) Do you have any last questions about the procedure? Remember that we are interested in how well the system helps you solve a problem. You should at all times concentrate on problem solution. If you are interested in exploring the system's capabilities we will be happy to arrange an additional session for this purpose. Good luck! [ Wait for the subject to correctly initiate the scenario, then leave the experimental enclosure. Do not return until the subject has terminated the scenario. Remove the just-completed scenario and replace it with the next one. Continue until all required scenarios have been completed. ] Questionnaire administration You have completed your assigned problem set. I would now like you to fill out the following questionnaire. Please let me know when you are done. [Leave the enclosure; return when signalled by the subject.] Session conclusion At this point, formal data collection is at an end and the experimenter should proceed to complete the session as outlined in the Experimenter Instructions: debriefing and site-specific administrative procedures.] ; ; [CMU] Notes for the End-to-End experimenter ------------------------------------- This file is ~data/ATISEval/documentation/experimenter_notes It describes the mechanics of how to set up an experimental session. Before starting, you should have read the document entitled "Experimenter Instructions for End-to-End Evaluation" and understood what's in it. Set up The cubicle should be tidy, with no extraneous materials on the main table, where they may interfere with subject activities. There should be a pad of paper and pencil for the subject to use, at the left of the keyboard. The crib sheet should be on the typing stand. You should be logged in as 'data'. You should open up a couple of terminal windows and move them to the auxiliary screen. Do your setup work in these. Applications to start Three applications need to be running before you can start a session: Recognition servers: (leave yourself ~5min for this part) telnet alpha1.speech.cs.cmu.edu [login as speech] ps x # if server is running, skip next two lines cd /moo/baz/atis/eval/live nohup server_male.csh >& male_log & exit telnet alpha3.speech.cs.cmu.edu [login as speech] ps x # if server is running, skip next two lines cd /moo/baz/atis/eval/live nohup server_female.csh >& female_log & exit Sybase server: telnet moo ps axu | grep sybase # if it's running, skip next four lines su sybase cd ; cd scripts startserver # wait for status messages to end exit exit ATISEval: cd ATISEval ATISEval Setting up ATISEval for a subject Use ~data/ATISEval/ATISEval (it should be in the dock). Enter the necessary information into the Subject Information panel (ie, subject ID, session number ("1") and row id from design (eg, "1r1")). This panel always appears at application startup. The application will not allow you to overwrite an existing dataset. If you really want to do this, you should delete the directory in question from the shell. When you successfully register a sesion, the Scenario window will appear. Move it to the auxillary screen, to the upper left corner. Make sure all of it shows on the screen. Hide (Command-h) all applications except for ATISEval and the Window Manager. Do not iconize the File Viewer; if you do so the speech input process will not function correctly (due to an o/s bug). You are now ready to bring a subject into the cubicle. Running the subject Follow the procedure specified in the document "Subject Instructions for End-to-End Evaluation". After the subject has left Quit ATISEval. Check that the results directory contains reasonable-looking files (ie, one .log and several .wav and .sro files per scenario, which are numbered 1-5). there should be a .squ file in the speaker directory. Make note of oddities and unusual occurrences in the experimental log book. Update the paperwork for the just-completed subject. After the session Start transcribing the .wav files, edit the dummy .sro files in the session directories. Back up the subject's data to an od. ; ; [MIT] %%NOTE: Original was in latex format and included an additional diagram. %% Instructions for using the ZDC Data Collection System 1 System Overview First, here is a brief layout of the topology of the system. The boxes in the diagram represent the processes, and the single-letter labels represent physical machines that they should run on. In theory, one could run the entire system on one machine, or each process on a separate machine. This is the standard MIT data collection configuration. Machine C must have the valley boards in it; machine B must have an Ariel board with a ProPort attached for data capture. Since the remote search and the backend run large programs, it is highly recommended that they are mapped to separate, fast machines that have enough physical memory (at least 32mb). 2 Starting the System To start up the system, you should follow these steps below in order. The % denotes the unix shell prompt. Any text after this should be typed in as specified with a carriage return at the end of the line. 1 If you have trouble starting the system as directed, contact one of the following people for help: David Goodine (dmg@goldilocks.lcs.mit.edu) Joe Polifroni (joe@goldilocks.lcs.mit.edu) 1. Log in to the WIZARD and USER machines as SLSDEMO. 2. Start up X Windows: % x11 3. On the WIZARD machine, get an xterm (or your favorite unix shell progrem) 4. Go to the Wizard directory (symbolically linked under " slsdemo"): % cd wizard 5. Start the Zdc system progrem: % start_zdc (This script can be edited to change the physical machines used by each of the pseudo machines above.) The shell you have just used to start the wizard should not be used for anything else (shouldn't be run in the background) since quite a bit of debugging information will be output here. You may want to log the output to a file for debugging purposes. Once the wizard window appears (possibly as an icon), open the window and continue: 6. Click on the Speaker Database button to select a speaker. For random speakers (i.e. demos, etc), you should use Joe Random User (initials are "JRU") which you can find by typing "JRU" on the line under the list of speakers and hitting a carriage return. If the speaker is not yet in the database, click on the New Speaker button to add this speaker. Then, click on the Hide This Window button to go back to the WIZARD window. (Bug Note: If you add a new speaker to the database, you must click on Reload Speaker Info before you can proceed, otherwise the speaker info is not internalized correctly. This is a bug that will be fixed eventually.) 7. Click on each of the start buttons in the middle of the wizard window. This will start up all of the pseudo machines (USER, RECOGNIZER, REMOTE-SEARCH, BACKEND). If a particular start button is not illuminated (greyed out), go to the System Parameters section below. 8. Several small Xterm windows will appear which control each of the subprograms running on the pseudo machines. Each of the programs illustrated above will be started. In general, it can be difficult to determine whether a particular program is running and whether it is wedged or crashed. This is complicated by several important sounding but irrelevant error messages printed out. In general, if a window has "segmentation fault", "bus error" and/or "core dumped" printed out, you've lost it. How to proceed at this early point in the system's development is somewhat sticky and cannot be documented here. 2 9. When each of the subprograms is successfully booted, the wizard window will enable selection of various relevant options. (This is how you know you're up.) These options are on/off settings labeled by component (NL, Backend, etc). Here are the descriptions of the settings (ON is generally displayed as a darker grey (3D depressed look) than OFF). Canonical configurations for different modes (e.g. DEMO vs. Wizardless Data Collection vs. Wizard Mode) are listed below: SAVE UTTERANCES Whether the waveforms should be saved for each utterance. If this option is not selected, playback by the wizard is disabled. If you just want to give a system demo, you can safely turn off this option. ORDERED SCENARIOS Decides whether the scenario number should be incremented automatically at the end of each scenario. This option is ignored when demo mode is selected. USE RECOGNIZER If this is ON, the utterance will be sent to the recognizer. If the utterance is invalid (too soft, too short, etc), it will be rejected before this stage by the recording module. ROBUST PARSER Determines whether robust parser will be used if a parsable sentence is not generated by the system. USE BACKEND Determines whether the backend should be called with the best hypothesis. In general, this should always be ON, and is only useful in the off position to debug the Nbest recognizer outputs. TRANSCRIPTION MODE If this is on, the wizard will be required to enter a transcription for each utterance processed by the system. If the Use Recognizer option is selected, this transcription is used simply as a reference (in the .log and .sro files) and does not affect the flow of the system (although the next utterance cannot be recorded until the transcription of the previous utterance has been entered). If the Use Recognizer options is not selected, this transcription will be passed to the backend as the best hypothesis. DEBUG MODE If this is selected, ghastly amounts of debugging information will be printed out be each of the subcomponents of the system. Don't bother unless you're sure you need to use it. DEMO MODE This button puts the system in demo mode, whereby the session is not scenario oriented. This should be selected before any of the subprocesses are started, but MUST be chosen before starting the session, if demo mode is desired. Horizontal Slider in lower right corner This controls the playback volume for the wizard playback. 10. Here are the canonical configurations you may want to use ("X" means this option is ignored in this configuration): 3 _________________________________________________________________________________________________ |__________________________Option__|_|Demo_Mode__|_____Recognizer_Mode__|______Wizard_Mode__|_____ |_______SAVE_UTTERANCES__|_|______________MAYBE__|______________________ON__|______________ON__|_ |____ORDERED_SCENARIOS__|_|_______________________X__|__________________ON__|______________ON__|_ |__________USE_RECOGNIZER__|_|__________________ON__|___________________ON__|____________OFF__|__ |___________ROBUST_PARSER__|_|____________MAYBE__|________________MAYBE__|___________MAYBE__|____ |______________USE_BACKEND__|_|_________________ON__|___________________ON__|______________ON__|_ |__TRANSCRIPTION_MODE__|_|_____________________OFF__|___________________ON__|______________ON__|_ |_______________DEBUG_MODE__|_|___________MAYBE__|________________MAYBE__|___________MAYBE__|____ |_________________DEMO_MODE__|_|________________ON__|_________________OFF__|_____________OFF__|__ 11. For each of the scenarios (or just once in demo mode), you will have to enable the user by clicking on the Start button under the User Interface options in the wizard window. You can abort the session by clicking on the End Session button. 12. To shut down the system, you should do the following: o be sure there is no current active scenario o click on the big Exit button, and wait a few seconds for things to die gracefully o control-C (kill) the xterm windows o control-Z in the Zdcbackend and Zoracle Xterm windows The system should then be completely down. You an check by using the unix "ps" program search for the appropriate programs on each of the physical machines. 3 The System Parameters Window You can use the system parameters window to specify what physical machines each of the pseudo machines above will run on. They are usually supplied in the command line of the wizard program. Call the wizard program (in the " slsdemo/wizard" directory) with the -h argument to find out the options. You can also specify the N for the Nbest outputs and some other things. To get this window, click on the System Parameters button on the right side of the wizard window. 4 COMMON LOSSAGE MODES o If the A/D/A device driver does not startup correctly (USER machine), with an error like "ARIO: Cannot open A/D", there is probably a user program still running that has control of the Ariel board. Kill that program and try it again. o If any of the programs (specifically the User, Rec and Remote-Search) get an error number 48 trying to bind an incoming data socket, the socket is already allocated to an older (or other) instance of the program it's trying to connect to. Kill that program and try again. o Due to a bug in the operating system beyond our control, we have installed a kludge that allows the wizard to kick the system if if gets wedged. In the top right hand of the wizard program, there is a button that will blink when the system is running normally. If this blinking stops 4 for some time (more than five seconds), then the bug has occurred. If the system seems stuck, check this first. If this is the case, you can jump-start the system by pressing the button to the left. This should start the button blinking again. As occurrence of this bug is directly related to the load on the system, you should make sure there aren't any unnecessary jobs running on the wizard machine. o Due to other brain-damage in unix, there is a possible bug (which I may have fixed) that could crop up. If this happens please contact Dave Goodine at the above address immediately. The symptoms are that the user utterance will not be sent to the recognizer for processing. You can detect this by looking at the Xterm window for the User program (top left little window) and seeing if it recently failed to connect to the recognizer. If this is the case, please do not stop the system but report the bug immediately (in person if possible). o Any other peculiarities should be noted and specifically in the case of fatal errors, try to preserve the state of the system or record as much information as possible. ; ; [SRI] Procedure: When the subject arrived, the experimenter took the subject into the data collection room, seated her or him in the chair at the desk, and explained that the experiment involved talking to the Air Travel Information System. The subject was then asked to read and sign a consent form. The experimenter sat in a chair next to the subject and went through the instructions as described in the document "Subject Instructions for End-to-End Evaluation." The experimenter pointed out the particular site-specific features of the SRI SLS, helped the subject solve the practice scenario, and answered any questions the subject had along the way. When both the experimenter and the subject were convinced that the subject was ready to start the real scenarios, the experimenter handed the subject the first scenario, in hard-copy form. the subject kept this scenario on the desk in front of the console during the session. the experimenter waited in the room while the subject read through the scenario to make sure the subject understood what was being asked for. the experimenter answered any questions about using the microphone or system, but did not answer questions about how to speak or how to solve the scenario. The experimenter left the subject alone in the room to solve the scenario, and went to the adjacent room, where the experimenter could listen in via a cable connected through the wall to the Sennheiser microphone. the experimenter also tape-recorded each session using this cable. The subject was instructed to write the answer to the scenario on the hard-copy form containing the scenario, and then to let the experimenter know s/he was finished by speaking into the Sennheiser microphone (but not querying the system). the experimenter monitored for this statement from the next room, and went into the subject's room to present the subject with the next scenario each time the subject finished a scenario. If more than 15 minutes went by for any particular scenario, the experimenter went in to stop the subject, and then gave the subject the next scenario. Subjects were told they could take breaks between sessions, but none did this. When all 4 scenarios were completed, the experimenter gave the subject the questionnaire to fill out. When the subject finished filling out questionnaire, the experimenter thanked the subject profusely for participating and gave the subject $14 worth of gift certificates to a local bookstore. The experimenter debriefed the subject, either at the time of the experiment, or in an e-mail message, as follows: What you just took part in is part of an experiment being run at several sites across the country that have developed Air Travel Information Systems. Each site is using the same scenarios and instructions, but of course different systems and speakers. Your interaction with the system will be used both to evaluate our system and to improve it. We recorded everything you said to the system and everything the system did in response, including some information on timing. These data will be used to train the system in order to improve it. Your data will be combined with that of others during analysis and your contribution will remain anonymous. Thanks for your efforts! Do you have any questions? ;