NAME OF THIS FILE: actual.txt SUSAS Speech Under Simulated and Actual Stress Robust Speech Processing Laboratory http://www.ee.duke.edu/Research/Speech/ October 23, 1997 (Release Rev. 1.4) January 15, 1997 (Release Rev. 1.0) Author: John H.L. Hansen Robust Speech Processing Laboratory Duke University, Dept. of Electrical Engineering Durham, North Carolina 27708-0291 email: jhlh@ee.duke.edu ----------------------------------------------------------------------------- Changes: Release Rev. 1.4 [1] includes label files for all "simulated" stress data, neutral and computer task stress of "actual" data (label files are provide for Scream Machine (/scream) and Free Fall (/freefall) speech, however there may be some phone level shifts due to background noise: they are provided "as is") Release Rev. 1.1 [1] replaced "simulated" data with original 8kHz speech data. Rev1.0 contained "simulated" speech with begin and ending part of the word silence removed. Since some files were cut (stops), we replaced the entire "simulated" portion with the original data. This means that if you obtained label files with Rev1.0, you must re-run with Rev1.1 (or use label files provided). [2] for "actual" stress portion, tokens for the word "break" were sometimes named "brake". All these actual portion entries have been re-named "break" [3] for "actual" stress portion, there were a number of out-of-vocabulary words produced (examples include pilot, may-day, help-me, power-dive, etc). Since some recognition scripts process all entries in a directory, out-of-vocabulary words have been moved to new directories (sm_oov_all: would contain actual stress words from /cream which are out-of-vocabulary, as well as the large "All" files which contain the entire roller-coaster ride speaker run). Release Rev. 1.0 Original SUSAS data; includes fixes for missing files from "simulated" portion ----------------------------------------------------------------------------- susas/actual/ directory contents: Contents: 7 speakers in roller coaster and free fall actual stress 4 speakers in an USA Apache helicopter cockpit Speakers: (Roller Coaster and Free Fall) --------- m1, m2, m4: Male General USA Accent m3: Male Southern USA Accent f1, f2, f3: Female General USA Accent Stress Conditions: (Roller Coaster & Free Fall) ------------------ neutral Neutral Speech medst low Dual-Tracking task stress hist high Dual-Tracking task stress freefall Free Fall Amusement Park ride stress scream Scream Machine: Roller Coaster stress neut_oov_all each of these directories contains any out-of-vocabulary meds_oov_all words, "Silence" files extracted from the original hist_oov_all digitizing process, or "All" files which contain ff_oov_all a full roller-coaster speaker run. sm_oov_all Speakers: (USA Apache helicopter cockpit) --------- rs, dt Male Southern USA (NC) Accent (normal flight conditions) dw, rf Male Southern USA (NC) Accent (out of fuel stress conditions) directories: helicopt/rs ------ contains speaker rs helicopt/dt ------ contains speaker dt helicopt/fuelout -- contains speakers dw,rf (not 35 word vocabulary) Stress Conditions: (USA Apache helicopter cockpit) ------------------ medst Warm-up: Helicopter on the ground but running hist Flight: Pilots flying the helicopter while speaking Special Comments: ----------------- All of these files are in binary short format (16-bit integer) (same as short in C++). All of these files have had excessive leading and trailing silence removed via endpoint detection. Sampled at 8kHz Files of the format All* contain the continuous speech version of the utterances (in other words, the whole thing). They are found for the /scream and /freefall speech. Files of the format All* in rs and dt contain the continuous speech version of the speaker utterances for warm-up and flight tasks. Files of the format Silence* contain representative silence sections for each speaker and stress condition. The original recording was mostly recorded on a tape deck with batteries. Under neutral conditions, some recordings were made using AC current, resulting in the presence of a small 60Hz tone. For some speakers, silence files are provided that are representative samples of the background noise between words for this run. This was true for speaker m4 in neutral and medst conditions (there are two silence files that are representative samples of the background noise between words for this run.). The files are Silence?. The neutral recording here was with battery power first, and then AC powered (hence, the hum in Silence2). For APACHE speech: -Speaker rs is a Male in his late 20's with a general American speech pattern. The speech of Speaker rs is much cleaner (speech waveform is more distinct than for speaker rs). -Speaker dt is a Male in his late 20's with a North Carolina accent. The speech of Speaker dt is much noisier (speech waveform is noisier than for speaker rs). -The "medst" directory is in an Apache helicopter on the ground during "warm-up" with background ground controller over-talk. -The "hist" directory is a moderate stress condition where the Apache helicopter is now in "flight" with the pilots performing normal maneuvers such as hover, turns, etc. while speaking. -The directories: hist_oov_all meds_oov_all contain the files "All1" that is a version of the binary input file that includes all 35 of the SUSAS words (two tokens each).