This README.TXT file contains information about all files of the 200 adult speakers which were recorded for the Arko Urdu project. The recordings were carried out on behalf of US Army Research Laboratory in 2006. In each session one speaker has been presented with 400 prompts to read: sentences, place names, and person names. Two microphones were used for the recordings, which were set at different distances to the speaker. For the two speech files there is one SAM label file with information about the speech files. The label file has the extension URO, the speech files for the four microphones have the extensions UR0 and UR1. The files of one speaker are located in directories named SESxxx; where xxx reprepsents the session number recorded. 10 sessions were put into one block, these blocks are directories named BLOCKxx; where xx starts with zero (00). The blocks are located in the root directory \ADULT1UR. The speech files are separated from the SAM label files. The volume names are ADULT1UR000-ADULT1UR010 for the speech files and ADULT1URD00 for the documentation files. Both volumes have the same directory structure. Description of the files of the database: Extension DOC denotes Microsoft Word for Windows files. Extension PS denotes PostScript files. Extension TXT denotes text file Extension PDF denotes Adobe Portable Document Format Contents of Root directory README.TXT database description file as plain text, this file DISK.ID volume name COPYRIGH.TXT copyright statement ADULT1UR000 directory ADULT1URD00 directory Contents of directory ADULT1URD00 DOC directory TABLE directory INDEX directory BLOCK directory Contents of directory ADULT1UR\DOC DESIGN.DOC contains free text information about database (created in Tahoma font) SAMPALEX.PS table of SAMPA symbols used for the phoneme notation in TABLE\LEXICON.TBL SUMMAR0.TXT contains description of all recording sessions (channel0) using following mnemonics: DIR full directory path of the session SES Session number CCD2N 2 strings with N corpus codes, where N is the number of total items. The 2 strings are separated by a space. RED Recording date of first item RET Recording time of first item Contents of directory ADULT1UR\TABLE LEXICON.TBL lexicon file, alphabetically ordered table of distinct lexical items which occur in the corpus with the corresponding pronunciation information, ranking of alternative pronunciations and the frequencies of occurrence REC_COND.TBL table with information about the conditions of all recording sessions using following mnemonics: SES session number MIP microphone positions MIT microphone types SCC scenario code SESSION.TBL table with information about each recording session (speaker, recording environment, recording time and date) using following mnemonics: SES session number SCD unique speaker code REP recording place RED recording date RET recording time SPEAKER.TBL table with information about each speaker (code, gender, age, accent), using following mnemonics: SCD unique speaker code SEX speaker gender AGE speaker age ACC speaker accent Contents of directory ADULT1UR\INDEX CONTENT0.LST transcription of each recorded utterance (encoding = UTF8), information about the speaker and environment), using following mnemonics: DIR directory SRC speech signal file name CCD corpus code SCD speaker code SEX speaker gender AGE speaker age ACC speaker accent SCC scenario code LBO speech transcription without the numerical data Contents of directory ADULT1UR\BLOCK ( is a number from 00 to 20) SES directories for each recording session, is the block number, is a session number from 0 to 9 Contents of directory ADULT1UR\BLOCK\SES SA.URO SAM label file of item with corpus code SA.UR0 signal file of item with corpus code , channel 0 SA.UR1 signal file of item with corpus code , channel 1