Levantine Arabic QT Training Data Set 3 Transcripts (LDC2005T02) This corpus provides the transcription for the corresponding speech corpus (LDC2005T02) from LDC. This training speech release contains 322 conversations and the total speech is just over 50 hours of Levantine Arabic speech. In this directory (docs), we included the following documents: 1) filelist A list of conversation IDs with prefix of 'fsa_'. 2) wordlist.LA-TD3.utf8.txt Wordlist and mapping table 3) speaker_info.txt Speaker information on origin, gender, age (group) etc, judged by the annotators who transcribed the conversations. Unlike the previous training data corpora (Set 1 and 2) which are nearly 100% dominated by Jordanian speakers, this corpus is mostly Lebanese (72%) plus a combination of others Levantine speakers. Directory structure annotation - 322 transcription files in UTF-8 format. docs - documentation. Note: The audio (in sphere format) is released on a separate package (LDC2005S07). For more information, please contact Mohamed Maamouri maamouri@ldc.upenn.edu Timbuck Water timbuck2@ldc.upenn.edu Hubert Jin hubertj@ldc.upenn.edu