Levantine Arabic QT Training Data Set 4 (Speech + Transcripts)
|Item Name:||Levantine Arabic QT Training Data Set 4 (Speech + Transcripts)|
|Author(s):||Mohamed Maamouri, Tim Buckwalter, Hubert Jin|
|LDC Catalog No.:||LDC2005S14|
|Release Date:||June 15, 2005|
|Data Source(s):||telephone conversations|
|Language(s):||North Levantine Arabic, South Levantine Arabic|
|Language ID(s):||apc, ajp|
LDC User Agreement for Non-Members
|Online Documentation:||LDC2005S14 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Maamouri, Mohamed, Tim Buckwalter, and Hubert Jin. Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) LDC2005S14. Web Download. Philadelphia: Linguistic Data Consortium, 2005.|
This file contains documentation on the Levantine Arabic QT Training Data Set 4 (Speech + Transcripts), Linguistic Data Consortium (LDC) catalog number LDC2005S14 and ISBN 1-58563-342-9.
This release contains 901 calls and the total speech is 133.6 hours of telephone conversation in Levantine Arabic. Both audio and transcription files are included in this package.
The majority of speakers in this corpus are Lebanese. The data is similar to the training data in Set 3 [LDC2005S07, speech and LDC2005T03, transcripts]. The dialects are distributed as follows:
- 171 JOR
- 1373 LEB
- 229 PAL
- 29 SYR
For an example of this corpus, please review this audio sample.