Levantine Arabic QT Training Data Set 4 (Speech + Transcripts)

Item Name: Levantine Arabic QT Training Data Set 4 (Speech + Transcripts)
Author(s): Mohamed Maamouri, Tim Buckwalter, Hubert Jin
LDC Catalog No.: LDC2005S14
ISBN: 1-58563-342-9
ISLRN: 546-803-428-857-5
DOI: https://doi.org/10.35111/a75r-qp57
Release Date: June 15, 2005
Member Year(s): 2005
DCMI Type(s): Sound
Sample Rate: 8000
Data Source(s): telephone conversations
Project(s): EARS, GALE
Language(s): North Levantine Arabic, South Levantine Arabic
Language ID(s): apc, ajp
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2005S14 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Maamouri, Mohamed, Tim Buckwalter, and Hubert Jin. Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) LDC2005S14. Web Download. Philadelphia: Linguistic Data Consortium, 2005.
Related Works: View


This file contains documentation on the Levantine Arabic QT Training Data Set 4 (Speech + Transcripts), Linguistic Data Consortium (LDC) catalog number LDC2005S14 and ISBN 1-58563-342-9.

This release contains 901 calls and the total speech is 133.6 hours of telephone conversation in Levantine Arabic. Both audio and transcription files are included in this package.

The majority of speakers in this corpus are Lebanese. The data is similar to the training data in Set 3 [LDC2005S07, speech and LDC2005T03, transcripts]. The dialects are distributed as follows:

  • 171 JOR
  • 1373 LEB
  • 229 PAL
  • 29 SYR


For an example of this corpus, please review this audio sample.

Available Media

View Fees

Login for the applicable fee