Levantine Arabic QT Training Data Set 4 (Speech + Transcripts)
Item Name: | Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) |
Author(s): | Mohamed Maamouri (project head), Tim Buckwalter, Hubert Jin |
LDC Catalog No.: | LDC2005S14 |
ISBN: | 1-58563-342-9 |
ISLRN: | 546-803-428-857-5 |
DOI: | https://doi.org/10.35111/a75r-qp57 |
Release Date: | June 15, 2005 |
Member Year(s): | 2005 |
DCMI Type(s): | Sound |
Sample Rate: | 8000 |
Data Source(s): | telephone conversations |
Project(s): | GALE, EARS |
Language(s): | North Levantine Arabic, South Levantine Arabic |
Language ID(s): | apc, ajp |
License(s): |
LDC User Agreement for Non-Members |
Online Documentation: | LDC2005S14 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Mohamed Maamouri (project head), Tim Buckwalter, and Hubert Jin. Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) LDC2005S14. Web Download. Philadelphia: Linguistic Data Consortium, 2005. |
Related Works: | View |
Introduction
This file contains documentation on the Levantine Arabic QT Training Data Set 4 (Speech + Transcripts), Linguistic Data Consortium (LDC) catalog number LDC2005S14 and ISBN 1-58563-342-9.
This release contains 901 calls and the total speech is 133.6 hours of telephone conversation in Levantine Arabic. Both audio and transcription files are included in this package.
The majority of speakers in this corpus are Lebanese. The data is similar to the training data in Set 3 [LDC2005S07, speech and LDC2005T03, transcripts]. The dialects are distributed as follows:
- 171 JOR
- 1373 LEB
- 229 PAL
- 29 SYR
Samples
For an example of this corpus, please review this audio sample.