Phonemes of Arabic

Item Name: Phonemes of Arabic
Author(s): Mohamed Alshaari, Hussien ElHarati, Veton Kepuska
LDC Catalog No.: LDC2020S13
ISBN: 1-58563-950-8
ISLRN: 049-846-101-218-5
Release Date: December 15, 2020
Member Year(s): 2020
DCMI Type(s): Sound
Sample Type: pcm
Sample Rate: 48000
Data Source(s): microphone speech
Application(s): speech recognition, language identification
Language(s): Arabic
Language ID(s): ara
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2020S13 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Alshaari, Mohamed, Hussien ElHarati, and Veton Kepuska. Phonemes of Arabic LDC2020S13. Web Download. Philadelphia: Linguistic Data Consortium, 2020.


Phonemes of Arabic was developed at the Florida Institute of Technology. It consists of approximately one hour of speech from native Arabic speakers that includes all Arabic sounds (consonants and vowels) and 24 words with specific consonant-vowel patterns.


Arabic has three short vowels, three long vowels and 28 consonants. Speakers recorded all sounds and repeated each sound three times. Each speaker also recorded 24 Arabic words with a specified consonant-vowel pattern and repeated each word three times.

Recordings were collected over a three month period at a cultural center in Florida. The speakers (19 male) were from the following countries: Egypt, Iraq, Lebanon, Libya, Morocco, Saudi Arabia and Syria.

Speech data is presented as single channel, 48 kHz, 32-bit Singed Integer PCM wav files.


Please view the following samples


None at this time.

Available Media

View Fees

Login for the applicable fee