L2-KSU Native and Non-Native Arabic Speech
Item Name: | L2-KSU Native and Non-Native Arabic Speech |
Author(s): | Norah Alrashoudi, Hend AlKhalifa, Yousef Ajami Alotaibi |
LDC Catalog No.: | LDC2024S11 |
ISLRN: | 031-691-303-064-0 |
DOI: | https://doi.org/10.35111/n3d8-t960 |
Release Date: | September 16, 2024 |
Member Year(s): | 2024 |
DCMI Type(s): | Sound, Text |
Sample Type: | pcm |
Sample Rate: | 16000 |
Data Source(s): | microphone speech |
Application(s): | speaker identification, speech recognition |
Language(s): | Standard Arabic, Arabic |
Language ID(s): | arb, ara |
License(s): |
L2-KSU Native and Non-Native Arabic Speech Agreement |
Online Documentation: | LDC2024S11 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Alrashoudi, Norah, Hend AlKhalifa, and Yousef Alotaibi. L2-KSU Native and Non-Native Arabic Speech LDC2024S11. Web Download. Philadelphia: Linguistic Data Consortium, 2024. |
Related Works: | View |
Introduction
L2-KSU Native and Non-Native Arabic Speech was developed by King Saud University (KSU) and contains approximately six hours of Modern Standard Arabic read speech from 80 subjects, along with transcripts and speaker metadata.
Data
The speech data was collected in 2022 from 40 native and 40 non-native speakers. Native speakers were from Saudi Arabia, Egypt, and Palestine. They provided audio recordings through the crowd sourcing platform Khamsat. Non-native speakers were Central and West African students enrolled in KSU's Arabic Linguistics Institute; they provided speech recordings on site. All subjects read a series of ten sentences, repeating each sentence multiple times.
Audio is presented as 16-bit 16 kHz wav files. Transcript files in UTF-8 plain text, speaker metadata, and the Arabic sentences with transliteration, English translation and IPA transcription are also included in the documentation accompanying this release.
Samples
Please view these samples:
Updates
None at this time.