Home › Language Resources › Data

Ravnursson Faroese Speech and Transcripts

Item Name:	Ravnursson Faroese Speech and Transcripts
Author(s):	Carlos Daniel Hernández Mena, Annika Simonsen, Jon Gudnason
LDC Catalog No.:	LDC2024S09
ISLRN:	558-066-910-837-0
DOI:	https://doi.org/10.35111/d60c-5x79
Release Date:	August 15, 2024
Member Year(s):	2024
DCMI Type(s):	Sound, Text
Sample Type:	pcm
Sample Rate:	16000
Data Source(s):	microphone speech
Application(s):	speech recognition
Language(s):	Faroese
Language ID(s):	fao
License(s):	Ravnursson Faroese Speech and Transcripts (For-Profit) Ravnursson Faroese Speech and Transcripts (Non-Member) Ravnursson Faroese Speech and Transcripts (Not-For-Profit)
Online Documentation:	LDC2024S09 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Hernández Mena, Carlos Daniel, Annika Simonsen, and Jon Gudnason. Ravnursson Faroese Speech and Transcripts LDC2024S09. Web Download. Philadelphia: Linguistic Data Consortium, 2024.

Introduction

Ravnursson Faroese Speech and Transcripts contains 109 hours of Faroese prompted speech from 433 speakers (249 female, 184 male), corresponding transcripts and speaker metadata. It is an extract from the Basic Language Resource Kit 1.0 (BLARK 1.0) developed by the Faroe Islands' Ravnur Project.

Data

Speech data was collected in 2022. Speakers from all major dialect areas in the Faroe Islands in three age groups -- 15-35, 36-60, and 61+ years -- read texts that included a word list, a phrase list, closed vocabulary readings, and short texts. Recordings also contain spontaneous speech.

TASCAM DR-40 Linear PCM audio recorders captured speech data at 48 kHz, downsampled for this corpus. The audio data is divided into train, development, and test sets and is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM.

Recordings were orthographically transcribed and time-stamped. Transcripts and speaker metadata are included in a tab separated file.

Ravnursson Faroese Speech and Transcripts

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees