Corpus of Conversational Persian Transcripts
Item Name: | Corpus of Conversational Persian Transcripts |
Author(s): | Ariana Negar Mohammadi |
LDC Catalog No.: | LDC2019T11 |
ISBN: | 1-58563-897-8 |
ISLRN: | 187-041-892-174-7 |
DOI: | https://doi.org/10.35111/qxep-jt47 |
Release Date: | August 15, 2019 |
Member Year(s): | 2019 |
DCMI Type(s): | Text |
Data Source(s): | telephone conversations, microphone conversation |
Application(s): | discourse analysis, sociolinguistics, pragmatics |
Language(s): | Persian |
Language ID(s): | fas |
License(s): |
LDC User Agreement for Non-Members |
Online Documentation: | LDC2019T11 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Mohammadi, Ariana Negar. Corpus of Conversational Persian Transcripts LDC2019T11. Web Download. Philadelphia: Linguistic Data Consortium, 2019. |
Related Works: | View |
Introduction
Corpus of Conversational Persian Transcripts consists of transcripts from approximately 20 hours of naturally occurring informal conversations in the Tehrani dialect of Iranian Persian. The corresponding speech is not included in this release.
Data
This corpus is extracted from 1,201 minutes of conversations among 22 participants, 12 male and 10 female. The participants recorded their daily phone calls and face-to-face interactions in a variety of informal settings. The conversations represent various interaction types, settings, types of relationship, and communicative goals.
The transcripts were annotated for gender, age, and recording method and setting. See the included documentation for more information about the annotations and transcription methodology.
Each conversation is presented as a UTF-8 encoded XML file.
Samples
Please view this sample.
Updates
None at this time.