Item Name: Corpus of Conversational Persian Transcripts
Author(s): Ariana Negar Mohammadi
LDC Catalog No.: LDC2019T11
ISBN: 1-58563-897-8
ISLRN: 187-041-892-174-7
Release Date: August 15, 2019
Member Year(s): 2019
DCMI Type(s): Text
Data Source(s): telephone conversations, microphone conversation
Application(s): discourse analysis, sociolinguistics, pragmatics
Language(s): Persian
Language ID(s): fas
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2019T11 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Mohammadi, Ariana Negar. Corpus of Conversational Persian Transcripts LDC2019T11. Web Download. Philadelphia: Linguistic Data Consortium, 2019.
Corpus of Conversational Persian Transcripts consists of transcripts from approximately 20 hours of naturally occurring informal conversations in the Tehrani dialect of Iranian Persian. The corresponding speech is not included in this release.


This corpus is extracted from 1,201 minutes of conversations among 22 participants, 12 male and 10 female. The participants recorded their daily phone calls and face-to-face interactions in a variety of informal settings. The conversations represent various interaction types, settings, types of relationship, and communicative goals.

The transcripts were annotated for gender, age, and recording method and setting. See the included documentation for more information about the annotations and transcription methodology.

Each conversation is presented as a UTF-8 encoded XML file.


