Discourse Graphbank

Item Name: Discourse Graphbank
Author(s): Florian Wolf, Edward Gibson, Amy Fisher, Meredith Knight
LDC Catalog No.: LDC2005T08
ISBN: 1-58563-320-8
ISLRN: 983-656-398-539-6
DOI: https://doi.org/10.35111/7snd-y397
Release Date: March 15, 2005
Member Year(s): 2005
DCMI Type(s): Text
Project(s): EARS, GALE
Application(s): discourse analysis, information retrieval, summarization
Language(s): English
Language ID(s): eng
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2005T08 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Wolf, Florian, et al. Discourse Graphbank LDC2005T08. Web Download. Philadelphia: Linguistic Data Consortium, 2005.
Related Works: View


Discourse Graphbank contains 135 newswire texts totalling 70,000 words annotated with coherence relations.

The project was Florian Wolf's PhD thesis and aimed to define a descriptively adequate data structure for representing discourse coherence structures, investigated the impact of discourse coherence structures on other linguistic processes and natural language applications (e.g. anaphora resolution, summarization and information retrieval), and developed and tested discourse parsing algorithms.


The source data consists of Assoicated Press and Wall Street Journal newswire data from TIPSTER Complete (LDC93T3A) annotated with coherence relations.

The data was annotated by two independent annotators with 88% agreement. The annotators notated 11 types of coherence relations:

Resemblance relations Parallel
Cause-Effect relations Explanation
Violated Expectation
Temporal Sequence relation
Attribution relation
Same relation


For an example of the data in this corpus, please view this sample (JPG).


None at this time.

Available Media

View Fees

Login for the applicable fee