Discourse Graphbank
Item Name: | Discourse Graphbank |
Author(s): | Florian Wolf, Edward Gibson, Amy Fisher, Meredith Knight |
LDC Catalog No.: | LDC2005T08 |
ISBN: | 1-58563-320-8 |
ISLRN: | 983-656-398-539-6 |
DOI: | https://doi.org/10.35111/7snd-y397 |
Release Date: | March 15, 2005 |
Member Year(s): | 2005 |
DCMI Type(s): | Text |
Project(s): | EARS, GALE |
Application(s): | discourse analysis, information retrieval, summarization |
Language(s): | English |
Language ID(s): | eng |
License(s): |
LDC User Agreement for Non-Members |
Online Documentation: | LDC2005T08 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Wolf, Florian, et al. Discourse Graphbank LDC2005T08. Web Download. Philadelphia: Linguistic Data Consortium, 2005. |
Related Works: | View |
Introduction
Discourse Graphbank contains 135 newswire texts totalling 70,000 words annotated with coherence relations.
The project was Florian Wolf's PhD thesis and aimed to define a descriptively adequate data structure for representing discourse coherence structures, investigated the impact of discourse coherence structures on other linguistic processes and natural language applications (e.g. anaphora resolution, summarization and information retrieval), and developed and tested discourse parsing algorithms.
Data
The source data consists of Assoicated Press and Wall Street Journal newswire data from TIPSTER Complete (LDC93T3A) annotated with coherence relations.
The data was annotated by two independent annotators with 88% agreement. The annotators notated 11 types of coherence relations:
Resemblance relations | Parallel |
Contrast | |
Example | |
Generalization | |
Elaboration | |
Cause-Effect relations | Explanation |
Violated Expectation | |
Condition | |
Temporal Sequence relation | |
Attribution relation | |
Same relation |
Samples
For an example of the data in this corpus, please view this sample (JPG).
Updates
None at this time.