The EventStatus Corpus

Item Name: The EventStatus Corpus
Author(s): Ruihong Huang, Daniel Jurafsky, Ellen Riloff
LDC Catalog No.: LDC2017T09
ISBN: 1-58563-800-5
ISLRN: 173-931-115-382-5
Release Date: May 15, 2017
Member Year(s): 2017
DCMI Type(s): Text
Data Source(s): newswire
Project(s): DEFT
Application(s): event detection
Language(s): English, Spanish
Language ID(s): eng, spa
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2017T09 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Huang, Ruihong, Daniel Jurafsky, and Ellen Riloff. The EventStatus Corpus LDC2017T09. Web Download. Philadelphia: Linguistic Data Consortium, 2017.
Related Works: View


The EventStatus Corpus was developed by researchers at Texas A&M University, Stanford University and The University of Utah. It consists of approximately 3,000 English and 1,500 Spanish news articles about civil unrest events annotated with temporal tags.

This corpus was designed to support the study of the temporal and aspectual properties of major events, that is, whether an event has already happened, is currently happening or may happen in the future. Since it focuses on a single domain (civil unrest events), it may be appropriate for tasks such as event extraction and temporal question answering.


The relevant news articles were sourced from English Gigaword Fifth Edition (LDC2011T07) and Spanish Gigaword Third Edition (LDC2011T12). The civil unrest events include protests, demonstrations, marches and strikes. The data was annotated as PAST, ON-GOING or FUTURE and within each of those categories, as PLANNED, ALERT or POSSIBLE.

In addition to the annotated articles, file lists used in experiments for tuning and test are included. 10-fold cross-validations were performed, and the specific 10-fold splits of the test are included as well. All text is presented as plain text and encoded in UTF-8.


Please view this sample.


None at this time.

Available Media

View Fees

Login for the applicable fee