TimeBank 1.2 contains 183 news articles that have been annotated with temporal information, adding events, times and temporal links between events and times. The annotation follows the TimeML 1.2.1 specificationavailable at www.timeml.org.
TimeML aims to capture and represent temporal information. This is accomplished using four primary tag types: TIMEX3 for temporal expressions, EVENT for temporal events, SIGNAL for temporal signals, and LINK for representing relationships. For a detailed description of TimeML, see the TimeML 1.2.1 Specification and Guidelines. Here, we give a summary of each tag.
TIMEX3. This tag is used to capture dates, times, durations, and sets of dates and times. All TIMEX3 tags include a type and a value along with some other possible attributes. The value is given according to the ISO 8601 standard. The TIMEX3 tag allows specification of a tempral anchor. This facilitates the use of temporal functions to calculate the value of an underspecified temporal expression. For example, an article might include a document creation time such as "January 3, 2006." Later in the article, the temporal expression "today" may occur. By anchoring the TIMEX3 for "today" to the document creation time, we can determine the exact value of the TIMEX3.
EVENT. The EVENT tag is used to annotate those elements in a text that mark the semantic events described by it. Any event that can be temporally anchored or ordered is captured with this tag. An EVENT includes a class attribute with values such as occurrence, state, or reporting. The class of an EVENT may indicate what relationships the event participates in. In addition to the EVENT tag, events are also annotated with one or more MAKEINSTANCE tags that include information about a particular instance of the event. This includes part of speech, tense, aspect, modality, and polarity. When an event participates in a relationship, it is actually the event instance that is referenced. This is to allow for statements such as "John taught on Monday but not on Tuesday." Here, there are actually two instances of the teaching-event: one that has a positive polarity and one that is negative. Further, each instance participates in its own temporal relationship with respect to "Monday" and "Tuesday."
SIGNAL. The SIGNAL tag is used to annotate temporal function words such as "after," "during," and "when." These signals are then used in the representation of a temporal relationship.
The following three tags are link tags. They capture temporal, subordination, and aspectual relationships found in the text. These tags do not consume any actual text, but they do relate the three tag types above to each other.
TLINK. Temporal links are represented with a TLINK tag. A TLINK can temporally relate two temporal expressions, two event instances, or a temporal expression and an event instance. Along with an identification marker for each of these two elements, a relation type is given such as before, includes, or ended by. When a signal is present that helps to define the relationship, an ID for the SIGNAL is given as well.
SLINK. This tag is used to capture subordination relationships that involve event modality, evidentiality, and factuality. An SLINK includes an event instance ID for the subordinating event and an event instance ID for the subordinated event. Possible relation types for SLINK include modal, evidential, and factive. An SLINK will typically not include a signal ID unless it has the relation type conditional. Three specific EVENT classes interact with SLINK: reporting, i_state, and i_action.
ALINK. An aspectual connection between two event instances is represented with ALINK. As with SLINK, this tag includes two event instance IDs, one that introduces the ALINK and one that is the event argument to that event. The introducing event has the class aspectual. Some possible relation types for ALINK are initiates, terminates, and continues. TimeBank 1.2 contains 183 articles with just over 61,000 non-punctuation tokens. The count for each TimeML tag is listed below:
|EVENT ||7935 |
|MAKEINSTANCE ||7,940 |
|TIMEX3 ||1,414 |
|SIGNAL ||688 |
|ALINK ||265 |
|SLINK ||2,932 |
|TLINK ||6,418 |
|Total ||27,592 |
Samples For an example of the data in this corpus, please view the following samples.
Portions © 1998 American Broadcasting Corporation, © 1998 The Associated Press, © 1998 Cable News Network, LP, LLLP, © 1987-1989 Dow Jones & Company, Inc., © 1998 New York Times, © 1998 Public Radio International, © 2002-2006 Brandeis University, © 2006 Trustees of the University of Pennsylvania
The World is a co-production of Public Radio International and the British Broadcasting Corporation and is produced at WGBH Boston.