KAIROS Phase 1 Evaluation Source Data, Annotation, and Assessment
| Item Name: | KAIROS Phase 1 Evaluation Source Data, Annotation, and Assessment |
| Author(s): | Song Chen, Jennifer Tracey, Justin Mott, Ann Bies, Michael Arrigo, Christopher Caruso, David Graff, Stephanie Strassel |
| LDC Catalog No.: | LDC2026T07 |
| ISLRN: | 558-102-578-740-1 |
| DOI: | https://doi.org/10.35111/rfam-2766 |
| Release Date: | June 15, 2026 |
| Member Year(s): | 2026 |
| DCMI Type(s): | Image, MovingImage, Software, Sound, StillImage, Text |
| Data Source(s): | web collection |
| Project(s): | KAIROS |
| Application(s): | entity extraction, event detection, information extraction, knowledge representation |
| Language(s): | Spanish, English |
| Language ID(s): | spa, eng |
| License(s): |
LDC User Agreement for Non-Members |
| Online Documentation: | LDC2026T07 Documents |
| Licensing Instructions: | Subscription & Standard Members, and Non-Members |
| Citation: | Chen, Song, et al. KAIROS Phase 1 Evaluation Source Data, Annotation, and Assessment LDC2026T07. Web Download. Philadelphia: Linguistic Data Consortium, 2026. |
| Related Works: | View |
Introduction
KAIROS Phase 1 Evaluation Source Data, Annotation, and Assessment was developed by the Linguistic Data Consortium (LDC). It contains the English and Spanish source data (text, video and images), manual annotations, reference knowledge graphs, the system output assessed during the evaluation, and human assessment results from the Phase 1 evaluation of the DARPA KAIROS Program.
The Phase 1 evaluation focused on the improvised explosive bombing scenario with nine complex events (CEs) and two surprise complex events in the mass shooting scenario:
- ce1005: Sidney Aeroplane Bomb Plot, Australia, 2017
- ce1006: Stockholm Bombings, Sweden, 2010
- ce1007: Manchester Arena Bombing, England, 2017
- ce1008: Taxi Detonation, Canada, 2016
- ce1009: Spokane Bombing Attempt, Washington, 2011
- ce1010: Derry Bombing, Northern Ireland, 2019
- ce1011: Bogotá Police Academy Car Bombing, Colombia, January 2019
- ce1012: Kansas City Hospital Bombing, Missouri, 2020
- ce1013: Attempted bombing in Moses Lake, Washington, 2018
- ce1020: El Paso Walmart Shooting, Texas, 2019
- ce1021: Orlando nightclub shooting, Florida, 2016
The DARPA KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) program aimed to build technology capable of understanding and reasoning about complex real-world events in order to provide actionable insights to end users. KAIROS systems utilized formal event representations in the form of schema libraries that specified the steps, preconditions and constraints for an open set of complex events; schemas were then used in combination with event extraction to characterize and make predictions about real-world events in a large multilingual, multimedia corpus. Each KAIROS evaluation focused on a real-world scenario and several real-world complex events within that scenario, along with the possibility of surprise complex events in different but related scenarios.
Data
Source data was collected from the web by LDC. A total of 139 root web pages were collected and processed, yielding 131 text data files, 1176 image files, and 27 video files. The evaluation source data for each complex event was an input data set consisting of 10-15 documents that included multimodal English and Spanish event-relevant and off-topic distractor documents. Manual annotation and assessment of event-relevant documents for 10 complex events are included in this release.
Scenario-relevant events and relations were labeled for each document to develop a structured representation of temporally-ordered events, relations and arguments that expressed the scenario-relevant events in each complex event. A reference knowledge graph (Graph G) was developed for each event; systems were expected to match the Graph G with a given schema library. Assessment data includes human assessment judgments and the system output that was manually assessed for the end-to-end evaluation task.
Source data is presented in various formats: .gif, .jpg,. ltf, .mp4, .png, .psm, and .svg. Annotations are presented as tab separated files (.tab). Graph G data is presented in JSON format and in human-readable Excel (.xlsx) files. System output is presented in JSON format and as tab separated files. A software tool is also included in this release to recreate original source data from the processed XML material.
Samples
Please view these samples:
- Argument Annotations (.tab)
- Event Annotations (.tab)
- Relations Annotations (.tab)
- Temporal Annotations (.tab)
- Graph G (.json)
- Correctness Assessment
Sponsorship
KAIROS was sponsored by the Air Force Research Laboratory (AFRL) and the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR0011-19-S-0014.
Updates
No updates at this time.