CCGbank

Item Name: CCGbank
Author(s): Julia Hockenmaier, Mark Steedman
LDC Catalog No.: LDC2005T13
ISBN: 1-58563-340-2
ISLRN: 181-921-208-336-7
Release Date: May 15, 2005
Member Year(s): 2005
DCMI Type(s): Text
Data Source(s): newswire
Project(s): TIDES, GALE
Application(s): natural language processing, information detection, cross-lingual information retrieval, automatic content extraction
Language(s): English
Language ID(s): eng
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2005T13 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Hockenmaier, Julia, and Mark Steedman. CCGbank LDC2005T13. Web Download. Philadelphia: Linguistic Data Consortium, 2005.

Introduction

CCGbank is a translation of the Penn Treebank into a corpus of Combinatory Categorial Grammar derivations. It pairs syntactic derivations with sets of word-word dependencies which approximate the underlying predicate-argument structure.

Data

CCGbank contains 99.44% of the sentences in the Penn Treebank, for which it corrects a number of inconsistencies and errors in the original annotation.

Samples

For an example of this corpus, please examine this sample.

Update

The current version, 1.1, is a bug fix that supersedes the old package. It is available for download.

Available Media

View Fees

Member
Non-Member
Reduced-License
Extra Copy
Login for the applicable fee