CCGbank is a translation of the Penn Treebank into a corpus of Combinatory Categorial Grammar derivations. It pairs syntactic derivations with sets of word-word dependencies which approximate the underlying predicate-argument structure.
CCGbank contains 99.44% of the sentences in the Penn Treebank, for which it corrects a number of inconsistencies and errors in the original annotation.
For an example of this corpus, please examine this sample.
The current version, 1.1, is a bug fix that supersedes the old package. It is available for download.
Portions © 2005 Julia Hockenmaier and Mark Steedman, © 2005 Trustees of the University of Pennsylvania