Home › Language Resources › Data

BLLIP 1987-89 WSJ Corpus Release 1

Item Name:	BLLIP 1987-89 WSJ Corpus Release 1
Author(s):	Eugene Charniak, Don Blaheta, Niyu Ge, Keith Hall, John Hale, Mark Johnson
LDC Catalog No.:	LDC2000T43
ISBN:	1-58563-165-5
ISLRN:	233-420-716-637-7
DOI:	https://doi.org/10.35111/fwew-da58
Member Year(s):	2000
DCMI Type(s):	Text
Data Source(s):	newswire
Project(s):	TIDES, GALE
Application(s):	tagging, parsing, natural language processing
Language(s):	English
Language ID(s):	eng
License(s):	BLLIP 1987-89 WSJ Corpus Release 1 License Agreement
Online Documentation:	LDC2000T43 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Charniak, Eugene, et al. BLLIP 1987-89 WSJ Corpus Release 1 LDC2000T43. Web Download. Philadelphia: Linguistic Data Consortium, 2000.
Related Works: Hide	View isAnnotationOf LDC93T1 ACL/DCI isSimilarWith LDC95T7 Treebank-2 LDC99T42 Treebank-3 LDC2008T13 BLLIP North American News Text, Complete LDC2008T14 BLLIP North American News Text, General Release

Introduction

Brown Laboratory for Linguistic Information Processing (BLLIP)1987-89 WSJ Corpus Release 1 contains a complete, Treebank-style part-of-speech (POS) tagged and parsed version of the three-year Wall Street Journal (WSJ) collection from ACL/DCI (LDC93T1), approximately 30 million words. The annotation was performed using statistically-based methods developed by BLIIP researchers Eugene Charniak, Don Blaheta, Niyu Ge, Keith Hall, John Hale and Mark Johnson.

This corpus both overlaps and supplements the million-word Penn Treebank (PTB) collection of parsed and POS-tagged WSJ texts.

Data

The PTB project selected 2,499 stories from a three-year WSJ collection of 98,732 stories for syntactic annotation. These 2,499 stories are distributed in Treebank-2 (LDC95T7) and Treebank-3 (LDC99T42), both of which include the raw text for each story.

Updates

There are no updates at this time.

BLLIP 1987-89 WSJ Corpus Release 1

Introduction

Data

Updates

Copyright

Available Media

View Fees