North American News Text Supplement
Item Name: | North American News Text Supplement |
Author(s): | Robert MacIntyre |
LDC Catalog No.: | LDC98T30 |
ISBN: | 1-58563-137-X |
ISLRN: | 686-158-826-526-6 |
DOI: | https://doi.org/10.35111/j208-rp93 |
Member Year(s): | 1998 |
DCMI Type(s): | Text |
Data Source(s): | newswire |
Project(s): | TIDES, Hub4, GALE, EARS |
Application(s): | information retrieval, language modeling |
Language(s): | English |
Language ID(s): | eng |
License(s): |
North American News Text Supplement Agreement |
Online Documentation: | LDC98T30 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | MacIntyre, Robert. North American News Text Supplement LDC98T30. Web Download. Philadelphia: Linguistic Data Consortium, 1998. |
Related Works: | View |
Introduction
This release of North American News Text provides a supplement to the LDC's earlier publication of similar materials (LDC95T21: North American News Text Corpus). The same TIPSTER-style SGML markup is used in formatting the data. The data sources are as follows:
Source Dates Approx. # Words Covered (Millions) ------------------------------------------------------- Los Angeles Times & 09/97-04/98 11 Washington Post New York Times News 01/97-04/98 116 Syndicate Associated Press 11/94-04/98 143 World Stream English -------------------------------------------------------
The previous North American News release included prior materials from both the LA Times/Washington Post and the New York Times; this supplement provides the continuation of those sources.
Data
The LDC has been collecting the Associated Press Worldstream newswire service in six languages since 1994. The is the first release of the English language portion of this service. The material in this set is typically NOT North American in origin -- the reporters who provide the stories may or may not be American born, but the locations and topics covered are much more heavily international in comparison to the North American wire services. Reports from Asia, Africa and Europe are found here that show up only rarely or not at all in North American newspapers, including political, financial and sports stories that are presumably geared to English-speaking readers in those parts of the world.
This release, when combined with the LDC's earlier NA News Text Corpus, constitutes all the English-language newswire text collected by the LDC between January 1994 and April 1998, inclusive.
Updates
There are no updates at this time.
Additional Licensing Instructions
This 'members-only' corpora is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.