Japanese Business News Text
Item Name: | Japanese Business News Text |
Author(s): | David Graff, Zhibiao Wu |
LDC Catalog No.: | LDC95T8 |
ISBN: | 1-58563-049-7 |
ISLRN: | 177-142-310-728-1 |
DOI: | https://doi.org/10.35111/1psy-zz16 |
Member Year(s): | 1995 |
DCMI Type(s): | Text |
Data Source(s): | newswire |
Project(s): | TIDES, GALE |
Application(s): | language modeling, information retrieval |
Language(s): | Japanese |
Language ID(s): | jpn |
License(s): |
Japanese Business News Text Individual Japanese Business News Text Organization Nihon Keizai Shimbun Agreement |
Online Documentation: | LDC95T8 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Graff, David, and Zhibiao Wu. Japanese Business News Text LDC95T8. Web Download. Philadelphia: Linguistic Data Consortium, 1995. |
Related Works: | View |
The Linguistic Data Consortium announces the availability of a Japanese language text corpus composed of business and financial news from two sources:
- Approximately 30 million words of text have been made available from the morning edition of Nihon Kezai Shimbun, the largest Japanese financial news daily newspaper; the release this year covers all text that was published during 1994.
The data was received at the LDC on nine-track magnetic tape; the character encoding was EBCDIC, but was standardized to EUC, which the LDC has chosen as its standard for Japanese.
- A smaller part of the corpus comes from Dow Jones Telerate, which markets its Japanese Language Service. This is a financial newswire produced by Kyodo News Service; its recipients are primarily managers of Japanese owned corporations, or Japanese employees working in North American brokerage houses, banking, etc. The text is received at the LDC via a digital transmission service installed by Telerate; special software was written by the LDC to poll a central database and download articles individually. The character encoding is EUC.
This corpus is available to LDC members only.