Japanese Business News Text
|Item Name:||Japanese Business News Text|
|Author(s):||David Graff, Zhibiao Wu|
|LDC Catalog No.:||LDC95T8|
|Application(s):||language modeling, information retrieval|
Japanese Business News Text Individual
Japanese Business News Text Organization
Nihon Keizai Shimbun Agreement
|Online Documentation:||LDC95T8 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Graff, David, and Zhibiao Wu. Japanese Business News Text LDC95T8. Web Download. Philadelphia: Linguistic Data Consortium, 1995.|
The Linguistic Data Consortium announces the availability of a Japanese language text corpus composed of business and financial news from two sources:
- Approximately 30 million words of text have been made available from the morning edition of Nihon Kezai Shimbun, the largest Japanese financial news daily newspaper; the release this year covers all text that was published during 1994.
The data was received at the LDC on nine-track magnetic tape; the character encoding was EBCDIC, but was standardized to EUC, which the LDC has chosen as its standard for Japanese.
- A smaller part of the corpus comes from Dow Jones Telerate, which markets its Japanese Language Service. This is a financial newswire produced by Kyodo News Service; its recipients are primarily managers of Japanese owned corporations, or Japanese employees working in North American brokerage houses, banking, etc. The text is received at the LDC via a digital transmission service installed by Telerate; special software was written by the LDC to poll a central database and download articles individually. The character encoding is EUC.
This corpus is available to LDC members only.