Xinhua News Agency
Availability: CD-ROM
Data type: Text
Text type: Journalistic (newswire)
Domain(s): International News
Language: Mandarin Chinese
General Description:
The xinhua/ directory contains newswire articles from the Xinhua News
Agency, the official newswire service of the government of the People's
Republic of China. The data is collected via telephone/modem at the
Linguistic Data Consortium (LDC).
Publisher and place of publication: Xinhua News Agency, Beijing, China
Collector of Data: Linguistic Data Consortium
Collection time span: 1994-1996
Description of file organization: three files per month.
For example, xh9602_2 contains Xinhua newswire service ("xh") articles from
the second part of February 1996 (Feb. 11-19). Exception: xh96_567
contains all articles from 96/5/6 to 96/7/18. There may also be some
mixing and overlap among the other files due to reception problems.
Number of files: 60
Total size: 62 megabytes;
about 25 million text characters (2.5% ASCII, 97.5% GB-encoded 16-bit)
Tagging description:
The format uses a labeled bracketing, expressed in the style of SGML
(Standard Generalized Markup Language). Each article is enclosed in
...
and