Hong Kong Hansards Parallel Text

Item Name: Hong Kong Hansards Parallel Text
Author(s): Xiaoyi Ma
LDC Catalog No.: LDC2000T50
ISBN: 1-58563-175-2
ISLRN: 272-276-125-586-5
Member Year(s): 2000
DCMI Type(s): Text
Data Source(s): government documents
Project(s): TIDES, GALE
Application(s): machine translation
Language(s): English, Chinese
Language ID(s): eng, zho
License(s): LDC User Agreement for Non-Members
Online Documentation: LDC2000T50 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Ma, Xiaoyi. Hong Kong Hansards Parallel Text LDC2000T50. Web Download. Philadelphia: Linguistic Data Consortium, 2000.


This publication contains the Hong Kong Special Administrative Region (HKSAR) Hansards Corpus produced by the Linguistic Data Consortium (LDC) catalog number LDC2000T50, ISBN 1-58563-175-2. This corpus contains excerpts from the Official Record of Proceedings of the Legislative Council of the (HKSAR) from October 1995 to April 2000.

We wish to thank the Hong Kong Special Administrative Region of the Peoples Republic of China for granting the LDC permission to distribute this data to the research community.

The Legislative Council normally meets every Wednesday afternoon in the Chamber of the Legislative Council Building. Business includes:

discussion of subsidiary legislation and other papers reports and addresses statements questions the three readings of bills motion debates

From time to time, the Chief Executive attends a special Council meeting to brief Members on policy issues and to answer questions from Members. All Council meetings are open to the public. The proceedings of the meetings are recorded verbatim in the Official Record of Proceedings of the Legislative Council (Hansard).

The record of proceedings is in the original language delivered by the speakers (Floor Version). They are then translated into English and Chinese versions separately.


This corpus contains excerpts from the official record of meetings from October, 1995 to April, 2000. There are 11.9 million English words and 18.15 million Chinese Characters.

There are 388 files in the data/ subdirectory of this corpus, half (194 files) in English in the data/english/ subdirectory and half (194 files) in Chinese in the data/chinese/ subdirectory. Data file names are in the form YYYYMMDD_[ce].doc, where YYYYMMDD indicates the date of the meeting, c=Chinese and e=English. As an example of the text in this corpus the Chinese sample is part of the Chinese language record of the meeting held on May 24, 1997. The parallel English file is in the English sample.

Copying and Distribution

Permission has been granted to the Linguistic Data Consortium to make and distribute copies of the laws, press releases and news of Hong Kong Special Administrative Region provided this copyright notice and permission notice are distributed with all copies.

Permission has been given to reproduce the laws, press releases, and/or news articles from the Hong Kong Special Administrative Region Government website for research, education, and technology development.


There are no updates at this time.


The Reduced Licensing Fee for this corpus is US$150.

Available Media

View Fees

Extra Copy
Login for the applicable fee