Hong Kong Hansards Parallel Text


Item Name: Hong Kong Hansards Parallel Text
Authors: Xiaoyi Ma
LDC Catalog No.: LDC2000T50
ISBN: 1-58563-175-2
Data Type: text
Data Source(s): government documents
Project(s): GALE, TIDES
Application(s): machine translation
Language(s): Chinese, English
Language ID(s): eng, zho
Distribution: 1 CD
Member fee: $0 for 2000 members
Non-member Fee: N/A (Members Only)
Reduced-License Fee: N/A
Extra-Copy Fee: US $150.00
Online documentation: yes
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Xiaoyi Ma
2000
Hong Kong Hansards Parallel Text
Linguistic Data Consortium, Philadelphia

Introduction

This publication contains the Hong Kong Special Administrative Region (HKSAR) Hansards Corpus produced by the Linguistic Data Consortium (LDC) catalog number LDC2000T50, ISBN 1-58563-175-2. This corpus contains excerpts from the Official Record of Proceedings of the Legislative Council of the (HKSAR) from October 1995 to April 2000.

We wish to thank the Hong Kong Special Administrative Region of the Peoples Republic of China for granting the LDC permission to distribute this data to the research community.

The Legislative Council normally meets every Wednesday afternoon in the Chamber of the Legislative Council Building. Business includes:

discussion of subsidiary legislation and other papers reports and addresses statements questions the three readings of bills motion debates

From time to time, the Chief Executive attends a special Council meeting to brief Members on policy issues and to answer questions from Members. All Council meetings are open to the public. The proceedings of the meetings are recorded verbatim in the Official Record of Proceedings of the Legislative Council (Hansard).

The record of proceedings is in the original language delivered by the speakers (Floor Version). They are then translated into English and Chinese versions separately.

Data

This corpus contains excerpts from the official record of meetings from October, 1995 to April, 2000. There are 11.9 million English words and 18.15 million Chinese Characters.

There are 388 files in the data/ subdirectory of this corpus, half (194 files) in English in the data/english/ subdirectory and half (194 files) in Chinese in the data/chinese/ subdirectory. Data file names are in the form YYYYMMDD_[ce].doc, where YYYYMMDD indicates the date of the meeting, c=Chinese and e=English. As an example of the text in this corpus the Chinese sample is part of the Chinese language record of the meeting held on May 24, 1997. The parallel English file is in the English sample.

Copying and Distribution

Permission has been granted to the Linguistic Data Consortium to make and distribute copies of the laws, press releases and news of Hong Kong Special Administrative Region provided this copyright notice and permission notice are distributed with all copies.

Permission has been given to reproduce the laws, press releases, and/or news articles from the Hong Kong Special Administrative Region Government website for research, education, and technology development.

Updates

There are no updates at this time.

Copyright

Portions 1995-2000 The Government of the Hong Kong Special Administrative Region, 2000 Trustees of the University of Pennsylvania

Pricing

The Reduced Licensing Fee for this corpus is US$150.