Home › Language Resources › Data

Korean Treebank Annotations Version 2.0

Item Name:	Korean Treebank Annotations Version 2.0
Author(s):	Na-Rae Han, Shijong Ryu, Sook-Hee Chae, Seung-yun Yang, Seunghun Lee, Martha Palmer
LDC Catalog No.:	LDC2006T09
ISBN:	1-58563-381-X
ISLRN:	365-025-522-700-1
DOI:	https://doi.org/10.35111/02nk-p662
Release Date:	April 17, 2006
Member Year(s):	2006
DCMI Type(s):	Text
Data Source(s):	newswire
Application(s):	automatic content extraction, discourse analysis, information detection, information extraction, morphology learning, parsing, part of speech tagging, syntactic parsing
Language(s):	Korean
Language ID(s):	kor
License(s):	Korean Treebank Annotations Version 2.0 Agreement
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Han, Na-Rae, et al. Korean Treebank Annotations Version 2.0 LDC2006T09. Web Download. Philadelphia: Linguistic Data Consortium, 2006.
Related Works: Hide	View isVersionOf LDC2002T26 Korean English Treebank Annotations isAnnotationOf LDC2000T45 Korean Newswire hasAnnotation LDC2006T03 Korean Propbank LDC2023T05 Penn Korean Universal Dependency Treebank isSimilarWith LDC2004T03 Morphologically Annotated Korean Text

Introduction

The Korean Treebank Annotations Version 2.0 was developed by the Linguistic Data Consortium (LDC) and contains 647 articles of Korean newswire text annotated with morphological and syntactic information. It is an extension of the Korean English Treebank Annotations (LDC2002T26).

Data

The original texts for the Korean Treebank 2.0 were selected from the LDC corpus Korean Newswire (LDC2000T45), which is a collection of Korean Press Agency news articles from June 2, 1994 to March 20, 2000. Korean Treebank 2.0 is based on the March 2000 portion of the corpus. The articles were collected by means of a continuous feed from the news provider over a modem connection.

The annotated corpus can find many uses, including training of morphological analyzers, part-of-speech taggers and syntactic parsers.

The text is encoded as KSC-5601(EUC-KR). Version 1.1 of the treebank is included in this release.

Korean Treebank Annotations Version 2.0

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees