Home › Language Resources › Data

SemTransCNC

Item Name:	SemTransCNC
Author(s):	Shichang Wang, Chu-Ren Huang, Yao Yao, Angel Chan
LDC Catalog No.:	LDC2020T12
ISBN:	1-58563-931-1
ISLRN:	835-247-023-332-5
DOI:	https://doi.org/10.35111/vreb-7n07
Release Date:	June 22, 2020
Member Year(s):	2020
DCMI Type(s):	Text
Data Source(s):	web collection, newswire, essays, journal articles, non-fiction, fiction, microphone speech, journal entries, meeting speech, microphone conversation, correspondence, transcribed speech, dictionaries
Application(s):	semantic role labelling
Language(s):	Mandarin Chinese
Language ID(s):	cmn
License(s):	SemTransCNC Agreement
Online Documentation:	LDC2020T12 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Wang, Shichang, et al. SemTransCNC LDC2020T12. Web Download. Philadelphia: Linguistic Data Consortium, 2020.
Related Works: Hide	View isAnnotationOf Sinica Corpus https://ckip.iis.sinica.edu.tw/project/sinicacorpus/

Introduction

SemTransCNC was developed by The Hong Kong Polytechnic University. It is comprised of a semantic transparency dataset of Chinese nominal compounds built using a series of crowd-based experiments.

Nominal compounds were selected from the Sinica Corpus and a modern Chinese lexicon. Crowd workers answered questionnaires that included demographic information and questions about the Chinese language. For assessing overall semantic transparency (OST) of selected compounds, they answered the question: "How is the sum of the meanings of A and B similar to the meaning of AB?" For assessing constituent semantic transparency (CST), they were asked to describe the similarity of A alone to its meaning in AB and the meaning of B alone to its meaning in AB.

Data

SemTransCNC consists of OST and CST data for 1,176 dimorphemic Chinese nominal compounds, which consist of free morphemes and have mid-range frequencies.

The text data is presented as a UTF-8 encoded comma separated text file.

Samples

Please view this text sample (CSV).

Updates

None at this time.

SemTransCNC

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees