Home › Language Resources › Data

EVALution

Item Name:	EVALution
Author(s):	Enrico Santus, Hongchao Liu, Chu-Ren Huang
LDC Catalog No.:	LDC2020T06
ISBN:	1-58563-921-4
ISLRN:	490-239-801-102-1
DOI:	https://doi.org/10.35111/4h9q-yt20
Release Date:	March 13, 2020
Member Year(s):	2020
DCMI Type(s):	Text
Data Source(s):	web collection
Application(s):	semantic role labelling
Language(s):	Mandarin Chinese, English
Language ID(s):	cmn, eng
License(s):	EVALution Agreement
Online Documentation:	LDC2020T06 Documents
Licensing Instructions:	Subscription & Standard Members, and Non-Members
Citation:	Santus, Enrico, Hongchao Liu, and Chu-Ren Huang. EVALution LDC2020T06. Web Download. Philadelphia: Linguistic Data Consortium, 2020.

Introduction

EVALution was developed by The Hong Kong Polytechnic University. It is comprised of English and Mandarin Chinese data sets -- EVALution 1.0 and EVALution-Man, respectively -- that contain semantic relations and metadata for training and evaluating distributional semantic models.

Data

EVALution 1.0 consists of approximately 7500 English tuples extracted from ConceptNet 5.0 and WordNet and filtered through automatic methods and crowd-sourcing. Several semantic relations between word pairs were instantiated, including hypernymy, synonymy, antonymy and meronymy. The corpus also includes additional information that can be used to filter the pairs or to analyze the results, such as relation domain, word frequency, word part-of-speech and word semantic field.

EVALution-MAN consists of Chinese word pairs from two sources: Chinese Wordnet and humans who completed an elicitation task by supplying missing words to sentences. The human-supplied sentence word pairs were then judged by human raters for reliability.

All text data is presented as UTF-8 encoded tab separated plain text.

Samples

Please view this EVALutaion 1.0 sample and EVALution-MAN sample.

Updates

None at this time.

EVALution

Introduction

Data

Samples

Updates

Copyright

Available Media

View Fees