|Item Name:||Chinese CogBank|
|Author(s):||Bin Li, Siqi Yin, Jie Xu, Li Song, Minxuan Feng|
|LDC Catalog No.:||LDC2020T01|
|Release Date:||February 17, 2020|
|Data Source(s):||web collection|
|Application(s):||semantic role labelling|
LDC User Agreement for Non-Members
|Online Documentation:||LDC2020T01 Documents|
|Licensing Instructions:||Subscription & Standard Members, and Non-Members|
|Citation:||Li, Bin, et al. Chinese CogBank LDC2020T01. Web Download. Philadelphia: Linguistic Data Consortium, 2020.|
Chinese CogBank is a database of cognitive properties of Chinese words intended for use in metaphor understanding and generation. It consists of 232,497 "word-property" pairs, which are comprised of 83,104 words and 100,195 properties. Each "word-property" type also has an associated frequency which can stand as a functional measure of the importance of a property.
The data was collected via the Chinese search engine Baidu.com. The original collection consisted of 1,258,430 types (5,637,500 tokens) of "word-adjective" pairs that were reduced in Chinese CogBank to 232,497 "word-property" pairs after a series of manual checks.
The corpus is presented as a single tab separated value file encoded in UTF-8.
Please view this sample.
None at this time.