Ancient Chinese WordNet
| Item Name: | Ancient Chinese WordNet |
| Author(s): | Bin Li, Feng Minxuan, Dai Junyang, Xu Huidan, Lu Xin, Tuo Xinyu, Wang Lezhi, Zhang Yuqin |
| LDC Catalog No.: | LDC2026L03 |
| ISLRN: | 662-487-315-741-3 |
| DOI: | https://doi.org/10.35111/m3h4-rm10 |
| Release Date: | March 16, 2026 |
| Member Year(s): | 2026 |
| DCMI Type(s): | Text |
| Data Source(s): | dictionaries |
| Application(s): | cross-lingual information retrieval, language learning, language teaching |
| Language(s): | Literary Chinese, Old Chinese |
| Language ID(s): | lzh, och |
| License(s): |
LDC User Agreement for Non-Members |
| Online Documentation: | LDC2026L03 Documents |
| Licensing Instructions: | Subscription & Standard Members, and Non-Members |
| Citation: | Li, Bin, et al. Ancient Chinese WordNet LDC2026L03. Web Download. Philadelphia: Linguistic Data Consortium, 2026. |
| Related Works: | View |
Introduction
Ancient Chinese WordNet was developed by Nanjing Normal University and contains lexical and semantic information for Ancient Chinese vocabulary dating back to the Pre-Qin period (before 221 BCE). The WordNet comprises 38,781 word forms and 55,100 senses, each manually linked to a corresponding synset in Princeton WordNet 1.6.
The Ancient Chinese WordNet (ACWN) project began in 2012 with the goal of creating a structured lexical database to support linguistic research and natural language processing applications involving historical Chinese language materials. ACWN organizes vocabulary using WordNet's noun, verb, adjective, and adverb hierarchies and provides WordNet definitions, semantic relations, and categorization for each sense.
Data
Ancient Chinese WordNet contains 55,100 records, where each record represents a single Ancient Chinese lexical item mapped to one WordNet synset. It follows WordNet 1.6 organizational structure, including 22 noun categories, 15 verb categories, and additional adjective and adverb categories.
Each entry includes the following fields:
- ID - The serial number of the ACWN entry
- Word - Ancient Chinese word form
- wn_offset - 8-digit WordNet 1.6 synset offset with trailing POS (n/v/a/s/r)
- senseid - Sense number for this word form (ordinal among that word's senses)
- pos - Part of speech (noun (n), verb (v), adj (a/s), adv (r))
- wn_category - Numeric code for the WordNet 1.6 lexicographer file (category)
- wn_synset - Synset headword(s) in WordNet 1.6
- wn_definition - WordNet gloss for the synset
- wn_similar to - Synset with similar meaning
- wn_pertainym - Pertainym synset offset(s)
- wn_attribute - Attribute synset offset(s)
- wn_hypernym - Hypernym synset offset(s)
- wn_hyponym - Hyponym synset offset(s)
The data is presented in UTF-8 encoded CSV and XLSX formats.
Samples
Updates
No updates at this time.