Wordsim-240
许可协议:
MIT
Overview
The dataset is conventional similarity test for Chinese. The constructed test set consists of 240 pairs of Chinese Words, divided into 12 groups, each group containing 20 pairs of Words.
Wordsim-240 (original name: words-240) is from Wang Xiang, Jia Yan, Zhou Bin, et al. Computing Semantic Relatedness using Chinese Wikipedia Links and Taxonomy. Journal of Chinese Computer Systems, 2011, 32(11): 2237-2242. (pdf)
Citation
Please use the following citation when referencing the dataset:
@article{wang2011computing,
title={Computing semantic relatedness using chinese wikipedia links and taxonomy},
author={Wang, Xiang and Jia, Yan and Zhou, Bin and Ding, Zhao-Yun and Liang, Zheng},
journal={Journal of Chinese Computer Systems},
volume={32},
number={11},
pages={2237--2242},
year={2011},
publisher={Shenyang Institute of Computing Technology, 100 Sanhao Ave. Shenyang 110004~…}
}