graviti
产品服务
解决方案
知识库
公开数据集
关于我们
avatar
ChID-Dataset
Text
Reading Comprehension
|...
许可协议: Apache License 2.0

Overview

ChID: A Large-scale Chinese IDiom Dataset for Cloze Test

Data Format

One example is shown below:

{
    "content": "世锦赛的整体水平远高于亚洲杯,要如同亚洲杯那样“鱼与熊掌兼得”,就需要各方面密切配合、#idiom#。作为主帅的俞觉敏,除了得打破保守思想,敢于破格用人,还得巧于用兵、#idiom#、灵活排阵,指挥得当,力争通过比赛推新人、出佳绩、出新的战斗力。", 
    "realCount": 2,
    "groundTruth": ["通力合作", "有的放矢"], 
    "candidates": [
        ["凭空捏造", "高头大马", "通力合作", "同舟共济", "和衷共济", "蓬头垢面", "紧锣密鼓"], 
        ["叫苦连天", "量体裁衣", "金榜题名", "百战不殆", "知彼知己", "有的放矢", "风流才子"]
    ]
}
  • content: The given passage where the original idioms are replaced by placeholders #idiom#
  • realCount: The number of placeholders or blanks
  • groundTruth: The golden answers in the order of blanks
  • candidates: The given candidates in the order of blanks

Citation

The ChID Dataset for paper ChID: A Large-scale Chinese IDiom Dataset for Cloze Test.

If your research is related to or based on our ChID dataset (or the version adapted for the competition), please kindly cite it:

@inproceedings{zheng-etal-2019-chid,
    title = "{C}h{ID}: A Large-scale {C}hinese {ID}iom Dataset for Cloze Test",
    author = "Zheng, Chujie  and
      Huang, Minlie  and
      Sun, Aixin",
    booktitle = "Proceedings of the 57th Conference of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1075",
    pages = "778--787",
}

License

chujiezheng/ChID-Dataset is licensed under the

Apache License 2.0

A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

数据概要
数据格式
text,
数据量
3.848K
文件大小
195.84MB
发布方
Chujie Zheng
I am Chujie Zheng (郑楚杰), a first-year Ph.D student of THUCoAI group, supervised by Prof. Minlie Huang. My major research interest lies in open-domain dialog. I got my B.Sc. in Dept. of Physics, Tsinghua University.
| 数据量 3.848K | 大小 195.84MB
ChID-Dataset
Text
Reading Comprehension
许可协议: Apache License 2.0

Overview

ChID: A Large-scale Chinese IDiom Dataset for Cloze Test

Data Format

One example is shown below:

{
    "content": "世锦赛的整体水平远高于亚洲杯,要如同亚洲杯那样“鱼与熊掌兼得”,就需要各方面密切配合、#idiom#。作为主帅的俞觉敏,除了得打破保守思想,敢于破格用人,还得巧于用兵、#idiom#、灵活排阵,指挥得当,力争通过比赛推新人、出佳绩、出新的战斗力。", 
    "realCount": 2,
    "groundTruth": ["通力合作", "有的放矢"], 
    "candidates": [
        ["凭空捏造", "高头大马", "通力合作", "同舟共济", "和衷共济", "蓬头垢面", "紧锣密鼓"], 
        ["叫苦连天", "量体裁衣", "金榜题名", "百战不殆", "知彼知己", "有的放矢", "风流才子"]
    ]
}
  • content: The given passage where the original idioms are replaced by placeholders #idiom#
  • realCount: The number of placeholders or blanks
  • groundTruth: The golden answers in the order of blanks
  • candidates: The given candidates in the order of blanks

Citation

The ChID Dataset for paper ChID: A Large-scale Chinese IDiom Dataset for Cloze Test.

If your research is related to or based on our ChID dataset (or the version adapted for the competition), please kindly cite it:

@inproceedings{zheng-etal-2019-chid,
    title = "{C}h{ID}: A Large-scale {C}hinese {ID}iom Dataset for Cloze Test",
    author = "Zheng, Chujie  and
      Huang, Minlie  and
      Sun, Aixin",
    booktitle = "Proceedings of the 57th Conference of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1075",
    pages = "778--787",
}

License

chujiezheng/ChID-Dataset is licensed under the

Apache License 2.0

A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号

Copyright@Graviti
沪ICP备19019574号
沪公网安备 31011002004865号