THCHS-30
Audio
NLP
|...
许可协议: Custom

Overview

Speech data is crucially important for speech recognition research. There are quite some speech databases that can be purchased at prices that are reasonable for most research institutes. However, for young people who just start research activities or those who just gain initial interest in this direction, the cost for data is still an annoying barrier. We support the `free data' movement in speech recognition: research institutes (particularly supported by public funds) publish their data freely so that new researchers can obtain sufficient data to kick of their career.Here, we follow this trend and release a free Chinese speech database THCHS-30 that can be used to build a full- edged Chinese speech recognition system.

Citation

Please use the following citation when referencing the dataset:

@article{DBLP:journals/corr/WangZ15e,
  author    = {Dong Wang and
               Xuewei Zhang},
  title     = {{THCHS-30} : {A} Free Chinese Speech Corpus},
  journal   = {CoRR},
  volume    = {abs/1512.01882},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.01882},
  archivePrefix = {arXiv},
  eprint    = {1512.01882},
  timestamp = {Mon, 13 Aug 2018 16:46:59 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/WangZ15e.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

License

Custom

数据概要
数据格式
Audio,
数据量
--
文件大小
13.4GB
发布方
CSLT at Tsinghua University
The Center for Speech and Language Technology (CSLT), Tsinghua University, was established with the goal of conducting cut-edging research on intelligent human-machine interactions, particularly the research on speech and language techniques.
数据集反馈
| 55 | 数据量 -- | 大小 13.4GB
THCHS-30
Audio
NLP
许可协议: Custom

Overview

Speech data is crucially important for speech recognition research. There are quite some speech databases that can be purchased at prices that are reasonable for most research institutes. However, for young people who just start research activities or those who just gain initial interest in this direction, the cost for data is still an annoying barrier. We support the `free data' movement in speech recognition: research institutes (particularly supported by public funds) publish their data freely so that new researchers can obtain sufficient data to kick of their career.Here, we follow this trend and release a free Chinese speech database THCHS-30 that can be used to build a full- edged Chinese speech recognition system.

Citation

Please use the following citation when referencing the dataset:

@article{DBLP:journals/corr/WangZ15e,
  author    = {Dong Wang and
               Xuewei Zhang},
  title     = {{THCHS-30} : {A} Free Chinese Speech Corpus},
  journal   = {CoRR},
  volume    = {abs/1512.01882},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.01882},
  archivePrefix = {arXiv},
  eprint    = {1512.01882},
  timestamp = {Mon, 13 Aug 2018 16:46:59 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/WangZ15e.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

License

Custom

数据集反馈
0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号