graviti
产品服务
解决方案
知识库
公开数据集
关于我们
MAGICDATA Mandarin Chinese Read Speech Corpus
Audio
许可协议: CC-BY-NC-ND 4.0

Overview

The corpus is a subset of a much bigger data ( 10566.9 hours Chinese Mandarin Speech Corpus ) set which was recorded in the same environment. The corpus aims to support researchers in speech recognition, machine translation, speaker recognition, and other speech-related fields. Therefore, the corpus is totally free for academic use.

Data Format

The contents and the corresponding descriptions of the corpus include:

  • The corpus contains 755 hours of speech data, which is mostly mobile recorded data.
  • 1080 speakers from different accent areas in China are invited to participate in the recording.
  • The sentence transcription accuracy is higher than 98%.
  • Recordings are conducted in a quiet indoor environment.
  • The database is divided into training set, validation set, and testing set in a ratio of 51: 1: 2.
  • Detail information such as speech data coding and speaker information is preserved in the metadata file.
  • The domain of recording texts is diversified, including interactive Q&A, music search, SNS messages, home command and control, etc.
  • Segmented transcripts are also provided.

License

All datasets on this page are copyright by us and published under the CC BY-NC-ND 4.0 license.

数据概要
数据格式
audio,
数据量
--
文件大小
52.03GB
发布方
MAGIC DATA Technology
Magic Data Technology is a professional AI data annotation service provider,
| 数据量 -- | 大小 52.03GB
MAGICDATA Mandarin Chinese Read Speech Corpus
Audio
许可协议: CC-BY-NC-ND 4.0

Overview

The corpus is a subset of a much bigger data ( 10566.9 hours Chinese Mandarin Speech Corpus ) set which was recorded in the same environment. The corpus aims to support researchers in speech recognition, machine translation, speaker recognition, and other speech-related fields. Therefore, the corpus is totally free for academic use.

Data Format

The contents and the corresponding descriptions of the corpus include:

  • The corpus contains 755 hours of speech data, which is mostly mobile recorded data.
  • 1080 speakers from different accent areas in China are invited to participate in the recording.
  • The sentence transcription accuracy is higher than 98%.
  • Recordings are conducted in a quiet indoor environment.
  • The database is divided into training set, validation set, and testing set in a ratio of 51: 1: 2.
  • Detail information such as speech data coding and speaker information is preserved in the metadata file.
  • The domain of recording texts is diversified, including interactive Q&A, music search, SNS messages, home command and control, etc.
  • Segmented transcripts are also provided.

License

All datasets on this page are copyright by us and published under the CC BY-NC-ND 4.0 license.

0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号

Copyright@Graviti
沪ICP备19019574号
沪公网安备 31011002004865号