MobvoiHotwords
Audio
NLP
|...
许可协议: Apache-2.0

Overview

The MobvoiHotwords is a corpus of wake-up words collected from a commercial smart speaker of Mobvoi. It consists of keyword and non-keyword utterances.
For keyword data, keyword utterances contain either 'Hi xiaowen' or 'Nihao Wenwen' are collected. For each keyword, there are about 36k utterances. All keyword data is collected from 788 subjects, ages 3-65, with different distances from the smart speaker (1, 3 and 5 meters). Different noises (typical home environment noises like music and TV) with varying sound pressure levels are played in the background during the collection. The keyword data is identical to the keyword data used in the paper below:

License

Apache-2.0

数据概要
数据格式
Audio,
数据量
--
文件大小
16.65GB
发布方
Mobvoi
An AI company focusing on advanced voice interaction and hardware-software integration.
数据集反馈
立即开始构建AI