SogouCS
Text
NLP
|...
许可协议: Custom

Overview

News data from 18 channels including domestic, international, sports, social, entertainment, etc. from June to July 2012 of Sohu News, providing URL and text information.

Data Collection

The data are collected from Sina Weibo. Both the training and test files are UTF-8 encoded. Besides the training data, we also provide the background data, from which the training and test data are drawn. The purpose of providing the background data is to find the more sophisticated features by the unsupervised way.

Citation

@inproceedings{inproceedings,
author = {Wang, Canhui and Zhang, Min and Ru, Liyun and Ma, Shaoping},
year = {2008},
month = {01},
pages = {1033-1042},
title = {Automatic online news topic ranking using media focus and user attention based on
aging theory},
doi = {10.1145/1458082.1458219}
}

License

Custom

数据概要
数据格式
Text,
数据量
--
文件大小
--
发布方
Sogou
Sogou, Inc is a Chinese technology company that specializes mainly in web search.
数据集反馈
立即开始构建AI