graviti
产品服务
解决方案
知识库
公开数据集
关于我们
Ancient Chinese Text (wenyanwen)
Text Detection
Text Detection
|...
许可协议: CC-BY-SA 4.0

Overview

Context

Classical Chinese(文言文) and ancient poetry (古诗词) are, probably the most reliable primary history source about the China.

They entail stories about ancient kings, legends of gods, of struggling braveries, of uncelebrated love, of how stars look like when dynasties toppled, of how people whistling, farming, entertaining and math puzzling with algebra thousands of years ago.

They hold different philosophies, some worship order and courtesy, some excel in deception of war, others believe in balance and nature.

They give birth to a language derived into thousands of living dialects and still spoken, written among more than a billion of human being on this planet.

Content

Data is from the 2020, March's data dump from wikisource
> The data is in csv format with 4 columns:

  • id: id from datadump
  • url: The original wikisource file
  • title: The title of the article/ poetry
  • text: The textual data in Chinese

Acknowledgements

This dataset was parsed from Wikisource's data dump, thanks to all the contributor editing these words, as honest to the original as possible

Inspiration

  • What's the relationship between words, names?
  • Any generative model for such material?
  • Any way we can search through these text for event/ figure/ story better?
数据概要
数据格式
text,
数据量
1
文件大小
98.47MB
发布方
Raynard Jon
| 数据量 1 | 大小 98.47MB
Ancient Chinese Text (wenyanwen)
Text Detection
Text Detection
许可协议: CC-BY-SA 4.0

Overview

Context

Classical Chinese(文言文) and ancient poetry (古诗词) are, probably the most reliable primary history source about the China.

They entail stories about ancient kings, legends of gods, of struggling braveries, of uncelebrated love, of how stars look like when dynasties toppled, of how people whistling, farming, entertaining and math puzzling with algebra thousands of years ago.

They hold different philosophies, some worship order and courtesy, some excel in deception of war, others believe in balance and nature.

They give birth to a language derived into thousands of living dialects and still spoken, written among more than a billion of human being on this planet.

Content

Data is from the 2020, March's data dump from wikisource
> The data is in csv format with 4 columns:

  • id: id from datadump
  • url: The original wikisource file
  • title: The title of the article/ poetry
  • text: The textual data in Chinese

Acknowledgements

This dataset was parsed from Wikisource's data dump, thanks to all the contributor editing these words, as honest to the original as possible

Inspiration

  • What's the relationship between words, names?
  • Any generative model for such material?
  • Any way we can search through these text for event/ figure/ story better?
0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号

Copyright@Graviti
沪ICP备19019574号
沪公网安备 31011002004865号