graviti
产品服务
解决方案
知识库
公开数据集
关于我们
avatar
Stamp Verification (StaVer) Dataset
许可协议: CC-BY-SA 4.0

Overview

Context:

An automatic system for stamp segmentation and further verification is needed especially for environments like insurance companies where a huge volume of documents is processed daily. However, detection of a general stamp is not a trivial task as it can have different shapes and colors and, moreover, it can be imprinted with a variable quality and rotation. This dataset was collected to help researchers build such a system.

Content:

This dataset contains 400 scanned document images. The documents are automatically generated invoices that were printed, stamped and scanned with 200 dpi resolution. They include color logos and color texts which makes the evaluation results more realistic. There are stamps of many different shapes and colors including black ones in the data set, sometimes the stamps are overlapped with signatures or a text. In some documents there are multiple stamps or none at all. The groundtruth consists of binary images with masks of the stamp strokes which allows for accurate pixel-wise evaluation.
This dataset contains the following folders, each with 400 items (one for each image):

  • scans: scans of the stamped genuine documents
  • ground-truth-maps: maps defining the region of the stamp(s)
  • ground-truth-pixel: pixel-level ground truth
  • info: contains text files with the info for each file. Each info file contains the following information:
    • signature [0|1]: signature present [0] or not [1]
    • textOverlap [0|1]: stamps overlap with printed text [1]
    • numStamps [0|...|n]: number of stamps on the page
    • bwStamp[1|...|n]: stamp[1|...|n] is black stamp [1] or colored [1]

In addition, there is a .pdf file will all the images in one file. The complete dataset (including scans with higher resolution) can be found here.

Acknowledgements:

This dataset was collected by Barbora Micenkova´ and Joost van Beusekom. If you use this dataset in your work, please cite the following paper:

Micenkov, B., & van Beusekom, J. (2011, September). Stamp detection in color document images. In Document Analysis and Recognition (ICDAR), 2011 International Conference on(pp. 1125-1129). IEEE.

Inspiration:

  • Can you segment just the stamps from the background text?
  • Can you use OCR techniques to identify the stamped text?
数据概要
数据格式
image,
数据量
1.628K
文件大小
237.82MB
发布方
Rachael Tatman
| 数据量 1.628K | 大小 237.82MB
Stamp Verification (StaVer) Dataset
许可协议: CC-BY-SA 4.0

Overview

Context:

An automatic system for stamp segmentation and further verification is needed especially for environments like insurance companies where a huge volume of documents is processed daily. However, detection of a general stamp is not a trivial task as it can have different shapes and colors and, moreover, it can be imprinted with a variable quality and rotation. This dataset was collected to help researchers build such a system.

Content:

This dataset contains 400 scanned document images. The documents are automatically generated invoices that were printed, stamped and scanned with 200 dpi resolution. They include color logos and color texts which makes the evaluation results more realistic. There are stamps of many different shapes and colors including black ones in the data set, sometimes the stamps are overlapped with signatures or a text. In some documents there are multiple stamps or none at all. The groundtruth consists of binary images with masks of the stamp strokes which allows for accurate pixel-wise evaluation.
This dataset contains the following folders, each with 400 items (one for each image):

  • scans: scans of the stamped genuine documents
  • ground-truth-maps: maps defining the region of the stamp(s)
  • ground-truth-pixel: pixel-level ground truth
  • info: contains text files with the info for each file. Each info file contains the following information:
    • signature [0|1]: signature present [0] or not [1]
    • textOverlap [0|1]: stamps overlap with printed text [1]
    • numStamps [0|...|n]: number of stamps on the page
    • bwStamp[1|...|n]: stamp[1|...|n] is black stamp [1] or colored [1]

In addition, there is a .pdf file will all the images in one file. The complete dataset (including scans with higher resolution) can be found here.

Acknowledgements:

This dataset was collected by Barbora Micenkova´ and Joost van Beusekom. If you use this dataset in your work, please cite the following paper:

Micenkov, B., & van Beusekom, J. (2011, September). Stamp detection in color document images. In Document Analysis and Recognition (ICDAR), 2011 International Conference on(pp. 1125-1129). IEEE.

Inspiration:

  • Can you segment just the stamps from the background text?
  • Can you use OCR techniques to identify the stamped text?
0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号

Copyright@Graviti
沪ICP备19019574号
沪公网安备 31011002004865号