graviti
产品服务
解决方案
知识库
公开数据集
关于我们
Dogs vs. Cats
2D Classification
许可协议: Research Only

Overview

Dogs vs. Cats is a competition on Kaggle, which needs to write an algorithm to classify whether images contain either a dog or a cat. The training archive contains 25,000 images of dogs and cats.

The Asirra data set

Web services are often protected with a challenge that's supposed to be easy for people to solve, but difficult for computers. Such a challenge is often called a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) or HIP (Human Interactive Proof). HIPs are used for many purposes, such as to reduce email and blog spam and prevent brute-force attacks on web site passwords.

Asirra (Animal Species Image Recognition for Restricting Access) is a HIP that works by asking users to identify photographs of cats and dogs. This task is difficult for computers, but studies have shown that people can accomplish it quickly and accurately. Many even think it's fun!

Asirra is unique because of its partnership with Petfinder.com, the world's largest site devoted to finding homes for homeless pets. They've provided Microsoft Research with over three million images of cats and dogs, manually classified by people at thousands of animal shelters across the United States. Kaggle is fortunate to offer a subset of this data for fun and research.

Image recognition attacks

While random guessing is the easiest form of attack, various forms of image recognition can allow an attacker to make guesses that are better than random. There is enormous diversity in the photo database (a wide variety of backgrounds, angles, poses, lighting, etc.), making accurate automatic classification difficult. In an informal poll conducted many years ago, computer vision experts posited that a classifier with better than 60% accuracy would be difficult without a major advance in the state of the art. For reference, a 60% classifier improves the guessing probability of a 12-image HIP from 1/4096 to 1/459.

State of the art

The current literature suggests machine classifiers can score above 80% accuracy on this task [1]. Therefore, Asirra is no longer considered safe from attack. This contest aims to benchmark the latest computer vision and deep learning approaches to this problem.

Data Preview

Label Distribution

License

These images have been published by Microsoft Research for the express purpose of furthering academic research. They may be used for non-commercial research purposes, but they may not be re-published without the express permission of Microsoft Research.

数据概要
数据格式
image,
数据量
25K
文件大小
813.56MB
发布方
Kaggle
Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data science goals.
| 数据量 25K | 大小 813.56MB
Dogs vs. Cats
2D Classification
许可协议: Research Only

Overview

Dogs vs. Cats is a competition on Kaggle, which needs to write an algorithm to classify whether images contain either a dog or a cat. The training archive contains 25,000 images of dogs and cats.

The Asirra data set

Web services are often protected with a challenge that's supposed to be easy for people to solve, but difficult for computers. Such a challenge is often called a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) or HIP (Human Interactive Proof). HIPs are used for many purposes, such as to reduce email and blog spam and prevent brute-force attacks on web site passwords.

Asirra (Animal Species Image Recognition for Restricting Access) is a HIP that works by asking users to identify photographs of cats and dogs. This task is difficult for computers, but studies have shown that people can accomplish it quickly and accurately. Many even think it's fun!

Asirra is unique because of its partnership with Petfinder.com, the world's largest site devoted to finding homes for homeless pets. They've provided Microsoft Research with over three million images of cats and dogs, manually classified by people at thousands of animal shelters across the United States. Kaggle is fortunate to offer a subset of this data for fun and research.

Image recognition attacks

While random guessing is the easiest form of attack, various forms of image recognition can allow an attacker to make guesses that are better than random. There is enormous diversity in the photo database (a wide variety of backgrounds, angles, poses, lighting, etc.), making accurate automatic classification difficult. In an informal poll conducted many years ago, computer vision experts posited that a classifier with better than 60% accuracy would be difficult without a major advance in the state of the art. For reference, a 60% classifier improves the guessing probability of a 12-image HIP from 1/4096 to 1/459.

State of the art

The current literature suggests machine classifiers can score above 80% accuracy on this task [1]. Therefore, Asirra is no longer considered safe from attack. This contest aims to benchmark the latest computer vision and deep learning approaches to this problem.

Data Preview

Label Distribution

License

These images have been published by Microsoft Research for the express purpose of furthering academic research. They may be used for non-commercial research purposes, but they may not be re-published without the express permission of Microsoft Research.

0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号

Copyright@Graviti
沪ICP备19019574号
沪公网安备 31011002004865号