DOTA
Classification
2D Polygon
Common
|...
许可协议: Custom

Overview

Dota is a large-scale dataset for object detection in aerial images. It can be used to develop and evaluate object detectors in aerial images. We will continue to update DOTA, to grow in size and scope and to reflect evolving real-world conditions. For the DOTA-v1.0, it contains 2806 aerial images from different sensors and platforms. Each image is of the size in the range from about 800 × 800 to 4000 × 4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using 15 common object categories. The fully annotated DOTA images contains 188, 282 instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral.

Data Collection

The images of in DOTA-v1.0 dataset are manily collected from the Google Earth, some are taken by satellite JL-1, the others are taken by satellite GF-2 of the China Centre for Resources Satellite Data and Application.
Use of the images from Google Earth must respect the corresponding terms of use: "Google Earth" terms of use.

Data Annotation

In the dataset, each instance's location is annotated by a quadrilateral bounding boxes, which can be denoted as "x1, y1, x2, y2, x3, y3, x4, y4" where (xi, yi) denotes the positions of the oriented bounding boxes' vertices in the image. The vertices are arranged in a clockwise order. The following is the Visualization of adopted annotation method. The yellow point represents the starting point. which refers to: (a) top left corner of a plane, (b) top left corner of a large vehicle 、 diamond, (c) the center of sector-shaped baseball.
Except the annotation of location, category label is assigned for each instance, which comes from one of the above 15 selected categories, and meanwhile a difficult label is provided which indicates whether the instance is difficult to be detected(1 for difficult, 0 for not difficult).
The object categories in DOTA-v1.0 include: plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field and swimming pool.
Annotations for an image are saved in a text file with the same file name. At the first line, 'imagesource'(from GoogleEarth, GF-2 or JL-1) is given. At the second line, ’gsd’(ground sample distance, the physical size of one image pixel, in meters) is given. Note if the 'gsd' is missing, it is annotated to be 'null'. From third line to last line in annotation text file, annotation for each instance is given. The annotation format is:

'imagesource':imagesource
'gsd':gsd
x1, y1, x2, y2, x3, y3, x4, y4, category, difficult
x1, y1, x2, y2, x3, y3, x4, y4, category, difficult
...

Citation

@InProceedings{Xia_2018_CVPR,
author = {Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
title = {DOTA: A Large-Scale Dataset for Object Detection in Aerial Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

@InProceedings{Ding_2019_CVPR,
author = {Jian Ding, Nan Xue, Yang Long, Gui-Song Xia, Qikai Lu},
title = {Learning RoI Transformer for Detecting Oriented Objects in Aerial Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}

License

Custom

数据概要
数据格式
Image,
数据量
--
文件大小
18.83GB
发布方
Computational And Photogrammetric Vision Team
CAPTAIN (Computational And Photogrammetric vision) team in WuHan University, China
数据集反馈
立即开始构建AI