SemanticKITTI
3D Semantic Segmentation
Autonomous Driving
|...
许可协议: CC BY-NC-SA 4.0

Overview

We present a large-scale dataset based on the KITTI Vision Benchmark and we used all sequences provided by the odometry task. We provide dense annotations for each individual scan of sequences 00-10, which enables the usage of multiple sequential scans for semantic scene interpretation, like semantic segmentation and semantic scene completion.

The remaining sequences, i.e., sequences 11-21, are used as a test set showing a large variety of challenging traffic situations and environment types. Labels for the test set are not provided and we use an evaluation service that scores submissions and provides test set results.

Classes

The dataset contains 28 classes including classes distinguishing non-moving and moving objects. Overall, our classes cover traffic participants, but also functional classes for ground, like parking areas, sidewalks.

img

Folder structure and format

Semantic Segmentation and Panoptic Segmentation

img

We provide for each scan XXXXXX.bin of the velodyne folder in the sequence folder of the original KITTI Odometry Benchmark, a file XXXXXX.label in the labels folder that contains for each point a label in binary format. The label is a 32-bit unsigned integer (aka uint32_t) for each point, where the lower 16 bits correspond to the label. The upper 16 bits encode the instance id, which is temporally consistent over the whole sequence, i.e., the same object in two different scans gets the same id. This also holds for moving cars, but also static objects seen after loop closures.

We furthermore provide the poses.txt file that contains the poses, which we used to annotate the data, estimated by a surfel-based SLAM approach (SuMa).

Semantic Scene Completion

img

We provide for each scan XXXXXX.bin of the velodyne folder in the sequence folder of the original KITTI Odometry Benchmark, we provide in the voxel folder:

  • a file XXXXXX.bin in a packed binary format that contains for each voxel if that voxel is occupied by laser measurements. This is the input to the semantic scene completion task and it corresponds to the voxelization of a single LiDAR scan.
  • a file XXXXXX.label that contains for each voxel of the completed scene a label in binary format. The label is a 16-bit unsigned integer (aka uint16_t) for each voxel.
  • a file XXXXXX.invalid in a packed binary format that contains for each voxel a flag indicating if that voxel is considered invalid, i.e., the voxel is never directly seen from any position to generate the voxels. These voxels are also not considered in the evaluation.
  • a file XXXXXX.occluded in a packed binary format that contains for each voxel a flag that specifies if this voxel is either occupied by LiDAR measurements or occluded by a voxel in line of sight of all poses used to generate the completed scene.

The blue files (img) are only given for the training data and the label file must be predicted for the semantic segmentation task.

To allow a higher compression rate, we store the binary flags in a custom format, where we store the flags as bit flags,i.e., each byte of the file corresponds to 8 voxels in the unpacked voxel grid. Please see the development kit for further information on how to efficiently read these files using numpy.

See also our development kit for further information on the labels and the reading of the labels using Python. The development kit also provides tools for visualizing the point clouds.

Citation

Please use the following citation when referencing the dataset:

@inproceedings{behley2019iccv,
  author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss
and J. Gall},
  title = {{SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences}},
  booktitle = {Proc. of the IEEE/CVF International Conf.~on Computer Vision (ICCV)},
  year = {2019}
}

But also cite the original KITTI Vision Benchmark:

@inproceedings{geiger2012cvpr,
  author = {A. Geiger and P. Lenz and R. Urtasun},
  title = {{Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite}},
  booktitle = {Proc.~of the IEEE Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  pages = {3354--3361},
  year = {2012}
}

License

CC BY-NC-SA 4.0

数据概要
数据格式
Point Cloud,
数据量
--
文件大小
833.47MB
发布方
University of Bonn
The dataset is result of a collaboration between the Photogrammetry & Robotics Group, the Computer Vision Group, and the Autonomous Intelligent Systems Group, which are all part of the University of Bonn.
数据集反馈
出错了
刚刚
timeout_error
立即开始构建AI
出错了
刚刚
timeout_error