Neolix OD
3D Box
Autonomous Driving
许可协议: CC BY-NC-SA 4.0


we have presented a diverse dataset for autonomous driving by capturing a wide range of interesting scenarios, which shows the domain diversity among different cities, including Beijing, Shenzhen and Xi’an. The dataset are open to the public; anyone interested in training is welcome. We believe that this dataset will be highly useful in many areas of computer vision and autonomous driving.

Data Collection

We collect the data in different cities, including Beijing, Shenzhen and Xi’an. The data collection covers multiple regions, and various driving conditions, including day, night, dawn, dusk and sunny day. We sample the raw data in different sample frequency according to the different velocity of the collection car. Then our annotators discard the data whose scene is the same with other files. The remaining files are annotated by the annotators.

Data Preview

Label Distribution

Data Annotation

According to the operational scene, we define the classes 'Adult', 'Child', 'Car', 'Cyclist', 'Unknown', 'Tricycle', 'Barrier', 'Bicycle', 'Bicycles', 'Bus', 'Truck', 'Motorcycle', 'Motorcyclist' and 'Animal'. There are only the first eight classes of objects in our dataset so far.

For each object in a frame point clouds file, we provide rich annotations, including the object type, the location of the box's center, the dimension of the box and the rotate angle of object's heading.

Data Format

Our dataset contains two parts: the point clouds files and their annotations.

Our point clouds is composed of the point clouds from three lidars. The three lidars are distributed on the top, left and right of the car respectively. The label includs many rows and each row discribes a 3D box.


1. Read point cloud file

import numpy as np
## The fields of point cloud contains four attributes: x, y, z and intensity.
## And the intensity of each point is set forcely zero.

points = np.fromfile(pc_path, dtype=np.float32).reshape(-1, 4)

2. The statistics of dataset

def count_labels(l_path, cls):
    size: h,w,l
    z_axis: the coordinate of the bounding box in z-axis
    size = []
    z_axis = []
    for l in os.listdir(l_path):
        with open(l_path+l) as f:
            label_lines = f.readlines()
            for label_line in label_lines:
                label_line = label_line.split(" ")
                if label_line[0] == cls:
                    size.append([float(label_line[8]), float(label_line[9]), float(label_line[10])])
    np_size = np.array(size)
    np_z_axis = np.array(z_axis)
    return np_size.shape[0]

def visu_class(class_path):
    plt.figure(figsize=(10, 10), dpi=80)
    class_ls = ['Bus', 'Dontcare', 'Barrier', 'Animal', 'Bicycles', 'Cyclist', 'Car', 'Child', 'Adult','Truck', 'Motorcycle', 'Bicycle', 'Motorcyclist', 'Tricycle']
    N = len(class_ls)
    objects_num = []
    for c in class_ls:
        objects_num.append(count_labels(class_path, c))
    values = tuple(objects_num)
    index = np.arange(N)
    width = 0.45
    label_content = ""
    for c, n in zip(class_ls, objects_num):
        label_content += "%s: %d" % (c, n) + "\n"
    print(index, values)
    p2 =, values, width, label=label_content, color="BLUE")
    plt.ylabel('number of bounding box')
    plt.title(' Bounding boxes distributions')
    plt.xticks(index, ('Bus', 'Dontcare', 'Barrier', 'Animal', 'Bicycles', 'Cyclist', 'Car', 'Child', 'Adult', 'Truck', 'Motorcycle', 'Bicycle', 'Motorcyclist', 'Tricycle'))
    plt.legend(loc="upper right")


This dataset are free for academic usage. You can run them at your own risk. For other purposes, please contact the corresponding author Lichao Wang (

Author = {Lichao Wang and Lanxin Lei and Hongli Song and Weibao Wang},
Title = {The NEOLIX Open Dataset for Autonomous Driving},
Year = {2020},
Eprint = {arXiv:2011.13528},



Point Cloud,
Neolix aspires to lead the future smart city way of life with state-of-the-art technologies. Empowered by autonomous driving, 5G communications, Internet of Vehicles, intelligent hardware and autonomous vehicle super factory, Neolix has commercially deployed vehicles in hundreds of scenarios globally to build smart services ecosystem along with trusted partners and provide consumers with superior experience.
Baidu AI Data Department belongs to the intelligent cloud business group, as Baidu data support department, to undertake various types of internal AI data collection and annotation needs.