The SCUT FIR Pedestrian Datasets is a large far infrared pedestrian detection dataset. It consist of about 11 hours-long image sequences ($\sim 10^6​$ frames) at a rate of 25 Hz by driving through diverse traffic scenarios at a speed less than 80 km/h.

Data Collection

The image sequences were collected from 11 road sections under 4 kinds of scenes including downtown, suburbs, expressway and campus in Guangzhou, China.

Data Annotation

We annotated 211,011 frames for a total number of 477,907 bounding boxes around 7,659 unique pedestrians.


  • Seq video format. Data Format is compatible with Caltech Pedestrian Dataset Format
  • datatool. Evaluation/labeling code for our dataset which is based on Caltech Dataset.
  • toolbox. The datatool depended tool which is based on Piotr's Matlab Toolbox.
  • pydatatool. If you want to use this dataset as coco style annotation in Detectron framework, please use this python version datatool.


