The benchmark involves a large training and test set. The training set contains 15.560 pedestrian samples (image cut-outs at 48x96 resolution) and 6744 additional full images not containing pedestrians for extracting negative samples. The test set contains an independent sequence with more than 21.790 images with 56.492 pedestrian labels (fully visible or partially occluded), captured from a vehicle during a 27 min drive through urban traffic, at VGA resolution (640x480, uncompressed). As such, the dataset is realistic and about one order of magnitude larger than other datasets at time of publication (8.5 Gb). It specifies two evaluation settings: one “generic” (2D bounding box overlap criterion) and one specific to pedestrian detection onboard a vehicle (3D localization criterion, known ground plane and sensor coverage area provide regions of interest, processing constraints).