INRIA Holidays
No Label
Image Search
许可协议: Custom


The Holidays dataset is a set of images which mainly contains some of our personal holidays photos. The remaining ones were taken on purpose to test the robustness to various attacks: rotations, viewpoint and illumination changes, blurring, etc. The dataset includes a very large variety of scene types (natural, man-made, water and fire effects, etc) and images are in high resolution. The dataset contains 500 image groups, each of which represents a distinct scene or object. The first image of each group is the query image and the correct retrieval results are the other images of the group.

The dataset can be downloaded from this page, see details below. The material given includes:

  • the images themselves
  • the set of descriptors extracted from these images (see details below)
  • a set of descriptors produced, with the same extractor and descriptor, for a distinct dataset (Flickr60K).
  • two sets of clusters used to quantize the descriptors. These have been obtained from Flickr60K.
  • some pre-processed feature files for one million images, that we have used in our ECCV paper to perform the evaluation on a large scale.

In our paper, we also used some sets of distractor images downloaded from Flickr. Their features are provided below.


Dataset size: 1491 images in total: 500 queries and 991 corresponding relevant images Number of queries: 500 (one per group) Number of descriptors produced: 4455091 SIFT descriptors of dimensionality 128

Data Format

Two binary file formats are used.

.siftgeo format

descriptors are stored in raw together with the region information provided by the software of Krystian Mikolajczyk. There is no header (use the file length to find the number of descriptors).

A descriptor takes 168 bytes (floats and ints take 4 bytes, and are stored in little endian):

field field type description
x float horizontal position of the interest point
y float vertical position of the interest point
scale float scale of the interest region
angle float angle of the interest region
mi11 float affine matrix component
mi12 float affine matrix component
mi21 float affine matrix component
mi22 float affine matrix component
cornerness float saliency of the interest point
desdim int dimension of the descriptors
component byte*desdim the descriptor vector (dd components)
A matlab fileto read .siftgeo files.

.fvecs format

This one is used to store centroids. As for the .siftgeo format, there is no header. Centroids are stored in raw. Each centroid takes 516 bytes, as shown below.

field field type description
desdim int descriptor dimension
components float*desdim the centroids components

A matlab fileto read .fvecs files

Descriptor Extraction

Before computing descriptors, we have resized the images to a maximum of 786432 pixels and performed a slight intensity normalization.

For the descriptor extraction, we have used a modified versionof the software of Krystian Mikolajczyk(thank you Krystian!).

We have used the Hessian-Affine extractor and the SIFT descriptor. Note however that our version of the code may be different from the one which is currently on the web. If so, this should not noticeably impact the results.

The set of commands used to extract the descriptors was the following. Note that we have used the default values for descriptor generation.

infile=xxxx.jpgtmpfile=${infile/jpg/pgm}outfile=${infile/jpg/siftgeo}# Rescaling and intensity
normalizationdjpeg $infile | ppmtopgm | pnmnorm -bpercent=0.01 -wpercent=0.01 -maxexpand=400
| pamscale -pixels $[1024*768] > $tmpfile# Compute descriptorscompute_descriptors -i $tmpfile -o4
$outfile -hesaff -sift

The output format option -o4 produces a binary .siftgeo file, which format is described above. The other available formats are described here.

NEW VERSION: the new version of the descriptor (pre-compiled).

It is almost the same as the one above, but it includes dense sampling as well. Also, it does not depend on ImageMagick anymore, for improved portability. As a result, input JPG format is no longer supported. For the same set of parameters, there might be some small differences between the output of this version of the previous one, but there differences are mainly precision ones and the output of the two softwares are intended to be compatible.


Please use the following citation when referencing the dataset:

  title={Hamming Embedding and Weak Geometry Consistency for Large Scale Image Search--Extended
  author={Schmid, Herv{\'e} Jegou—Matthijs Douze—Cordelia},



The Inria project Pervasive Interaction develops theories and models for context aware, sociable interaction with systems and services that are composed from ordinary objects that have been augmented with abilities to sense, act, communicate and interact with humans and with the environment (smart objects). The ability to interconnect smart objects makes it possible to assemble new forms of systems and services in ordinary human environments.