This dataset is maintained by the Multimedia Computing and Computer Vision Lab, Augsburg University.

Changelog

FlickrLogos-32

The dataset FlickrLogos-32 contains photos showing brand logos and is meant for the evaluation of logo retrieval and multi-class logo detection/recognition systems on real-world images. We collected logos of 32 different logo brands by downloading them from Flickr. All logos have an approximately planar surface.

Visual summary of all 32 classes

Visual summary of test set of all 32 classes

Partitions / subsets

The retrieved images were inspected manually to ensure that the specific logo is actually shown. The whole dataset is split into three disjoint subsets P1, P2, and P3, each containing images of all 32 classes. The first partition P1 - the training set - consists of 10 images that were hand-picked such that these consistently show a single logo under various views with as little background clutter as possible. The other two partitions P2 (validation set) and P3 (test set = query set) contain 30 images per class. Unlike P1 these images contain at least one instance of a logo but in several cases multiple instances.

To facilitate the development of high-precision classifiers the evaluation of their sensitivity on non-logo images is important. Therefore both partitions P2, and P3 include another 3000 images downloaded from Flickr with the queries "building", "nature", "people" and "friends". These images are the negative images and complete our dataset. A brief summary of the data subsets is shown in the table below.

Dataset partitions / Subsets
Partition Description Images #Images
P1, P2 and P3 are disjoint. 8240 images
P1 (training set) Hand-picked images 10 per class 320 images
P2 (validation set) Images showing at least a single logo under various views 30 per class 3960 images
Non-logo images 3000
P3 (test set = query set) Images showing at least a single logo under various views 30 per class 3960 images
Non-logo images 3000

Pixel-level annotations

Pixel-level annotations

This dataset further includes pixel-level annotations, i.e. binary masks + bounding boxes (see above) that mark the position of the logo in each image.

The binary masks are provided as .png files as well as by the coordinates of the corresponding bounding box.
Each logo image in <basedir>/classes/jpg/<class>/<file>.jpg has its mask in <basedir>/classes/masks/<class>/<file>.mask.<n>.png where n is the mask number starting at 0.
Masks are single-channel PNG images of the same size as the original image. Masked areas (where the logos is) have a value != 0, unmasked areas (=background) 0.

There are separate masks for each annotation in an image available and an additional mask merged from all individual masks:
Each logo image in <basedir>/classes/jpg/<class>/<file>.jpg has its merged mask in <basedir>/classes/masks/<class>/<file>.mask.merged.png.

Also thumbnail images are included for easy visualization of results.

Evaluation Protocol

Evaluation of classification/recognition methods:
  • Training may use training + validation set only.
  • All the images in the test set P3 (logos+non-logos = 3960 images) are used for the evaluation of recognition methods.
  • Please see fl_eval_classification.py for details of the evaluation procedure.
Evaluation of retrieval methods:
  • All images in training and validation set are indexed, including non-logo images = 4280 images in total.
  • All images in query set P3 (logos only = 960 images) are used for the evaluation of retrieval methods.
  • The average precision is computed for every query (see our scripts for specific implementation of AP) and averaged over all queries yielding the final mAP.
  • If not specified otherwise the masks/bounding boxes are not used as region-of-interest when indexing and querying.

Evaluation & Tools

We provide evaluation scripts to test both retrieval and classification systems (Python 2.7+ needed):
  • fl_eval_retrieval.py: Evaluates the retrieval of logo images by computing mAP, AvgTop4 score, response ratio, etc.
  • fl_eval_classification.py: Evaluates the classification of logo images by computing precision, recall and more.
  • fl_plot_classification_results.py: Plots classification results, i.e. true positives per class and the confusion matrix. Contained in full package.
  • Other scripts that simplify the management of this dataset (copying, etc..)
Download the evaluation kit including all scripts plus additional sample data. The evaluation kit was last updated on 18th November 2013 (Version 1.0.4). See the changelog for details. If you encounter problems, please contact us.

Download

If you wish to download the dataset, please send an (informal) email to Christian Eggert. Please state your name, your institution and why you would like to have access to this dataset (we are curious). We then send you a download link by e-mail. The file size is 3.1 GB.

Note: This dataset consists of images downloaded from Flickr. Use of these images must respect Flickr's terms of use.

Precomputed Visual Features

We provide precomputed visual features for download as used in our publications [1], [2] and [3]:
  • Hessian-affine SIFT visual words (used in [1]) obtained from Hessian-affine covariant regions with SIFT descriptors quantized using a vocabulary obtained with flat k-means. Available for visual vocabulary sizes ranging from 1,000 to 4,096 words for SIFT and Color SIFT. [Data].
  • RootSIFT visual words & descriptors (used in [2,3]) obtained from Difference-of-Gaussian interest points with RootSIFT descriptors. The features of each image are stored as x, y, size, orientation. Additionally, descriptors and visual word labels of the 3 closest centroids and distances to the 3 closest centroids of different visual vocabulary are available: [8, 16, 24, 32, 64, 96, 100, 128, 192, 200, 256, 500, 1000, 2000, 5000, 10K, 20K, 50K, 100K, 200K, 500K, 1M, 2M, 3M, 4M visual words]. All features have been computed with different patch magnifiers [3 (default), 6, 9]. All vocabularies were learned on training+validation set only. For larger vocabularies (>= 10K) approximate k-means and randomized kd-trees were used. In the other cases k-means and exhaustive nn-search. See [2] and [3] for details and evaluation. [Data].

Paper

If you use this dataset in your work please cite the following paper:

[1] Scalable Logo Recognition in Real-World Images
Stefan Romberg, Lluis Garcia Pueyo, Rainer Lienhart, Roelof van Zwol
ACM International Conference on Multimedia Retrieval 2011 (ICMR11), Trento, April 2011.
Also Technical Report, University of Augsburg, Institute of Computer Science, March 2011
[PDF] [Errata] [Slides] [Bibtex] [ACM catalog entry]

Related Papers

[2] Bundle min-Hashing
Stefan Romberg, Rainer Lienhart
International Journal of Multimedia Information Retrieval, Springer, Volume 2, Issue 4, pp. 243-259, September/November 2013.
[Bibtex] [DOI: 10.1007/s13735-013-0040-x]

[3] Bundle min-Hashing For Logo Recognition
Stefan Romberg, Rainer Lienhart
ACM International Conference on Multimedia Retrieval (ICMR) 2013, Dallas, April 2013.
[PDF] [Slides] [Bibtex] [ACM catalog entry]

[4] Robust Feature Bundling
Stefan Romberg, Moritz August, Christian X. Ries, Rainer Lienhart
Advances in Multimedia Information Processing - PCM 2012, Lecture Notes in Computer Science, Springer, 2012.
[PDF] [Slides] [Bibtex] [Springer catalog entry]

Other publications

Important Notes

There is a similar dataset called FlickrLogos-27. It consists of 27 classes only and while there is some overlap it is largely different from our FlickrLogos-32 dataset. Results for these different datasets are not comparable.

Contact

If you have any questions, corrections or other issues please contact Christian Eggert. The former maintainer was Stefan Romberg.