Appendix B. Image Datasets

B.1 Flickr

The immensely popular photo-sharing site Flickr ( is a gold mine for computer vision researchers and hobbyists. With hundreds of millions of images, many of them tagged by users, it is a great resource to get training data or for doing experiments on real data. Flickr has an API for interfacing with the service that makes it possible to upload, download, and annotate images (and much more). A full description of the API is available at, and there are kits for many programming languages, including Python.

Let’s look at using a library called flickrpy, available freely at Download the file You will need an API Key from Flickr to get this to work. Keys are free for non-commercial use and can be requested for commercial use. Just click the link “Apply for a new API Key” on the Flickr API page and follow the instructions. Once you have an API key, open and replace the empty string on the line

API_KEY = ''

with your key. It should look something like this:

API_KEY = '123fbbb81441231123cgg5b123d92123'

Let’s create a simple command line tool that downloads images tagged with a particular tag. Add the following code to a new file called

import flickr import urllib, urlparse import os import sys if len(sys.argv)>1: tag = sys.argv[1] else: print 'no tag specified' # downloading image data f = flickr.photos_search(tags=tag) urllist = [] #store ...

Get Programming Computer Vision with Python now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.