January 2018
Beginner to intermediate
284 pages
8h 35m
English
There are a few datasets that consist of image and sentence pairs in English describing the content of the images that are available publicly for evaluation purposes:

The following table lists some basic information:
|
Dataset name |
#Images |
#Text |
Object |
Link |
Description |
Reference |
|
Pascal VOC 2008 |
1000 |
5 |
No |
http://nlp.cs.illinois.edu/HockenmaierGroup/pascal-sentences/index.html |
It consists of 1,000 images randomly selected from the training and validation set of the PASCAL 2008 object recognition challenge. Each image is associated with five different captions that describe the entities and events depicted ... |
Read now
Unlock full access