January 2018
Beginner to intermediate
284 pages
8h 35m
English
With the advanced progress in both computer vision and NLP areas, more and more researchers are starting to look at potential applications in their intersect areas.
One type of application is called image captioning, or im2text, which is for automatically generating descriptions for a given image. It requires the joint use of technologies in both computer vision and NLP. For a given image, the goal is to analyze its visual content and generate a realistic textual description to describe the major content or most salient aspect of the image. For example, the human in a picture.
To achieve this goal, the caption generation model has to have at least two capabilities:
Read now
Unlock full access