Chapter 9. Applications of LSTM – Image Caption Generation

In the previous chapter, we saw how we can use LSTMs to generate text. In this chapter, we will use an LSTM to solve a more complex task: generating suitable captions for given images. This task is more complex in the sense that solving it involves multiple subtasks, such as training/using a CNN to generate encoded vectors of images, learning word embeddings, and training an LSTM to generate captions. So this is not as straightforward as the text generation task, where we simply input text and output text in a sequential manner.

Automated image captioning or image annotation has a wide variety of applications. One of the most prominent application is image retrieval in search engines. Automated ...

Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.