August 2021
Intermediate to advanced
752 pages
21h 59m
English
We have now spent a number of chapters on working with textual data. Before that, we looked at how convolutional networks can be applied to image data. In this chapter, we describe how to combine a convolutional network and a recurrent network to build a network that performs image captioning. That is, given an image as input, the network generates a textual description of the image. We then describe how to extend the network with attention. We conclude the chapter with a programming example that implements such an attention-based image-captioning network.
Given that this programming example is the most extensive example in the book and we describe it after we described the Transformer, it ...
Read now
Unlock full access