January 2019
Intermediate to advanced
316 pages
8h 16m
English
The generator network is again a deep convolutional neural network. The Stage-I result, which is the low-resolution image, is passed through several downsampling layers to generate image features. Then, the image features and the text conditioning variables are concatenated along the channel dimensions. After that, the concatenated tensor is fed into some residual blocks that learn multimodal representations across image and text features. Finally, the output of the last operation is fed into a set of upsampling layers, which generate a high-resolution image with dimensions of 256x256x3. Let's have a look at the architecture of the generator network, as shown in the following images:
Read now
Unlock full access