The architecture of the generator network of pix2pix is as follows:
Here, we assume that both the input and output data are 3-channel 256x256 images. In order to illustrate the generator structure of pix2pix, feature maps are represented by colored blocks and convolution operations are represented by gray and blue arrows, in which gray arrows are convolution layers for reducing the feature map sizes and blue arrows are for doubling the feature map sizes. Identity mapping (including skip connections) is represented by black arrows.
We can see that the first half layers of this network ...