1.1 What is a text-to-image generation model?1.1.1 Unimodal vs. multimodal models1.1.2 Practical use cases of text-to-image models1.2 Transformer-based text-to-image generation1.2.1 Converting an image into a sequence of integers and then back1.2.2 Training and using a transformer-based text-to-image model1.3 Text-to-image generation with diffusion models1.3.1 Forward and reverse diffusions1.3.2 Latent diffusion models and Stable Diffusion1.4 How to build text-to-image models from scratch1.5 Challenges for text-to-image generation models1.5.1 Are generative AI models stealing from artists?1.5.2 The geometric inconsistency problem1.6 Social, environmental, and ethical concerns