17

Building Optimized Prompts for Stable Diffusion

In Stable Diffusion V1.5 (SD V1.5), crafting prompts to generate ideal images can be challenging. It is not uncommon to see impressive images emerge from complex and unusual word combinations. This is largely due to the language text encoder used in Stable Diffusion V1.5 – OpenAI’s CLIP model. CLIP is trained using captioned images from the internet, many of which are tags rather than structured sentences.

When using SD v1.5, we must not only memorize a plethora of “magical” keywords but also combine these tagging words effectively. For SDXL, its dual-language encoders, CLIP and OpenCLIP, are much more advanced and intelligent than those in the previous SD v1.5. However, we still need to follow ...

Get Using Stable Diffusion with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.