10

Overcoming 77-Token Limitations and Enabling Prompt Weighting

From Chapter 5, we know that Stable Diffusion utilizes OpenAI’s CLIP model as its text encoder. The CLIP model’s tokenization implementation, as per the source code [6], has a context length of 77 tokens.

This 77-token limit in the CLIP model extends to Hugging Face Diffusers, restricting the maximum input prompt to 77 tokens. Unfortunately, it’s not possible to assign keyword weights within these input prompts due to this constraint without some modifications.

For instance, let’s say you give a prompt string that produces more than 77 tokens, like this:

 from diffusers import StableDiffusionPipeline import torch pipe = StableDiffusionPipeline.from_pretrained(     "stablediffusionapi/deliberate-v2", ...

Get Using Stable Diffusion with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.