Chapter 12. Concepts of Inference
In the previous chapters of this book, you focused on training models using PyTorch and on how to create models that manage images (aka Computer Vision), text content (aka NLP), and sequence modelling. For the rest of this book, you’ll cover a lot of content around using trained models to make predictions from new data (aka inference) and in particular using large generative models for text-to-text and text-to-image generative AI.
But before you jump into that, it’s important for you to understand the underlying data transfer technology. We’ve touched on it a little in the training chapters, but as you go deeper into ML—in either training or inference—it’s important for you to be able to understand the underlying concepts of tensors.
Ultimately, no matter what data type you have, you’ll convert it into tensors to pass it into the model. Similarly, no matter the data type in which you want to present answers from the model to your users, you’ll get them back as tensors as well!
In many cases, you’ll have helper functions, such as the transformers that you’ll see in Chapter 15 (which covers LLMs) and the diffusers that you’ll see in Chapter 19 (which handles image generation). And while you won’t be touching tensors with them, you’ll still be using them under the hood.
Tensors
A tensor is an array that can have any number of dimensions. Tensors are typically used to represent numerical data for deep-learning algorithms; they’re containers that ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access