November 2024
Intermediate to advanced
202 pages
3h 32m
English
Text
Audio
Images
Video
Each of those formats are a mode. Multimodal AI is the process of using multiple AI models together to generate (or to understand) content where the input is one type of mode and the output is a different type of mode.
Take, for example, OpenAI’s Whisper model. If you provide it audio, it is able to create a transcription ...
Read now
Unlock full access