Overview
Join host George Anadiotis and guest Purvanshi Mehta, cofounder of Lica World, for a discussion about multimodal AI and its applications. Trained on various types of data from text to images to audio and video, multimodal AI models are expanding the possibilities for the kinds of AI applications we can build.
New large AI models such as GPT-4, Gemini, and Claude 3 are all general-purpose multimodal foundational models. More specialized multimodal AI models, such as OpenAI’s yet-to-be-released Sora, which generates video from text, or Suno AI, which generates songs from text, are fueling the imagination with ways we might leverage AI to automate and augment tasks in robotics, entertainment, healthcare, manufacturing, and other industries.
George and Purvanshi discuss where this technology stands and share their thoughts on where the field is headed.
What you’ll learn and how you can apply it
- Learn about state-of-the-art multimodal AI and technologies that you can leverage today
- Understand the specific techniques and skills needed to build multimodal AI systems
- Explore what’s in store for multimodal AI and how to keep up with the latest developments
This live course is for you because…
- You want to stay up-to-date on the latest developments and breakthroughs in the field of AI.
- You’re an AI practitioner who wants to expand your skills beyond one particular field of application.
Recommended follow-up:
- Read “Multimodal Foundation Models” (chapter 10 in Generative AI on AWS)
- Listen to “Now you see me—multimodality” (episode 9 of AI Unveiled)
- Watch Multilingual and Multimodal Prompt Engineering (on-demand course)
- Watch Enhancing Lakehouse Infrastructure for Multimodal AI (video)
Please note that slides or supplemental materials are not available for download from this recording. Resources are only provided at the time of the live event.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Watch now
Unlock full access