Skip to Content
View all events

AI Engineering Bootcamp

Published by O'Reilly Media, Inc.

Beginner to intermediate content levelBeginner to intermediate

From theory to production with evaluations, prompting, RAG, and agents

What you’ll learn and how you can apply it

  • Explain the shift from classical machine learning (ML) to AI engineering and describe the three levers for improving foundation model (FM) applications
  • Tune generation settings and prompt structures to control FM behavior
  • Design and implement systematic evaluation pipelines that combine AI-as-a-judge metrics with human review
  • Build prompts and guardrails to mitigate prompt injection and other attacks
  • Construct context through retrieval-augmented generation (RAG) and tool-based agents, understanding planning and failure modes
  • Decide when and how to fine-tune a model versus relying on prompts and retrieval
  • Source, annotate, synthesize, and process data to support adaptation and evaluation
  • Optimize inference latency and cost at the model, hardware, and service layers

Course description

The AI engineering field is moving beyond model-centric ML to a product-centric approach that brings foundation models into production. In this two-day, hands-on bootcamp, AI expert Ammar Mohanna distills the essentials of Chip Huyen’s book AI Engineering: Building Applications with Foundation Models. The course follows Chip’s “production first” philosophy: start simple, optimize instructions and context before touching the model, and always tie model behavior to user experience and product value.

Ammar leads you through basic to advanced techniques, with hands-on exercises at every step. You’ll learn the fundamentals of foundation models and their role in modern AI applications; acquire a set of best practices for prompting, RAG, and agents to make systems more reliable; and understand how to integrate evaluation and feedback loops to ensure continuous improvement and trustworthy performance. In a final exercise, you’ll put it all together to prototype an end-to-end foundation model-powered application.

This live event is for you because...

  • You’re an AI/ML engineer, data scientist, or technical product manager who’s eager to bring generative AI systems into real-world production.
  • You want to become an AI engineer who can design, evaluate, and optimize high-quality generative AI products.
  • You work with Python on a daily basis, apply core machine learning concepts, and rely on experimentation to solve problems.

Prerequisites

  • A computer with Python 3 installed, reliable internet access, and the ability to run Jupyter notebooks
  • Python programming experience
  • Familiarity with basic machine learning concepts
  • Prior exposure to large language models (helpful but not required)

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Day 1: AI Engineering Fundamentals

Introduction to AI engineering (60 minutes)

  • Presentation: Overview of AI engineering and foundation model applications; differences from classical ML; three levers (instructions, context, model)
  • Hands-on exercises: Compare an FM-powered app and a classical ML pipeline; identify levers to improve product quality
  • Q&A
  • Break

Foundation models (60 minutes)

  • Presentation: The anatomy of foundation models (data, architecture/scale, alignment); decoding strategies and generation settings; why sampling controls hallucinations
  • Hands-on exercise: Experiment with temperature, top p, and max tokens on a small language model to see how sampling affects output quality
  • Q&A
  • Break

Evaluation methods (60 minutes)

  • Presentation: Metrics for generative models; when to use AI as a judge; the importance of human evaluation and daily manual reviews; annotation guidelines
  • Hands-on exercises: Build an evaluation rubric for a sample text generation task; use an automatic evaluator and compare with human-annotated scores
  • Q&A
  • Break

Evaluation pipeline and observability (60 minutes)

  • Presentation: Designing evaluation pipelines; instrumentation and logging; error categorization; continuous evaluation; observability dashboards
  • Hands-on exercise: Set up a simple evaluation loop for a prototype foundation model app, log outputs and latencies, and inspect results
  • Q&A
  • Break

Prompt engineering (50 minutes)

  • Presentation: Principles of prompt design; few-shot prompting; instruction versus examples; prompt risks (injection, spoofing); guardrails and sanitization
  • Hands-on exercises: Craft prompts for classification and question answering; implement prompt attack scenarios; add input and output guardrails
  • Q&A

Wrap-up and Q&A (10 minutes)

Day 2: Advanced Techniques and Full Stack Development

Context construction—RAG and agents (60 minutes)

  • Presentation: Why context matters; retrieval-augmented generation (RAG); query rewriting; agent architectures; planning and failure modes; routing and caching
  • Hands-on exercises: Build a simple RAG pipeline using a small document collection; test a tool using an agent on a multistep task and diagnose failures
  • Q&A
  • Break

Fine-tuning (60 minutes)

  • Presentation: When fine-tuning helps; instruction versus reinforcement learning from human feedback (RLHF) versus open instruction tuning; parameter-efficient techniques; safety and alignment considerations
  • Hands-on exercises: Plan a fine-tuning experiment on a small dataset; run a lightweight parameter-efficient fine-tuning and compare with baseline prompts
  • Q&A
  • Break

Data engineering (60 minutes)

  • Presentation: Data acquisition and licensing; annotation and synthesis; data processing pipelines; creating evaluation and training splits; quality control
  • Hands-on exercises: Collect and annotate a small dataset for a domain-specific task; practice using synthetic data generation and validation
  • Q&A
  • Break

Inference optimization (60 minutes)

  • Presentation: Latency and cost trade-offs; model optimization (quantization, distillation); hardware choices (GPU versus CPU); service-level strategies (batching, caching, routing)
  • Hands-on exercises: Apply quantization to a small model; benchmark inference latency; implement request batching and caching to reduce cost
  • Q&A
  • Break

End-to-end build and feedback (50 minutes)

  • Presentation: Putting it all together—building a full foundation model product; designing user interfaces; capturing feedback; implementing continuous improvement loops; ethical and product considerations
  • Hands-on exercise: Prototype a small end-to-end foundation model-powered application (e.g., question answering with RAG); instrument it for feedback capture and plan next steps

Wrap-up and Q&A (10 minutes)

Your Instructor

  • Ammar Mohanna

    Ammar Mohanna is a seasoned AI expert, educator, and entrepreneur with extensive experience spanning academia, industry consulting, and technology innovation. He teaches advanced courses in AI and machine learning at the American University of Lebanon, helping shape the next generation of AI professionals. As a consultant, Ammar leads initiatives focused on integrating AI and generative AI into educational technologies, collaborating closely with cloud technologies. Previously, Ammar cofounded and was AI lead at Assentify, a company dedicated to providing specialized AI solutions, training, and consultancy. His professional expertise includes machine learning, MLOps, explainable AI (XAI), Kubernetes, and microservices architecture. He holds a PhD in edge artificial intelligence from the University of Genoa, Italy. Ammar lives in Beirut, Lebanon, and is fluent in Arabic, English, and French, with intermediate proficiency in Italian.

Skill covered

Prompt Engineering