O'Reilly logo
live online training icon Live Online training

AI Superstream Series: AI & ML in Production



Topic: Data
Antje Barth

One of the most consistent challenges for ML engineers is how to move from model to production. Join us for a day of sessions dedicated to making the most of AI in your company. You’ll learn about everything from scaling to deployment and from pipeline to model decay—straight from our experts.

About the AI Superstream Series: This four-part series of half-day online events is packed with insights from some of the brightest minds in AI. You’ll get a deeper understanding of the latest tools and technologies that can help keep your organization competitive, and learn to leverage AI to drive real business results.

What you'll learn-and how you can apply it

  • Understand how MLOps can help you evolve from manually building models
  • Learn how to use PyTorch to effectively deploy and scale your AI models
  • Explore design patterns that will help you tackle problems that frequently crop up during the ML process

This Superstream is for you because...

  • You want to learn more about moving machine learning from model to production.
  • You want to better understand MLOps.
  • You’re interested in improving your skills in scaling, model monitoring, and deployment.


  • Come with your questions
  • Have a pen and paper handy to capture notes, insights, and inspiration

About your host

  • Antje Barth is a senior developer advocate for AI and machine learning at AWS. She’s the coauthor of Data Science on AWS and frequently speaks at AI and machine learning conferences, online events, and meetups around the world. Antje is also passionate about helping developers leverage big data, container, and Kubernetes platforms in the context of AI and machine learning. She’s cofounder of the Düsseldorf chapter of Women in Big Data.


The timeframes are only estimates and may vary according to how the class is progressing

EVENT 1: AI IN PRODUCTION - MARCH 17, 9:00AM–1:00PM PT | 12:00PM–4:00PM ET | 5:00PM–9:00PM UTC/GMT

Antje Barth: Introduction (5 minutes) - 9:00am PT | 12:00pm ET | 5:00pm UTC/GMT

  • Antje Barth welcomes you to the AI Superstream.

Antje Barth: Get Your AI/ML Pipelines Ready for Prime Time with MLOps 4.0 (15 minutes) - 9:05am PT | 12:05pm ET | 5:05pm UTC/GMT

  • Productionizing AI and machine learning pipelines typically requires close collaboration between the application, data science, and DevOps teams. In her keynote address, Antje Barth explains how to evolve from manually building models to running pipelines that are automatically triggered on model decay with MLOps 4.0.
  • Antje Barth is a Senior Developer Advocate for AI and Machine Learning at AWS. She is co-author of the O'Reilly Book, Data Science on AWS. She frequently speaks at AI and Machine Learning conferences, online events and meetups around the world, including the O’Reilly AI conferences. Besides ML/AI, Antje is passionate about helping developers leverage Big Data, container and Kubernetes platforms in the context of AI and Machine Learning. Antje is also co-founder of the Düsseldorf chapter of Women in Big Data.

Yochay Ettun and Chris Banyai: MLOps to Build Scalable End-to-End AI Pipelines (sponsored session) (30 minutes) - 9:20am PT | 12:20pm ET | 5:20pm UTC/GMT

  • Prior to the existence of MLOps platforms, the manual setup and configuration of even a small scale version of an intended end-to-end AI pipeline was often cumbersome, complex, and time consuming often with little reuse. By manual methods, just getting the as-planned production pipeline running is a challenge that may leave few cycles to proactively study scaling limits, fix late found scaling issues, or experiment with other approaches. Central to the productivity gains of MLOps tooling is the ability to more quickly and easily construct and deploy versioned end-to-end AI pipelines composed from individual stages or building blocks with control over where each stage runs (on-prem or cloud). The example covered in this talk, uses the cnvrg.io platform, to show how an end-to-end pipeline can be rapidly constructed, deployed, modified, and scaled. Enabling the data science team with the end-to-end pipeline in the research phase, presents more realistic data flows not seen in locally cached datasets typically used for initial model development and exploration. Finally, specifically focusing on training & inference stages, MLOps tools can enable data science teams to deploy and evaluate different scale patterns themselves such as such as single-node/instance vs distributed or compare different compute type/instances for TCO, latency, or other desired metrics.
  • Yochay Ettun is an experienced tech leader and has been named in the 2020 Forbes 30 under 30 list for his achievements in AI advancement and for building cnvrg.io. Since the age of 7 Yochay has been writing code. He served in the Israeli Defence Force Intelligence unit for 4 years, and studied Computer Science at the Hebrew University of Jerusalem (HUJI) where he founded the HUJI Innovation Lab. Yochay lead as the former CTO of Webbing labs, and has been consulting companies in AI and machine learning. After 3 years of consulting, Yochay, along with Co-founder Leah Kolben decided to create a tool to help data scientists and companies scale their AI and Machine Learning with cnvrg.io. The company continues to help data science teams from Fortune 500 companies manage, build and automate machine learning from research to production.
  • Chris Banyai is a Senior AI Technical Specialist & System Engineer. In his role, he identifies, drives, and optimizes end-to-end solutions to move AI/ML from concept to deployed production services for Intel Partners. Chris studied computer engineering at the University of Michigan, Ann Arbor and Engineering Physics at Hope College. His 25+ years industry experience spans: data science, core data center PaaS distributed systems, applications at the Edge, and safety critical embedded systems.
  • Break (5 minutes)

Geeta Chauhan: Scaling AI in Production with PyTorch (45 minutes) - 9:55am PT | 12:55pm ET | 5:55pm UTC/GMT

  • Deploying AI models in production and scaling your ML services are still big challenges. Jumpstart the journey of taking your PyTorch models from research to production. Geeta Chauhan shows how to effectively deploy your AI models, shares best practices for the deployment scenarios, and discusses techniques for performance optimization and scaling the ML services.
  • Geeta Chauhan leads AI partnership engineering at Facebook AI, drawing on her expertise in building resilient, antifragile, large-scale distributed platforms for startups and Fortune 500s. As a core member of the PyTorch team, she leads TorchServe and many partner collaborations to build a strong PyTorch ecosystem and community. Geeta is a thought leader on topics ranging from ethics in AI, deep learning, blockchain, and the IoT; she was recognized as Women in IT’s 2019 CTO of the year for Silicon Valley and is an ACM Distinguished Speaker. She’s passionate about promoting the use of AI for good.
  • Break (10 minutes)

Sara Robinson: Design Patterns for MLOps (45 minutes) - 10:50am PT | 1:50pm ET | 6:50pm UTC/GMT

  • Design patterns capture best practices and solutions to recurring problems. Join Sara Robinson, one of the authors of Machine Learning Design Patterns, to explore solutions to common challenges in data preparation, model building, and MLOps. You’ll dive into three MLOps-focused patterns to help engineers tackle problems that frequently crop up during the ML process: workflow pipeline, feature store, and model versioning.
  • Sara Robinson is a developer advocate for Google Cloud, focusing on machine learning. She inspires developers and data scientists to integrate ML into their applications through demos, online content, and events. Previously, she was a developer advocate on the Firebase team. When she’s not writing code, she can be found on a spin bike or eating frosting.

David Talby: Real-world lessons from applying natural language processing to personalized healthcare (sponsored session) (30 minutes) - 11:35am PT | 2:35pm ET | 7:35pm UTC/GMT

  • Accelerating progress in personalized healthcare requires learning the causal relationships between diseases, genes, treatments, medications, labs, and other clinical information – at scale, over a large population, and a long time range. More than half of the clinically relevant data for applications like recommending a course of treatment for a patient, finding actionable genomic biomarkers, matching patients to clinical trials, matching patients to research results, or curating real-world data is only found in free-text data. This session describes some of the first real-world projects that power such applications, extracting clinical facts using Spark NLP for Healthcare - the most widely used, most accurate, and most scalable Healthcare NLP library today - along with lessons learned and promising future directions.
  • David Talby is a chief technology officer at John Snow Labs, helping healthcare & life science companies put AI to good use. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK. David holds a PhD in computer science and master’s degrees in both computer science and business administration.
  • Break (5 minutes)

Brian Amadio: Multiarmed Bandits and the Stitch Fix Experimentation Platform (45 minutes) - 12:10pm PT | 3:10pm ET | 8:10pm UTC/GMT

  • Multiarmed bandits are becoming a popular alternative to traditional A/B testing for online experimentation. Join Brian Amadio as he shares his solution to building a scalable platform for contextual multiarmed bandits at Stitch Fix. It allows data scientists to integrate sophisticated reward models and provides efficient, deterministic implementations of both epsilon-greedy and Thompson sampling strategies.
  • Brian Amadio is a data platform engineer at Stitch Fix, building a robust and scalable experimentation platform to derisk innovation and accelerate growth across the entire company. Recently he’s been focused on enabling new kinds of experimentation as well as automated meta-analyses to help understand the impact of past experiments and plan future ones. He's also worked as a data scientist, delivering high-value projects and solving challenging data problems across a range of complex domains. He has a PhD in experimental particle physics from UC Berkeley, where he analyzed huge datasets from the Large Hadron Collider, looking for evidence of supersymmetry and microscopic black holes.

Antje Barth: Closing Remarks (5 minutes) - 12:55am PT | 3:55pm ET | 8:55pm UTC/GMT

  • Antje Barth closes out today’s event.

Upcoming AI Superstream events:

  • Responsible AI - June 16, 2021
  • Scaling AI - September 22, 2021
  • Securing AI - December 1, 2021