Skip to Content
View all events

AI Superstream: Efficient Machine Learning

Published by O'Reilly Media, Inc.

Intermediate content levelIntermediate

Machine learning has grown significantly, and with it the footprint of ML models—which can make training, deploying, and monitoring difficult and expensive. What if you could make your ML models and systems more efficient, whether in the form of cost, compute, storage, latency, or carbon footprint? Join us for this Superstream where experts dive into techniques for using fewer resources and delivering better quality.

About the AI Superstream Series: This three-part series of half-day online events is packed with insights from some of the brightest minds in AI. You’ll get a deeper understanding of the latest tools and technologies that can help keep your organization competitive and learn to leverage AI to drive real business results.

What you’ll learn and how you can apply it

  • Understand hardware and software resources required for deep learning
  • Learn how to optimize ML models and workloads
  • Discover how to build robust and scalable machine learning systems
  • Explore AI efficiencies that combat climate change

This live event is for you because...

  • You’re an ML engineer or data practitioner who wants to use more-efficient algorithms and improve ML model efficiency.
  • You’re a data team leader or CDO who wants to proactively reduce the cost and resource use of ML systems and pipelines.
  • You’re a product stakeholder who wants to learn more about how ML efficiencies align with business goals.

Prerequisites

  • Come with your questions
  • Have a pen and paper handy to capture notes, insights, and inspiration

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Shingai Manjengwa: Introduction (5 minutes) - 8:00am PT | 11:00am ET | 4:00pm UTC/GMT

  • Shingai Manjengwa welcomes you to the AI Superstream.

Jesse Hoey: Keynote—Socially Sustainable Machine Learning (15 minutes) - 8:05am PT | 11:05am ET | 4:05pm UTC/GMT

  • A major challenge for machine learning and artificial intelligence is dealing with surprising and unpredictable events. Climate change will produce such surprises as tipping points are reached and novel situations arise. In this keynote, Jesse Hoey explores efforts to use computational models of emotion to build adaptable machine learning for climate change mitigation strategies focused on human behavior. (An important element of these systems is explainability, including narratives generated around issues of fairness and diversity.)
  • Jesse Hoey is a professor in the Cheriton School of Computer Science at the University of Waterloo, where he leads the Computational Health Informatics Laboratory (CHIL). He’s a faculty affiliate at the Vector Institute and an affiliate scientist at KITE/TRI, both in Toronto. He’s also the founder of BayesACT Inc., a company that builds computational models of emotion for artificial intelligence. Jesse holds a PhD in computer science from the University of British Columbia. His primary research interest is to understand the nature of human emotional intelligence by attempting to build computational models of some of its core functions, and to apply them in domains with social and economic impact. He’s an associate editor for IEEE Transactions on Affective Computing and an area chair for the International Joint Conferences on Artificial Intelligence (IJCAI).

Michael Houston: Energy-Efficient AI and High-Performance Computing Data Centers (30 minutes) - 8:20am PT | 11:20am ET | 4:20pm UTC/GMT

  • Exploding complexity for AI workloads and a surge in adoption across every industry is ramping up energy demand in the modern data center. With AI model compute increasing 300,000x in the last five years, key technological advancements are required to avoid a commensurate increase in energy consumption. Join Michael Houston to see how advances in full stack accelerated computing platforms, from GPUs, interconnects, and networking to software and algorithms, have delivered huge reductions in energy required for different applications.
  • Michael Houston is a VP and chief architect of AI systems at NVIDIA. His team leads the NVIDIA SuperPOD effort and supports cloud providers and companies as they build and deploy AI systems at scale. Over his career, Mike’s worked on chip design, computer architecture, programming models and languages, autonomous vehicles, AI algorithms, performance optimization, and system design. His PhD work at Stanford University involved GPU computing and parallel programming models.

Meena Arunachalam: Performance-Efficient End-to-End AI Pipelines (Sponsored by Intel) (30 minutes) - 8:50am PT | 11:50am ET | 4:50pm UTC/GMT

  • When building AI pipelines for data ingestion, preprocessing, feature engineering, fine-tuning, and other phases of model inferencing, performance is critical—and there are many things you can do to improve it. Software optimizations with frameworks and libraries such as TensorFlow, PyTorch, and XGBoost can offer better vectorization, parallelization and core scaling, cache reuse, and more. And hardware tuning using approaches such as SigOpt multiobjective knob-tuning can enable you to tune usage models such as real-time or batched inferencing and partition a CPU socket to serve multiple inference streams in parallel and scale efficiently. Together these optimizations can result in significant performance boost. Join Intel’s Meena Arunachalam to learn how to build several end-to-end pipelines and achieve high efficiency and throughput on CPUs using both software and hardware optimizations, through concrete examples such as document-level sentiment analyses, recommendation systems, and more.
  • Meena Arunachalam is director and principal engineer within the AI and Analytics Group at Intel. She works on software performance optimizations of end-to-end pipelines for vision, video, recommendation, ML, NLP, and other use cases with Intel Xeon CPUs and accelerators. She heads a team that focuses on end-to-end AI performance, hardware-software codesign, and workload performance modeling.
  • This session will be followed by a 30-minute Q&A in a breakout room. Stop by if you have more questions for Meena.
  • Break (10 minutes)

Xin Li: Optimize Deep Learning Workloads by Reducing Compute Resources (30 minutes) - 9:30am PT | 12:30pm ET | 5:30pm UTC/GMT

  • The advances in deep learning (DL) research, frameworks, and hardware have led to a proliferation of larger and more powerful models. Developing these models is computationally intensive, requiring an army of expensive, specialized accelerators such as GPUs and TPUs, leading to staggeringly high carbon footprints and training costs. Join Xin Li to understand what goes on beneath just the DL frameworks. You’ll explore the various software and hardware stacks involved in training a DL model, optimization opportunities at each level, and recent techniques and research to better understand and utilize your hardware resources.
  • Xin Li is a graduate student in the EcoSystem research group at the University of Toronto and the Vector Institute, where he works on making training and serving DL models more efficient and accessible. Previously, he was an applied ML specialist on Vector's AI engineering team, where he helped researchers and private sector clients develop DL models and training pipelines on a large GPU cluster. Xin is generally interested in making deep learning systems work well with modern accelerators and software infrastructure.

Maryam Mehri Dehnavi: Build Efficient Frameworks for Machine Learning at Scale (30 minutes) - 10:00am PT | 1:00pm ET | 6:00pm UTC/GMT

  • Machine learning applications often operate in large and big data problems. With the emerging complex learning methods and the ever-evolving parallel and distributed hardware, it’s crucial to build scalable algorithms and systems for machine learning applications. - Maryam Mehri Dehnavi shares state-of-the-art research her team used to build robust and scalable machine learning systems.
  • Maryam Mehri Dehnavi is an assistant professor in the Computer Science Department at the University of Toronto and is the Canada Research Chair in parallel and distributed computing. Her research focuses on high-performance computing and domain-specific compiler design for machine learning applications and scientific computing problems. Previously, she was an assistant professor at Rutgers University and a postdoctoral researcher at MIT. She holds a PhD from McGill University. Some of her recognitions include the Ontario Early Researcher Award and the ACM SRC Grand Finals prize.
  • Break (10 minutes)

Moty Fania: Best Practices for Running Efficient AI at Scale (Sponsored by Intel) (30 minutes) - 10:40am PT | 1:40pm ET | 6:40pm UTC/GMT

  • To enable smarter, faster, and more innovative business processes at scale, Intel developed a unique set of reusable capabilities to enable world-class MLOps to accelerate and automate the development, deployment, and maintenance of machine-learning models. This environmentally responsible approach to model productization avoids the typical logistical hurdles that often prevent other companies’ AI projects from reaching production efficiently. Join Intel’s Moty Fania to explore some of the best practices Intel developed to deliver operational excellence in a large-scale AI operation—including continuous delivery of ML models (hundreds of models annually) and systematic measures to minimize the full cost and effort required to make them more sustainable in production. You’ll learn how Intel increased efficiency in different business domains and the role “thinking green” played in deployment scenarios throughout the methodology, key concepts, and related architectures.
  • Moty Fania is a principal engineer, CTO, and head of ML engineering driving operational excellence, scale, and productivity within the AI Group at Intel IT—a large-scale operation that delivers AI and big data solutions across the company. Moty has rich experience in ML engineering, MLOps, advanced analytics, and decision-support solutions. He’s led the architecture work and development of various AI initiatives, including AI platforms, predictive engines, IoT systems, online inference systems, and more.
  • This session will be followed by a 30-minute Q&A in a breakout room. Stop by if you have more questions for Moty.

Alishba Imran: Computationally Efficient Learning Methods for Cognition, Control, and Perception (30 minutes) - 11:10am PT | 2:10pm ET | 7:10pm UTC/GMT

  • Machine intelligence is often embodied in a physical form through robotics—robotic process automation unlocks value for workplaces and various applications in many ways. However, one of the key challenges in robot learning is developing systems that generalize to novel tasks while still being computationally efficient. Alishba Imran shares her research on robot learning using imitation, offline, batch RL, and sim2real methods for cognition, control, and perception leveraging image, video, and natural language. Join in to explore learning methods that significantly save computation and memory resources required in robotics without degrading performance.
  • Alishba Imran is a machine learning and robotics developer who led ML research at SJSU and the BLINC Lab under Fred Barez, where she developed a closed-loop vision-based system to improve manipulation for amputees and reduce the cost of prosthetics from $10,000 to $700. She also worked on neurosymbolic AI research for Sophia, the world’s most human-like robot, to improve manipulation and study human-computer interaction in the context of autism and depression. Alongside this, she developed RL and imitation learning techniques at Kindred and built ML approaches for soft robotic control using SRT to aid dozens of patients with neuromuscular rehabilitation at the Harvard Biodesign Lab.

Shingai Manjengwa: Closing Remarks (5 minutes) - 11:40am PT | 2:40pm ET | 7:40pm UTC/GMT

  • Shingai Manjengwa closes out today’s event.

Upcoming AI Superstream events:

  • NLP in Production - May 11, 2022
  • MLOps - December 7, 2022

Your Host

  • Shingai Manjengwa

    Shingai Manjengwa is the head of AI education at ChainML, a tech startup that has developed an open source platform for the rapid and responsible development of generative AI tools. ChainML works with clients on AI education, adoption, and implementation from an AI product idea to an affordable and scalable deployment. A data scientist by profession, she led technical education at the Vector Institute for Artificial Intelligence in Toronto, where she translated advanced AI research into educational programming to drive AI adoption and innovation in industry and government. She also founded Fireside Analytics Inc., a data science education company that develops customized programs to teach digital and AI literacy, data science, bias and fairness in machine learning, and computer programming. Shingai’s book, The Computer and the Cancelled Music Lessons, teaches data science to kids ages 5 to 12. She also sits on the Service Advisory Committee for Employment and Social Development Canada and she’s a board member at the Institute on Governance. You can find Shingai on LinkedIn and X (Twitter) as @Tjido.

    linkedinXsearch

Skill covered

Artificial Intelligence (AI)

Sponsored by

  • Intel logo