Practical Weak Supervision

by Wee Hyong Tok, Amit Bahree, Senja Filipi

Released October 2021

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781492077060

Book description

Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models.

You'll learn how to build natural language processing and computer vision projects using weakly labeled datasets from Snorkel, a spin-off from the Stanford AI Lab. Because so many companies have pursued ML projects that never go beyond their labs, this book also provides a guide on how to ship the deep learning models you build.

Get up to speed on the field of weak supervision, including ways to use it as part of the data science process
Use Snorkel AI for weak supervision and data programming
Get code examples for using Snorkel to label text and image datasets
Use a weakly labeled dataset for text and image classification
Learn practical considerations for using Snorkel with large datasets and using Spark clusters to scale labeling