Book description
Your training data has as much to do with the success of your data project as the algorithms themselves--most failures in deep learning systems relate to training data. But while training data is the foundation for successful machine learning, there are few comprehensive resources to help you ace the process. This hands-on guide explains how to work with and scale training data. You'll gain a solid understanding of the concepts, tools, and processes needed to:
- Design, deploy, and ship training data for production-grade deep learning applications
- Integrate with a growing ecosystem of tools
- Recognize and correct new training data-based failure modes
- Improve existing system performance and avoid development risks
- Confidently use automation and acceleration approaches to more effectively create training data
- Avoid data loss by structuring metadata around created datasets
- Clearly explain training data concepts to subject matter experts and other shareholders
- Successfully maintain, operate, and improve your system
Publisher resources
Table of contents
- 1. Training Data Introduction
- 2. Getting Up and Running
- 3. Schema
- 4. Data Engineering
- 5. Workflow
-
6. Tools
- Introduction
-
Why Training Data Tools
- What do Training Data Tools Do?
- Best practices and levels of competency
- Human Computer Supervision
- Tools Bring Clarity
- Understanding the Importance of Tooling
- Realizing the Need for Dedicated Tooling
- More Usage, More Demands
- Advent of New Standards
- Journey to the Suite
- Open Source Standards
- A paradigm to deliver machine learning software
- Scale
- Scope
- Tooling quickstart
- Training Data Tooling Hidden Assumptions
- Security
- Open Source and Closed Source
- Deployment
- Costs
- Annotation Interfaces
- Integrations
- Ease of Use
- Installation and organization
- Configuration Choices
- Bias in training data
- Metadata
-
7. AI Transformation
- AI Transformation Introduction
- Getting Started
- The Creative Revolution of Data Centric AI
-
Appoint a Leader: a Director of Training Data
- Go From a Work Pool to Standard Expectation for All
- Sometimes Proposals and Corrections, Sometimes Replacement
- Upstream Producers and Downstream Consumers
- Reading this Chart
- Spectrum of Training Data Team Engagement
- Dedicated Producers and Other Teams
- Organizing Producers from Other Teams
- Securing your AI Future
- Use Case Discovery
- Rethink AI Annotation Talent - quality over quantity
- Adopt Modern Training Data Tools
- About the Author
Product information
- Title: Training Data for Machine Learning
- Author(s):
- Release date: November 2023
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492094524
You might also like
book
The Self-Service Data Roadmap
Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw …
book
Analytical Skills for AI and Data Science
While several market-leading companies have successfully transformed their business models by following data- and AI-driven paths, …
book
Interpretable AI
AI doesn’t have to be a black box. These practical techniques help shine a light on …
book
SQL for Data Analysis
With the explosion of data, computing power, and cloud data warehouses, SQL has become an even …