2
Labeling Data for Classification
In this chapter, we are going to learn how to label tabular data by applying business rules programmatically with Python libraries. In real-world use cases , not all of our data will have labels. But we need to prepare labeled data for training the machine learning models and fine-tuning the foundation models. The manual labeling of large sets of data or documents is cumbersome and expensive. In case of manual labeling, individual labels are created one by one. Also, occasionally, sharing private data with a crowd-sourcing team outside the organization is not secure.
So, programmatically labeling data is required to automate data labeling and quickly label a large-scale dataset. In case of programmatic labeling, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access