Chapter 4. Technical Interview: Model Training and Evaluation
In this chapter, we will cover the ML model training process and related interview questions. To many practitioners, the model training is the most exciting part, and I agree―it’s very satisfying to see the model become more and more accurate throughout the process. However, to begin ML model training, hyperparameter tuning, and running experiments with various algorithms, you’ll need to have data. Machine learning at its core is letting algorithms find patterns in data and then making predictions and decisions based on those patterns. Having useful data is the foundation of ML, and as the industry adage says, “Garbage in, garbage out.” That is, if the ML models are training on useless data, then the resulting model and inferences will also be useless.
I’ll start with an overview of data processing and cleaning, which transforms raw data into a format that is useful for (and compatible with) ML algorithms. Next, I’ll go through algorithm selection, such as trade-offs between ML algorithms in different scenarios, and how to generally select the best one for a given problem.
After that, I’ll cover model training and the process of optimizing the model’s performance. This can be an ambiguous and challenging process, and there are some best practices you’ll learn, such as hyperparameter tuning and experiment tracking, which can prevent the best results from being lost and ensure that they are reproducible. On that note, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access