Chapter 6. Advanced Classification Methods
In Chapter 5, we built the core classification workflow using logistic regression and KNN. Those models are valuable starting points, but they also impose constraints: logistic regression is linear in its decision boundary, and KNN can become unstable or hard to scale as feature spaces grow.
This chapter introduces tree-based models as a more flexible extension of that work. Decision trees, random forests, and gradient boosting methods can represent nonlinear relationships, interactions, and threshold effects that arise naturally in soccer data.
That flexibility is especially useful in soccer analytics, where performance often depends on combinations of context: distance and angle together, pressure and location together, possession state and game situation together. Tree-based models can capture those structures without requiring us to write every interaction by hand.
At the same time, tree-based models preserve an important link to interpretation. ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access