Preparing tools and datasets

As introduced in the previous chapters, the Python package for machine learning with the lion's share is scikit-learn. In this chapter, we also will use XGboost, LightGBM, and Catboost: you'll find the instructions in the relevant sections.

The motivations for using scikit-learn developed at Inria, the French Institute for Research in Computer Science and Automation (inria.fr/en/), are multiple. It is worthwhile at this point to mention the most important reasons for using scikit-learn for the success of your data science project:

  • A consistent API (fit, predict, transform, and partial_fit) across models that naturally helps to correctly implement data science procedures working on data organized in NumPy arrays ...

Get Python Data Science Essentials - Third Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.