3 Python Data Science Libraries

Python provides access to a robust ecosystem of third-party libraries that you’ll find useful for data analysis and manipulation. This chapter introduces you to three of the more popular data science libraries: NumPy, pandas, and scikit-learn. As you’ll see, many data analysis applications use these libraries extensively, either explicitly or implicitly.

NumPy

NumPy, or the Numeric Python library, is useful for working with arrays, which are data structures that store values of the same data type. Many other Python libraries that perform numerical computations rely on NumPy.

The NumPy array, a grid of elements ...

Get Python for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.