16Pandas

Previously we have looked at concepts and packages from the standard Python library, and now in this chapter we will look at a third‐party package and one that is very relevant within the Python eco‐system. Pandas is a package that is used for data analysis and data manipulation. It's used in a variety of packages and therefore understanding of it and its concepts is a crucial tool for a Python programmer to learn. In this chapter, we will introduce the package pandas from the basics up to some more advanced techniques. However, before we get started with pandas, we will briefly cover numpy arrays which alongside dictionaries and lists are concepts that should be understood to allow us to cover pandas.

16.1 Numpy Arrays

Numpy comes as part of the Anaconda distribution and is a key component in the scientific libraries within Python. It is very fast and underpins many other packages within Python. We concentrate on one specific aspect of it, numpy arrays. However, if you are interested in any of the machine learning libraries within Python, then numpy is certainly something worth exploring further.

We can import it as follows.

image

Why np? Its the standard convention used in the documentation, however you do not have to use that convention but we will. In this chapter, we won't cover everything to do with numpy but instead only introduce a few concepts and the first one ...

Get The Python Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.