Chapter 14. Data Indexing and Selection
In Part II, we looked in detail
at methods and tools to access, set, and modify values in NumPy arrays.
These included indexing (e.g., arr[2, 1]), slicing (e.g.,
arr[:, 1:5]), masking (e.g., arr[arr > 0]), fancy indexing (e.g.,
arr[0, [1, 5]]), and combinations thereof (e.g., arr[:, [1, 5]]).
Here we’ll look at similar means of accessing and modifying
values in Pandas Series and DataFrame objects. If you have used the
NumPy patterns, the corresponding patterns in Pandas will feel very
familiar, though there are a few quirks to be aware of.
We’ll start with the simple case of the one-dimensional
Series object, and then move on to the more complicated
two-dimensional DataFrame object.
Data Selection in Series
As you saw in the previous chapter, a Series object acts in many ways
like a one-dimensional NumPy array, and in many ways like a standard
Python dictionary. If you keep these two overlapping analogies in mind,
it will help you understand the patterns of data indexing and selection
in these arrays.
Series as Dictionary
Like a dictionary, the Series object provides a mapping from a
collection of keys to a collection of values:
In[1]:importpandasaspddata=pd.Series([0.25,0.5,0.75,1.0],index=['a','b','c','d'])dataOut[1]:a0.25b0.50c0.75d1.00dtype:float64
In[2]:data['b']Out[2]:0.5
We can also use dictionary-like Python expressions and methods to examine the keys/indices and values:
In[3]:'a'indataOut[