Chapter 15. Operating on Data in Pandas
One of the strengths of NumPy is that it allows us to perform quick element-wise operations, both with basic arithmetic (addition, subtraction, multiplication, etc.) and with more complicated operations (trigonometric functions, exponential and logarithmic functions, etc.). Pandas inherits much of this functionality from NumPy, and the ufuncs introduced in Chapter 6 are key to this.
Pandas includes a couple of useful twists, however: for unary operations
like negation and trigonometric functions, these ufuncs will preserve
index and column labels in the output, and for binary operations such
as addition and multiplication, Pandas will automatically align
indices when passing the objects to the ufunc. This means that keeping
the context of data and combining data from different sources—both
potentially error-prone tasks with raw NumPy arrays—become essentially
foolproof with Pandas. We will additionally see that there are
well-defined operations between one-dimensional Series
structures and
two-dimensional DataFrame
structures.
Ufuncs: Index Preservation
Because Pandas is designed to work with NumPy, any NumPy ufunc will work
on Pandas Series
and DataFrame
objects. Let’s start by
defining a simple Series
and DataFrame
on which to demonstrate this:
In
[
1
]:
import
pandas
as
pd
import
numpy
as
np
In
[
2
]:
rng
=
np
.
random
.
default_rng
(
42
)
ser
=
pd
.
Series
(
rng
.
integers
(
0
,
10
,
4
))
ser
Out
[
2
]:
0
0
1
7
2
6
3
4
dtype
:
int64
In
[
3
]:
df
=
pd
.
DataFrame ...
Get Python Data Science Handbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.