Skip to Content
Python for Excel (Chinese Edition), 2nd Edition
book

Python for Excel (Chinese Edition), 2nd Edition

by Felix Zumstein
May 2026
Intermediate
418 pages
5h 50m
Chinese
O'Reilly Media, Inc.
Content preview from Python for Excel (Chinese Edition), 2nd Edition

第 5 章. 使用 pandas进行数据 分析

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

本章将向您介绍 pandas——这个Python 数据分析库,或者用我喜欢的说法,就是拥有超能力的基于 Python 的电子表格。pandas 库让那些在 Excel 中特别令人头疼的任务变得更简单、更快,且更少出错。其中一些任务包括从外部来源获取数据集,以及处理统计数据、时间序列和交互式图表。 如果您熟悉 Excel 中的 Power Query,pandas 涵盖了类似的功能,但灵活性更高。pandas 最重要的“超能力”是向量化与数据对齐。正如我们在上一章关于 NumPy 数组的讨论中所见,向量化让您能够编写简洁的基于数组的代码,而数据 对齐则确保在处理多个 数据集时不会出现数据不匹配的情况。

本章将完整呈现 的数据分析之旅:首先介绍数据清洗与预处理,随后展示如何通过聚合、描述性统计和可视化来解析大型数据集。章末,我们将了解如何使用pandas导入和导出数据。但先从基础开始——让我们先来了解pandas的主要数据结构:DataFrame和Series!

DataFrame 和 Series

DataFrame()和 Series 是 pandas 中的核心数据结构。在本节中,我将介绍 DataFrame 的主要组成部分:索引、列和数据。DataFrame类似于二维 NumPy 数组,但它带有列和行标签,且每列可以存储不同的数据类型。从 DataFrame 中提取单列或单行,即可得到一维的 Series。 同样,Series类似于带有标签的一维 NumPy 数组。观察图 5-1 中的 DataFrame 结构,不难发现 DataFrame 将成为您基于 Python 的电子表格。

Diagram comparing a pandas Series and DataFrame, highlighting their structure with indices, columns, and data orientation along axis 0 and axis 1.
图 5-1. pandasSeries 和 DataFrame

为了向您展示从电子表格转换为 DataFrame 是多么简单,请看图 5-2 中的 Excel 表格,其中列出了某在线课程学生的成绩。您可以在配套代码库的xl文件夹中找到对应的students.xlsx文件。

Excel table showing user IDs, names, birth years, countries, scores, and continents for online course students.
图 5-2. students .xlsx

要将此 Excel 表格引入 Python,首先导入 pandas,然后使用其read_excel函数,该函数会返回一个 DataFrame:

In [1]: import pandas as pd
In [2]: pd.read_excel("xl/students.xlsx", engine="calamine")
Out[2]: user_id name born country score continent 0 1001 Mark 1966 Italy 4.5 Europe 1 1000 John 1988 USA 6.7 America 2 1002 Tim 1980 USA 3.9 America 3 1003 Jenny 2009 Germany 9.0 Europe ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

学习勒索软件响应和恢复 (Chinese Edition)

学习勒索软件响应和恢复 (Chinese Edition)

W. Curtis Preston, Michael Saylor
Prometheus:快速入门,第二版

Prometheus:快速入门,第二版

Julien Pivotto, Brian Brazil

Publisher Resources

ISBN: 0642572396008