Skip to Content
Python入门指南, 3rd Edition
book

Python入门指南, 3rd Edition

by Bill Lubanovic
September 2025
Intermediate to advanced
660 pages
7h 15m
Chinese
O'Reilly Media, Inc.
Content preview from Python入门指南, 3rd Edition

第 25 章 数据科学 数据科学

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

在数据面前空谈理论是大错特错的。

福尔摩斯

世界是混乱的,数据也是如此。1 你需要花费大量的时间来清理、合并、拆分和推动数据,以获得你需要的东西。

Python 已成为最流行的计算机语言,部分原因是许多开发人员采用它来处理他们的克星--数据。 本章涉及广泛的数据主题,包括这些:

  • 数据处理

  • 格式转换

  • 分析和统计

  • 可视化

我将首先讨论当你打开并摇动标准 Python 盒子时会得到什么。 然后,我将深入探讨人们为驯服数据而编写的丰富的第三方工具。

人工智能得等到下一章了,我敢打赌,它肯定没预料到这一点。

标准 Python

首先, ,让我们来看看我还没有提到的一些 Python 功能。

我在第 8 章中提到过sort()sorted(),但 只适用于列表。operator 模块有方便的函数itemgetter()attrgetter(),可以告诉sort()sorted()如何进行排序。

基本的itemgetter() 函数会从一个可迭代器中返回一个或多个项--一个可以让你用[] 访问项的对象,就像列表或元组一样:

>>> from operator import itemgetter
>>> l = ['a', 'b', 'c', 'd', 'e']
>>> f = itemgetter(2)
>>> f(l)
'c'

如果给itemgetter() 多个索引,它将返回一个元组:

>>> f = itemgetter(3, 2, 1)
>>> f(l)
('d', 'c', 'b')

常见的用法是对可迭代器进行排序,比如下面的列表。它会在每个子列表中查找索引1 ,得到项目yeb ,然后按这些值对它们的列表进行排序:

>>> l = [ ['x', 'y', 'z'], ['d', 'e', 'f'], ['a', 'b', 'c'] ]
>>> x = sorted(l, key=itemgetter(1))
>>> x
[['a', 'b', 'c'], ['d', 'e', 'f'], ['x', 'y', 'z']]

试试用字符串列表代替列表列表:

>>> l = [ 'xyz', 'def', 'abc' ]
>>> x = sorted(l, key=itemgetter(1))
>>> x
['abc', 'def', 'xyz']

这也适用于二进制文件,因为二进制文件允许通过键值而不是索引进行访问。让我们通过sym (符号)键值对化学元素二进制文件列表进行排序:

>>> from operator import itemgetter
>>> l = [ {'sym': 'C', 'wt': 12},
... {'sym': 'H', 'wt': 1},
... {'sym': 'Be', 'wt': 9}
... ]
>>> f = itemgetter('sym')
>>> x = sorted(l, key=f)
>>> x
[{'sym': 'Be', 'wt': 9}, {'sym': 'C', 'wt': 12}, {'sym': 'H', 'wt': 1}]
>>> f = itemgetter('wt')
>>> x = sorted(l, key=f)
>>> x
[{'sym': 'H', 'wt': 1}, {'sym': 'Be', ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

生成式人工智能设计模式

生成式人工智能设计模式

Valliappa Lakshmanan, Hannes Hapke

Publisher Resources

ISBN: 9798341668898