Skip to Content
《高性能 Python》第二版
book

《高性能 Python》第二版

by Micha Gorelick, Ian Ozsvald
May 2025
Intermediate to advanced
468 pages
6h 20m
Chinese
O'Reilly Media, Inc.
Content preview from 《高性能 Python》第二版

第 4 章 词典和集合 字典和集合

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

当数据没有内在顺序(插入顺序除外),但有一个可用于引用的唯一对象(引用对象通常是字符串,但也可以是任何散列类型)时,集合和字典就是理想的数据结构。 这个引用对象被称为 key,而数据则是 value。字典和集合几乎相同,只是集合实际上不包含值:集合只是唯一 key 的集合。顾名思义,集合在进行集合操作时非常有用。

备注

可散列类型是同时实现了 __hash__ 魔术函数和 __eq____cmp__ 的类型。Python 中的所有本地类型都已经实现了这些,任何用户类都有默认值。更多详情请参见"Hash Functions and Entropy"

我们在上一章中看到,对于没有内在顺序的列表/图元,我们最多只能用O(log n) 查找时间(通过搜索操作),而字典和集合则能让我们根据任意索引进行O(1) 查找。此外,与列表/元组一样,字典和集合的插入时间也是O(1)1正如我们将在"字典和集合如何工作?"中看到的,这种速度是通过使用开放地址哈希表作为底层数据结构实现的。

不过,使用字典和集合也有代价。首先,它们通常会占用较大的内存空间。此外,虽然插入/查找的复杂度为O(1) ,但实际速度在很大程度上取决于所使用的散列函数。如果散列函数的评估速度很慢,那么对字典或集合的任何操作也会同样缓慢。

让我们来看一个例子。假设我们要存储电话簿中每个人的联系信息,我们希望将其存储在一个表单中,这样将来回答 "无名氏的电话号码是多少?我们希望将这些信息存储在一个表单中,这样将来在回答 "无名氏的电话号码是多少?"这个问题时就会非常简单。如例 4-1 所示,使用列表时,我们会按顺序存储电话号码和姓名,然后扫描整个列表来查找我们需要的电话号码。

例 4-1. 使用列表查找电话簿
def find_phonenumber(phonebook, name):
    for n, p in phonebook:
        if n == name:
            return p
    return None

phonebook = [
    ("John Doe", "555-555-5555"),
    ("Albert Einstein", "212-555-5555"),
]
print(f"John Doe's phone number is {find_phonenumber(phonebook, 'John Doe')}")
备注

我们还可以对列表进行排序,并使用bisect 模块(来自例 3-4),以获得O(log n) 的性能。

而使用字典时,我们只需将 "索引 "设置为姓名,将 "值 "设置为电话号码即可,如例 4-2 所示。这样,我们就可以简单地查找所需的值,并获得对它的直接引用,而不必读取数据集中的每一个值。

例 4-2. 使用字典查找电话簿
phonebook = {
    "John Doe": "555-555-5555",
    "Albert Einstein" : "212-555-5555",
}
print(f"John Doe's phone number is {phonebook['John Doe']}")

对于大型电话簿, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

《学习 Python》第 5 版

《学习 Python》第 5 版

Mark Lutz
Tokenizing Text

Tokenizing Text

David Beazley

Publisher Resources

ISBN: 9798341657946