Skip to Content
Python数据处理
book

Python数据处理

by Jacqueline Kazil, Katharine Jarmul
July 2017
Intermediate to advanced
398 pages
11h 54m
Chinese
Posts & Telecom Press
Content preview from Python数据处理
146
7
确匹配包含芝士汉堡(
cheeseburger
)的回答,即使二者是完全相同的。你能用我们已经学
过的另一个方法来做到这一点么?
FuzzyWuzzy
提供的最后一个匹配方法是
process
模块。如果你只有有限的几个选项和杂乱
的数据,这个模块是很有用的。比如说回答只有
yes
no
maybe
decline to comment
种。我们看一下如何对其匹配:
from
fuzzywuzzy
import
process
choices = ['Yes', 'No', 'Maybe', 'N/A']
process.extract('ya', choices, limit=2)
process.extractOne('ya', choices)
process.extract('nope', choices, limit=2)
process.extractOne('nope', choices)
利用
FuzzyWuzzy
extract
方法,将字符串与可能匹配的列表依次比较。函数返回的
choices
列表中两个可能的匹配。
利用
FuzzyWuzzy
extractOne
方法,返回
choices
列表中与我们的字符串对应的最佳
匹配。
啊哈!给定几个单词,我们事先知道其“含义”相同,
process
可以找出最佳猜测——在
上面的例子中也是正确的猜测。
extract
返回的是带有比例的元组,代码对回答字符串进
行解析,并对其相似之处和不同之处作比较。
extractOne ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学中的实用统计学(第2版)

数据科学中的实用统计学(第2版)

Peter Bruce, Andrew Bruce, Peter Gedeck
Java持续交付

Java持续交付

Daniel Bryant, Abraham Marín-Pérez
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115459190