Skip to Content
Python数据处理
book

Python数据处理

by Jacqueline Kazil, Katharine Jarmul
July 2017
Intermediate to advanced
398 pages
11h 54m
Chinese
Posts & Telecom Press
Content preview from Python数据处理
173
9
数据探索和分析
既然已经花费了一些时间获取和清洗数据,你可以开始做数据分析了!不要对结果有太多
期望,这对于你的数据探索过程来说很重要。你的问题可能对于某个答案来说太宽泛,也
可能没有结论性的答案。回想一下你在第一节自然科学课程中学到的有关假设和结论的知
识。最好用同样的方法来进行数据探索,并且要理解,在数据分析中你可能不会得到一个
清晰的结论。
尽管如此,只是去探索数据并发现数据中没有趋势或者趋势不符合预期,这就很有趣。如
果一切都如我们所愿,数据处理会变得有些无聊。我们已学会少一点期待,多一点探索。
当你开始分析和探索数据时,可能会意识到需要更多的或不同种类的数据。
在你更深入地定义想要回答的问题,并检验数据告诉你什么的过程中,这是
很常见的一种情况,你需要接受。
现在也非常适合回顾一下你最初发现数据集时所提出的问题。你想知道什么?是否还有其
他相关的问题有助于你的探索?这些问题可能会指出方向,告诉你在哪能找到故事。即使
没有,这些问题也会指引你发现另外一些有趣的问题。即使你不能回答最初的问题,也能
够对话题有更深入的了解,并发现新的问题去探索。
在这一章,我们会学习一些新的用于数据探索和分析的
Python
库,并且继续应用我们在前
两章学到的清洗数据的知识。我们会学习如何合并数据集,探索数据,得到有关数据集中
关系的统计学结论。
9.1
 探索数据
在前两章你已经学习了解析和清洗数据,想必已经很熟悉用
Python
来与数据交互了。现在
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学中的实用统计学(第2版)

数据科学中的实用统计学(第2版)

Peter Bruce, Andrew Bruce, Peter Gedeck
Java持续交付

Java持续交付

Daniel Bryant, Abraham Marín-Pérez
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115459190