Skip to Content
Python数据处理
book

Python数据处理

by Jacqueline Kazil, Katharine Jarmul
July 2017
Intermediate to advanced
398 pages
11h 54m
Chinese
Posts & Telecom Press
Content preview from Python数据处理
xiv
前言
试的技巧,你可以翻到那里看一下。
不适合阅读本书的读者
本书肯定不适合经验丰富的
Python
程序员,他们已经知道数据处理任务需要用到哪些库和
技术。(对于这些人,我们推荐
Wes McKinney
写的《利用
Python
进行数据分析》。)如果
你是经验丰富的
Python
开发者,或使用过
Scala
R
等其他具有数据分析能力的语言,本
书可能也不适合你。但如果你是经验丰富的
Web
语言开发者,使用的
PHP
JavaScript
语言本身缺乏数据分析能力,那么本书可以通过数据处理来教你
Python
的知识。
本书结构
本书的结构沿循一般数据分析项目或故事的整个生命周期。首先提出一个问题,然后获取
数据、清洗数据、探索数据、传达数据中的发现、扩展到更大的数据集,最后将整个过程
自动化。这种方法可以让你从简单的问题逐步过渡到更复杂的问题和研究。我们会先讲传
达数据中发现的基本方法,然后再讲数据采集的高级技巧。
如果对某些章节的内容比较熟悉,你也可以将本书当作参考,或者跳过那些章节。但我们
建议你大致浏览一下每一章节的内容,确保没有错过新的资源与技术。
什么是数据处理
数据处理是指将杂乱的或未加工的数据源转换成有用的信息。先寻找原始数据源,并判断
其价值:这些数据集的数据质量有多好?它们与你的目标是否相关?能否找到更好的数
据源?在对数据进行解析与清洗后,数据集变得可用,这时你可以利用工具和方法(如
Python
脚本)来帮你分析数据,并以报告的形式展示结果。这样你可以将无人问津的数据
变得清晰可用。
遇到困难怎么办
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学中的实用统计学(第2版)

数据科学中的实用统计学(第2版)

Peter Bruce, Andrew Bruce, Peter Gedeck
Java持续交付

Java持续交付

Daniel Bryant, Abraham Marín-Pérez
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115459190