Skip to Content
Python数据处理
book

Python数据处理

by Jacqueline Kazil, Katharine Jarmul
July 2017
Intermediate to advanced
398 pages
11h 54m
Chinese
Posts & Telecom Press
Content preview from Python数据处理
数据探索和分析
187
库基于这个算法开发了
agate-stat
库(
https://github.com/onyxfish/agate-stats
)。在这之前,
你可以使用
numpy
进行相关性分析。相关系数(例如皮尔森相关系数)告诉我们数据是否
关联,以及一个因子是否会影响另一个因子。
如果你还没有安装
numpy
,可以通过运行命令
pip install numpy
安装它。之后,使用下面
几行代码计算童工雇用率和政府腐败指数之间的相关性:
import
numpy
numpy.corrcoef(cpi_and_cl.columns['Total (%)'].values(),
cpi_and_cl.columns['CPI 2013 Score'].values())[0, 1]
我们首先得到了类似于之前曾见到的
CastError
异常的错误。因为
numpy
需要浮点型数据,
而不是小数型,所以我们需要将数字转换回浮点数。我们可以使用列表生成式做这个转换:
numpy.corrcoef(
[float(t)
for
t
in
cpi_and_cl.columns['Total (%)'].values()],
[float(s)
for
s
in
cpi_and_cl.columns['CPI 2013 Score'].values()])[0, 1]
我们的输出显示出一些轻微的负相关:
-0.36024907120356736
负相关意味着,一个变量增长,另一个变量会减小。正相关意味着两个变量 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学中的实用统计学(第2版)

数据科学中的实用统计学(第2版)

Peter Bruce, Andrew Bruce, Peter Gedeck
Java持续交付

Java持续交付

Daniel Bryant, Abraham Marín-Pérez
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115459190