Skip to Content
数据科学中的实用统计学(第2版)
book

数据科学中的实用统计学(第2版)

by Peter Bruce, Andrew Bruce, Peter Gedeck
October 2021
Intermediate to advanced
289 pages
8h 31m
Chinese
Posts & Telecom Press
Content preview from 数据科学中的实用统计学(第2版)
102
3
r
×
c
表示行数乘以列数——一个
2
×
3
的表格有
2
3
列。
3.9.1
 卡方检验
一种重抽样方法
假设你要测试三个不同的标题——
A
B
C
。你对每个标题都使用
1000
名访问者进行了
测试,结果如表
3-4
所示。
3-4:三个标题的Web测试结果
标题A 标题B 标题C
点击
14 8 12
没有点击
986 992 988
这些标题看上去差异明显,标题
A
的点击率差不多是标题
B
的两倍,不过,实际数值有
点儿小了。重抽样可以检验这些点击率之间的差异是否超过了随机因素的影响范围。要进
行这种检验,我们需要一种点击的“预期”分布。在这个例子中,零假设认为这三个标题
具有同样的点击率,即总体点击率为
34/3000
,而预期分布就是建立在这个零假设之上的
根据零假设,点击率的列联表应该类似于表
3-5
3-5:三个标题具有相同点击率(零假设)时的预期分布
标题A 标题B 标题C
点击
11.33 11.33 11.33
没有点击
988.67 988.67 988.67
皮尔逊残差
定义如下:
预期值
R
=
观测 预值 期值
R
测量的是实际计数与预期计数之间的差异程度(见表
3-6
)。
3-6:皮尔逊残差
标题A 标题B 标题C
点击
0.792 –0.990 0.198
没有点击
–0.085 0.106 –0.021
卡方统计量就定义为皮尔逊残差的平方和:
2
rc
ij
XR=
∑∑
统计实验与显著性检验
103
其中
r
c
分别是行数和列数。本例中卡方统计量的值是
1.666
。它是否超出了随机模型结
果的合理范围? ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python机器学习案例精解

Python机器学习案例精解

Posts & Telecom Press, Yuxi (Hayden) Liu

Publisher Resources

ISBN: 9787115569028