Skip to Content
面向数据科学家的实用统计学
book

面向数据科学家的实用统计学

by Peter Bruce, Andrew Bruce
October 2018
Beginner to intermediate
238 pages
6h 32m
Chinese
Posts & Telecom Press
Content preview from 面向数据科学家的实用统计学
8
1
( )
1
2
n p
i
i p
x
x
n p
= +
=
=
切尾均值
切尾均值消除了极值对均值的影响。举个例子,在国际跳水比赛中,会有五名裁判打分,
一名选手的最终得分需要去除其中的最高分和最低分,取余下三名裁判打分的均值
2
。这确
保了裁判难以操纵选手的得分,因为每名裁判可能会偏向自己国家的选手。切尾均值得到
了广泛使用。相对于普通的均值,人们在很多情况下更倾向于使用切尾均值。
1.3.2
节将对
此做出详细的介绍。
另一种均值是
加权均值
。在计算加权均值时,要将每个值
x
i
乘以一个权重值
w
i
,并将加权
值的总和除以权重的总和。计算公式为:
1
n
i i
i
w
n
i
i
w x
x
w
=
=
=
加权均值
使用加权均值,主要是出于以下两个方面的考虑。
一些值本质上要比其他的值更为多变,因此需要对多变的观测值赋予较低的权重。例如,
如果我们需要对来自多个传感器的数据计算均值,但是其中一个传感器的数据不是很准
确,那么我们可对该传感器的数据赋予较低的权重。
所采集的数据可能并未准确地表示我们想要测量的不同群组。例如,受限于在线实验的
开展方式,我们得到的数据集可能并未准确地反映出不同用户群组的情况。为了修正这
一问题,我们可对未准确表示的群组赋予较高的权重。
1.3.2
 中位数和稳健估计量
中位数
是位于有序数据集中间位置处的数值。如果数值的个数为偶数,那么中位数实际上
是位于中间位置处的两个值的均值。不同于使用所有观测值计算得到的均值,中位数仅取
决于有序数据集中间位置处的值。尽管看上去中位数的计算方法存在一些弊端,但是考虑
到均值对数据更敏感,因此在不少实际应用中,中位数依然是更好的位置度量 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

C++语言导学(原书第2版)

C++语言导学(原书第2版)

本贾尼 斯特劳斯特鲁普
基于Python的智能文本分析

基于Python的智能文本分析

Benjamin Bengfort, Rebecca Bilbro, Tony Ojeda

Publisher Resources

ISBN: 9787115493668