book

数据驱动力：企业数据分析实战

Name: 数据驱动力：企业数据分析实战
Author: Carl Anderson
ISBN: 9787115560179

by Carl Anderson

April 2021

Intermediate to advanced

210 pages

6h 3m

Chinese

Posts & Telecom Press

Read now

Unlock full access

封面
扉页
版权
版权声明
O'Reilly Media, Inc.介绍
目录
中文版赞誉
前言
概要
读者对象
篇章结构排版约定
O’Reilly在线学习平台（O’Reilly Online Learning）
联系我们

Content preview from 数据驱动力：企业数据分析实战

183

附录A

关于数据不合理的有效性：

为什么数据越多越好

本附录复制自作者的同名博客文章（做了轻微的改动和校正）。

谷歌的

Halevy

、

Norvig

和

Pererira

在“

The Unreasonable Eectiveness of Data

”一文中声

称，当语料库达到互联网规模时，有趣的事情就会发生：

基于大量数据的简单模型胜过基于较少数据的精确模型。

在这篇文章中以及

Norvig

的一场更详尽的技术讲座中

，他们论证了，当语料库包含亿万个

或百万亿个训练样本或单词时，即使是基于基本独立假设的非常简单的模型，也比用更少

的数据得到的复杂模型（比如那些根据精心设计的本体建立的模型）要强。不过他们没怎

么解释

为何

数据越多越好，本附录就来探究其中原因。

我认为有几类问题和原因可以解释为什么数据越多越好。

A.1

　最近邻类型问题

第一类是

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9787115560179

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

数据驱动力：企业数据分析实战

by Carl Anderson

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.