Skip to Content
精通数据科学算法
book

精通数据科学算法

by Posts & Telecom Press, David Natingga
May 2024
Intermediate to advanced
181 pages
3h 9m
Chinese
Packt Publishing
Content preview from 精通数据科学算法

前言

数据科学是一门有关机器学习、统计学与数据挖掘的交叉学科,它的目标是通过算法和统计分析方法从现存数据中获取新知识。在本书中,你将会学习数据科学中7种重要的数据分析方法。每章将首先通过一个简单的例子解释某算法或分析某概念,然后用更多的例子与练习建立与拓展一些特殊的分析 方法。

第1章,用k最近邻算法解决分类问题,基于 k 个最相似的项对数据项分类。

第2章,朴素贝叶斯,学习用贝叶斯定理来计算某个数据项属于某一个特定类的概率。

第3章,决策树,将决策准则整理、归纳成树的分支,并用一个决策树将数据项分类到叶节点所在的类中。

第4章,随机森林,用决策树集成的方式来划分数据项,通过减少偏差的负面影响来提高算法的准确率。

第5章,k-means聚类,将数据划分成 k 个簇来寻找模式和数据项之间的相似度,并应用这些模式划分新的数据。

第6章,回归分析,通过一个方程对数据进行建模,并以这种简单的方式对未知数据进行预测。

第7章,时间序列分析,通过揭示依赖时间的数据的发展趋势和重复模式来预测未来的股票市场、比特币价格和其他的时间事件。

附录A,统计,提供一个对数据科学家实用的统计方法和分析工具的概要。

附录B,R参考,涉及基本的R语言结构。

附录C,Python 参考,涉及基本的Python语言结构、整本书所用到的命令和函数。

附录D,数据科学中的算法和方法术语,提供数据科学与机器学习领域中一些非常重要并且实用的算法和方法术语。

最重要的是,保持一个积极的态度去思考问题——许多新的知识隐藏在练习中。同时,你也需要在自己选择的系统中运行Python和R程序。本书的作者是在Linux操作系统中使用命令行来运行编程语言的。

本书是为熟悉 Python ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学原理

数据科学原理

Posts & Telecom Press, Sinan Ozdemir
PyTorch深度学习

PyTorch深度学习

Posts & Telecom Press, Vishnu Subramanian
程序员学数据结构

程序员学数据结构

Posts & Telecom Press, William Smith
可编程网络自动化

可编程网络自动化

Jason Edelman, Scott S. Lowe, Matt Oswalt

Publisher Resources

ISBN: 9781836204596