Skip to Content
数据科学实战手册
book

数据科学实战手册

by Posts & Telecom Press, Tony Ojeda, Sean Patrick Murphy, Bengfort Benjamin
May 2024
Intermediate to advanced
357 pages
5h 3m
Chinese
Packt Publishing
Content preview from 数据科学实战手册

第3章 模拟美式橄榄球比赛数据(R)

本章中,我们将完成以下几部分工作。

  • 获取和清理美式橄榄球比赛数据
  • 分析和理解美式橄榄球比赛数据
  • 构建度量攻防能力的指标
  • 模拟单场由程序决定胜负的比赛
  • 模拟多场由计算决定胜负的比赛

美式橄榄球在美国是最受欢迎的竞技运动,也是全世界第九受欢迎的体育运动。每年,橄榄球迷们都在期待9月开始的新一轮赛季:17周的常规赛,第二年1月开始的季后赛,以及第二年1月底或者2月初举行的争夺总冠军的超级碗。

我们首先通过一些统计指标初步认识和理解本章中所使用的橄榄球比赛数据,并通过比较队伍间的统计指标判定比赛的胜利者。随后,我们可以利用这些指标模拟单场和多场橄榄球比赛。在众多模拟比赛的方式中,虽然我们可以通过收集和利用球队中每个选手参加每场比赛的数据来模拟出非常详实的单场比赛,但是像这样如此细致的模拟仅仅在制作橄榄球电动游戏时是必需的。在本章中,我们会采取比较简单的方式:利用球队粒度的统计数据,就已经可以有效地决定比赛中哪一支队伍应该获得胜利。

本章的目的是展示如何实现一个完整的数据科学项目,包括从网站获取数据,提出指标、公式和计算方法,并解释不同的现实场景。最后,我们可以利用历史数据中总结出来的信息模拟未来的比赛。为了显示R不仅可以进行统计建模,而且是一门编程语言,我们选择使用R来实现项目中的数据获取、处理和显示。

本章中依然遵循数据科学项目实现流程,但会因为处理不同的数据类型和任务类型而做适当的修改。

为了完成本章的数据科学项目,你需要一个可以访问互联网的计算机,并且这台计算机上已经安装了R语言和以下R包。

install.packages("XML") install.packages("RSQLite") install.packages("stringr") ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Python编程入门与实战

Python编程入门与实战

Posts & Telecom Press, Fabrizio Romano
软件开发实践:项目驱动式的Java开发指南

软件开发实践:项目驱动式的Java开发指南

Raoul-Gabriel Urma, Richard Warburton
编写整洁的Python代码(第2版)

编写整洁的Python代码(第2版)

Posts & Telecom Press, Mariano Anaya

Publisher Resources

ISBN: 9781836206774