Skip to Content
数据科学实战手册(R+Python)(第2版)
book

数据科学实战手册(R+Python)(第2版)

by Posts & Telecom Press, Prabhanjan Narayanachar Tattar, Tony Ojeda, Sean Patrick Murphy, Sean P Murphy, Benjamin Bengfort, Bengfort Benjamin, ABHIJIT DASGUPTA
May 2024
Intermediate to advanced
326 pages
4h 55m
Chinese
Packt Publishing
Content preview from 数据科学实战手册(R+Python)(第2版)

第6章 汽车数据可视化(基于Python)

本章包含以下主要内容。

  • IPython入门。
  • 熟悉Jupyter Notebook。
  • 为分析汽车燃料效率做好准备。
  • 用Python探索并描述汽车燃料效率数据。
  • 用Python分析汽车燃料效率随时间变化情况。
  • 用Python研究汽车的品牌和型号。

在介绍R语言的本书第2章中,我们介绍了一个运用R语言分析汽车燃油经济数据的项目。相关网站上的数据集包含美国各品牌型号的汽车在不同时间点的燃料效率指标,以及丰富的其他特性和属性,这为我们整理和分析数据以发现有趣的趋势和关系提供了机会。

与之前章节讲述的R语言不同的是,本章我们全部使用Python进行分析。但仍然依照数据科学的分析流程,按照相同的步骤,解决之前的问题。通过本章的学习,你会看到两种语言在进行几乎相同分析时的相似点和区别。

在之前的章节中,我们主要用纯Python代码,也使用了一部分NumPy和SciPy的功能,通过Python命令行——又称为Read-Eval-Print Loop(REPL)——或者可执行的脚本文件来实现我们的分析。而在本章中,我们将见识到Python作为脚本语言的另一种不同用法——一种更类似于R语言的交互式方式。这里,我们将向读者介绍Python的非官方交互式环境IPython和Jupyter notebook,并说明如何在这个环境下编写可读性强、记录详尽的分析脚本。此外,我们还将利用相对较新且功能强大的pandas库的数据分析能力以及它提供的极为有用的数据框数据类型。pandas使得我们可以用少量代码完成复杂的任务,这种方法的不便之处就是需要学习pandas库本身包含的完全不同的API。当然,它可以帮我们节省很多与数据操作相关的重复工作。

本章的目的不是指导你重复一个已经完成过的项目,而是向你展示如何用另一种语言完成该项目。更重要的是,我们希望可以借此使读者对自己代码和分析进行思考。不仅考虑如何完成,而且还要了解为何在特定语言中要用某种特定的实现方式,以及程序语言是如何影响分析思路的。 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Hadoop管理手冊

Hadoop管理手冊

Eric Sammer
HBase管理指南

HBase管理指南

Posts & Telecom Press, Yifeng Jiang
编写整洁的Python代码(第2版)

编写整洁的Python代码(第2版)

Posts & Telecom Press, Mariano Anaya
TensorFlow深度学习项目实战

TensorFlow深度学习项目实战

Posts & Telecom Press, Luca Massaron, Alberto Boschetti, Alexey Grigorev, Abhishek Thakur

Publisher Resources

ISBN: 9781836202370