Skip to Content
Python数据处理
book

Python数据处理

by Jacqueline Kazil, Katharine Jarmul
July 2017
Intermediate to advanced
398 pages
11h 54m
Chinese
Posts & Telecom Press
Content preview from Python数据处理
298
14
自动化和规模化
你已经从
API
和网站爬取了大量的数据,也已经清洗和组织了数据,并且运行了统计学分
析,生成了可视化报告。现在是时候让
Python
大展身手,自动化你的数据处理了。在这一
章中,我们会介绍如何自动化数据分析、收集和发布。我们会学习如何创建合适的日志和
警报,这样可以充分地自动化脚本,得到成功、失败以及工作中碰到的任何问题的通知。
我们还会学习使用
Python
库规模化自动化程序,帮助执行多个任务,并且监控它们的成功
和失败。我们会分析一些库和辅助工具,充分地在云端规模化数据。
Python
提供了大量的自动化和规模化选择。有一些简单直接的任务可以在几乎所有的机
器上自动化执行,而不需要太多的设置,同时也有一些更大型、更复杂的方式来实现自动
化。我们会涉及这两者的示例,同时也会讲解作为数据处理者如何规模化数据自动化。
14.1
 为什么要自动化
自动化提供了一种轻松运行脚本的方式,而不需要在本地机器上执行脚本,甚至不需要你
醒着!自动化的能力意味着你可以把时间花在其他的思维密集型项目上。如果拥有一个完
备的脚本来执行数据清洗,你就可以专注于研究数据,写出更好的报告。
下面是一些自动化可以帮上忙的示例。
每周二输出一些新的分析结果;你要编制一份报告,并将其发送给相关方。
其他部门或同事需要能够在没有你的指导和支持下运行报告工具或清洗工具。
你需要每周进行一次数据下载、清洗和发送。
每次用户请求新报告,报告脚本需要运行,并且在报告生成后通知用户。
你需要每周清洗一次数据库里错误的数据,并且将其备份到其他地方。 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学中的实用统计学(第2版)

数据科学中的实用统计学(第2版)

Peter Bruce, Andrew Bruce, Peter Gedeck
Java持续交付

Java持续交付

Daniel Bryant, Abraham Marín-Pérez
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115459190