Skip to Content
Python数据处理
book

Python数据处理

by Jacqueline Kazil, Katharine Jarmul
July 2017
Intermediate to advanced
398 pages
11h 54m
Chinese
Posts & Telecom Press
Content preview from Python数据处理
308
14
的,它同时提供了灵活性。
除了这些相当简单直接的方式来解析数据,给脚本额外的信息,你也可以使用更加复杂和
分布式的方式,比如基于云计算的数据和数据存储。我们会在下面查看这部分内容。
14.5.2
 在数据处理中使用云
这个名词通常用来代表一个资源共享池,例如服务器。有许多公司提供云服务——亚马
逊网络服务(
AWS
),是最著名的云服务提供商之一。
云这个词经常被过度使用。如果你正在云服务器上运行代码,最好说“我正
在一个服务器上执行它”,而不是“我在云上执行它”。
什么时候适合使用云?如果数据太大,不能在自己的计算机上执行,或者程序需要很长的
时间执行,云都是一个很好的处理方式。将大多数想要自动化的任务放到云上执行,这样
你不用担心电脑打开或关闭时脚本是否在运行。
如果选择
AWS
,第一次登录后会看到许多不同的服务选择。在数据处理中你只需要几种
服务(见表
14-1
)。
14-1AWS云服务
服务 在数据处理中的目的
简单存储服务(
S3
一个简单的文件存储服务,用来备份数据文件(
JSON
XML
等)
弹性计算(
EC2
一个按需分配的服务器。这是运行脚本的地方
弹性
MapReduce
EMR
通过一个托管的
Hadoop
框架提供分布式的数据处理进程
这些是你需要熟悉的基本的
AWS
服务。还有各种各样的竞争者,包括
IBM
Bluemix
沃特森开发者云(让你得以使用各种大型数据平台,包括沃特森的逻辑和自然语言处理能
力)。你还可以使用
DigitalOcean
Rackspace
,它们提供了廉价的云资源。 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学中的实用统计学(第2版)

数据科学中的实用统计学(第2版)

Peter Bruce, Andrew Bruce, Peter Gedeck
Java持续交付

Java持续交付

Daniel Bryant, Abraham Marín-Pérez
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115459190