Skip to Content
Spark快速大数据分析(第2版)
book

Spark快速大数据分析(第2版)

by Jules S. Damji, Brooke Wenig, Tathagata Das, Denny Lee
November 2021
Intermediate to advanced
340 pages
10h 46m
Chinese
Posts & Telecom Press
Content preview from Spark快速大数据分析(第2版)
16
2
下载并开始使用
Apache Spark
本章将带领你安装
Spark
,并通过
3
个简单的步骤带你入门,编写出自己的第一个独立的
Spark
应用。
在本地模式中,所有的处理都在单台机器上完成。我们将使用本地模式,因为这对于
学习框架而言更简单,而且可以对迭代执行的
Spark
操作提供快速的反馈闭环
。在使用
Spark shell
编写复杂的
Spark
应用前
,你可以在小数据集上用
Spark
操作尝试出原型。但
对于需要强大的分布式执行的大规模数据集来说,本地模式就不太合适了,
Y
ARN
或者
Kubernetes
这些部署模式会更合适。
虽然
Spark shell
只支持
Scala
Python
R
语言
,但在编写
Spark
应用和用
Spark SQL
起查询时,你可以使用支持的任意一种语言(包括
Java
。希望你至少熟悉其中一种语言。
2.1
 第
1
下载
Spark
访问
Spark
下载页面,先在第
1
行下拉菜单中选择
Spark
版本,然后在第
2
行下拉菜单中
选择“
Pre-built
for Apache Hadoop 2.7
”,接着单击第
3
行的“
Download Spark
”链接,如
2-1
所示。
这会下载名为
spark-3.0.0-preview2-bin-hadoop2.7.tgz
的压缩包
,其中包含在笔记本计算机
上跑本地模式
Spark
所需要的所有与
Hadoop
相关的二进制文件
。如果想将
Spark
安装到已
有的
HDFS
Hadoop
集群中,
可以从下拉菜单中选择相匹配的
Hadoop
版本。本书不会介
绍如何从源码编译
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据驱动力:企业数据分析实战

数据驱动力:企业数据分析实战

Carl Anderson
数据压缩入门

数据压缩入门

Colt McAnlis, Aleks Haecky
解密金融数据

解密金融数据

Justin Pauley

Publisher Resources

ISBN: 9787115576019