Video description
In this event, we'll examine Spark SQL, a new Alpha component that is part of the Apache Spark 1.0 release. Spark SQL lets developers natively query data stored in both existing RDDs and external sources such as Apache Hive. A key feature of Spark SQL is the ability to blur the lines between relational tables and RDDs, making it easy for developers to intermix SQL commands that query external data with complex analytics. In addition to Spark SQL, we'll explore the Catalyst optimizer framework, which allows Spark SQL to automatically rewrite query plans to execute more efficiently.
Product information
- Title: Performing Advanced Analytics on Relational Data with Spark SQL
- Author(s):
- Release date: July 2014
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 978149190828
You might also like
video
Querying NoSQL with SQL
An experienced C#, ASP.NET, JavaScript, and PHP developer demonstrates how SQL can be applied to NoSQL …
video
Mastering Big Data Analytics with PySpark
PySpark helps you perform data analysis at-scale; it enables you to build more scalable analyses and …
book
SQL for Data Analysis
With the explosion of data, computing power, and cloud data warehouses, SQL has become an even …
video
From 0 to 1: Hive for Processing Big Data
End-to-End Hive: HQL, Partitioning, Bucketing, UDFs, Windowing, Optimization, Map Joins, Indexes About This Video Analytical Processing: …