© Benjamin Weissman and Enrico van de Laar 2019
B. Weissman, E. van de LaarSQL Server Big Data Clusters https://doi.org/10.1007/978-1-4842-5110-2_6

6. Working with Spark in Big Data Clusters

Benjamin Weissman1  and Enrico van de Laar2
(1)
Nurnberg, Germany
(2)
Drachten, The Netherlands
 

So far, we have been querying data inside our SQL Server Big Data Cluster using External Tables and TSQL code. We do, however, have another method available to query data that is stored inside the HDFS filesystem of your Big Data Cluster. As you have read in Chapter 2, “Big Data Cluster Architecture,” Big Data Clusters also have Spark included in the architecture, meaning we can leverage the power of Spark to query data stored inside our Big Data Cluster.

Spark is a very ...

Get SQL Server Big Data Clusters: Early First Edition Based on Release Candidate 1 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.