So that we can develop a .NET for Apache Spark application, we need to install Apache Spark on our development machines and then configure .NET for Apache Spark so that our application executes correctly. When we run our Apache Spark application in production, we will use a cluster, either something like a YARN cluster or using a fully managed environment such as Databricks. When we develop applications, we use the same version of Apache Spark locally as we would when we run against a cluster of many machines. Having the same version on our development machines means that when we develop and test ...
2. Setting Up Spark
Get Introducing .NET for Apache Spark: Distributed Processing for Massive Datasets now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.