In this section, we will use the Dataset API in an immutable way. We will cover the following topics:
- Dataset immutability
- Creating two leaves from the one root dataset
- Adding a new column by issuing transformation
The test case for the dataset is quite similar, but we need to do a toDS() for our data to be type safe. The type of dataset is userData, as shown in the following example:
import com.tomekl007.UserDataimport org.apache.spark.sql.SparkSessionimport org.scalatest.FunSuiteclass ImmutableDataSet extends FunSuite { val spark: SparkSession = SparkSession .builder().master("local[2]").getOrCreate()test("Should use immutable DF API") { import spark.sqlContext.implicits._ //given val userData ...