July 2017
Intermediate to advanced
796 pages
18h 55m
English
Spark 2.x supports a different way of defining schema for complex data types. First, let's look at a simple example.
Encoders must be imported using the import statement in order for you to use Encoders:
import org.apache.spark.sql.Encoders
Let's look at a simple example of defining a tuple as a data type to be used in the dataset APIs:
scala> Encoders.product[(Integer, String)].schema.printTreeStringroot |-- _1: integer (nullable = true) |-- _2: string (nullable = true)
The preceding code looks complicated to use all the time, so we can also define a case class for our need and then use it. We can define a case class Record with two fields-an Integer and a String:
scala> case class Record(i: Integer, s: String)defined class Record ...
Read now
Unlock full access