July 2017
Intermediate to advanced
796 pages
18h 55m
English
UDFs define new column-based functions that extend the functionality of Spark SQL. Often, the inbuilt functions provided in Spark do not handle the exact need we have. In such cases, Apache Spark supports the creation of UDFs, which can be used.
Let's go through an example of an UDF which simply converts State column values to uppercase.
First, we create the function we need in Scala.
import org.apache.spark.sql.functions._scala> val toUpper: String => String = _.toUpperCasetoUpper: String => String = <function1>
Then, we have to encapsulate the created function inside the udf to create the UDF.
scala> val toUpperUDF ...
Read now
Unlock full access