July 2017
Intermediate to advanced
796 pages
18h 55m
English
Skewness measures the asymmetry of the values in your data around the average or mean.
The skewness API has several implementations, as follows. The exact API used depends on the specific use case.
def skewness(columnName: String): ColumnAggregate function: returns the skewness of the values in a group.def skewness(e: Column): ColumnAggregate function: returns the skewness of the values in a group.
Let's look at an example of invoking skewness on the DataFrame on the Population column:
import org.apache.spark.sql.functions._scala> statesPopulationDF.select(skewness("Population")).show+--------------------+|skewness(Population)|+--------------------+| 2.5675329049100024|+--------------------+
Read now
Unlock full access