July 2017
Intermediate to advanced
796 pages
18h 55m
English
The average of the values is calculated by adding the values and dividing by the number of values.
The avg API has several implementations, as follows. The exact API used depends on the specific use case:
def avg(columnName: String): ColumnAggregate function: returns the average of the values in a group.def avg(e: Column): ColumnAggregate function: returns the average of the values in a group.
Let's look at an example of invoking avg on the DataFrame to print the average population:
import org.apache.spark.sql.functions._scala> statesPopulationDF.select(avg("Population")).show+-----------------+| avg(Population)|+-----------------+|6253399.371428572|+-----------------+
Read now
Unlock full access