July 2017
Intermediate to advanced
796 pages
18h 55m
English
Computes the sum of the values of the column. Optionally, sumDistinct can be used to only add up distinct values.
The sum API has several implementations, as follows. The exact API used depends on the specific use case:
def sum(columnName: String): ColumnAggregate function: returns the sum of all values in the given column.def sum(e: Column): ColumnAggregate function: returns the sum of all values in the expression.def sumDistinct(columnName: String): ColumnAggregate function: returns the sum of distinct values in the expressiondef sumDistinct(e: Column): ColumnAggregate function: returns the sum of distinct values in the expression.
Let's look at an example of invoking sum on the DataFrame to print the summation (total) Population.
Read now
Unlock full access