July 2017
Intermediate to advanced
796 pages
18h 55m
English
Rollup is a multi-dimensional aggregation used to perform hierarchical or nested calculations. For example, if we want to show the number of records for each State+Year group, as well as for each State (aggregating over all years to give a grand total for each State irrespective of the Year), we can use rollup as follows:
scala> statesPopulationDF.rollup("State", "Year").count.show(5)+------------+----+-----+| State|Year|count|+------------+----+-----+|South Dakota|2010| 1|| New York|2012| 1|| California|2014| 1|| Wyoming|2014| 1|| Hawaii|null| 7|+------------+----+-----+
The rollup calculates the count for state and year, such as California+2014, as well as California state (adding up all years).
Read now
Unlock full access