July 2017
Intermediate to advanced
796 pages
18h 55m
English
Cube is a multi-dimensional aggregation used to perform hierarchical or nested calculations just like rollup, but with the difference that cube does the same operation for all dimensions. For example, if we want to show the number of records for each State and Year group, as well as for each State (aggregating over all Years to give a grand total for each State irrespective of the Year), we can use rollup as follows. In addition, cube also shows a grand total for each Year (irrespective of the State):
scala> statesPopulationDF.cube("State", "Year").count.show(5)+------------+----+-----+| State|Year|count|+------------+----+-----+|South Dakota|2010| 1|| New York|2012| 1|| null|2014| 50|| Wyoming|2014| 1|| Hawaii|null| 7|+------------+----+-----+ ...Read now
Unlock full access