July 2017
Intermediate to advanced
796 pages
18h 55m
English
combineByKey is very similar to aggregateByKey; in fact, combineByKey internally invokes combineByKeyWithClassTag, which is also invoked by aggregateByKey. As in aggregateByKey, the combineByKey also works by applying an operation within each partition and then between combiners.
combineByKey turns an RDD[K,V] into an RDD[K,C], where C is a list of Vs collected or combined under the name key K.
There are three functions expected when you call combineByKey.
Read now
Unlock full access