November 2017
Beginner to intermediate
290 pages
7h 34m
English
The top hashtags computation is, essentially, a windowed word count that only outputs the n largest counts for each window. As shown in the DAG figure graph, the operation is split into countByKey and topN and both transforms use existing operators and accumulations. countByKey is implemented as a key based window operator with SumLong accumulation:
KeyedWindowedOperatorImpl<String, Long, MutableLong, Long> countByKey = new KeyedWindowedOperatorImpl<>(); countByKey.setAccumulation(new SumLong()); countByKey.setDataStorage(new InMemoryWindowedKeyedStorage<String, MutableLong>()); countByKey.setWindowOption(new WindowOption.TimeWindows(Duration.standardMinutes(5))); countByKey.setWindowStateStorage(new InMemoryWindowedStorage<WindowState>()); ...
Read now
Unlock full access