Index
A
addAccumulator method
addInPlace method
Alternating least square (ALS)
awaitTermination()method
B
Batch processing
Big Data systems, Spark
acyclic graph
canonical word-count
MapReduce programming model
Samza messages
sensor network
SQL to NoSQL
stream-processing system
Web 2.0 applications
local Execution
.sbt file
standalone cluster mode
YARN
C
cache() function
Call data record (CDR)
Case-class method
Cassandra Query Language (CQL)
ChiSqSelector
Chi-square selection
Clickstream Dataset
Collaborative filtering
compute() method
createCombiner function
Custom receiver
HttpInputDStream
receiver interface method
D
Data frame
avoid shuffling
cache aggressively
MLlib
persistence
query transformation
action
aggregation expression
cube operation
DataFrameNaFunctions
DataFrameStatFunctions ...
Get Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.