Index
A
addAccumulator method
addInPlace method
Alternating least square (ALS)
awaitTermination()method
B
Batch processing
Big Data systems, Spark
acyclic graph
canonical word-count
MapReduce programming model
Samza messages
sensor network
SQL to NoSQL
stream-processing system
Web 2.0 applications
local Execution
.sbt file
standalone cluster mode
YARN
C
cache() function
Call data record (CDR)
Case-class method
Cassandra Query Language (CQL)
ChiSqSelector
Chi-square selection
Clickstream Dataset
Collaborative filtering
compute() method
createCombiner function
Custom receiver
HttpInputDStream
receiver interface method
D
Data frame
avoid shuffling
cache aggressively
MLlib
persistence
query transformation
action
aggregation expression
cube operation
DataFrameNaFunctions
DataFrameStatFunctions ...
Get Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.