Index
A
- actions, Bringing Data from the Cluster to the Client
- ADAM (genomics tool set)
- ingesting data with ADAM CLI, Ingesting Genomics Data with the ADAM CLI-Parquet Format and Columnar Storage
- Parquet format and columnar storage, Parquet Format and Columnar Storage-Parquet Format and Columnar Storage
- aggregations, computing with DataFrame, Analyzing Data with the DataFrame API
- alternating least squares (ALS) algorithm, The Alternating Least Squares Recommender Algorithm-The Alternating Least Squares Recommender Algorithm
- anomaly detection
- about, Anomaly Detection
- categorical variables, Categorical Variables
- choosing K, Choosing k-Choosing k
- clustering of full normalized data set, Clustering in Action
- data visualization with SparkR, Visualization with SparkR-Visualization with SparkR
- defined for K-means clustering, Clustering in Action
- feature normalization, Feature Normalization
- initial clustering attempt, A First Take on Clustering-A First Take on Clustering
- K-means clustering for, Network Intrusion-Where to Go from Here
- KDD Cup 1999 data set, KDD Cup 1999 Data Set
- network intrusion, Network Intrusion
- using labels with entropy, Using Labels with Entropy
- Apache Maven, Getting Started: The Spark Shell and SparkContext
- area under the curve (AUC)
- computing for recommendation engine, Computing AUC
- defined, Evaluating Recommendation Quality
- Audioscrobbler data set, Data Set
- average path length, Computing Average Path Length with Pregel-Computing Average Path Length with Pregel
- Avro, Decoupling Storage from ...
Get Advanced Analytics with Spark, 2nd Edition now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.