Index
A
- actions, Bringing Data from the Cluster to the Client
- ADAM (genomics tool set)
- ingesting data with ADAM CLI, Ingesting Genomics Data with the ADAM CLI-Parquet Format and Columnar Storage
- Parquet format and columnar storage, Parquet Format and Columnar Storage-Parquet Format and Columnar Storage
- aggregations, computing with DataFrame, Analyzing Data with the DataFrame API
- alternating least squares (ALS) algorithm, The Alternating Least Squares Recommender Algorithm-The Alternating Least Squares Recommender Algorithm
- anomaly detection
- about, Anomaly Detection
- categorical variables, Categorical Variables
- choosing K, Choosing k-Choosing k
- clustering of full normalized data set, Clustering in Action
- data visualization with SparkR, Visualization with SparkR-Visualization with SparkR
- defined for K-means clustering, Clustering in Action
- feature normalization, Feature Normalization
- initial clustering attempt, A First Take on Clustering-A First Take on Clustering
- K-means clustering for, Network Intrusion-Where to Go from Here
- KDD Cup 1999 data set, KDD Cup 1999 Data Set
- network intrusion, Network Intrusion
- using labels with entropy, Using Labels with Entropy
- Apache Maven, Getting Started: The Spark Shell and SparkContext
- area under the curve (AUC)
- computing for recommendation engine, Computing AUC
- defined, Evaluating Recommendation Quality
- Audioscrobbler data set, Data Set
- average path length, Computing Average Path Length with Pregel-Computing Average Path Length with Pregel
- Avro, Decoupling Storage from ...
Get Advanced Analytics with Spark, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.