Index
A
adduser command
allBigData
Apache Hadoop
Apache Hive
code execution flow
HiveQL commands
RDBMS
SQL
Apache Kafka
broker
consumer
development
message flow
producer
Apache Mesos cluster manager
Apache Pig
Apache Spark
description
GraphFrames
MLlib
Resilient Distributed Datasets
Apache Storm
Apache Tez
Atomicity, Consistency, Isolation and Durability (ACID) principles
B
Big Data
Apache Hadoop
Apache Storm
Apache Tez
variety
velocity
veracity
volume
Big Data frameworks
Breadth first algorithm
API
path-finding algorithms
C
Cassandra
Cassandra installation
Catalyst optimizer
Cluster By clause
Cluster managers
Apache Mesos
distributed system
standalone
YARN
Comma-separated value (CSV) file
DataFrame
header argument
inferSchema argument
reading
spark.read.csv() function
swimmerData.csv
Console sink ...
Get PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.