Skip to Content
Mastering Apache Cassandra 3.x - Third Edition
book

Mastering Apache Cassandra 3.x - Third Edition

by Aaron Ploetz, Tejaswi Malepati
October 2018
Beginner to intermediate content levelBeginner to intermediate
348 pages
10h
English
Packt Publishing
Content preview from Mastering Apache Cassandra 3.x - Third Edition

PySpark

PySpark is an interactive CLI, built-in with Spark, which provides the Python way of developing for processing large amounts of data, either from a single source or aggregating from multiple sources. This is one of the most widely-used CLIs for data interaction. It has a much wider community, due to its simplicity in developing data-processing applications from five different sources. It can achieve this more efficiently and with less effort for developing in Python than Scala, R, or Java.

PySpark can be found in the bin directory of the binary installations. Moreover, this can be directly run in local or pseudo mode, where all of the resources of an instance can be directly used. But as PySpark is an application CLI for spark, there ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Learn Apache Cassandra in Just 2 Hours

Learn Apache Cassandra in Just 2 Hours

Navdeep Kaur

Publisher Resources

ISBN: 9781789131499Supplemental Content