March 2019
Beginner to intermediate
182 pages
4h 6m
English
We will go back into our Spark folder, which is spark-2.3.2-bin-hadoop2.7, and start our PySpark binary by typing .\bin\pyspark.
We can see that we've started a shell session with Spark in the following screenshot:

Spark is now available to us as a spark variable. Let's try a simple thing in Spark. The first thing to do is to load a random file. In each Spark installation, there is a README.md markdown file, so let's load it into our memory as follows:
text_file = spark.read.text("README.md")
If we use spark.read.text and then put in README.md, we get a few warnings, but we shouldn't be too concerned about that at the moment, ...
Read now
Unlock full access