September 2018
Intermediate to advanced
472 pages
12h 2m
English
Making a full installation of Apache Spark is not an easy task to do from scratch. This is usually accomplished on a cluster of computers, often accessible on the cloud, and it is delegated to experts of the technology (namely, data engineers). This could be a limitation, because you may then not have access to an environment in which to test what you will be learning in this chapter.
However, in order to test the contents of this chapter, you actually do not need to make too-complex installations. By using Docker (https://www.docker.com/), you can have access to an installation of Spark, together with a Jupyter notebook and PySpark, on a Linux server on your own computer (it does not matter if it is a ...
Read now
Unlock full access