Another app: Apache Spark

Now that we have acquired some practice using services, we step up to the next level. We'll deploy Apache Spark on Swarm. Spark is an open source cluster computing framework from the Apache foundation, which is mainly used for data processing.

Spark may be (but not limited to) used for things, such as:

  • Analysis of big data (Spark Core)
  • Fast and scalable data structured console (Spark SQL)
  • Streaming analytics (Spark Streaming)
  • Graph processing (Spark GraphX)

Here we will focus mainly on the infrastructural part of Swarm. If you want to learn how to program or use Spark in detail, read Packt's selection of books on Spark. We suggest starting with Fast Data Processing with Spark 2.0 - Third Edition.

Spark is a neat and clear alternative ...

Get Native Docker Clustering with Swarm now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.