O'Reilly logo

Learning Real-time Processing with Spark Streaming by Sumit Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 1. Installing and Configuring Spark and Spark Streaming

Apache Spark (http://spark.apache.org/) is a general-purpose, open source cluster computing framework developed in the AMPLab in UC Berkeley in 2009.

The emergence of Spark has not only opened new data processing possibilities for a variety of business use cases but at the same time introduced a unified platform for performing various batch and real-time operations using a common framework. Depending on the user and business needs, data can be consumed or processed every second (even less) or maybe every day, which is in harmony with the needs of enterprises.

Spark, being a general-purpose distributed processing framework, enables Rapid Application Development (RAD) and at the same ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required