O'Reilly logo

Spark for Python Developers by Amit Nandi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Connecting to social networks

Let's delve into the first steps of the data-intensive app architecture's integration layer. We are going to focus on harvesting the data, ensuring its integrity and preparing for batch and streaming data processing by Spark at the next stage. This phase is described in the five process steps: connect, correct, collect, compose, and consume. These are iterative steps of data exploration that will get us acquainted with the data and help us refine the data structure for further processing.

The following diagram depicts the iterative process of data acquisition and refinement for consumption:

Connecting to social networks

We connect to the social networks ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required