Connecting to social networks
Let's delve into the first steps of the data-intensive app architecture's integration layer. We are going to focus on harvesting the data, ensuring its integrity and preparing for batch and streaming data processing by Spark at the next stage. This phase is described in the five process steps: connect, correct, collect, compose, and consume. These are iterative steps of data exploration that will get us acquainted with the data and help us refine the data structure for further processing.
The following diagram depicts the iterative process of data acquisition and refinement for consumption:
We connect to the social networks ...