O'Reilly logo

Learning Real-time Processing with Spark Streaming by Sumit Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Persisting Log Analysis Data

Systems like Hadoop and Spark have reduced the overall cost of solutions and infrastructure. They have revolutionized the industry with low cost but robust and stable frameworks which are scalable/extendable/customizable and can process a variety of data formats.

There are a bunch of enterprise products for performing ETL (Extraction, Transformation, and Loading) operations but, when we have to deal with the huge amount of data or varied data sources or produce reports in an hour, these tools are not useful. As a result, architects and developers have been evaluating and implementing Hadoop and Spark-like systems just for ETL use cases where the source or raw data is processed or transformed and finally stored ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required