O'Reilly logo

Learning Real-time Processing with Spark Streaming by Sumit Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. Applying Transformations to Streaming Data

Data is of no use if it cannot be transformed and some meaningful analysis is derived from the overall process!

Data analysis is a multi-step process which includes inspecting, transforming and finally modeling, with the important goal of discovering useful information which is further considered and applied in arriving at critical business decisions.

In simple terms, transformation is a process where a series of rules or functions is applied to extracted data, so that it can be loaded to the end target for further analysis.

An important activity in transformation is data cleansing, which aims to process and prepare only proper and relevant data, which can be analyzed and interpreted by the system. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required