Chapter 11. Data Transformation Service

So far, in the build phase, we have finalized the methodology to handle polyglot data models and the query processing required to implement the insight logic. In this chapter, we dig deeper into the implementation of business logic, which traditionally follows the Extract-Transform-Load (ETL) or Extract-Load-Transform (ELT) pattern.

There are a few key pain points associated with developing transformation logic. First, data users are experts in business logic but need engineering support to implement the logic at scale. That is, with the exponential growth in data, distributed programming models are required to implement the logic in a reliable and performant fashion. This often slows down the overall process since data users need to explain business logic and then user acceptance testing (UAT) to engineers. Second, there is an increasing need to build real-time business logic transformers. Traditionally, the transformation has been batch-oriented, involving reading from file, transforming the format, joining with different data sources, and so on. Data users are not experts in evolving programming models, especially for real-time insights. Third, running transformations in production requires continuous support to track availability, quality, change management of data sources, and processing logic. These pain points slow down the time to transform. Typically, transformation logic is not built from scratch but as a variant of the existing ...

Get The Self-Service Data Roadmap now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.