Chapter 2. Introducing the Scaled Architecture: Organizing Data at Scale

What architecture does an enterprise need to become data-driven? How can you distribute data efficiently while retaining agility, security, and control? This chapter will address these questions, set the foundation for data management, and start building.

The trends we have seen require us to rethink the way data management and data integration are done. We’ve discussed the tight couplings that arise when making exact copies of application data and the difficulties of operationalizing analytics on raw data. We’ve also discussed the unification problems and tremendous effort of building an integrated data warehouse and its impact on agility. We need to shift away from funneling all data into a single silo toward an approach that empowers domains, teams, and users to distribute, consume, and use data themselves easily and securely. Platforms, processes, and patterns should simplify the work for others. We need interfaces that are simple, well-documented, fast, and easy to consume. We need a data management architecture that works at scale. This chapter discusses this, starting with how to organize the landscape and do data integration differently.

The large-scale architecture I envision focuses on data management and data integration. It is an architecture for enterprises that is meant to allow teams to provide data securely and easily while retaining agility, control, and insight. Just like many other architectures, ...

Get Data Management at Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.