Chapter 7. Data Architectures Supported by Data Virtualization Systems
In this chapter we discuss several popular data architectures that enterprises use to manage the data that exists across the organization, and how data virtualization systems can enable these architectures. A data architecture describes the management practices specifying where data is stored within an enterprise, which systems are used to store and provide access to the data, who is in charge of managing and maintaining these systems, and which datasets are contained within them. We start by discussing one of the most popular solutions to creating a space for analytical teams to perform data science and analysis: the data warehouse. We then discuss more modern approaches, including enterprise-wide data products, the data mesh, and the data fabric.
Data Warehouse
The term data warehouse was originally defined by computer scientist Bill Inmon as “a subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management’s decision-making process.” These four parts of the definition can be explained as follows:
- Subject-oriented
-
The data warehouse collects data from across an enterprise and organizes that data into subjects that represent important real-world entities for that enterprise, such as customer, transaction, and product.
- Integrated
-
The data warehouse brings together data from multiple data sources and combines it into a single view or format. These views could ...
Get Data Virtualization in the Cloud Era now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.