Chapter 1. Why Build a Data Fabric?

Why do you need this thing called a data fabric? It’s not just because of the sheer size of your data. You also are faced with access and integration challenges because of where the data is coming from, where it’s stored, and in what form. You’ve got data on premises. In the public cloud. In private clouds. You have data in multicloud and hybrid cloud ecosystems. Within these various silos, some of the data is structured but most is unstructured, which raises challenges. And don’t forget streaming data—that’s an important part of the picture, too.

What’s the state of enterprise data, then? Fragmented. A full 93% of enterprises have a multicloud strategy, with 87% having a hybrid cloud environment in place, according to Flexera’s 2020 State of the Cloud survey.1 On average, companies have data stored in 2.2 public and 2.2 private clouds, as well as in various on-premises data repositories (see Figure 1-1).


Businesses are pushing the limits of what they can do with existing data management tools.

Figure 1-1. The fragmented state of enterprise data (Source: Flexera)

The reasons for this fragmentation are varied, and include the following:

Time-to-data-insight is a competitive differentiator

Today nearly every business transformation—whether aiming for greater customer intimacy, more optimized operations, or faster ...

Get Data Fabric as Modern Data Architecture now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.