Chapter 7. Substrate

Let’s assume we have figured out the major components of our data architecture, and the pieces start to fit in the puzzle that will serve the business use case. The next question is, how do we deal with the infrastructure layer to support those components? Our base requirement is that we need it to run our systems/components in an efficient and cost-effective manner. We also require that this layer provides resource management, monitoring, multitenancy, easy scaling, and other crucial operational capabilities in order to implement our architecture on top of it.

As usual in computer science, the solution is to use an additional abstraction layer. This infrastructure abstraction, let’s call it the substrate, allows us to run a wide variety of software components while providing several core operational capabilities. At its core, this layer is essentially an abstraction on top of hardware resources and operating system functions in a distributed setting. And as in an operating system, we want this layer to provide a set of basic services to the application running on top of it:

  • Allocate enough resources as needed that are fairly distributed among applications

  • Provide application-level isolation to securely run applications from different business owners

  • Ensure application resilience in case of failure of the underlying hardware

  • Expose usage metrics to enable system operators to decide on capacity planning

  • Provide management and monitoring interfaces ...

Get Designing Fast Data Application Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.