Chapter 9. Extending a Data Platform Using Hybrid and Edge

So far in this book, we have discussed how to plan, design, and implement a data platform using the capabilities of a public cloud. However, there are many situations where a single public cloud will not be enough because it is inherent to the use case for data to originate at, be processed at, or be stored in some other location—this could be on premises, in multiple hyperscalers, or in connected intelligent devices such as smartphones or sensors. In situations like these, there is a new challenge that needs to be addressed: how do you provide a holistic view of the platform so that users can effectively mix and join the data spread across different places? In this chapter you will learn the approaches, techniques, and architectural patterns that your organization can take when dealing with such distributed architectures.

Furthermore, there are other situations where you need to make your data work in a partially connected or disconnected mode environment. You will learn in this chapter how to deal with such a situation leveraging a new approach, called edge computing, that can bring a portion of storage and compute resources out of the cloud and closer to the subject that is generating or using data itself.

Why Multicloud?

As a data leader, your organization wants you to continuously look for ways to boost business outcomes while minimizing the technology costs you incur. When it comes to data platforms, you are expected ...

Get Architecting Data and Machine Learning Platforms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.