Chapter 2. Big Data Architectures on the Cloud
Big data may mean more information, but it also means more false information.
Naseem Taleb
As we learned in Chapter 1, there are two key takeaways about cloud data lakes that set the foundation for this chapter:
-
A data lake approach starts with the ability to store and process any type of data regardless of its source, size, or structure, thereby allowing an organization to extract high-value insights from many disparate sources of data with variable value density (i.e., signal-to-noise ratio).
-
Building your data lake on the cloud involves a disaggregated architecture where you assemble different components of IaaS, PaaS, and SaaS solutions together.
What is important to remember is that building your cloud data lake solution also gives you a lot of options for architectures, each with its own set of strengths. This article on Future.com provides a comprehensive overview of the various components of a modern data architecture. In this chapter, we will dive deep into some of the more common architectural patterns, covering what they are as well as understanding the strengths of each of these architectures as they apply to a fictitious organization called Klodars Corporation.
Why Klodars Corporation Moves to the Cloud
Klodars Corporation is a thriving company that sells rain gear and other supplies in the Pacific Northwest region. The rapid growth in its business is driving its move to the cloud for the following reasons: ...
Get The Cloud Data Lake now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.