Chapter 21
Ten Differences between a Data Warehouse and a Data Lake
IN THIS CHAPTER
Exploring key distinctions between a data warehouse and a data lake
Seeing how to leverage data warehouse knowledge to jump-start building a data lake
In the beginning — say, around 1990 — data warehousing came onto the scene, promising previously unachievable levels of data integration. Data warehousing and its sibling discipline called business intelligence kicked off a new era of data-driven insights for companies and governmental agencies.
For all the power of data warehousing, organizations soon ran up against the limitations of what a data warehouse was able to support. In many ways, data lakes are the next-generation successors to data warehousing, overcoming many different technical and architectural barriers that capped what a data warehouse was able to do.
Data warehouses are still widely used and are often incorporated into an overall data lake environment. You should understand the distinction between the two disciplines to make the best architectural decisions for your organization.
Types of Data Supported
A data warehouse almost always contains structured data (numbers, fixed-length and relatively short strings of characters, and dates). You typically store this structured data in ...
Get Data Lakes For Dummies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.