Chapter 5. Looking Ahead
Most companies are at the very beginning stages of understanding with respect to optimizing their data storage and analytics platforms. An estimated 70% of the market ignores big data today, and because they use data warehouses, it is tough for them to quickly accommodate business changes. Approximately 20 to 25% of the market stores some of its data in data lakes using scale-out architectures such as Hadoop and Amazon Simple Storage Service (Amazon S3) to more cost-effectively manage big data. However, most of these implementations have turned into data swamps. Data swamps are essentially unmanaged data lakes, so although they still are more cost effective than data warehouses, they are only really useful for some ad hoc exploratory use cases. Finally, 5 to 10% of the market is using managed, governed data lakes, which allows for energized business insights via a scalable, modern data architecture.
As the mainstream adopters and laggards are playing catch up with big data, today’s innovators are looking at automation, machine learning, and intelligent data remediation to construct more usable, optimized data lakes. Companies such as Zaloni are working to make this a reality.
As the data lake becomes an important part of next-generation data architectures, we see multiple trends emerging based on different vertical use cases that indicate what the future of data lakes will look like.
Logical Data Lakes
We are seeing more and more requirements for hybrid ...
Get Architecting Data Lakes, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.