Every decade since the 1960s, researchers at companies like IBM, Amazon, and many others have introduced major new frameworks and techniques to handle rising data management problems. This concise ebook explains how these new systems helped data science evolve quickly—from hierarchical and relational databases to big data and cloud computing to streaming and graph data.
Computer scientist Paco Nathan shows members of your data science team how major companies created each of these data management systems not just to deal with new data types but also to take full advantage of the opportunities the data presented. Their efforts over the years have propelled an entire industry.
This report covers the historical progression of data management topics including:
- Hierarchical databases—1960s mainframe batch systems are still used in finance, healthcare, manufacturing, energy, and other industries.
- Relational databases—these enabled faster transactions, mathematical optimization, and budgeting guarantees for many businesses.
- Big data—this includes relatively cheap horizontal scale-out systems for collecting huge amounts of customer data.
- Cloud computing—large companies began managing reliable, scalable, cost-effective data centers; Amazon turned the concept into a business.
- Cluster schedulers—managing horizontal clusters was difficult before schedulers such as Apache Mesos appeared.
- Streaming data—data continuously generated by different sources requires responses in "real time"—generally milliseconds.
Table of contents
- Title: Fifty Years of Data Management and Beyond
- Release date: April 2019
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492057505
You might also like
Data Architecture: A Primer for the Data Scientist, 2nd Edition
Over the past 5 years, the concept of big data has matured, data science has grown …
Analytical Skills for AI and Data Science
While several market-leading companies have successfully transformed their business models by following data- and AI-driven paths, …
40 Algorithms Every Programmer Should Know
Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …