Every decade since the 1960s, researchers at companies like IBM, Amazon, and many others have introduced major new frameworks and techniques to handle rising data management problems. This concise ebook explains how these new systems helped data science evolve quickly—from hierarchical and relational databases to big data and cloud computing to streaming and graph data.
Computer scientist Paco Nathan shows members of your data science team how major companies created each of these data management systems not just to deal with new data types but also to take full advantage of the opportunities the data presented. Their efforts over the years have propelled an entire industry.
This report covers the historical progression of data management topics including:
- Hierarchical databases—1960s mainframe batch systems are still used in finance, healthcare, manufacturing, energy, and other industries.
- Relational databases—these enabled faster transactions, mathematical optimization, and budgeting guarantees for many businesses.
- Big data—this includes relatively cheap horizontal scale-out systems for collecting huge amounts of customer data.
- Cloud computing—large companies began managing reliable, scalable, cost-effective data centers; Amazon turned the concept into a business.
- Cluster schedulers—managing horizontal clusters was difficult before schedulers such as Apache Mesos appeared.
- Streaming data—data continuously generated by different sources requires responses in "real time"—generally milliseconds.
Table of contents
- Title: Fifty Years of Data Management and Beyond
- Release date: April 2019
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492057505
You might also like
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. …
Python for Data Analysis, 2nd Edition
Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, …
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …