Skip to Content
The Self-Service Data Roadmap
book

The Self-Service Data Roadmap

by Sandeep Uttamchandani
September 2020
Beginner to intermediate
284 pages
7h 40m
English
O'Reilly Media, Inc.
Content preview from The Self-Service Data Roadmap

Chapter 1. Introduction

Data is the new oil. There has been exponential growth in the amount of structured, semi-structured, and unstructured data collected within enterprises. Insights extracted from data are becoming a valuable differentiator for enterprises in every industry vertical, and machine learning (ML) models are used in product features as well as improved business processes.

Enterprises today are data-rich, but insights-poor. Gartner predicts that 80% of analytics insights will not deliver business outcomes through 2022. Another study highlights that 87% of data projects never make it to production deployment. Sculley et al. from Google show that less than 5% of the effort of implementing ML in production is spent on the actual ML algorithms (as illustrated in Figure 1-1). The remaining 95% of the effort is spent on data engineering related to discovering, collecting, and preparing data, as well as building and deploying the models in production.

While an enormous amount of data is being collected within data lakes, it may not be consistent, interpretable, accurate, timely, standardized, or sufficient. Data scientists spend a significant amount of time on engineering activities related to aligning systems for data collection, defining metadata, wrangling data to feed ML algorithms, deploying pipelines and models at scale, and so on. These activities are outside of their core insight-extracting skills, and bottlenecked by dependency on data engineers and platform IT ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Data Management at Scale

Data Management at Scale

Piethein Strengholt
Data Mesh

Data Mesh

Zhamak Dehghani
The Enterprise Data Catalog

The Enterprise Data Catalog

Ole Olesen-Bagneux

Publisher Resources

ISBN: 9781492075240Errata Page