Preface
In recent years many enterprises have begun experimenting with using big data and cloud technologies to build data lakes and support data-driven culture and decision making—but the projects often stall or fail because the approaches that worked at internet companies have to be adapted for the enterprise, and there is no comprehensive practical guide on how to successfully do that. I wrote this book with the hope of providing such a guide.
In my roles as executive at IBM and Informatica (major data technology vendors), Entrepreneur in Residence at Menlo Ventures (a leading VC firm), and founder and CTO of Waterline (a big data startup), I’ve been fortunate to have had the opportunity to speak with hundreds of experts, visionaries, industry analysts, and hands-on practitioners about the challenges of building successful data lakes and creating a data-driven culture. This book is a synthesis of the themes and best practices that I’ve encountered across industries (from social media to banking and government agencies) and roles (from chief data officers and other IT executives to data architects, data scientists, and business analysts).
Big data, data science, and analytics supporting data-driven decision making promise to bring unprecedented levels of insight and efficiency to everything from how we work with data to how we work with customers to the search for a cure for cancer—but data science and analytics depend on having access to historical data. In recognition of this, ...