Alex Gorelik

How to build a successful enterprise data lake

Date: This event took place live on January 12 2017

Presented by: Alex Gorelik

Duration: Approximately 60 minutes.

Cost: Free

Questions? Please send email to


Big data and data science promise to bring unprecedented levels of insight and efficiency to everything from working with data and working with customers to curing cancer. To successfully deliver on this promise, traditional enterprises are building data lakes, which bridge the gap between enterprise data warehouses, where data is a precious commodity carefully tended to by professional IT personnel, and the freewheeling culture of modern Internet companies.

An enterprise data lake must provide three new capabilities: cost-effective scalable storage and computing; cost-effective data access and governance; and tiered, governed access, based on user needs, skill levels, and applicable data-governance policies. Drawing on a 30-year career developing leading-edge data technology and working with some of the world's largest enterprises on their thorniest data problems, Alex Gorelik, author of the forthcoming O'Reilly book The Enterprise Data Lake, discusses the considerations of and best practices for building data lakes, with examples taken from the world's leading big data companies and enterprises.

Topics include:

  • How to start and grow a data lake, including data warehouse offloading, analytical sandboxes, and "data puddles"
  • Setting up different tiers of data—from raw, untreated landing areas to carefully managed and summarized data
  • How to enable self-service to help users find, understand, and provision data and provide different interfaces to users with different skill levels
  • Staying in compliance with enterprise data-governance policies

About Alex Gorelik

Alex Gorelik is the founder and CEO of Waterline Data, a startup focused on enhancing the value of Hadoop through data self-service and governance. Alex is a serial entrepreneur and innovator who has spent over 30 years inventing and bringing to market cutting-edge data-oriented technology. Prior to Waterline, Alex was an EIR at Menlo Ventures; held several executive roles at Informatica, including GM of Informatica's Data Quality Business Unit—driving marketing, product management, and R&D for an $80M business—and SVP of R&D for Core Technology—driving innovation in big data and social media while managing a team of 400 engineers and product managers developing Informatica's platform and data-integration technology; and served as an IBM distinguished engineer for IBM's Information Integration team. Alex is a former founder, CTO, and VP of engineering at Exeros (acquired by IBM in 2009) and cofounder, CTO, and VP of engineering at Acta Technology (acquired by Business Objects in 2002 and marketed as Business Objects Data Services). Prior to Acta, Alex managed development of the replication server at Sybase and worked on Sybase's strategy for enterprise application integration. Earlier, he developed the database kernel at Amdahl's Design Automation group. Alex holds a BS in computer science from Columbia University's School of Engineering and a master's degree in computer science from Stanford University.

Related title

The Enterprise Big Data Lake
By Alex Gorelik
O'Reilly Media
June 2016
Ebook $33.99 USD