Chapter 90. When to Say No to Data

Robert J. Abate

While I was working as the director of a large retailer, we started to build a data lake with all the information that could be collected (both inside and outside the enterprise) in order to get a 360-degree view of the customer (for marketing and other purposes). This would become a huge dataset incorporating customer data, syndicated sales data, shopping cart information, marketing (promo) data, demographics (from the US Census Bureau), store locations, weather, and so on.

This data lake would contain information on the who (shopper), what (product), where (location), when (time), how (transaction type), and why (external data such as weather, stock market, income around store locations, etc.). Its primary usage would be to support visualizations inside the Data CAFÉ (Collaborative Analytics Facility for the Enterprise). The Data CAFÉ was designed so that executives could enter this room with nine (9) large screen displays of information and make critical business decisions in real time (e.g., Black Friday sales on the East Coast allowed management to change distribution in the Mountain and West Coast time zones based on live feeds).

Inside the Data CAFÉ, we would slice and dice the data so that visualizations could be created with filters (specific store, state, region, etc.), and then simply by pushing ...

Get 97 Things About Ethics Everyone in Data Science Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.