Appendix. Cloud Data Lake Decision Framework Template

In this section, I’ll provide a template that you can use to plan your cloud data lake solution. You can customize this template as needed based on your specific scenarios or customers. I recommend that you plan at least over a one- to two-year horizon to ensure that your cloud data lake design is sustainable before you have to make drastic changes.

Phase 1: Assess Framework

Target 60%–70% accuracy and completeness

The objective of this phase is to define and prioritize the requirements for the cloud data lake that will drive your architecture and implementation decisions. In this phase, the data engineering team needs to identify the requirements based on two key aspects: customer requirements and business drivers. I strongly recommend doing your due diligence in the assess phase, which will set you up for smooth planning and execution of the subsequent phases.

Use Table A-1 to record the findings from your stakeholder interviews.

Table A-1. Inventory of problems and requirements
Customer Problem Severity of problem Helpfulness of data lake How cloud data lake can help

       

       

High/Medium/Low

High/Medium/Low

       

       

       

High/Medium/Low

High/Medium/Low

       

       

       

High/Medium/Low

High/Medium/Low

       

       

       

High/Medium/Low

High/Medium/Low

       

       

       

High/Medium/Low

High/Medium/Low ...

Get The Cloud Data Lake now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.