The Informed Company

Book description

Learn how to manage a modern data stack and get the most out of data in your organization!

Thanks to the emergence of new technologies and the explosion of data in recent years, we need new practices for managing and getting value out of data. In the modern, data driven competitive landscape the "best guess" approach—reading blog posts here and there and patching together data practices without any real visibility—is no longer going to hack it. The Informed Company provides definitive direction on how best to leverage the modern data stack, including cloud computing, columnar storage, cloud ETL tools, and cloud BI tools. You'll learn how to work with Agile methods and set up processes that's right for your company to use your data as a key weapon for your success . . . You'll discover best practices for every stage, from querying production databases at a small startup all the way to setting up data marts for different business lines of an enterprise.

In their work at Chartio, authors Fowler and David have learned that most businesspeople are almost completely self-taught when it comes to data. If they are using resources, those resources are outdated, so they're missing out on the latest cloud technologies and advances in data analytics. This book will firm up your understanding of data and bring you into the present with knowledge around what works and what doesn't.

  • Discover the data stack strategies that are working for today's successful small, medium, and enterprise companies
  • Learn the different Agile stages of data organization, and the right one for your team
  • Learn how to maintain Data Lakes and Data Warehouses for effective, accessible data storage
  • Gain the knowledge you need to architect Data Warehouses and Data Marts
  • Understand your business's level of data sophistication and the steps you can take to get to "level up" your data

The Informed Company is the definitive data book for anyone who wants to work faster and more nimbly, armed with actionable decision-making data.

Table of contents

  1. Cover
  2. Title Page
  3. Copyright
  4. Dedication
  5. About This Book
    1. Why Write This Book
    2. Who This Book Is For
    3. Who This Book Is Not For
    4. Who Wrote the Book
    5. Who Edited the Book
    6. Influences
    7. How This Book Was Written
    8. How to Read This Book
  6. Foreword
  7. Introduction
    1. Merging Business Context with Data Information
    2. The Four Stages of Agile Data Organization
  8. STAGE 1: SOURCE aka Siloed Data
    1. Chapter One: Starting with Source Data
      1. Common Options for Analyzing Source Data
    2. Chapter Two: The Need to Replicate Source Data
    3. Chapter Three: Source Data Best Practices
      1. Keep a Complexity Wiki Page
      2. Snippet Dictionary
      3. Use a BI Product
      4. Double Check Results
      5. Keep Short Dashboards
      6. Design Before Building
  9. STAGE 2: DATA LAKE aka Data Combined
    1. Chapter Four: Why Build a Data Lake?
      1. What Is a Data Lake?
      2. Reasons to Build a Data Lake Summarized
    2. Chapter Five: Choosing an Engine for the Data Lake
      1. Modern Columnar Warehouse Engines
      2. Modern Warehouse Engine Products
      3. Database Engines
      4. Recommendation
    3. Chapter Six: Extract and Load (EL) Data
      1. ETL versus ELT
      2. EL/ETL Vendors
      3. Extract Options
      4. Load Options
      5. Multiple Schemas
      6. Other Extract and Load Routes
    4. Chapter Seven: Data Lake Security
      1. Access in Central Place
      2. Permission Tiers
    5. Chapter Eight: Data Lake Maintenance
      1. Why SQL?
      2. Data Sources
      3. Performance
      4. Upgrade Snippets to Views
  10. STAGE 3: DATA WAREHOUSE aka the Single Source of Truth
    1. Chapter Nine: The Power of Layers and Views
      1. Make Readable Views
      2. Layer Views on Views
      3. Start with a Single View
    2. Chapter Ten: Staging Schemas
      1. Orient to the Schemas
      2. Pick a Table and Clean It
      3. Other Staging Modeling Considerations
      4. Building on Top of Staging Schemas
    3. Chapter Eleven: Model Data with dbt
      1. Version Control
      2. Modularity and Reusability
      3. Package Management
      4. Organizing Files
      5. Macros
      6. Incremental Tables
      7. Testing
    4. Chapter Twelve: Deploy Modeling Code
      1. Branch Using Version Control Software
      2. Commit Message
      3. Test Locally
      4. Code Review
      5. Schedule Runs
    5. Chapter Thirteen: Implementing the Data Warehouse
      1. Manage Dependencies
      2. Combine Tables Within Schemas
      3. Combine Tables Across Schemas
      4. Keep the Grain Consistent
      5. Create Business Metrics
      6. Keeping Accurate History
    6. Chapter Fourteen: Managing Data Access
      1. How to Secure Sensitive Data in the Data Warehouse
      2. How to Secure Sensitive Data in a BI Tool
    7. Chapter Fifteen: Maintaining the Source of Truth
      1. Track New Metrics
      2. Deprecate Old Metrics
      3. Deprecate Old Schemas
      4. Resolve Conflicting Numbers
      5. Handling Ongoing Requests and Ongoing Feedback
      6. Updating Modeling Code
      7. Manage Access
      8. Tuning to Optimize
      9. Code Review All Modeling
      10. Maintenance Checklist
  11. STAGE 4: DATA MARTS aka Data Democratized
    1. Chapter Sixteen: Data Mart Implementation
      1. Views on the Data Warehouse
      2. Segment Tables
      3. Access Update
    2. Chapter Seventeen: Data Mart Maintenance
      1. Educate Team
      2. Identifies Issues
      3. Identify New Needs
      4. Help Track Success
    3. Chapter Eighteen: Modern versus Traditional Data Stacks: What's Changed?
      1. What's Changed?
    4. Chapter Nineteen: Row‐ versus Column‐Oriented Database
      1. Row‐Oriented Databases
      2. Column‐Oriented Databases
      3. Summary
    5. Chapter Twenty: Style Guide Example
      1. Simplify
      2. Clean
      3. Naming Conventions
      4. Share It
    6. Chapter Twenty-One: Building an SST Example
      1. First Attempt—Same Tables with Prefixes
      2. Second Attempt—Operational Schema (Source Agnostic)
      3. Third Attempt—Application Separate, Other Sources Smashed
      4. Less Planning, More Implementing
  12. Acknowledgments and Contributions
    1. Thank‐yous
  13. Index
  14. End User License Agreement

Product information

  • Title: The Informed Company
  • Author(s): Dave Fowler, Matthew C. David
  • Release date: October 2021
  • Publisher(s): Wiley
  • ISBN: 9781119748007