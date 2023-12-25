Book description
All cloud architects need to know how to build data platforms—the key to enabling businesses with data and delivering enterprise-wide intelligence in a fast and efficient way. This handbook is ideal for learning how to design, build, and modernize cloud native data and machine learning platforms using AWS, Azure, Google Cloud, or multicloud tools like Fivetran, dbt, Snowflake, and Databricks.
Authors Marco Tranquillin, Valliappa Lakshmanan, and Firat Tekiner cover the entire data lifecycle in a cloud environment, from ingestion to activation, using real-world enterprise architectures. You'll learn how to transform and modernize familiar solutions, like data warehouses and data lakes, and you'll be able to leverage recent AI/ML patterns to get accurate and quicker insights to drive competitive advantage.
This book shows you how to:
- Design a modern cloud native or hybrid data analytics and machine learning platform
- Accelerate data-led innovation by consolidating enterprise data in a data platform
- Democratize access to enterprise data and allow business teams to extract insights and build AI/ML capabilities
- Enable your business to make decisions in real time using streaming pipelines
- Move from a descriptive analytics approach to a more predictive and prescriptive one by building an MLOps platform
- Make your organization more effective in working with data analytics and machine learning in a cloud environment
Publisher resources
Table of contents
- Preface
-
1. Modernizing Your Data Platform: An Introductory Overview
- Why do organizations need a data platform?
- Creating a Unified Analytics Platform
- Hybrid Cloud
- Applying AI
- Why Cloud for AI?
- Core Principles
- Summary
-
2. Strategic steps to innovate with data
- Step 1: Strategy and Planning
- Step 2: Reduce Total Cost of Ownership adopting a cloud approach
- Step 3: Break down silos
- Step 4: Make decisions in context faster
- Step 5: Leapfrog with packaged AI solutions
- Step 6: Operationalize AI-driven Workflows
-
Step 7: Product Management for Data
- Applying Product Management Principles to Data
- 1. Understand and maintain a map of data flows in the enterprise
- 2. Identify key metrics
- 3. Agreed criteria, committed roadmap, and visionary backlog
- 4. Build for the customers you have
- 5. Don’t shift the burden of change management
- 6. Interview customers to discover their data needs
- 7. Whiteboard and prototype extensively
- 8. Build only what will be used immediately
- 9. Standardize common entities and KPIs
- 10. Provide self-service capabilities in your data platform
- Summary
-
3. Creating a Modern Data Analytics Capability
- The Data Life Cycle
- Foundational elements
- Governance and security
- Moving to the public cloud
- Modernizing Data Workflows
- Summary
-
4. Designing your data team
- Classifying data processing organizations
- Data-analysis driven organization (DADO)
- Data-engineering driven organization (DEDO)
- Data-science driven organization (DSDO)
- Summary
-
5. A Migration Framework
- A Four-Step Migration Framework
- Estimating the overall cost of the solution
- Setting up security and data governance
- Schema, pipeline and data migration
- Summary
-
6. Architecting a data lake
- Data Lake and the cloud - A perfect marriage
- Architecture design and implementation details
- Integrating the data lake: the real superpower
- Democratizing data processing and reporting
- Machine Learning in the Data Lake
- Summary
-
7. Innovate with an enterprise data warehouse
- A modern data platform
- Hub-and-Spoke architecture
- Data Warehouse to enable Data Scientists
- Summary
- 8. Converging to a Lakehouse
-
9. Architectures for Streaming
- The value of streaming
- Streaming Ingest
- Real-time Dashboards
- Stream Analytics
- Continuous Intelligence through ML
- Summary
- About the Authors
Product information
- Title: Architecting Data and Machine Learning Platforms
- Author(s):
- Release date: December 2023
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098151614
You might also like
book
Deciphering Data Architectures
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern …
book
Developing Apps with GPT-4 and ChatGPT
This mini-book is a comprehensive guide for Python developers who want to learn how to build …
book
Foundations of Scalable Systems
In many systems, scalability becomes the primary driver as the user base grows. Attractive features and …
book
The Enterprise Data Catalog
Combing the web is simple, but how do you search for data at work? It's difficult …