Book description
As data management and integration continue to evolve rapidly, storing all your data in one place, such as a data warehouse, is no longer scalable. In the very near future, data will need to be distributed and available for several technological solutions. With this practical book, you’ll learnhow to migrate your enterprise from a complex and tightly coupled data landscape to a more flexible architecture ready for the modern world of data consumption.
Executives, data architects, analytics teams, and compliance and governance staff will learn how to build a modern scalable data landscape using the Scaled Architecture, which you can introduce incrementally without a large upfront investment. Author Piethein Strengholt provides blueprints, principles, observations, best practices, and patterns to get you up to speed.
- Examine data management trends, including technological developments, regulatory requirements, and privacy concerns
- Go deep into the Scaled Architecture and learn how the pieces fit together
- Explore data governance and data security, master data management, self-service data marketplaces, and the importance of metadata
Publisher resources
Table of contents
- Foreword
- Preface
-
1. The Disruption of Data Management
- Data Management
- Analytics Is Fragmenting the Data Landscape
- Speed of Software Delivery Is Changing
- Networks Are Getting Faster
- Privacy and Security Concerns Are a Top Priority
- Operational and Transactional Systems Need to Be Integrated
- Data Monetization Requires an Ecosystem-to-Ecosystem Architecture
- Enterprises Are Saddled with Outdated Data Architectures
- Summary
- 2. Introducing the Scaled Architecture: Organizing Data at Scale
-
3. Managing Vast Amounts of Data: The Read-Only Data Stores Architecture
- Introducing the RDS Architecture
- Command and Query Responsibility Segregation
-
Read-Only Data Store Components and Services
- Metadata
- Data Quality
- RDS Tiers
- Data Ingestion
- Integrating Commercial Off-the-Shelf Solutions
- Extracting Data from External APIs and SaaSs
- Historical Data Service
- Design Variations
- Data Replication
- Access Layer
- File Manipulation Service
- Delivery Notification Service
- De-Identification Service
- Distributed Orchestration
- Intelligent Consumption Services
- Populating RDSs on Demand
- RDS Direct Usage Considerations
- Summary
- 4. Services and API Management: The API Architecture
-
5. Event and Response Management: The Streaming Architecture
- Introducing the Streaming Architecture
- The Asynchronous Event Model Makes the Difference
- What Do Event-Driven Architectures Look Like?
- A Gentle Introduction to Apache Kafka
- The Streaming Architecture
- Streaming as the Operational Backbone
- Guarantees and Consistency
- Metadata for Governance and Self-Service Models
- Summary
- 6. Connecting the Dots
- 7. Sustainable Data Governance and Data Security
-
8. Turning Data into Value
- Consumption Patterns
- Target Operating Model
- Data Professionals as a Target User Group
- Business Requirements
- Nonfunctional Requirements
- Building the Data Pipeline and Data Model
- Distributing Integrated Data
- Business Intelligence Capabilities
- Self-Service Capabilities
- Analytical Capabilities
- Advanced Analytics Reference Architecture
- Summary
- 9. Mastering Enterprise Data Assets
- 10. Democratizing Data with Metadata
- 11. Conclusion
- Glossary
- Index
Product information
- Title: Data Management at Scale
- Author(s):
- Release date: July 2020
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492054788
You might also like
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
Flow Architectures
Dominated by streaming data and events, the next generation of software development optimizes not only how …
book
Architecting Modern Data Platforms
There’s a lot of information about big data technologies, but splicing these technologies into an end-to-end …
book
Practical Statistics for Data Scientists, 2nd Edition
Statistical methods are a key part of data science, yet few data scientists have formal statistical …