Traditional data processing infrastructures—especially those that support applications—weren’t designed for our mobile, streaming, and online world. This O’Reilly report examines how today’s distributed, in-memory database management systems (IMDBMS) enable you to make quick decisions based on real-time data.
In this report, executives from MemSQL Inc. provide options for using in-memory architectures to build real-time data pipelines. If you want to instantly track user behavior on websites or mobile apps, generate reports on a changing dataset, or detect anomalous activity in your system as it occurs, you’ll learn valuable lessons from some of the largest and most successful tech companies focused on in-memory databases.
- Explore the architectural principles of modern in-memory databases
- Understand what’s involved in moving from data silos to real-time data pipelines
- Run transactions and analytics in a single database, without ETL
- Minimize complexity by architecting a multipurpose data infrastructure
- Learn guiding principles for developing an optimally architected operational system
- Provide persistence and high availability mechanisms for real-time data
- Choose an in-memory architecture flexible enough to scale across a variety of deployment options
Conor Doherty, Data Engineer at MemSQL, is responsible for creating content around database innovation, analytics, and distributed systems.
Gary Orenstein, Chief Marketing Officer at MemSQL, leads marketing strategy, product management, communications, and customer engagement.
Kevin White is the Director of of Operations and a content contributor at MemSQL.
Steven Camiña is a Principal Product Manager at MemSQL. His experience spans B2B enterprise solutions, including databases and middleware platforms.
Table of contents
1. When to Use In-Memory Database Management Systems (IMDBMS)
- Improving Traditional Workloads with In-Memory Databases
- Modern Workloads
- The Need for HTAP-Capable Systems
- Common Application Use Cases
- 2. First Principles of Modern In-Memory Databases
- 3. Moving from Data Silos to Real-Time Data Pipelines
4. Processing Transactions and Analytics in a Single Database
- Requirements for Converged Processing
- Benefits of Converged Processing
- 5. Spark
- 6. Architecting Multipurpose Infrastructure
- 7. Getting to Operational Systems
- 8. Data Persistence and Availability
9. Choosing the Best Deployment Option
- Considerations for Bare Metal
- Virtual Machine (VM) and Container Considerations
- Considerations for Cloud or On-Premises Deployments
- Choosing the Right Storage Medium
- Deployment Conclusions
- 10. Conclusion
- Title: Building Real-Time Data Pipelines
- Release date: November 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491935491
You might also like
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
Python for Programmers, First Edition
The professional programmer's Deitel® guide to Python® with introductory artificial intelligence case studies Written for programmers …
Python Crash Course, 2nd Edition
This is the second edition of the best selling Python book in the world. Python Crash …
Practical Real-time Data Processing and Analytics
A practical guide to help you tackle different real-time data processing and analytics problems using the …