Traditional data processing infrastructures—especially those that support applications—weren’t designed for our mobile, streaming, and online world. This O’Reilly report examines how today’s distributed, in-memory database management systems (IMDBMS) enable you to make quick decisions based on real-time data.
In this report, executives from MemSQL Inc. provide options for using in-memory architectures to build real-time data pipelines. If you want to instantly track user behavior on websites or mobile apps, generate reports on a changing dataset, or detect anomalous activity in your system as it occurs, you’ll learn valuable lessons from some of the largest and most successful tech companies focused on in-memory databases.
- Explore the architectural principles of modern in-memory databases
- Understand what’s involved in moving from data silos to real-time data pipelines
- Run transactions and analytics in a single database, without ETL
- Minimize complexity by architecting a multipurpose data infrastructure
- Learn guiding principles for developing an optimally architected operational system
- Provide persistence and high availability mechanisms for real-time data
- Choose an in-memory architecture flexible enough to scale across a variety of deployment options
Conor Doherty, Data Engineer at MemSQL, is responsible for creating content around database innovation, analytics, and distributed systems.
Gary Orenstein, Chief Marketing Officer at MemSQL, leads marketing strategy, product management, communications, and customer engagement.
Kevin White is the Director of of Operations and a content contributor at MemSQL.
Steven Camiña is a Principal Product Manager at MemSQL. His experience spans B2B enterprise solutions, including databases and middleware platforms.
Table of contents
1. When to Use In-Memory Database Management Systems (IMDBMS)
- Improving Traditional Workloads with In-Memory Databases
- Modern Workloads
- The Need for HTAP-Capable Systems
- Common Application Use Cases
- 2. First Principles of Modern In-Memory Databases
- 3. Moving from Data Silos to Real-Time Data Pipelines
4. Processing Transactions and Analytics in a Single Database
- Requirements for Converged Processing
- Benefits of Converged Processing
- 5. Spark
- 6. Architecting Multipurpose Infrastructure
- 7. Getting to Operational Systems
- 8. Data Persistence and Availability
9. Choosing the Best Deployment Option
- Considerations for Bare Metal
- Virtual Machine (VM) and Container Considerations
- Considerations for Cloud or On-Premises Deployments
- Choosing the Right Storage Medium
- Deployment Conclusions
- 10. Conclusion
- Title: Building Real-Time Data Pipelines
- Release date: November 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491935491
You might also like
40 Algorithms Every Programmer Should Know
Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. …
Simply SQL is a practical step-by-step guide to writing SQL. You'll learn how to make the …
Software Engineering at Google
Today, software engineers need to know not only how to program effectively but also how to …