O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Fast Data: Smart and at Scale

Book Description

The need for fast data applications is growing rapidly, driven by the IoT, the surge in machine-to-machine (M2M) data, global mobile device proliferation, and the monetization of SaaS platforms. So how do you combine real-time, streaming analytics with real-time decisions in an architecture that’s reliable, scalable, and simple?

In this O’Reilly report, Ryan Betts and John Hugg from VoltDB examine ways to develop apps for fast data, using pre-defined patterns. These patterns are general enough to suit both the do-it-yourself, hybrid batch/streaming approach, as well as the simpler, proven in-memory approach available with certain fast database offerings.

Their goal is to create a collection of fast data app development recipes. We welcome your contributions, which will be tested and included in future editions of this report.

Table of Contents

  1. Foreword
  2. Fast Data Application Value
    1. Looking Beyond Streaming
  3. Fast Data and the Enterprise
  4. 1. What Is Fast Data?
    1. Applications of Fast Data
      1. Ingestion
      2. Streaming Analytics
      3. Per-Event Transactions
    2. Uses of Fast Data
      1. Front End for Hadoop
      2. Enriching Streaming Data
      3. Queryable Cache
  5. 2. Disambiguating ACID and CAP
    1. What Is ACID?
      1. What Does ACID Stand For?
    2. What Is CAP?
      1. What Does CAP Stand For?
    3. How Is CAP Consistency Different from ACID Consistency?
    4. What Does “Eventual Consistency” Mean in This Context?
  6. 3. Recipe: Integrate Streaming Aggregations and Transactions
    1. Idea in Brief
    2. Pattern: Reject Requests Past a Threshold
    3. Pattern: Alerting on Variations from Predicted Trends
    4. When to Avoid This Pattern
    5. Related Concepts
  7. 4. Recipe: Design Data Pipelines
    1. Idea in Brief
    2. Pattern: Use Streaming Transformations to Avoid ETL
    3. Pattern: Connect Big Data Analytics to Real-Time Stream Processing
    4. Pattern: Use Loose Coupling to Improve Reliability
    5. When to Avoid Pipelines
  8. 5. Recipe: Pick Failure-Recovery Strategies
    1. Idea in Brief
    2. Pattern: At-Most-Once Delivery
    3. Pattern: At-Least-Once Delivery
    4. Pattern: Exactly-Once Delivery
  9. 6. Recipe: Combine At-Least-Once Delivery with Idempotent Processing to Achieve Exactly-Once Semantics
    1. Idea in Brief
    2. Pattern: Use Upserts Over Inserts
    3. Pattern: Tag Data with Unique Identifiers
      1. Subpattern: Fine-Grained Timestamps
      2. Subpattern: Unique IDs at the Event Source
    4. Pattern: Use Kafka Offsets as Unique Identifiers
    5. Example: Call Center Processing
      1. Version 1: Events Are Ordered
      2. Version 2: Events Are Not Ordered
    6. When to Avoid This Pattern
    7. Related Concepts and Techniques
  10. Glossary