O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data

Book Description

Construct a robust end-to-end solution for analyzing and visualizing streaming data

Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data, expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis platforms.

The author is among a very few leading experts in the field. He has a prestigious background in research, development, analytics, real-time visualization, and Big Data streaming and is uniquely qualified to help you explore this revolutionary field. Moving from a description of the overall analytic architecture of real-time analytics to using specific tools to obtain targeted results, Real-Time Analytics leverages open source and modern commercial tools to construct robust, efficient systems that can provide real-time analysis in a cost-effective manner. The book includes:

  • A deep discussion of streaming data systems and architectures

  • Instructions for analyzing, storing, and delivering streaming data

  • Tips on aggregating data and working with sets

  • Information on data warehousing options and techniques

  • Real-Time Analytics includes in-depth case studies for website analytics, Big Data, visualizing streaming and mobile data, and mining and visualizing operational data flows. The book's "recipe" layout lets readers quickly learn and implement different techniques. All of the code examples presented in the book, along with their related data sets, are available on the companion website.

    Table of Contents

    1. Cover
    2. Chapter 1: Introduction to Streaming Data
      1. Sources of Streaming Data
      2. Why Streaming Data Is Different
      3. Infrastructures and Algorithms
      4. Conclusion
    3. Part I: Streaming A Analytics Architecture
    4. Chapter 2: Designing Real-Time Streaming Architectures
      1. Real-Time Architecture Components
      2. Features of a Real-Time Architecture
      3. Languages for Real-Time Programming
      4. A Real-Time Architecture Checklist
      5. Conclusion
    5. Chapter 3: Service Configuration and Coordination
      1. Motivation for Configuration and Coordination Systems
      2. Maintaining Distributed State
      3. Apache ZooKeeper
      4. Conclusion
    6. Chapter 4: Data-Flow Management in Streaming Analysis
      1. Distributed Data Flows
      2. Apache Kafka: High-Throughput Distributed Messaging
      3. Apache Flume: Distributed Log Collection
      4. Conclusion
    7. Chapter 5: Processing Streaming Data
      1. Distributed Streaming Data Processing
      2. Processing Data with Storm
      3. Processing Data with Samza
      4. Conclusion
    8. Chapter 6: Storing Streaming Data
      1. Consistent Hashing
      2. “NoSQL” Storage Systems
      3. Other Storage Technologies
      4. Choosing a Technology
      5. Warehousing
      6. Conclusion
    9. Part II: Analysis and Visualization
    10. Chapter 7: Delivering Streaming Metrics
      1. Streaming Web Applications
      2. Visualizing Data
      3. Mobile Streaming Applications
      4. Conclusion
    11. Chapter 8: Exact Aggregation and Delivery
      1. Timed Counting and Summation
      2. Multi-Resolution Time-Series Aggregation
      3. Stochastic Optimization
      4. Delivering Time-Series Data
      5. Conclusion
    12. Chapter 9: Statistical Approximation of Streaming Data
      1. Numerical Libraries
      2. Probabilities and Distributions
      3. Working with Distributions
      4. Random Number Generation
      5. Sampling Procedures
      6. Conclusion
    13. Chapter 10: Approximating Streaming Data with Sketching
      1. Registers and Hash Functions
      2. Working with Sets
      3. The Bloom Filter
      4. Distinct Value Sketches
      5. The Count-Min Sketch
      6. Other Applications
      7. Conclusion
    14. Chapter 11: Beyond Aggregation
      1. Models for Real-Time Data
      2. Forecasting with Models
      3. Monitoring
      4. Real-Time Optimization
      5. Conclusion
    15. Introduction
      1. Overview and Organization of This Book
      2. Who Should Read This Book
      3. Tools You Will Need
      4. What's on the Website
      5. Time to Dive In
    16. End User License Agreement