book

Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data

Name: Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Author: Byron Ellis
ISBN: 9781118837917

by Byron Ellis

July 2014

Beginner to intermediate

432 pages

10h 54m

English

Wiley

Read now

Unlock full access

Cover
Chapter 1: Introduction to Streaming Data
Sources of Streaming DataWhy Streaming Data Is DifferentInfrastructures and AlgorithmsConclusion
Part I: Streaming A Analytics Architecture
Chapter 2: Designing Real-Time Streaming Architectures
Real-Time Architecture ComponentsFeatures of a Real-Time ArchitectureLanguages for Real-Time ProgrammingA Real-Time Architecture ChecklistConclusion
Chapter 3: Service Configuration and Coordination
Motivation for Configuration and Coordination SystemsMaintaining Distributed StateApache ZooKeeperConclusion
Chapter 4: Data-Flow Management in Streaming Analysis
Distributed Data FlowsApache Kafka: High-Throughput Distributed MessagingApache Flume: Distributed Log CollectionConclusion
Chapter 5: Processing Streaming Data
Distributed Streaming Data ProcessingProcessing Data with StormProcessing Data with SamzaConclusion
Chapter 6: Storing Streaming Data
Consistent Hashing“NoSQL” Storage SystemsOther Storage TechnologiesChoosing a TechnologyWarehousingConclusion
Part II: Analysis and Visualization
Chapter 7: Delivering Streaming Metrics
Streaming Web ApplicationsVisualizing DataMobile Streaming ApplicationsConclusion

Chapter 8: Exact Aggregation and Delivery
Timed Counting and SummationMulti-Resolution Time-Series AggregationStochastic OptimizationDelivering Time-Series DataConclusion
Chapter 9: Statistical Approximation of Streaming Data
Numerical LibrariesProbabilities and DistributionsWorking with DistributionsRandom Number GenerationSampling ProceduresConclusion
Chapter 10: Approximating Streaming Data with Sketching
Registers and Hash FunctionsWorking with SetsThe Bloom FilterDistinct Value SketchesThe Count-Min SketchOther ApplicationsConclusion
Chapter 11: Beyond Aggregation
Models for Real-Time DataForecasting with ModelsMonitoringReal-Time OptimizationConclusion
Introduction
Overview and Organization of This BookWho Should Read This BookTools You Will NeedWhat's on the WebsiteTime to Dive In
End User License Agreement

Content preview from Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data

Chapter 4Data-Flow Management in Streaming Analysis

Chapter 3, “Service Configuration and Coordination,” introduces the concept and difficulties of maintaining a distributed state. One of the most common reasons to require this distributed state is the collection and processing of data in a scalable way.

Distributed data flows, which include processing and collection, have been around a long time. Generally, the systems designed to handle this task have been bespoke applications developed either in-house or through consulting agreements. More recently, the technologies used to implement these data flow systems has reached the point of common infrastructure. Data flow systems can be split into a separate service in much the same way that coordination and configuration can. They are now general enough in their interfaces and their assumptions that they can be used outside of their originally intended applications.

The earliest of these systems were arguably the queuing systems, such as ActiveMQ, which started to come onto the scene in the early 2000s. However, they were not really designed for high-throughput volumes (although many of them can now achieve fairly good performance) and tended to be very Java centric.

The next systems on the scene were those open-sourced by the large Internet companies such as Facebook. One of the most well-known systems of this generation was a tool called Scribe, which was released in 2008. It used an RPC-like mechanism to concentrate data from ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781118838020Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data

by Byron Ellis

Chapter 4Data-Flow Management in Streaming Analysis

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.