Skip to Content
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
book

Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data

by Byron Ellis
July 2014
Beginner to intermediate
432 pages
10h 54m
English
Wiley
Content preview from Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data

Chapter 4Data-Flow Management in Streaming Analysis

Chapter 3, “Service Configuration and Coordination,” introduces the concept and difficulties of maintaining a distributed state. One of the most common reasons to require this distributed state is the collection and processing of data in a scalable way.

Distributed data flows, which include processing and collection, have been around a long time. Generally, the systems designed to handle this task have been bespoke applications developed either in-house or through consulting agreements. More recently, the technologies used to implement these data flow systems has reached the point of common infrastructure. Data flow systems can be split into a separate service in much the same way that coordination and configuration can. They are now general enough in their interfaces and their assumptions that they can be used outside of their originally intended applications.

The earliest of these systems were arguably the queuing systems, such as ActiveMQ, which started to come onto the scene in the early 2000s. However, they were not really designed for high-throughput volumes (although many of them can now achieve fairly good performance) and tended to be very Java centric.

The next systems on the scene were those open-sourced by the large Internet companies such as Facebook. One of the most well-known systems of this generation was a tool called Scribe, which was released in 2008. It used an RPC-like mechanism to concentrate data from ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Data Mining for Business Analytics

Data Mining for Business Analytics

Galit Shmueli, Peter C. Bruce, Peter Gedeck, Nitin R. Patel

Publisher Resources

ISBN: 9781118838020Purchase book