Skip to Main Content
Building Real-Time Analytics Systems
book

Building Real-Time Analytics Systems

by Mark Needham
September 2023
Beginner to intermediate content levelBeginner to intermediate
220 pages
4h 36m
English
O'Reilly Media, Inc.
Book available
Content preview from Building Real-Time Analytics Systems

Chapter 5. The Serving Layer: Apache Pinot

AATD has come to the conclusion that it’s going to need to introduce a new piece of infrastructure to achieve scalable real-time analytics, but isn’t yet convinced that a full-blown OLAP database is necessary.

In this chapter, we’ll start by explaining why we can’t just use a stream processor to serve queries on streams, before introducing Apache Pinot, one of the new breed of OLAP databases designed for real-time analytics. We’ll learn about Pinot’s architecture and data model, before ingesting the orders stream. After that, we’ll learn about timestamp indexes and how to write queries against Pinot using SQL.

Figure 5-1 shows how we’re going to evolve our infrastructure in this chapter.

bras 0501
Figure 5-1. Evolution of the orders service

Why Can’t We Use Another Stream Processor?

At the end of the last chapter, we described some of the limitations of using Kafka Streams to serve queries on top of streams. (See “Limitations of Kafka Streams”.) These were by no means a criticism of Kafka Streams as a technology; it’s just that we weren’t really using it for the types of problems for which it was designed.

A reasonable question might be, Why can’t we use another stream processor instead, such as ksqlDB or Flink? Both of these tools offer SQL interfaces, solving the issue of having to write Java code to query streams.

Unfortunately, it still doesn’t ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Building Real-Time Analytics Applications

Building Real-Time Analytics Applications

Darin Briskman
Graph-Powered Analytics and Machine Learning with TigerGraph

Graph-Powered Analytics and Machine Learning with TigerGraph

Victor Lee, Phuc Kien Nguyen, Alexander Thomas
Architecting Data and Machine Learning Platforms

Architecting Data and Machine Learning Platforms

Marco Tranquillin, Valliappa Lakshmanan, Firat Tekiner

Publisher Resources

ISBN: 9781098138783Errata PageSupplemental Content