Skip to Content
Building Better Distributed Data Pipelines
on-demand course

Building Better Distributed Data Pipelines

with Patrick McFadin
November 2017
Intermediate
53m
English
O'Reilly Media, Inc.
Closed Captioning available in German, English, Spanish, French, Japanese, Korean, Portuguese (Portugal, Brazil), Chinese (Simplified), Chinese (Traditional)

Overview

Patrick McFadin explains the basics of how to build more efficient data pipelines, using Apache Kafka to organize, Apache Cassandra to store, and Apache Spark to analyze. Patrick offers an overview of how Cassandra works and why it can be a perfect fit for data-driven projects. Patrick then demonstrates that with the addition of Spark and Kafka, you can maintain a highly distributed, fault-tolerant, and scaling solution. You’ll leave with a comprehensive view of the many options to make considered choices in your data pipeline projects.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Berglund and McCullough on Mastering Cassandra for Architects

Berglund and McCullough on Mastering Cassandra for Architects

Tim Berglund, Matthew McCullough

Publisher Resources

ISBN: 9781492031000