O'Reilly logo

Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka by Isaac Ruiz, Raul Estrada

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

© Raul Estrada and Isaac Ruiz 2016

Raul Estrada and Isaac Ruiz, Big Data SMACK, 10.1007/978-1-4842-2175-4_10

10. Data Pipelines

Raul Estrada and Isaac Ruiz1

(1)Mexico City, Mexico

Well, we have reached the chapter where we have to connect everything, especially theory and practice. This chapter has two parts: the first part is an enumeration of the data pipeline strategies and the second part is how to connect the technologies:

  • Spark and Cassandra

  • Akka and Kafka

  • Akka and Cassandra

  • Akka and Spark

  • Kafka and Cassandra

Data Pipeline Strategies and Principles

The following are data pipeline strategies and principles:

  • Asynchronous message passing

  • Consensus and gossip

  • Data locality

  • Failure detection

  • Fault tolerance / no single point of failure

  • Isolation

  • Location transparency ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required