Comparing Samza and Spark Streaming

It is useful to compare Samza and Spark Streaming to help identify the areas in which each can best be applied. As it has been hopefully made clear in this book, these technologies are very much complimentary. Even though Spark Streaming might appear competitive with Samza, we feel both products offer compelling advantages in certain areas.

Samza shines when the input data is truly a stream of discrete events and you wish to build processing that operates on this type of input. Samza jobs running on Kafka can have latencies in the order of milliseconds. This provides a programming model focused on the individual messages and is the better fit for true near real-time processing applications. Though it lacks support ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.