Adam Kawa

Hadoop Adventures at Spotify

Date: This event took place live on February 27 2014

Presented by: Adam Kawa

Duration: Approximately 60 minutes.

Cost: Free

Questions? Please send email to


Operating a small-size Hadoop cluster is a calm walk in a forest, while working with a big-size Hadoop cluster is a big adventure in a real jungle. The bigger elephant is, the more love and care it demands and we have discovered it in a hard way.

In this webcast talk led by Adam Kawa, we will talk about our real-world Hadoop issues that either broke our cluster or made it very unstable, especially when we were growing very fast from a 60 to 690-node Hadoop cluster.

Each issue comes from our JIRA dashboard and is based on facts. We will also expose real graphs, numbers, even our excerpts from emails and conversations. We will honestly share the mistakes that we made, describe the lessons that we learned (including an ashaming one!), and explain the fixes that finally domesticated this love-demanding yellow elephant and its friends.

About Adam Kawa

Adam Kawa works as Data Engineer at Spotify, where his main responsibility is to maintain one of the largest Hadoop-YARN clusters in Europe. Every so often, he implements and troubleshoots Python MapReduce, Hive and Pig jobs. He also works as Hadoop instructor at Compendium (Authorized Cloudera Training Partner).

Adam is a frequent speaker at Hadoop conferences and Hadoop User Groups meetups. He co-organizes Stockholm and Warsaw Hadoop User Groups. He regularly blogs about the Hadoop ecosystem at