I ♥ Logs: Apache Kafka and Real-time Data Integration
Date: This event took place live on May 21 2014
Presented by: Jay Kreps
Duration: Approximately 60 minutes.
Questions? Please send email to
Hosted By: Ben Lorica
This webcast talk will discuss how logs and stream-processing can form a backbone for data flow, ETL, and real-time data processing. It will describe the challenges and lessons learned as LinkedIn built out its real-time data subscription and processing infrastructure. It will also discuss the role of real-time processing and its relationship to offline processing frameworks such as MapReduce.
About Jay Kreps
Jay Kreps is a Principal Staff Engineer at LinkedIn where he is the lead architect for online data infrastructure. He is among the original authors of several open source projects including a distributed key-value store called Project Voldemort, a messaging system called Kafka, and a stream processing system called Samza. Twitter: @jaykreps
About Ben Lorica
Ben Lorica is the Chief Data Scientist and Director of Content Strategy for Data at O'Reilly Media, Inc.. He has applied Business Intelligence, Data Mining, Machine Learning and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services. He writes regularly about Big Data and Data Science on the O'Reilly Data blog.