Building a Continuously Curated Ingestion Pipeline: Recipes for Success
Date: This event took place live on December 09 2015
Presented by: Arvind Prabhakar
Duration: Approximately 60 minutes.
Questions? Please send email to
Are you polluting your data lake?
Modern data infrastructures are fed by vast volumes of data, streamed from an ever-changing variety of sources. Standard practice has been to store the data as ingested and force data cleaning onto each consuming application. This approach saddles data scientists and analysts with substantial work, creates delays getting to insights and makes real-time or near-time analysis practically impossible.
In this webcast you will discover:
About Arvind Prabhakar, CTO & Co-founder — StreamSets
Arvind Prabhakar is CTO and Co-Founder of StreamSets, a Big Data startup headquartered in San Francisco. He is an Apache Software Foundation member, former PMC Chair for Flume and Sqoop projects, PMC member on Storm and MetaModel projects. Prior to StreamSets, Arvind was director of engineering at Cloudera and software architect in the core platform engineering team at Informatica.