Description:

Capturing and processing big data isn't easy. That's why open-source software like Apache Spark, Kafka, Hadoop, and NiFi that scale, process, and manage immense data volumes are so popular. While powerful, the drawback to these tools is that they do not allow business users to access, optimize, and analyze big data like some other enterprise-friendly tools.

This webcast will introduce Kylo, an open-source data lake platform based on Apache Spark and NiFi. Kylo automates many of the tasks associated with data lakes, such as data ingest, preparation, discovery, profiling, and management.

Already in use at a number of global enterprises, Kylo integrates on-premise, cloud, and hybrid platforms with an engineering and data science control framework. By simplifying ingestion and pipeline control, Kylo accelerates time-to-value, often from weeks and months to hours and sometimes minutes. Using Kylo, enterprises can empower business analysts, data scientists and others to perform analytics on the data lake.

After this webcast, you'll be able to:

Gain an understanding of the three phases of building a foundation for enterprise analytics using open source
Examine options to encourage "data democratization" including Kylo, Apache Spark and NiFi and how they all work together in a data lake environment.
Learn more about the newly emerging discipline of Analytics Ops and how it enables the continuous delivery of analytics results

About Matt Hutton, Director of Research & Development at Think Big, a Teradata Company

Matt directs our Research and Development team developing technology assets for Think Big Data Lake solutions. Matt has 20 years of director-level experience building and managing software teams in Silicon Valley that develop large-scale, distributed software and data solutions. Matt provided consulting for early technology leaders in Big Data using Hadoop such as Quantcast. Prior to joining Think Big, Matt led software engineering at Lawrence Livermore National Laboratory for the National Ignition Facility program, a fusion energy research program and the world's largest laser. Matt designed the software and data architecture for the peta-scale data processing data cluster used for fusion sciences. Before LLNL, Matt was the Director of Software Engineering at ThinkLink (technology purchased by Microsoft), building a unified messaging and IP Telephony application service provider exceeding 500,000 customers. Prior to this, Matt was Director of Software Engineering at Netcom Online, an Internet Service Provider and early pioneer during the emergence of the Internet. Prior to Netcom (now EarthLink), Matt held software engineering positions at Symantec and Delrina Software.

About Scott Reisdorf, Principal Software Engineer of the Research and Development at Think Big

Scott Reisdorf has been with Think Big for two years a Principal Software Engineer of the Research and Development team. Scott has over 15 years of experience in software development. At Think Big Scott has helped successfully implement Big Data solutions for many fortune 500 companies. Scott is the technical lead for Kylo, an open source data lake platform built on Hadoop and Spark that provides a turnkey solution for quickly building out data lakes.

Description:

About Matt Hutton, Director of Research & Development at Think Big, a Teradata Company

About Scott Reisdorf, Principal Software Engineer of the Research and Development at Think Big

About O'Reilly

Community

Partner Sites

Shop O'Reilly