Alexander Ulanov

Distributed deep learning on Spark

Date: This event took place live on June 15 2016

Presented by: Alexander Ulanov

Duration: Approximately 60 minutes.

Questions? Please send email to

Description:

Hosted By: Ben Lorica

Deep learning is a popular area of machine learning. Deep learning models that are used in practice for image classification and speech recognition contain a huge number of weights, require a lot of computations, and are trained with large datasets. Training such models might take days or even months on a single powerful machine, but ongoing research is exploring how to scale out the training by means of distributed computations and data processing. Since Apache Spark provides these capabilities, as well as many other useful libraries essential to analytics and machine learning, it is a strong contender for a distributed training platform for deep learning. Alexander Ulanov offers an overview of a number of different tools and frameworks that have been proposed for performing deep learning on Spark and compares them. Alexander also considers the limitations of distributed training itself in the context of Spark and modern hardware.

About Alexander Ulanov

Alexander Ulanov is a senior researcher at Hewlett Packard Labs, where he focuses his research on machine learning on a large scale. Currently, Alexander works on deep learning and graphical models. He has made several contributions to Apache Spark; in particular, he implemented the multilayer perceptron classifier. Previously, he worked on text mining, classification and recommender systems, and their real-world applications. Alexander holds a PhD in mathematical modeling from the Russian Academy of Sciences.

About Ben Lorica

Ben Lorica is the Chief Data Scientist and Director of Content Strategy for Data at O'Reilly Media, Inc. He has applied Business Intelligence, Data Mining, Machine Learning and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services.