Andreas Pfadler

Sponsored by


Large-scale Machine Learning in Spark

Tackle regression problems with the open-source Fregata library

Date: This event took place live on August 29 2017

Presented by: Andreas Pfadler

Duration: Approximately 60 minutes.

Cost: Free

Questions? Please send email to


If you're a data scientist or engineer who needs to perform large-scale machine learning in Spark, you may face these challenges:

  • Tuning Hyperparameters
  • Efficient Parallelization
  • Dealing with large regression models

TalkingData, China's largest independent Big Data platform, has developed an open-source solution to these challenges — Fregata.

Join Andreas Pfadler, machine learning engineer at TalkingData, for this webcast as he walks through key challenges/solutions to large-scale machine learning and use cases for machine learning at TalkingData. Pfadler will introduce Fregata, a light-weight, large-scale machine learning library on Spark, which aims to tackle large-scale logistic regression and softmax regression problems involving hundreds of millions of training data records.

In this webcast, participants will:

  • Get an overview of common challenges in large-scale machine learning
  • Learn practical methods to address these challenges
  • Get introduced to Fregata, Fregata, a light-weight, large-scale machine learning library developed at Talking Data

About Andreas Pfadler, Machine Learning Engineer at TalkingData

Andreas Pfadler is a machine learning engineer at Talking Data. He holds a PhD in mathematics and previously worked as a consultant in the financial industry. He is passionate about math, machine learning, software architecture, and cooking. He currently lives in Beijing.