O'Reilly logo

Building a Recommendation Engine with Scala by Saleem Ansari

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 2. Data Processing Pipeline Using Scala

In Chapter 1, Introduction to Scala and Machine Learning, we gained some idea about Scala, Apache Spark, and machine learning. In this chapter, we will explore ways to compose a data processing pipeline using Scala. In particular, we will discuss:

  • Entree—A sample dataset for recommendation systems
  • ETL—extract transform load
  • Extraction and transformation for machine learning
  • Setting up MongoDB and Apache Kafka
  • Data processing pipeline for Entree

And then hopefully, we will be able to compose different components of the processing pipeline.

Entree – a sample dataset for recommendation systems

In this chapter, we will focus our discussion based on a dataset that is apt for recommendation engines. We have selected ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required