O'Reilly logo

Mastering Machine Learning with Spark 2.x by Michal Malohlava, Max Pumperla, Alex Tellez

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Association rule mining

Recall from the association rule introduction that in computing association rules, we are about halfway there once we have frequent item sets, that is, patterns for the specified minimum threshold. In fact, Spark's implementation of association rules assumes that we provide an RDD of FreqItemsets[Item], which we have already seen an example of in the preceding call to model.freqItemsets. On top of that, computing association rules is not only available as a standalone algorithm but is also available through FPGrowth

Before showing how to run the respective algorithm on our running example, let's quickly explain how association rules are implemented in Spark:

  1. The algorithm is already provided with frequent item sets, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required