O'Reilly logo

Mastering Machine Learning with Spark 2.x by Michal Malohlava, Max Pumperla, Alex Tellez

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Frequent pattern mining with FP-growth

When we introduced the frequent pattern mining problem, we also quickly discussed a strategy to address it based on the apriori principle. The approach was based on scanning the whole transaction database again and again to expensively generate pattern candidates of growing length and checking their support. We indicated that this strategy may not be feasible for very large data. 

The so called FP-growth algorithm, where FP stands for frequent pattern, provides an interesting solution to this data mining problem. The algorithm was originally described in Mining Frequent Patterns without Candidate Generation, available at https://www.cs.sfu.ca/~jpei/publications/sigmod00.pdf. We will start by explaining ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required