O'Reilly logo

Big Data Analytics with Java by Rajat Mehta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Implementation of the Apriori algorithm in Apache Spark

We have gone through the preceding algorithm. Now we will try to write the entire algorithm in Spark. Spark does not have a default implementation of Apriori algorithm, so we will have to write our own implementation as shown next (refer to the comments in the code as well).

First, we will have the regular boilerplate code to initiate the Spark configuration and context:

SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);
JavaSparkContext sc = new JavaSparkContext(conf);

Now, we will load the dataset file using the SparkContext and store the result in a JavaRDD instance. We will create the instance of the AprioriUtil class. This class contains the methods for calculating ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required