Skip to Content
Big Data Analytics with Java
book

Big Data Analytics with Java

by RAJAT MEHTA
July 2017
Beginner to intermediate
418 pages
9h 46m
English
Packt Publishing
Content preview from Big Data Analytics with Java

Implementation of the Apriori algorithm in Apache Spark

We have gone through the preceding algorithm. Now we will try to write the entire algorithm in Spark. Spark does not have a default implementation of Apriori algorithm, so we will have to write our own implementation as shown next (refer to the comments in the code as well).

First, we will have the regular boilerplate code to initiate the Spark configuration and context:

SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);
JavaSparkContext sc = new JavaSparkContext(conf);

Now, we will load the dataset file using the SparkContext and store the result in a JavaRDD instance. We will create the instance of the AprioriUtil class. This class contains the methods for calculating ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science with Java

Data Science with Java

Michael R. Brzustowicz
Data Science on AWS

Data Science on AWS

Chris Fregly, Antje Barth
Machine Learning: End-to-End guide for Java developers

Machine Learning: End-to-End guide for Java developers

Richard M. Reese, Jennifer L. Reese, Boštjan Kaluža, Dr. Uday Kamath, Krishna Choppella

Publisher Resources

ISBN: 9781787288980Supplemental Content