Skip to Content
Data Algorithms
book

Data Algorithms

by Mahmoud Parsian
July 2015
Intermediate to advanced
778 pages
17h 9m
English
O'Reilly Media, Inc.
Content preview from Data Algorithms

Chapter 7. Market Basket Analysis

Market Basket Analysis (MBA) is a popular data mining technique, frequently used by marketing and ecommerce professionals to reveal affinities between individual products or product groupings. The general goal of data mining is to extract interesting correlated information from a large collection of data–for example, millions of supermarket or credit card sales transactions. Market Basket Analysis helps us identify items likely to be purchased together, and association rule mining finds correlations between items in a set of transactions. Marketers may then use these association rules to place correlated products next to each other on store shelves or online so that customers buy more items. Finding frequent sets in mining association rules for Market Basket Analysis is a computationally intensive problem, making it an ideal case for MapReduce.

This chapter provides two Market Basket Analysis solutions:

  • A MapReduce/Hadoop solution for tuples of order N (where N = 1, 2, 3, ...). This solution just finds the frequent patterns.

  • A Spark solution, which not only finds frequent patterns, but also generates association rules for them.

MBA Goals

This chapter presents a MapReduce solution for data mining analysis to find the most frequently occurring pair of products (order of 1, 2, ...) in baskets at a given supermarket or ecommerce store. Our MapReduce solution is expandable to find the most frequently occurring TupleN (where N = 1, 2, 3, ...) ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Algorithms with Spark

Data Algorithms with Spark

Mahmoud Parsian
Graph Algorithms

Graph Algorithms

Mark Needham, Amy E. Hodler
Algorithms and Data Structures for Massive Datasets

Algorithms and Data Structures for Massive Datasets

Dzejla Medjedovic, Emin Tahirovic, Ines Schweigert

Publisher Resources

ISBN: 9781491906170Errata PageSupplemental Content