Market Basket Analysis

Association rules are a popular technique for data mining. The association rule algorithm was developed initially by Rakesh Agrawal, Tomasz Imielinski, and Arun Swami at the IBM Almaden Research Center.[61] It was originally designed as an efficient algorithm for finding interesting relationships in large databases of customer transactions. The algorithm finds sets of associations, items that are frequently associated with each other. For example, when analyzing supermarket data, you might find that consumers often purchase eggs and milk together. The algorithm was designed to run efficiently on large databases, especially databases that don’t fit into a computer’s memory.

R includes several algorithms implementing association rules. One of the most popular is the a priori algorithm. To try it in R, use the apriori function in the arules package:

library(arules)
apriori(data, parameter = NULL, appearance = NULL, control = NULL)

Here is a description of the arguments to apriori.

ArgumentDescriptionDefault
dataAn object of class transactions (or a matrix or data frame that can be coerced into that form) in which associations are to be found. 
parameterAn object of class ASParameter (or a list with named components) that is used to specify mining parameters. Parameters include support level, minimum rule length, maximum rule length, and types of rules (see the help file for ASParameter for more information).NULL
appearanceAn object of class APappearance (or a list with ...

Get R in a Nutshell, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.