O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Filtering out single item transactions

Since we will want to have a basket of items to perform some association rules on, we will want to filter out the transactions that only have one item per invoice. That might be useful for a separate analysis of customers who only purchased one item, but it does not help with finding associations between multiple items, which is the goal of this exercise.

  • Let's use sqldf to find all of the single item transactions, and then we will create a separate dataframe consisting of the number of items per customer invoice:
        library(sqldf) 
  • First construct a query: How many distinct invoices were there? We see that there were 25900 separate invoices:
        sqldf("select count(distinct InvoiceNo) from    OnlineRetail") ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required