O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Method one – Coercing a dataframe to a transaction file

Now we are ready to coerce the dataframe. We will create a temporary data frame containing just the transaction ID (InvoiceNo), and the descriptor (lastword).

First, we will verify the column names and numbers for these two variables. We can see that they correspond to columns 1 and 12 of the dataframe by first running a colnames function on OnlineRetail2:

colnames(OnlineRetail2) 
>  [1] "InvoiceNo"   "StockCode"   "Description" "Quantity"    "InvoiceDate">  [6] "UnitPrice"   "CustomerID"  "Country"     "itemcount"   "Desc2"      > [11] "lastword"    "firstword" 

As a double-check, display the first 25 rows, specifying the indices found previously:

kable(head(OnlineRetail2[, c(1, 11)], 5)) 

First, create the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required