Data download and exploration

When you go to the preceding link, there are a few different data options; the one we will use is called Let’s Get Sort-of-Real. This dataset is data for over two years for a fictional retail loyalty scheme. The data consists of purchases that are linked by basket ID and customer code, that is, we can track transactions by customers over time. There are a number of options here, including the full dataset, which is 4.3 GB zipped and over 40 GB unzipped. For our first models, we will use the smallest dataset, and will download the data titled All transactions for a randomly selected sample of 5,000 customers; this is 1/100th the size of the full database.

I wish to thank dunnhumby for releasing this dataset and ...

Get R Deep Learning Essentials now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.