This section builds on the earlier binary classification task and looks to increase the accuracy for that task. The first thing we can do to improve the model is to use more data, 100 times more data in fact! We will download the entire dataset, which is over 4 GB data in zip files and 40 GB of data when the files are unzipped. Go back to the download link (https://www.dunnhumby.com/sourcefiles) and select Let’s Get Sort-of-Real again and download all the files for the Full dataset. There are nine files to download and the CSV files should be unzipped into the dunnhumby/in folder. Remember to check that the CSV files are in this folder and not a subfolder. You need to run the code in Chapter4/prepare_data.R ...
Improving the binary classification model
Get Deep Learning with R for Beginners now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.