Understanding the dataset

Here, we are going to discuss our input dataset in order to develop the application. You can find the dataset at https://github.com/jalajthanaki/credit-risk-modelling/tree/master/data.

Let's discuss the dataset and its attributes in detail. Here, in the dataset, you can find the following files:

  • cs-training.csv
    • Records in this file are used for training, so this is our training dataset.
  • cs-test.csv
    • Records in this file are used for testing our machine learning models, so this is our testing dataset.
  • Data Dictionary.xls
    • This file contains information about each of the attributes of the dataset. So, this file is referred to as our data dictionary.
  • sampleEntry.csv
    • This file gives us an idea about the format in which we need to generate ...

Get Machine Learning Solutions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.