November 2018
Intermediate to advanced
322 pages
7h 54m
English
For this project, we are going to use the credit card dataset from Kaggle (https://www.kaggle.com/mlg-ulb/creditcardfraud), Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015. It consists of credit card transaction data from two days, from European cardholders. The dataset is highly imbalanced and contains approximately 284,000 pieces of transaction data with 492 instances of fraud (0.172% of the total).
There are 31 numerical columns in the dataset. Two of them are time and amount. Time denotes the amount of time elapsed (in seconds) ...
Read now
Unlock full access