Summary
Fraud detection is not a supervised learning problem. We did not use the random forests algorithm, decision trees, or logistic regression (LR). Instead, we leveraged what is known as a Gaussian Distribution equation to build an algorithm that performed classification, which is really an anomaly detection or identification task. The importance of picking an appropriate Epsilon (error term) to enable the algorithm to find the anomalous samples cannot be overestimated. Otherwise, the algorithm could go off the mark and label non-fraudulent examples as anomalies or outliers that indicate a fraudulent transaction. The point is, tweaking the Epsilon parameter does help with a better fraud detection process.
A good part of the computational ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access