Chapter 6. No Phishing Allowed!

One of the most important uses for anomaly detection is to identify potentially fraudulent behavior and thus reduce risk of loss and improve security. The nefarious behaviors to be found could be credit card fraud, identity theft, or phishing attacks on a secure website such as an online banking site. It’s not only challenging to think of how to create an effective model and alert system—it’s also a challenge to stay one step (or even two) ahead of the fraudsters. As you find ways to foil their attacks, they keep looking for new ways to commit theft. In this situation, agility, cost-effective and practical approaches, and innovation are all required.

Let’s take a look at a method that lets a machine-learning model quickly identify a hypothetical phishing attack on a bank site and flag it as suspicious. This example will extend the concepts of a probabilistic model that we have developed in previous chapters to situations that involve sequences of events.

The Phishing Attack

The attack is based on luring bank customers to a fake website in order to capture their private login details. The plan also includes having the customer unknowingly type in the CAPTCHA security code for the fraudsters that their fraud-bot script would not be able to do by itself without human help. A description of how the fraud might be attempted is given here and summarized in Figure 6-1.

Step 1
A huge number of customers receive an automated email that appears to be from the ...

Get Practical Machine Learning: A New Look at Anomaly Detection now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.