76 Enhance Your Business Applications: Simple Integration of Advanced Data Mining Functions
5.1 The business issue
Fraudulent behavior is detected when a telecommunications company tries to bill
a customer but never receives any payment. It is possible to manually evaluate
these cases, classifying them as fraud, bankruptcy of customer, or a technical
processing error. But fraudulent customers are creative and use techniques that
were not found previously as being characteristic for fraud. The ability to detect
unusual behavior, without knowing in advance which features make a certain
behavior unusual, is necessary to detect these cases as early as possible.
This business scenario is based on a real-life fraud detection exercise performed
by IBM data mining experts in a large telecommunications provider in Europe.
This organization was actually aware that they were a victim of fraud in their
Premium Rate services business unit. Premium Rate services provide expert
hotline services on such topics as:
򐂰 Advice on insurance and judicial
򐂰 Legal
򐂰 Stock market tips
򐂰 Adult services
The charge for these services can be up to 2 euro per minute and is billed by the
telecommunications company. The business model is that a company offering
such a service earns quite a high share of this rate immediately from the phone
company, whereas the phone company itself charges the caller for the entire
amount.
On the basis of this constellation, fraud can be carried out in the following way:
1. The (fraudulent) company offering a service via a premium number
cooperates conspiratorially with a partner.
2. The partner makes frequent and long phone calls from one or a few other
phone numbers to the (expensive) premium number.
3. A high amount of call charges accumulates within a few weeks (this task can
be technically facilitated by using a computer or an automatic dialing device to
perform the phone calls).
4. The service provider receives their comparably high share of the call charges
from the phone company.
5. Meanwhile, the phone company tries in vain to get its money from the
conspiratorial partner. But the partner most likely used a wrong name and
disappeared, so their conspiracy with the service provider cannot be proven.
Chapter 5. Fraud detection example 77
In the past, fraud detection models were typically created in a workbench
environment by data mining experts using algorithms that produce predictive
models. There are a few practical inconveniences with this approach:
򐂰
Effort to model fraud
In general, fraudulent behavior is relatively scarce, and labeled fraudulent
behavior is even more so. Using a classification approach, fraud has to be
identified up front and fraudulent cases must be labelled in the data. In reality,
this is both time consuming and resource intensive for most organizations.
In the case where a business can accurately identify fraudulent behavior,
fraudulent transactions are relatively scarce. In most cases, they only make
up less than 1% of the total number of transaction in most businesses.
Analyzing rare cases, such as fraud, requires artificial injection of data for
modeling purpose, which in turns require more planning, documentation, and
effort.
򐂰
Timeliness of the fraud detection model
It usually takes a significant amount of time to produce the model. By the time
it is deployed to the fraud detection system, the model loses some of its
predictive power. The result is fraud detection models that perform well in the
workbench but failed when deployed to the business.
򐂰
Shelf life of the fraud model
In addition, due to the elusive nature, improper behaviors change over time
and certainly with increasing speed. It is typically nontrivial to define known
fraudulent behavior and model it in a timely manner. By the time the model is
deployed, manual recalibration of model performance may be required. The
fraud model takes a long time to build, and only has a short useful life or shelf
life. This is similar to perishable goods, where you must use them by a certain
date. Otherwise, it is not good after that date.
Note: This kind of fraud falls into a category of fraud commonly known as
ghosting in the telecommunications industry. This kind of fraudulent behavior
is also common in other industries where service providers charge for
services that were never carried out.
The technique used in this scenario may equally apply to other industries
where there is a high incidence of fraud.

Get Enhance Your Business Applications: Simple Integration of Advanced Data Mining Functions now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.