Mining Your Own Business in Telecoms Using DB2 Intelligent Miner for Data

Chapter 5. Can you predict the customers who are likely to leave? 81

5.3 Sourcing and preprocessing the data

To create our data model we have to take the raw data that we collect and

convert it into the format required by the data models. We call this stage the

process sourcing and preprocessing, and this is

the third stage in our data

mining method

But, before sourcing the data into one integrated table or view or flat file which is

the required format in data mining, the churn prediction needs additional

consideration due to the nature of prediction modeling — predict the future based

on the past.

Determine time window

When sourcing all the data defined, it is necessary to specify which time frame of

data is supposed to be gathered.

25 Total_dur Total minutes of call

26 Inbound_dur Duration of outbound calls

27 Discount_share Discount calls (in regards of regular calls)

28 Complet_call 3 month number of call completed

BILLING /

PAYMENT

29 Revenue Revenue

30 Bill_amt Amount of bill

31 Pay_delayed_before How many times the payment was delayed?

DERIVED INDICS

32 Outsphere Number of different telephone number for outbound

call

33 Mobility Number of network cell visited during the call

34 Concentration Call for top 2 most frequently used phone in regards of

total calls

35 Quality Successful calls in regards of failed calls

36 Call_trend N month slope of the minutes of call

Variable name Description

82 Mining Your Own Business in Telecoms Using DB2 Intelligent Miner for Data

You should define the following three items to decide which time frame of

customer data and churn information are going to be used in the model.

򐂰 Data window: Time frame for input variables that is used for constructing

model

򐂰 Forecasting window: Time frame for the prediction and used when sourcing

the target prediction variable (churn indicator). The churn prediction model is

often referred to as “WHO and WHEN” model which means that it tries to

answer the questions: who is going to leave the company and when are they

going to leave? The forecasting window is the “WHEN” part of churn

prediction modeling. In the phase of building model, the forecasting window is

the time frame to examine whether the customers left the company or not.

򐂰 Time lag: Interval between data window and forecasting window.

In this case, we used six months as a data window, two months as a time lag and

one month as a forecasting window, as shown in Figure 5-1.

In the model building phase, six months of historical data from February to July

for customers who are active as of the end of July is used with churn information,

whether or not these customers left the company in October. This model can be

applied to customers who are active as of the end of August to predict probable

churners in November.

Therefore, in early September, marketing personnel can get the customers list of

those who are likely to leave the company in November, and a two month time

frame is available for them to setup and execute the proper marketing actions.

You can decide about a data window after studying historical churn patterns. You

better avoid certain time frames, if there are some abnormal patterns due to

external impacts. Timeframes of the latest data available to build the prediction

model is a good example of the data window.

Get Mining Your Own Business in Telecoms Using DB2 Intelligent Miner for Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Mining Your Own Business in Telecoms Using DB2 Intelligent Miner for Data by Corinne Baragoin

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly