Predictive Analytics: The Problem of Churn Forecasting
In this lesson, you’re expected to understand the concepts of predictive modeling and machine learning.
Introduction to Predictive Analytics
There are many business problems that can be resolved with predictive analytics or machine learning.
For example, a telecom company is interested in inferring which customers are more probable to migrate to another carrier next month. Thus, if they can anticipate this information, they can carry out personalized retention campaigns. When a new customer walks into a bank asking for a mortgage, the manager needs to anticipate the probability of default for that particular customer. Similarly, a health insurance company needs to infer next year’s accident rate of their policies so that they can check their rates.
All of these cases have something in common – the company has historical data of past events.
For example, a telco company does not know which customer will churn next month, but it has information of which customers churned the previous month, two months ago, or the year before. The information of those past events is not only meaningful to understand why customers churn and what keeps them engaged to a particular carrier, but is also useful to predict which customer will churn in the future.
Predictive analytics and machine learning are the tools that enable us to use historical data to find patterns and use them to predict future events.
A good churn model is vital for a telecom company as engaging new customers is often more costly than retaining existing ones. Hence, most companies usually have proactive policies towards customer retention.
Predictive Modeling: The Learning Problem
The first step to apply predictive analytics and machine learning in your company is to identify a business question that we need to answer. Depending on the type of question we are trying to answer and according to the type of answer we are seeking, we will choose the most appropriate technique.
If the business use case is answered by YES/NO, we have a binaryclassification problem. We could also have a multiclass problem where the answer is a categorical variable with more than two unique values. Thus, classifiers are a family of techniques used when our problem has a discrete set of answers. Some examples include:
• Is a customer going to leave the company or not? This is binary yes/no.
• Given the results of a clinical test, does this patient have cancer? This is binary yes/no.
• Given the characteristics of a new customer, what product will he buy? This is multiclass.
• Given the past activity associated with a bank account, will the account fall in arrears? This is binary yes/no.
If our question is answered by estimating a numerical value, we are presented with a regression problem.
• Given the characteristics of an apartment, what is its expected price? How many dollars will the price increase if it includes a parking space?
• Given the past records and demographic characteristics of a policyholder, what will be his accident rate during next year?
• Given the characteristics of a car, such as the brand, horsepower, miles per gallon, what will be its price?
So the first step is to find a business problem that we need to predict. Second, we need to define what exactly it is that we are trying to predict.
For example, the churn problem in a telco company. In this case, the company is interested in knowing in advance which customers will leave the provider so that they can make them an offer before they actually decide to leave the company. Thus, in this case, it is important to define exactly what the company understands by churn. We can define churn as customers who will migrate from the current provider to another one within the next month.
However, note that the company could have used an alternative definition of churn that would be resolving a different business problem.
For example, another definition could be: lines that become inactive, i.e. lines that are not used for three consecutive weeks. Note that the correct definition of the question and the answer we are seeking is of vital importance.
The previous two definitions are valid for churn however, they are solving different business problems. So narrowing the business problem and making accurate definitions is vital before one can start to create a model.
The Churn Case Example
Once we have defined the problem and the answer to that problem, we build our target variable, which is the answer to the question.
In our churn example, the target will be if the customer switches his line to another provider. This is a binary target, where 1 represents that the line migrated and 0 that it did not migrate.
Thus, in a supervised classification problem, given a set of examples with their corresponding label or target, our goal is to predict the membership of a given instance to one of a predefined discrete set of classes.
In our example, we need to assign to each telephone number whether it migrates or not. We also need to create a set of predictors.
Predictors are a set of variables that we will use to define each observation and from which we will predict the target. For example, in the churn case, we can characterize each phone number by the total day calls, number of text messages received/sent, total night calls, internet plan, city of residence etc.
So at this point, we will create a table with data about our customers, where each row in the table represents a telephone and the columns contain customer attributes, including the target variable. So the basic pipeline for supervised learning involves two main steps:
• Training: Given a set of data instances X and their corresponding label Y, we want to fit a model that captures the underlying trends and patterns.
• Testing or exploitation: Given a model, we want to apply it to new unseen data in order to predict its label. Usually we aim to predict future values.