Predicting likelihood for customer churn in B2C
It is now widely accepted that keeping an existing customer is much more valuable than acquiring a new customer. Therefore, predicting customer churn and taking preventing actions is paramount in the marketing strategy of any business.
The methodology to predict customer churn is different based on the business model. The churn prediction in a subscription-business is more straightforward. Here, the model can be developed using customer’s usage pattern and demographics. See our earlier blog for details of creating such a churn model.
Need help with Customer Churn? Want to leverage the latest in data science and marketing operations to turbocharge your marketing performance? Contact us for a FREE consultation!
Arguably the toughest business model to predict customer churn is ecommerce. This is because a customer who seems to be churned can return, say one year later, to make another transaction. This destroys the churn model developed that assumes the customer has churned.
However, there is a proven methodology for predicting churn in such businesses. It is based on RFM (Recency-Frequency-Monetary) framework. RFM has accumulated over fifty years of proven track record as a dependable customer value framework. It is also backed by sound academic research. Here is a quick overview of the three variables involved in RFM:
Recency: Days since the last purchase by the customer
Frequency: Number of transactions by the customer in a period, ideally last one full year.
Monetary: Total amount spent by the customer in the above period.
Refer to our earlier blog on RFM for more details.
The churn prediction model should be created in conjunction with RFM analysis. The churn model itself would be a regression expression that uses raw values of recency, frequency, and monetary values of each customer and use them to predict the dependent variable “likelihood to churn”. The dependent variable will have a range of values from 0 to 1, with 1 being the highest likelihood of churn.
Likelihood to churn is derived from the customer rank in the RFM analysis, with the customer lowest in the RFM list getting normalized value of 1 and the customer who is the highest in the list receiving 0.
The benefit of this methodology is that customer churn probabilities can be measured at any time using the regression equation derived without undertaking elaborate RFM analysis. The churn predictions can be done for individual customer or all customer population.
If you want to explore improving the regression model further, you can use the following additional variables, as explained in a relevant research paper (link):
- Length of relationship: Days from the first transaction till present
- Average inter-purchase time: Divide 365 days by the number of transactions in the same period
- Last inter-purchase time: Days between the last two transactions
- Number of categories purchased: This reflects the purchase footprint of customer
- Customer type flag: Consumer versus business customer
Once a satisfactory customer churn prediction model is created, it needs to be applied periodically (say quarterly) to the customer population. This will enable to prioritize the customers to reach out to prevent possible churn.
The churn prediction model itself should be refined in conjunction with RFM analysis, which is usually done once or twice a year.