Reducing Customer Churn in SAAS companies using Data Science

The need to reduce customer churn is now a common wisdom – it is estimated that retaining a customer is 5 to 20 times more profitable than acquiring a new customer.

Data science can be used to predict customer churn, but the approach is different for different business models – for example, a B2B SAAS model like Salesforce will need a different approach from a B2C subscription model like Netflix or a B2C non-subscription model like eBay.

For this discussion, let us focus on the churn issue for B2B SAAS companies that have annual contracts with the customers.  Our goal is to predict the likelihood of churn at the end of the contract. Assume we need to know the likelihood for the churn three months before the end of the contract so that we can take the necessary preventing actions.

Need help with Reducing Customer Churn? Want to leverage the latest in data science and marketing operations to turbocharge your marketing performance? Contact us for a FREE consultation!

One key concept of data science is that the data structure used for modeling should be identical to the one used for prediction.  To create a model for the churn at the end of the contract, we need to compile customer data as of 3 months before the end of the contract.  We also need to know the outcome at the end of the contract (i.e. churn versus no churn).

In the B2B world especially, the main obstacle for effective data science modeling is the incompleteness and inconsistencies in data.  An essential work before the modeling exercise is to make sure the data is in great shape.  One may want to use third party tools like Zoominfo or for this purpose.

The essential data needed for the churn modeling comes from the following categories, provided with some examples:

Firmographic: Employee size, Annual revenue, Industry, Region

Demographic: Title of the main contact person

Behavioral: Time since last contact from customer, Time since last communication, Number of communications in the last two months.

Usage: Number of seats, Number of active users

Sentiment: Mood of the customer (as indicated by customer rep)

Once the data for each customer (who are in a window of three months before the contract end date) are compiled, it is time to do some modeling.  Most modeling nowadays is done using cross-validation methodology, therefore carving out holdout for evaluation is not essential.  However, it will be advantageous to have a holdout group to prove to senior management that your churn model works.

The modeling needs to be done using different approaches to find the best. The most relevant for churn modeling are:

  • Classification tree
  • Logistic regression
  • Nearest neighbor
  • Naive Bayes

While evaluating the models, following things need to be kept in mind:

  • Reduce false negatives (i.e. false prediction that customer won’t churn)
  • Avoid over-fitting

Once the correct model is identified, it is time to operationalize it. In the example we discussed, this involves predicting in each month the likelihood of churn of customers whose contract ends in three months.

Once these churn likelihoods are estimated, there needs to be a marketing strategy to deal with the customers so they won’t cancel.  One idea is to divide the customers along two axes – “Likelihood to churn” and “Customer value”.

  • High likelihood to churn and high customer value segment needs to get very personalized, high touch treatment, probably involving customer support team and potential discounts.
  • On the other end, low likelihood to churn and low customer value segment can be addressed with low-touch marketing tactics.

Specific marketing strategies and tactics to reduce churn will depend on individual companies, and will need to be fine-tuned over time.


Posted by Joju Mangalam

Scroll to Top

 © 2018 HireJar