Predicting churn risk with CLV model
Customers behave in different ways, and businesses need to understand the why’s and how’s of their behavior.
Churn prediction or churn risk prediction is a process that helps business owners understand how likely a customer is to cancel a subscription or leave a specific brand. It is a critical aspect of marketing as it is always more expensive to acquire a customer than it is to retain one.
Making retention the core of your business helps sustain and grow revenue. Once churn risk is identified, you can create marketing strategies that help support retention and maximize the value of each customer on your list.
Churn risk prediction and marketing automation platforms
For businesses that use marketing automation platforms like Klaviyo, predicting churn risk becomes easier as there are features within the platform that help you keep on top of the churn risk.
The key to understanding churn risk is associating it with Customer Lifetime Value (CLV).
Marketing platforms have access to a large volume of e-commerce data. This helps them assess and predict churn in a more sophisticated manner. While some marketing automation platforms may have this feature built-in, the accuracy of their model is imperative.
Here, we will be looking at the Klaviyo churn risk predictor, and how it uses its AI-powered Customer Lifetime Value to connect predictions to its segmentation engine.
Whether you use Klaviyo or not, you can use the examples of the CLV model in action to understand how best it impacts your business. To provide the confidence in the model, we will also discuss how we validated the model and show how it improves over the academic model.
All these features work together to improve churn prediction for customers.
Ecommerce focused churn risk model
The Klaviyo churn risk model is tailored to predicting “purchasing in an e-commerce setting”.
Through an analysis of the data, we have seen that the model is out-predicting published academic models of customer behavior.
Klaviyo churn risk prediction
Klaviyo computes a churn risk prediction for each customer. (We use the terms “churn risk” and “probability of churn” interchangeably.)
The example customer below is currently predicted to have a 21% risk of churn.
The colored bar shows what would have predicted for days in the past all the way back to the customer’s first purchase in March 2016.
Churn Risk model in action
Let’s dive in with an extreme example that illustrates limitations in some models from academic literature.
We’ll look at Phones Forever (anonymized name), an online store for cell phone accessories. They are fortunate to have many repeat, low churn-risk customers.
However, like most businesses, they also have a ton of one-and-done, high churn-risk customers.
The Klaviyo churn model is able to differentiate between these two types of customers with much higher accuracy.
In the academic model, churn prediction increased too slowly over time. Churn prediction started at around 20%, and even after 15 months without a purchase, it had only crept up to 25%.
The Klaviyo model is better at modeling customers with one purchase in their history.
Using this model for the same customer, we see they initially have a medium risk for churn that decays to a 96% churn risk prediction after 15 months of not making a purchase.
Realistic estimates for churn probability over time for one-time customers differentiates the Klaviyo model from the models implemented in the academic literature.
The Klaviyo model predictions also improve our prediction quality for repeat customers.
The new model learns when one-and-done customers transition to being repeat customers:
The customer above is classified as medium-high risk for churn after their first purchase. After their second, risk decreases a little. After their third, they are thoroughly medium.
After all their remaining purchases, they start as low risk gradually increasing as the time since last purchase increases.
Here’s another example:
Each time a purchase is made, churn probability decreases.
During long gaps between purchases, churn probability increases over time.
The model can identify when it has been too long since a loyal customer’s last purchase and shows their churn risk go from low to high.
In the example above, it’s been about nine months since the customer’s last purchase, so it is very likely they have churned.
Let’s look at a loyal customer with fewer and less frequent placed orders. For the customer below, the same general behavior is observed.
But, the model learns the longer purchase frequency, so churn risk goes up slower between purchases. It has also been about nine months since this customer’s last purchase, however, we can see that their churn probability increased at a slower rate.
The churn risk is 85%, 10% lower than the customer above.
Nine months is a shorter gap for this customer, who typically waits 118 days between orders, than for the customer above, who waits 42 days between orders.
How do we know the new Klaviyo model performs better than the academic model?
We tested all of our models on e-commerce data from Klaviyo customers.
For each company, we constructed a set of training and validation customers. We randomly assigned 80% of customers to the training group. The remaining 20% were used in the validation set.
We withheld one year of data from training and used our model to predict what would happen during that year.
Then, we compared the predictions to what actually happened.
Evaluating the accuracy of probability of churn for a single individual is impossible.
When a customer is assigned a probability of churn of 75%, we can never tell if that prediction was accurate or not because they either do or don’t return to make a purchase.
However, if we have 100 customers with a probability of churn of 75% and 75 never return, our prediction is likely accurate. If 95% never return, the prediction is inaccurate.
By grouping our customers by their probability of churn, we can measure accuracy by comparing the number of customers expected to churn to the number who actually churned in each group.
Binning customers by probability of churn shows where the Klaviyo model outdid the academic model. Below, we show the churn categories and predictions for Best Bag Bargains (anonymized name), a company that sells handbags.
Klaviyo model vs the academic model
The Klaviyo model is much better at identifying customers who have 90%+ probability of churn.
The academic model is overly optimistic – assigning a medium 40-70% probability of churn to a large number of customers.
When we compare these predictions to reality, we see that 88-97% of these medium risk customers churn, showing the academic model cannot differentiate between medium risk customers and high risk customers.
In contrast, the Klaviyo model assigns a much smaller subset of customers to the 40-70% group and the prediction is more accurate: 58-71% of these medium risk customers churn.
The Klaviyo model correctly identifies that most customers should be assigned an 80-100% churn rate.
To compare model performance, we needed to put a single number on how well or poorly the different models did at predicting churn probability.
We binned customers by their predicted churn rate. Customers were separated into 10 groups of churn probability: a 0-10% chance group, a 10-20% chance group, continuing all the way to a 90-100% chance group.
Then, we counted how many of each group made a purchase during the holdout period.
A well performing model will predict well in each group: the 0-10% group should have 5% churn, the 10-20% group should have a 15% churn rate, all the way to the 90-100% churn group which should have a 95% churn rate.
Well performing models have a low misclassification rates for all bins.
To count misclassifications, we evaluate how “confused” the models are for each prediction bin. Since we are measuring a probability, we don’t know exactly how many false positives and negatives there are in our predictions, so instead we find the bounds on our predictions.
The lower bound is the least possible rate of confusion between the prediction and the performance. The upper bound is the maximum mismatch possible.
We show a visual calculation and formulas of this in the FAQ below. In the example above, we show the Klaviyo model confuses between 3-22% (40-246) customers.
This is a huge improvement over the academic model which confuses between 36-53% (406-587) customers.
Even in the worst possible case, the Klaviyo model beats the academic model by a significant margin. And Best Bag Bargains isn’t an exception: company after company showed massive improvements in accuracy with the new Klaviyo model.
Looking at our 700 test companies, the lower bound decreases for the Klaviyo model:
In this plot, we show the confusion scores for 700 randomly chosen companies. The score on the x-axis is the fraction of customers put in the wrong churn bin.
For example, a score of 0.2 means that 20% of customers were misclassified. The y-axis shows the number of companies with this score.
The academic model has a long tail of high confusion scores, with some companies seeing as much as 80% of their customers misclassified.
The Klaviyo model has a confusion of less than 10% for almost all companies. Overall, more than half of companies will see at least a two times decrease in confusion scores.
The Klaviyo model is very good at correctly assigning churn probabilities to customers.
Our model will allow you to identify high, medium, and low probability of churn customers. By differentiating between these different types of customers, our new model lets you target your customers differently based on how likely they are to return.
FAQ about using churn probability
After exporting my data, I made a list of customers with churn probability between 80-90%. How many customers should I expect to see make another purchase?
The average churn probability will be around 85%, so 15% of customers in this segment should return as customers.
I see that a customer has an 87% chance of churn and yet they are expected to make 3 purchases in the next year. How is that possible?
Churn probability only predicts the likelihood the customer will not come back. The expected number of purchases shows the expected value of the number of purchases the customer will make.
To calculate expected value, we take the summation of the number of purchases times the probability of that number of purchases.
So, even though this customer has an 87% chance of never making a purchase again, if they have a 13% chance of making 23 purchases in the future, that adds up to an expected value of 3 purchases. (.87*0 + .13*23 = 3)
Most of my customers have a 99% chance of churn. How can I use this?
If your business has a lot of one-and-done customers, your customers have a high probability of churn. Your lower probability of churn customers are likely your most valuable customers.
Churn risk predictions let you identify the small segment of low churn-risk customers even if most of your customers never return. The model also learns from your data – so if your business changes over time and can recapture more customers, the model will update and learn from your new data to predict churn accurately for your customers.
FAQ about evaluating models
Can you explain the upper and lower bounds in more detail?
When we calculate the confusion score for each bin, we find the maximum and minimum number of correct predictions possible given the expected churn and the actual churn. Take a look at this diagram:
We predict 85% churn for 100 customers. We actually see a 90% churn rate. In the best case, we have 5% error, the difference between how many customers we expected to churn and how many churned, absolute_value(#predicted_to_churn – #actual_churn) = absolute_value(85-90) = 5.
In the worst case, we have the maximum possible mismatch between our prediction and what actually happened.
All our customers who we predict to return don’t and our customers who return all were predicted to churn. The upper bound, the maximum mismatch possible, is #total – absolute_value(#actual_didn’t_churn – #predicted_to_churn) = 100 – absolute_value(10-85) = 100-75 = 25.
Why don’t you show the graph for the upper bound?
The upper bound is a less meaningful bound on accuracy. The upper bound is large for churn predictions between 25-75% even if we see the same percentage of churn in the actual data. We’ll explain visually for the most extreme example, 50% churn:
Unlike the 85% prediction, most customers are not expected to have the same behavior. This means there are more possible mismatches, so in the worst case, we see more error.
We plot the upper bounds below. The Klaviyo model still beats the academic model. However, these results are less meaningful because the upper bound for both models is very large for churn predictions around 50%.
Implementing churn risk in your marketing strategy
For marketers understanding the data they are using is as important as the result. The e-mail marketing platform or marketing automation software you choose for your business should have features that help you scale your business. Retention of your current subscriber base is the best way to scale in this era of cookies and data privacy where third party data is becoming more and more elusive.