AI-Powered Customer Lifetime Value

Editor’s Note: This post is co-authored by Ezra Freedman, Christina Jaworsky, and Eric Silberstein.

Ecommerce’s secret weapon is detailed, granular data about customer behavior. Your business is generating data faster and at a greater scale than you can interpret using traditional methods. The only way to take full advantage of your data is through modern data science. To stay competitive, it will be critical to use AI to automatically interpret your data so you can use it for customer segmentation, marketing automation, and business growth.

Today, we’re excited to announce the launch of our latest AI-powered feature: Customer Lifetime Value (CLV). But this post is more than just an announcement. First, we explain what customer lifetime value is and what makes it so valuable for ecommerce marketers. Next, we show how predictions operate for various companies and customers through actual examples. Finally, we explain the underlying math and talk about what’s next.

First things first, our new CLV feature is now live. Beginning today, Klaviyo will train (and keep up-to-date) a mathematical model of CLV for your company. This model presents you with:

  • Historic CLV
  • Predicted future CLV
  • Total CLV
  • Probability of churn
  • Average time between purchases for each of your customers

We’re particularly excited to share that, as far as we can tell, we’ve been able to test our model across more companies than any academic researchers or commercial vendors.

What is CLV?

Customer Lifetime Value represents the value of a customer to your business over the entire length of your relationship with that customer. It’s considered the holy grail of marketing metrics because it tells you how much you should spend to acquire different types of customers.

Churn probability goes hand-in-hand with CLV, and you can think of probability of churn as one of the inputs to predicting future spend. If a customer has a low probability of churn, it means that they are likely to buy from you again, so you should expect to earn more from that customer in the future. On the other hand, if a customer has a high probability of churn, that person is unlikely to buy again and so their expected future spend will be close to zero.

Definitions of Customer Lifetime Value are inconsistent across vendors. Here’s how we’ve defined things. For each customer, we first determine Historic CLV as the revenue generated from all orders placed to date. We then calculate Predicted CLV, the total dollars the customer is expected to spend in the next year. (Much more on that below.) We add those two numbers together and that gives Total CLV.

An ecommerce business typically has customers that buy when they want and spend what they want, which makes it difficult to predict CLV. At the other extreme, you may be in a multiyear contract with your cell phone carrier, which means they can easily project each of your payments and they know when you churn. Our CLV feature supports the more challenging ecommerce situation of no contracts and no regular purchase intervals.

In addition to CLV, we also calculate average time between orders. This metric is not a prediction–it’s based strictly on actual historical orders and considers the time between a customer’s first and last order. For example, if a customer’s only orders were 10 days apart two years ago, the average time between orders would be 10 days, even though two years have passed since the last order.

Predictive Analytics Module

Predictive analytics are shown in a box on the right side of profile pages. This way, you can quickly see the historic and predicted value of any given customer. Sometimes, we don’t have enough information from your company or about a customer. If you don’t see a box on profile pages it means we don’t have enough data for your company. If you see a box but everything is gray and marked as n/a it means we don’t have enough information for that customer.

The timeline illustrates orders over time. The example customer shown above placed their first order on September 19, 2015. They made six orders, indicated by the vertical tick marks, and spent a total of $888. When you hover over a tick mark, you can see the date and value of that specific order.

Now for the predictive part. We model the customer’s behavior and predict that they will make 1.43 orders and spend $189 over the next year. Those numbers probably sound oddly precise and of course it’s impossible for someone to make 1.43 orders, and it’s also impossible to predict exactly how many orders or dollars someone will spend. In this case, 1.43 orders means we expect the customer to make one or two orders (but there’s also a chance that they won’t make any orders or they’ll make 10 orders). These expectations start to make sense when you group multiple customers together because you can predict the total number of orders or spend for the group.

For example, if we have five customers with predicted number of purchases of 1.43, 0.25, 3.12, 0.78 and 2.97, we can expect approximately 9 purchases across this group. Another example–say we have 100 customers and for each we predict 0.1 orders over the next year. For any one customer our best guess is they won’t order anything, but for the group as a whole we predict 100 x 0.1 = 10 orders. Expectations let us see how a group will perform on average even though we cannot predict exactly what individuals will do.

Combining the historic CLV and predictive CLV helps us understand the total expected value of a specific customer relationship. In the example screenshot above we show a total CLV of $1,077.

We use color to represent churn risk. Think of the relationship a customer has with your brand. When they buy, they are engaged and are more likely to buy again. When someone goes a long time without purchasing, the likelihood they purchase again goes down and their likelihood of churning goes up. Our model works the same way. Each time the customer makes a purchase, their churn probability goes down (green), but as time elapses between purchases, the churn probability increases (red). In our example, the timeline starts with a medium (yellow) probability of churn back in September 2015 and improves over time. All of that leads up to today when we’re predicting a 17% chance of churn.

Let’s walk through some examples.

BBQ Masters

BBQ Masters (not the real name) sells barbeque supplies. Most customers fall into one of two categories: one-and-done or high-loyalty repeat purchasers. Our model predicts customers to have a high probability of churn up until they make a few purchases. After this, the model recognizes them as a repeat purchaser, and their churn probability goes down.

Here’s an example. This customer was predicted to have a medium probability of churn after their first purchase. Their second purchase temporarily decreased their churn probability, but because they hadn’t shown they were a repeat customer yet, they still had a high risk of churn. Once they made their third, fourth and fifth purchases, it became clear that this customer was going to continue to return and their churn probability is now low (13%).

BBQ Masters – Customer A

Unfortunately customers aren’t guaranteed to stay loyal forever. The model for BBQ Masters has learned customer buying behavior and can identify when previously loyal customers have waited uncharacteristically long between purchases. The once loyal customer shown below hasn’t made a purchase in a year, so the model predicts a very high churn risk and a low predicted CLV.

BBQ Masters – Customer B

Accessories Galore

Accessories Galore (not the real name) sells jewelry and other accessories. Like BBQ Masters, Accessories Galore has many one-and-done customers, but unlike BBQ Masters, repeat purchasers are few and far between.

The model correctly predicts that new customers have a high risk of churn immediately after their first purchase. Here’s an example of a customer who placed a single order and is immediately classified as high churn risk. Churn risk continues to increase over time, and the predicted CLV is essentially zero.

Accessories Galore – Customer A

Even the few frequent purchasers at Accessories Galore have a high churn risk between purchases. Each purchase temporarily decreases the churn probability, but it quickly goes up because customer loyalty is not expected.

Accessories Galore – Customer B

Word of Caution

Nobody can predict the future! Predictive analytics are a powerful tool for optimizing marketing spending and personalizing customer communication. However, predictions work best when averaged over many customers and are not expected to be exact for any single individual. While some individuals will spend more than their predicted CLV and some will spend less, as a whole they will average each other out. The larger the group, the more averaging, and predictions will be more accurate, even if individuals behave radically different than expected. That’s why it makes sense to design automation based on predicted CLV and churn rate, but it doesn’t make sense to count on a specific amount of future revenue from a specific individual.

Input Data and Requirements

Klaviyo automatically builds a CLV model using your company’s data and retrains the model at least once a week. You don’t need to do anything special to turn on our predictive analytics capability, but you do need to:

  • Have an ecommerce integration (e.g. Shopify, BigCommerce, Magento) or be using our API to send placed orders. We only consider orders that have a value and where the value is greater than zero.
  • Have at least 500 customers over your company’s history. These are customers who actually placed orders, not emailable profiles.
  • Have at least 180 days of order history and have orders within the last 90 days.

The Math

You don’t need to know how an engine works to drive a car, but you might find it interesting. Here’s a taste of what’s happening under the hood.

You’ll see the Greek letters lambda (λ) and alpha (α) here. Mathematicians have used Greek letters for centuries and we stick with them so you can link our explanation to literature on customer lifetime value models, including the academic paper we link to below.

The reason modeling is so useful is that you use math to explain and predict things that happen in the real world. You’ve probably done modeling without necessarily thinking of it that way. When you calculate the average amount that a customer spends with you in a year, that’s a simple model of customer behavior. The model we use is based on decades of academic research and uses computing power to consider millions of data points (if you have that many) to learn the model parameters we explain below. Our aim is much greater accuracy than a broad-brush average could provide–but the idea is the same–figure out a math function that explains what a customer has done and predicts what they will do.

In our CLV feature, customer behavior is modeled with a beta-geometric (BG) model. In the BG model, customers make purchases with frequency λ, and after each purchase, they have a likelihood p of becoming inactive. The BG model learns what the values of the parameters λ and p are for individual customers based on their purchase history. λ and p can then be used to calculate the values we cannot measure directly: the probability of churn for a customer, the number of purchases they will make in the future, and, when combined with a model of purchase value, the predicted future spending of that customer.

To estimate λ and p for each customer, the model needs to learn what “typical” customers look like for a company. That’s done by finding a probability distribution that generates λ’s and p’s for each customer that approximate the actual transaction history. We show examples of probability distributions below and you’ll see how the shape of the distribution relates to types of customer behavior.

The BG model captures the overall profile of behaviors for a company with the gamma distribution for frequency (λ) of purchases and beta distribution for customer attrition (p) and then estimates the λ and p for individual customers according to that profile. Customers act independently, meaning the value of λ and p for one customer is not correlated with the λ and p for another beyond that they come from the same distribution.

Values for λ for the customers of a given company follow a gamma distribution in the BG model, meaning they are centered around a particular value with a long tail to the right. Many λ’s will be similar (many customers purchase at approximately the same frequency), however, there will be outliers who purchase very often or very infrequently. The shape of the gamma distribution for a particular company is defined by the parameters r and α.

Gamma Distribution
(r = 5, α = 2)

The BG model finds the distribution of values for p for customers in a similar fashion but using a beta distribution. The beta distribution can have many different shapes, defined by the parameters a and b, representing the many different possibilities for attrition behavior: some companies may primarily have one-and-done customers, some may primarily have customers who are frequent purchasers, and others primarily have few-and-done customers who make a few repeat purchases and then stop.

One and Done
Beta Distribution (a = 5, b = 1)
Frequent Purchasers
Beta Distribution (a = 1, b = 10)
Few and Done
Beta Distribution (a = 5, b = 5)

Your company may have a mix of the three different behaviors. In that case the beta distribution learned by the model may look like this:

A mix in behavior
Beta Distribution (a = 2, b = 4)

Let’s say your customers are all over the map and there is no rhyme or reason to their propensity to churn. Even that will be learned and the beta distribution will look like this:

All behaviors
Beta Distribution (a = 1, b = 1)

Modeling customer behavior with probability distributions in the BG model has many advantages over using averages to estimate customer behavior. The long tail of the gamma distribution for purchase frequency in the BG model is able to explain outlier customers who purchase at a much higher frequency, whereas a simple average would be skewed by the outliers. Additionally, models let us estimate properties we can’t measure directly, such as probability of churn for an individual customer. While we can estimate the overall probability of churn with a simple average, for example that one third of customers will not return, we cannot identify which customers within the population are likely in the churn group and which are unlikely to churn. Estimating underlying customer behavior is a powerful way to characterize customers and personalize marketing activities.

Next Up

Going back to our car analogy…we’ve built a powerful engine and you can watch it do stuff, but it’s not really that useful until you can hook it up to wheels and drive. We get that, and viewing predictive analytics on profile pages is just our starting point. Our next step is to connect these predictive analytics to our segmentation engine. That will enable you to get insights and take actions using CLV, churn probability and average time between orders.

We’ll also be improving our model. One of the exciting things about Klaviyo’s scale is that we’ve now built and tested CLV models for hundreds of millions of customers across tens of thousands of companies. As far as we know, we’ve been able to test our model across more companies than any academic researchers or commercial vendors. We have a huge list of ideas for model enhancements and we’ll be implementing many of them over the coming weeks and months.

Feedback

If you have ideas or want to share your feelings about our work, our data science team would love to hear from you. Please send feedback to datascience@klaviyo.com.

Back to Blog Home
Get email marketing insights delivered straight to your inbox.

4 comments

  • Wish to have you setting up some camagins for us. Any idea?

  • I enjoyed going to your webiste. I leave comments rarely, but

    you definately up deserve a thumbs!

    • That’s very kind – thank you!

Comments are closed.