Video Blog: Demonstrating Customer Lifetime Value

Video Blog: Demonstrating Customer Lifetime Value

 

Contact us for more information:

 

Customer Lifetime Value Modeling as a Win-Win for Both the Vendor and the Customer

Customer Lifetime Value Modeling as a Win-Win for Both the Vendor and the Customer

 

Author: Janne Flinck, Codento

Introduction to Customer Lifetime Value

Customer analytics is not about squeezing out every penny from a customer, nor should it be about short-term thinking and actions. Customer analytics should seek to maximize the full value of every customer relationship. This metric of “full value” is called the lifetime value (LTV) of a customer. 

Obviously a business should look at how valuable customers have been in the past, but purely extrapolating that value into the future might not be the most accurate metric.

The more valuable a customer is likely to be to a business, the more that business should invest in that relationship. One should think about customer lifetime value as a win-win situation for the business and the customer. The higher a customer’s LTV is to your business, the more likely your business should be to address their needs.

A so-called Pareto principle is often used here, which states that 20% of your customers represent 80% of your sales. What if you could identify these customers, not just in the past but in the future as well? Predicting LTV is a way of identifying those customers in a data centric manner.

 

Business Strategy and LTV

There are some more or less “standard” ways of calculating LTV that I will touch upon in this article a little later. These out-of-the-box calculation methods can be good but more importantly, they provide good examples to start with.

What I mean by this is that determining the factors that are included in calculating LTV is something that a business leader will have to consider and weigh in on. LTV should be something that will set the direction for your business as LTV is also about business strategy, meaning that it will not be the same for every business and it might even change over time  for the same business.

If your business strategy is about sustainability, then the LTV should include some factors that measure it. Perhaps a customer has more strategic value to your business if they buy the more sustainable version of your product. This is not a set-and-forget metric either, the metric should be revisited over time to see if it reflects your business strategy and goals.

The LTV is also important because other major metrics and decision thresholds can be derived from it. For example, the LTV is naturally an upper limit on the spending to acquire a customer, and the sum of the LTVs for all of the customers of a brand, known as the customer equity, is a major metric for business valuations.

 

Methods of Calculating LTV

At their core, LTV models can be used to answer these types of questions about customers:

  • How many transactions will the customer make in a given future time window?
  • How much value will the customer generate in a given future time window?
  • Is the customer in danger of becoming permanently inactive?

When you are predicting LTV, there are two distinct problems which require different data and modeling strategies:

  • Predict the future value for existing customers
  • Predict the future value for new customers

Many companies predict LTV only by looking at the total monetary amount of sales, without using context. For example, a customer who makes one big order might be less valuable than another customer who buys multiple times, but in smaller amounts.

LTV modeling can help you better understand the buying profile of your customers and help you value your business more accurately. By modeling LTV,  an organization can prioritize their actions by:

  • Decide how much to invest in advertising
  • Decide which customers to target with advertising
  • Plan how to move customers from one segment to another
  • Plan pricing strategies
  • Decide which customers to dedicate more resources to

LTV models are used to quantify the value of a customer and estimate the impact of actions that a business might take. Let us take a look at two example scenarios for LTV calculation.

Non-contractual businesses and contractual businesses are two common ways of approaching LTV for two different types of businesses or products. Other types include multi-tier products, cross-selling of products or ad-supported products among others.

 

Non-contractual Business

One of the most basic ways of calculating LTV is by looking at your historical figures of purchases and customer interactions and calculating the number of transactions per customer and the average value of a transaction.

Then by using the data available, you need to build a model that is able to calculate the probability of purchase in a future time window per customer. Once you have the following three metrics, you can get the LTV by multiplying them:

LTV = Number of transactions x Value of transactions x Probability of purchase

There are some gotchas in this way of modeling the problem. First of all, as discussed earlier, what is value? Is it revenue or profit or quantity sold? Does a certain feature of a product increase the value of a transaction? 

The value should be something that adheres to your business strategy and discourages short-term profit seeking and instead fosters long-term customer relationships.

Second, as mentioned earlier, predicting LTV for new customers will require different methods as they do not have a historical record of transactions.

 

Contractual Business

For a contractual business with a subscription model, the LTV calculation will be different as a customer is locked into buying from you for the time of the contract. Also, you can directly observe churn, since the customers who churn won’t re-subscribe. For example, a magazine with a monthly subscription or a streaming service etc. 

For such products, one can calculate the LTV by the expected number of months for which the customer will re-subscribe.

LTV = Survival rate x Value of subscription x Discount rate

The survival rate by month would be the proportion of customers that maintain their subscription. This can be estimated from the data by customer segment using, for example, survival analysis. The value of a subscription could be revenue minus cost of providing the service and minus customer acquisition cost.

Again, your business has to decide what is considered value. Then the discount rate is there because the subscription lasts into the future.

 

Actions and Measures

So you now have an LTV metric that decision makers in your organization are happy with. Now what? Do you just slap it on a dashboard? Do you recalculate the metric once a month and show the evolution of this metric on a dashboard?

Is LTV just another metric that the data analysis team provides to stakeholders and expects them to somehow use it to “drive business results”? Those are fine ideas but they don’t drive action by themselves. 

LTV metric can be used in multiple ways. For example, in marketing one can design treatments by segments and run experiments to see what kind of treatments maximize LTV instead of short-term profit.

The multiplication of probability to react favorably to a designed treatment with LTV is the expected reward. That reward minus the treatment cost gives us the expected business value. Thus, one gets the expected business value of each treatment and can choose the one with the best effect for each customer or customer segment.

Doing this calculation for our entire customer base will give a list of customers for whom to provide a specific treatment that maximizes LTV given our marketing budget. LTV can also be used to move customers from one segment to another.

For pricing, one could estimate how different segments of customers react to different pricing strategies and use price to affect the LTV trajectory of their customer base towards a more optimal LTV. For example, if using dynamic pricing algorithms, the LTV can be taken into account in the reward function.

Internal teams should track KPIs that will have an effect on the LTV calculation over which they have control. For example, in a non-contractual context, the product team can be measured on how well they increase the average number of transactions, or in a contractual context, the number of months that a typical customer stays subscribed.

The support team can be measured on the way that they provide customer service to reduce customer churn. The product development team can be measured on how well they increase the value per transaction by reducing costs or by adding features. The marketing team can be measured on the effectiveness of treatments to customer segments to increase the probability of purchase. 

After all, you get what you measure for. 

 

A Word on Data

LTV models generally aim to predict customer behavior as a function of observed customer features. This means that it is important to collect data about interactions, treatments and behaviors. 

Purchasing behavior is driven by fundamental factors such as valuation of a product or service compared with competing products or services. These factors may or may not be directly measurable but gathering information about competitor prices and actions can be crucial when analyzing customer behavior.

Other important data is created by the interaction between a customer and a brand. These properties characterize the overall customer experience, including customer satisfaction and loyalty scores.

The most important category of data is observed behavioral data. This can be in the form of purchase events, website visits, browsing history, and email clicks. This data often captures interactions with individual products or campaigns at specific points in time. From purchases one can quantify metrics like frequency or recency of purchases. 

Behavioral data carry the most important signals needed for modeling as customer behavior is at the core of our modeling practice for predicting LTV.

The data described above should also be augmented with additional features from your businesses side of the equation, such as catalog data, seasonality, prices, discounts, and store specific information.

 

Prerequisites for Implementing LTV

Thus far in this article we have discussed why LTV is important, we have shown some examples of how to calculate it and then discussed shortly how to make it actionable. Here are some questions that need to be answered before implementing an LTV calculation method:

  • Do we know who our customers are?
  • What is the best measure of value?
  • How to incorporate business strategy into the calculation?
  • Is the product a contractual or non-contractual product?

If you can answer these questions then you can start to implement your first actionable version of LTV.

See a demo here.

 

 

About the author: Janne Flinck is an AI & Data Lead at Codento. Janne joined Codento from Accenture 2022 with extensive experience in Google Cloud Platform, Data Science, and Data Engineering. His interests are in creating and architecting data-intensive applications and tooling. Janne has three professional certifications and one associate certification in Google Cloud and a Master’s Degree in Economics.

 

Please contact us for more information on how to utilize machine learning to optimize your customers’ LTV.

Piloting Machine Learning at Speed – Utilizing Google Cloud and AutoML

Piloting machine learning at speed – Utilizing Google Cloud and AutoML

 

Can modern machine learning tools do one-weeks work in an afternoon? The development of machine learning models has traditionally been a very iterative process. The traditional machine learning project starts with the selection and pre-processing of data sets: cleaning and pre-processing. Only then can the actual development work of the machine learning model be started.

It is very rare, virtually impossible, for a new machine learning model to be able to make sufficiently good predictions on the first try. Indeed, development work traditionally involves a significant number of failures both in the selection of algorithms and their fine-tuning, in technical language in the tuning of hyperparameters.

All of this requires working time, in other words, money. What if, after cleaning the data, all the steps of development could be automated? What if the development project could be carried through at an over-paced sprint per day?

 

Machine learning and automation

In recent years, the automation of building machine learning models (AutoML) has taken significant leaps. Roughly described in traditional machine learning, the Data Scientist builds a machine learning model and trains it with a large dataset. AutoML, on the other hand, is a relatively new approach in which the machine learning model builds and trains itself using a large dataset.

All the Data Scientist needs to do is tell you what the problem is. This can be a problem with machine vision, pricing or text analysis, for example. However, Data Scientists will not be unemployed due to AutoML models. The workload shifts from fine-tuning the model to validating and using Explainable-AI tools.

 

Google Cloud and AutoML used to sole a practical challenge

Some time ago, we at Codento tested Google Cloud AutoML-based machine learning tools [1]. Our goal was to find out how well Google Cloud AutoML tool solves the Kaggle House Prices – Advanced Regression Techniques challenge [2].

The goal of the challenge is to build the most accurate tool possible to predict the selling prices of real estates based on their properties. The data set used in the building of the pricing model contained data on approximately 1,400 real estates: In total 80 different parameters that could potentially affect the price, as well as their actual sales prices. Some of the parameters were numerical, some were categorical.

 

Building a model in practice

The data used was pre-cleaned. The first phase of building the machine learning model was thus completed. First, the data set, a file in csv format, was uploaded as is to Google Cloud BigQuery data warehouse. The download took advantage of BigQuery’s ability to identify the database schema directly from the file structure. The AutoML Tabular feature found in the VertexAI tool was used to build the actual model.

After some clicking, the tool was told which of the price predictive parameters were numeric and which were categorical variables. In addition, the tool was told which column contains the predicted parameter. It all took about an hour to work. After that, the training was started and we started waiting for the results. About 2.5 hours later, the Google Cloud robot sent an email stating that the model was ready.

 

The final result was a positive surprise

The accuracy of the model created by AutoML surprised the developers. Google Cloud AutoML was able to independently build a pricing model that predicts home prices with approximately 90% accuracy. The level of accuracy per se does not differ from the general level of accuracy of pricing models. It is noteworthy here, however, that the development of this model took a total of half a working day.

However, the benefits of GCP AutoML do not end there. It would be possible to integrate this model with very little effort into the Google Cloud data pipeline. The model could also be loaded as a container and deployed in other cloud platforms.

 

Approach which pays off in the future as well

For good reason, tools based on AutoML can be considered the latest major development in machine learning. Thanks to the tools, the development of an individual machine learning model no longer has to be thought of as a project or an investment. Utilizing the full potential of these tools, models can be built with an approximately zero budget. New forecasting models based on machine learning can be built almost on a whim

However, the effective deployment of AutoML tools requires a significant initial investment. The entire data infrastructure, data warehouses and lakes, data pipelines, and visualization layers, must first be built with cloud-native tools. Codento’s certified cloud architects and data engineers can help with these challenges.

 

Sources:

Google Cloud AutoML, https://cloud.google.com/automl/ 

Kaggle, House Prices – Advanced Regression Techniques, https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/

 

The author of the article is Jari Rinta-aho, Senior Data Scientist & Consultant, Codento. Jari is a consultant and physicist interested in machine learning and mathematics, with extensive experience in utilizing machine learning in nuclear energy. He has also taught physics at several universities and led international research projects. Jari’s interests include ML-Ops, AutoML, Explainable AI and Industry 4.0.

 

Ask more about Codento’s AI and data services:

Business-driven Machine Learner with Google Cloud

Business-driven Machine Learner with Google Cloud: Multilingual Customer Feedback Classifier

Author: Jari Rinta-aho, Codento

At Codento, we have rapidly expanded our services to demanding implementations and services for data and machine learning. When discussing with our customers, the following business goals and expectations have often come to the fore:

  • Disclosure of hidden regularities in data
  • Automation of analysis
  • Minimizing human error
  • New business models and opportunities
  • Improving and safeguarding competitiveness
  • Processing of multidimensional and versatile data material

In this blog post, I will  go through the lessons from our recent customer case.

Competitive advantage from deep understanding customer feedback

A very concrete business need arose this spring for a Finnish B-to-C player: huge amounts of customer feedback data come, but how to utilize feedback intelligently in decision-making to make the right business decisions.

Codento recommended the use of machine learning

Codento’s recommendation was to take advantage of the challenging machine learning approach and Google Cloud off-the-shelf features to get the customer feedback classifier ready by the week.

The goal was to automatically classify short Customer Feedback into three baskets: Positive, Neutral, and Negative. Customer feedback was mainly short Finnish texts. However, there were also a few texts written in Swedish and English. The classifier must therefore also be able to recognize the language of the source text automatically.

Can you really expect results in a week?

At the same time, the project was tight on schedule and ambitious. There was no time to waste in the project, but in practice the results had to be obtained on the first try. Codento therefore decided to make the most of the ready-made cognitive services.

Google Cloud plays a key role

It was decided to implement the classifier by combining two ready-made tools found in the Google Cloud Platform: Translate API and Natural Language API. The purpose was to mechanically translate the texts into English and determine their tone. Because the Translate API is able to automatically detect the source language from about a hundred different languages, the tool met the requirements, at least on paper.

Were the results useful?

Random sampling and craftsmanship were used to validate the results. From the existing data, 150 texts were selected at random for the validation of the classifier. First, these texts were sorted by hand into three categories: positive, neutral, and negative. After that, the same classification was made with the tool we developed. In the end, the results of the tool and the craft were compared.

What was achieved?

The tool and the analyzer agreed on about 80% of the feedback. There was no contrary view. The validation results were pooled into a confusion matrix.

The numbers 18, 30, and 75 on the diagonal of the image confusion matrix describe the feedback in which the Validator and the tool agreed on the tone of the feedback. A total of 11 feedbacks were those in which Validator considered the tone positive but the tool neutral.

 

The most significant factor that explains the different interpretation made by the tool is the cultural relevance of the wording of the customer feedback, and when a Finn says “No complaining”, he praises.

Heard from an American, this is neutral feedback. This cultural difference alone is sufficient to explain why the largest single error group was “positive in the view of the validator, neutral in the view of the tool.” Otherwise, the error is explained by the difficulty of distinguishing between borderline cases. It is impossible to say unambiguously when slightly positive feedback will turn neutral and vice versa.

Utilizing the solution in business

The data-validated approach was well suited to solve the challenge and is an excellent starting point for understanding the nature of feedback in the future, developing further models for more detailed analysis, speeding up analysis and reducing manual work. The solution can also be applied to a wide range of similar situations and needs in other processes or industries.

The author of the article is Jari Rinta-aho, Senior Data Scientist & Consultant, Codento. Jari is a consultant and physicist interested in machine learning and mathematics, who has extensive experience in utilizing machine learning, e.g. nuclear technologies. He has also taught physics at the university and led international research projects.