Machine Learning & Predictive Analytics to Identify High CLTV Segments

Business Objective
A subscription based clothing e-tailer in Los Angeles had been collecting lots of potentially insightful information on their customers, but they had not yet been able to mine this data and extract the desired insights to optimize their marketing channel performance and drive the business forward.

The client wanted to start using these insights to optimize their approach to tailored marketing, customer acquisition, long term retention and ongoing customer engagement. This e-tailer was making use of several customer acquisition channels, but they really did not know which of these various channels were leading to customers with the highest lifetime value.

eSage Group was brought in to create a Business Intelligence solution that would allow them the ability to analyze and predict customer lifetime value (CLTV) in order to optimize marketing spend and increase customer profitability.

Solution
The CLTV solution required a place to store large amounts of historical customer data, as well as a Machine Learning platform to analyze and prepare this data for data mining efforts. The Business Intelligence solution eSage Group developed for this client was built on the Microsoft Azure cloud platform using Azure predictive machine learning models to extract insights from the historical customer data.

Microsoft Azure storage and the Microsoft HDInsight Hadoop service was chosen as the core platform due to the customers’ desire to work within the flexible Microsoft stack and leverage the Azure Machine Learning capabilities to determine CLTV. Some heavy lifting was accomplished by provisioning and tuning HDInsight and processing the data through Hive, then transporting the data to the Azure Machine Learning (ML) workspace. HDInsight is a cloud implementation of the rapidly expanding Apache Hadoop Big Data technology stack.

HDInsight allows e-tailers and others with Big Data needs to collect and mine large amounts of structured and unstructured data to greatly improve customer intelligence, optimize marketing performance and ultimately increase sales revenue.

eSage Group first consolidated existing customer information from various US and worldwide business units operating under the parent company. eSage Group cleansed, transformed, mapped and loaded this customer data into Azure Storage before it could be used for analysis via HDInsight and Machine Learning.

eSage built an Azure datamart to store this consolidated customer information. The available customer data was surfaced through external and managed Hive tables via the HDInsight service. With the new customer datamart in place eSage Group was able to test various machine learning models against the customer data to best determine the attributes of new or existing customers that indicated a propensity for a high lifetime value.

The Azure machine learning model leveraged data such as email domain, age, buying behavior, VIP status and historical profitability to score customers on their propensity to become high lifetime value customers and predict specific future buying behavior.

A data mining model is only as good as the data provided to it, so we only wanted to use features in the dataset that enhanced or led to an accurate prediction of customers with high value. In Azure Machine Learning, we used a “Filter Based Feature Selection” (FBFS) component to help isolate the 10 best features in each dataset. These features were then used to train the models with their respective methods (Score, Propensity, etc). Within the FBFS, a variety of algorithms such as The Chi Square test, Mutual Information and Multi-class Logistic Regression were used.

Results
An effort was undertaken to determine a “most valuable customer” by looking at past purchasing behavior and overall actions by the customer over time. Customers were ranked by metrics such as gross margin against their peers, by store and by the number of months since they first activated as a VIP. By determining each customer’s relative value position, it was possible to separate the truly valuable customers from the less financially valuable customers and zero in on the common attributes of these most desirable customers. Further segmentation was performed on attributes like billing actions (Cancel, Return, Skip, Purchase, etc), to further refine the behavior of the most valuable customers.

eSage Group created a new web service output from the new machine learning algorithms to provide the deeper segmentation and intelligence required to identify and target high value customer prospects with additional marketing campaigns, outreach and engagement. Conversely, customers who had an estimated lower lifetime value could now be identified and approached in such a way as to convert them to a valued customer, or to limit our clients exposure to them.

The actionable insights derived from CLTV and Customer Behaviors were used to optimize Customer Acquisition efforts (i.e. seek out new leads from the customer sources that deliver highest value customers) and to optimize Customer Retention efforts (i.e. focus efforts to extend the value of high potential customers).

The new CLTV scores are being brought into the client’s data warehouse for use across a variety of strategic initiatives to drive marketing strategies and improve customer insights.

Deliverables
  • Designed and built Microsoft Azure Customer Datamart on HDInsight
  • Determined attributes of high value customers through Azure Machine Learning CLTV Model analysis
  • Summarized results of Azure Machine Learning CLTV analysis and findings
  • Creation of CLTV Web Service and logic that allows client to run a set of customers against the CLTV service to determine predicted lifetime value