FasterAnalytics
for Consumer Products & Financial Services - Telephone Carrier
Churn Case Study
DecisionQ
has developed FasterAnalytics,
a unique analytics package that enables researchers, analysts,
and managers to use sophisticated predictive analytics from the
desktop. FasterAnalytics is fast and creates high quality, predictive
models from data that enable efficient review of clinical data,
real-time hypothesis testing, and rapid decisions.
FasterAnalytics
uses a modeling approach called Bayesian Networks to provide a
mapping of the complex relationships in data, which can then be
used to make high quality predictions. Users can:
- Get
an instant global view of their data.
- Understand
the driving factors in the data.
- Test
hypotheses in real time in our model Explorer.
- Produce
reports that can be exported to other applications.
- Make
determinations that can help prioritize the use of scarce resources.
Market
Overview
Product, Marketing, and Sales managers consistently desire better
tools to enhance their sales efforts. FasterAnalytics can help
organizations turn consumer and financial data into valuable information,
allowing them to make decisions with advanced knowledge of high-probability
consumer and market reactions. To assist in this effort, DecisionQ
has developed FasterAnalytics
for Consumer Products & Financial Services, a tool for modeling
consumer and financial data. DecisionQ's FasterAnalytics software
can combine data and business experience to create powerful predictive
models, models that can be used to understand future consumer
behavior.
Value
to the Customer
FasterAnalytics enables both experts and non-experts in statistics
to discover and leverage knowledge from large quantities of data
quickly. Examples include:
- Automatically
mapping data where targets are unknown to reveal correlations.
- Discovering
new relationships between variables and identifying new opportunities
to improve sales or reduce cost.
- Identifying
potential concerns early.
- Discovering
populations that may have substantially different responses
from the population at large.
- Predicting
the behavior of any factor or combination of factors in the
model.
FasterAnalytics
is designed for real-time environments. Bayesian models are highly
effective at identifying emerging trends that can be used to either
to identify potential adverse advents or improve quality of outcomes.
Product
and Technology
DecisionQ Corporation has produced a range of modules that include
data analysis, modeling, visualization, reporting, and decision
optimization. FasterAnalytics modules include:
- Discretizer.
Automatically configures the data for modeling.
- Modeler.
Quickly creates a visual model of the data.
- Explorer.
Allows real-time generation and testing of hypotheses.
- Reporter.
Extracts insights and key points for inclusion in reports and
presentations.
Using
the System: A Prognostic Example
The following is an example application of our software to analyze
the influencing attributes associated with the customer account
non-renewal or cancellation ("churn") data set. We have used a
data set comprising 5,000 telephone customers with 21 attributes
or markers. FasterAnalytics built the model in this example, from
start to finish, in less than 5 minutes.
To
build predictive models, our learning engine requires the data
to be in a flat tabular format. The data can be numerical or variable
character strings. Our software also handles missing values automatically
and will either impute a value or treat missing values as a special
category, at the user's discretion.
Figure
1: This example uses a data set held in an Excel spreadsheet as
shown below
Having
selected the data, a fully automated process will continue until
a full model is presented, or the user can stop each part of the
process to manually change parameters. The software begins by
categorizing the data and 'binning' in accordance with the default
settings. The data is then passed seamlessly to the Modeler for
automated model development. Once the software has mapped the
complex correlations in the data a model is presented in the Explorer.
Figure
2: Base case model of the data presented in Explorer
he
display illustrates conditional dependence between variables and
the pathways existing in the predictive model. Notice that the
network has multiple branches, and that the data is interrelated
in a "web", one of the strengths of multivariate Bayesian networks.
In
the example below, we examine likely prognostic and predictive
indicators that correlate with Churn. We begin by selecting our
target variable, CHURN, and setting it to "True." We are looking
for variables that were influenced when Churn occurred. We can
see that the coloring and distribution of the surrounding nodes
have changed to indicate the positive or negative effects on other
predictive markers associated with the current case. Note that
our indicators (VM PLAN, INTL PLAN, TOTAL DAY CHRG, and TOTAL
EVE CHRG) have indicated either positive or negative correlation.
It is clear that CHURN is associated with high phone usage as
both Total Day Charges and Total Evening Charges increase
when CHURN is set to true. Additionally, we see that expensive
international call usage is associated with Churn. These three
variables suggest that customers are price sensitive and may switch
to a competitor seeking cheaper alternatives when they are active
phone users and international callers. Interestingly, we see an
inverse correlation with respect to the use of voicemail plans.
This suggests that customers are much less likely to churn when
they have a voicemail plan in place.
Figure
3: CHURN set to True
Next,
we set CHURN to "False" and examine the effect on the likely prognostic
and predictive indicators. Note that our indicators (VM PLAN,
INTL PLAN, TOTAL DAY CHRG, and TOTAL EVE CHRG) retain approximately
the same value.
Figure
4: CHURN set to False
Compare
the two models in Figure 3 and 4 above with the base level in
Figure 2. While CHURN shares conditional dependence with the same
nodes, the behavior of those markers changes based upon the presence
of churn. The coloring in the graphical model shows the change
in population profile quickly and effectively.
It
is also possible to select two or more variables simultaneously.
For example, the extent to which VM PLAN and INTL PLAN affect
CHURN can be studied together. If we wish to test hypotheses,
we can modify any node and see how our hypothesis affects the
model. Notice how information flows through the network.
Suppose
we are interested in examining how VM PLAN and INTL PLAN affect
CHURN as a prognostic indicator (Figure 5). We first select these
nodes and click "Graph" to display the states within these nodes.
This can be done for as many variables as we choose.
In
Figure 5, we set VM PLAN to "no" and INTL PLAN to "yes." Here
we see very clearly the prognostic indication for CHURN level.
If a customer does not have a voice mail plan but has an international
plan, the likelihood of churn increases by 32%.
Figure
5: VM PLAN combined with INTL PLAN and correlation with CHURN
Conclusion:
Churn is one of the major concerns of telephone providers. The
cost of customer acquisition is high and the highest value customers,
those with high usage and international plans are most likely
to churn. The model clearly reveals the importance of voicemail
as a deterrent to Churn.
The
Reporter module can be used to create a report that will show
the conditional probabilities (or predicted likelihood) of any
target variables, given the expression of any independent variable(s).
Any part of the model visualization can be pasted into Reporter
and then transferred into other applications. Figure 6 shows a
sample report. Figure 7 shows the sample report pasted into a
Microsoft Excel worksheet.
Figure
6: A sample report listing the relationship between the prognostic
marker CHURN with the predictive and prognostic markers of VM
PLAN and INTL PLAN expression levels
Figure
7: A sample report pasted into a Microsoft Excel worksheet
ROC
curves are a new feature added to FasterAnalytics. Accessed through
the Modeler, the ROC curves estimate the predictive accuracy of
the model by testing the algorithm created by the model against
the data set. Figure 8, below, shows the ROC curves and AUC (area-under-curve)
values for both instances of churn (True and False). These numbers
indicate that the model/algorithm created by FasterAnalytics correctly
predicts the value of CHURN 87.4% of the time.
Figure
8: ROC Curves
DecisionQ
sells predictive modeling software and complementary professional
services. Alternatively, components from FasterAnalytics
can be integrated into third party applications as part of broad
data management and analysis platform. If you have any further
questions or would like to schedule a more detailed demonstration
in person or over the web, please contact us.
DecisionQ
Corporation
531 Howard Street, 3rd Floor
San Francisco, CA 94105
www.decisionq.com
Phone: 415-254-7996
Fax : 415-276-6356
Email: info@decisionq.com
|