Case Studies
Diagnostic Case Study

Telephone Carrier Churn Case Study

FasterAnalytics for Consumer Products & Financial Services - Telephone Carrier Churn Case Study

DecisionQ has developed FasterAnalytics, a unique analytics package that enables researchers, analysts, and managers to use sophisticated predictive analytics from the desktop. FasterAnalytics is fast and creates high quality, predictive models from data that enable efficient review of clinical data, real-time hypothesis testing, and rapid decisions.

FasterAnalytics uses a modeling approach called Bayesian Networks to provide a mapping of the complex relationships in data, which can then be used to make high quality predictions. Users can:

  • Get an instant global view of their data.
  • Understand the driving factors in the data.
  • Test hypotheses in real time in our model Explorer.
  • Produce reports that can be exported to other applications.
  • Make determinations that can help prioritize the use of scarce resources.

Market Overview
Product, Marketing, and Sales managers consistently desire better tools to enhance their sales efforts. FasterAnalytics can help organizations turn consumer and financial data into valuable information, allowing them to make decisions with advanced knowledge of high-probability consumer and market reactions. To assist in this effort, DecisionQ has developed FasterAnalytics for Consumer Products & Financial Services, a tool for modeling consumer and financial data. DecisionQ's FasterAnalytics software can combine data and business experience to create powerful predictive models, models that can be used to understand future consumer behavior.

Value to the Customer
FasterAnalytics enables both experts and non-experts in statistics to discover and leverage knowledge from large quantities of data quickly. Examples include:

  • Automatically mapping data where targets are unknown to reveal correlations.
  • Discovering new relationships between variables and identifying new opportunities to improve sales or reduce cost.
  • Identifying potential concerns early.
  • Discovering populations that may have substantially different responses from the population at large.
  • Predicting the behavior of any factor or combination of factors in the model.

FasterAnalytics is designed for real-time environments. Bayesian models are highly effective at identifying emerging trends that can be used to either to identify potential adverse advents or improve quality of outcomes.

Product and Technology
DecisionQ Corporation has produced a range of modules that include data analysis, modeling, visualization, reporting, and decision optimization. FasterAnalytics modules include:

  • Discretizer. Automatically configures the data for modeling.
  • Modeler. Quickly creates a visual model of the data.
  • Explorer. Allows real-time generation and testing of hypotheses.
  • Reporter. Extracts insights and key points for inclusion in reports and presentations.

Using the System: A Prognostic Example
The following is an example application of our software to analyze the influencing attributes associated with the customer account non-renewal or cancellation ("churn") data set. We have used a data set comprising 5,000 telephone customers with 21 attributes or markers. FasterAnalytics built the model in this example, from start to finish, in less than 5 minutes.

To build predictive models, our learning engine requires the data to be in a flat tabular format. The data can be numerical or variable character strings. Our software also handles missing values automatically and will either impute a value or treat missing values as a special category, at the user's discretion.

Figure 1: This example uses a data set held in an Excel spreadsheet as shown below

Having selected the data, a fully automated process will continue until a full model is presented, or the user can stop each part of the process to manually change parameters. The software begins by categorizing the data and 'binning' in accordance with the default settings. The data is then passed seamlessly to the Modeler for automated model development. Once the software has mapped the complex correlations in the data a model is presented in the Explorer.

Figure 2: Base case model of the data presented in Explorer

he display illustrates conditional dependence between variables and the pathways existing in the predictive model. Notice that the network has multiple branches, and that the data is interrelated in a "web", one of the strengths of multivariate Bayesian networks.

In the example below, we examine likely prognostic and predictive indicators that correlate with Churn. We begin by selecting our target variable, CHURN, and setting it to "True." We are looking for variables that were influenced when Churn occurred. We can see that the coloring and distribution of the surrounding nodes have changed to indicate the positive or negative effects on other predictive markers associated with the current case. Note that our indicators (VM PLAN, INTL PLAN, TOTAL DAY CHRG, and TOTAL EVE CHRG) have indicated either positive or negative correlation. It is clear that CHURN is associated with high phone usage as both Total Day Charges and Total Evening Charges increase when CHURN is set to true. Additionally, we see that expensive international call usage is associated with Churn. These three variables suggest that customers are price sensitive and may switch to a competitor seeking cheaper alternatives when they are active phone users and international callers. Interestingly, we see an inverse correlation with respect to the use of voicemail plans. This suggests that customers are much less likely to churn when they have a voicemail plan in place.

Figure 3: CHURN set to True

Next, we set CHURN to "False" and examine the effect on the likely prognostic and predictive indicators. Note that our indicators (VM PLAN, INTL PLAN, TOTAL DAY CHRG, and TOTAL EVE CHRG) retain approximately the same value.

Figure 4: CHURN set to False

Compare the two models in Figure 3 and 4 above with the base level in Figure 2. While CHURN shares conditional dependence with the same nodes, the behavior of those markers changes based upon the presence of churn. The coloring in the graphical model shows the change in population profile quickly and effectively.

It is also possible to select two or more variables simultaneously. For example, the extent to which VM PLAN and INTL PLAN affect CHURN can be studied together. If we wish to test hypotheses, we can modify any node and see how our hypothesis affects the model. Notice how information flows through the network.

Suppose we are interested in examining how VM PLAN and INTL PLAN affect CHURN as a prognostic indicator (Figure 5). We first select these nodes and click "Graph" to display the states within these nodes. This can be done for as many variables as we choose.

In Figure 5, we set VM PLAN to "no" and INTL PLAN to "yes." Here we see very clearly the prognostic indication for CHURN level. If a customer does not have a voice mail plan but has an international plan, the likelihood of churn increases by 32%.

Figure 5: VM PLAN combined with INTL PLAN and correlation with CHURN

Conclusion: Churn is one of the major concerns of telephone providers. The cost of customer acquisition is high and the highest value customers, those with high usage and international plans are most likely to churn. The model clearly reveals the importance of voicemail as a deterrent to Churn.

The Reporter module can be used to create a report that will show the conditional probabilities (or predicted likelihood) of any target variables, given the expression of any independent variable(s). Any part of the model visualization can be pasted into Reporter and then transferred into other applications. Figure 6 shows a sample report. Figure 7 shows the sample report pasted into a Microsoft Excel worksheet.

Figure 6: A sample report listing the relationship between the prognostic marker CHURN with the predictive and prognostic markers of VM PLAN and INTL PLAN expression levels

Figure 7: A sample report pasted into a Microsoft Excel worksheet

ROC curves are a new feature added to FasterAnalytics. Accessed through the Modeler, the ROC curves estimate the predictive accuracy of the model by testing the algorithm created by the model against the data set. Figure 8, below, shows the ROC curves and AUC (area-under-curve) values for both instances of churn (True and False). These numbers indicate that the model/algorithm created by FasterAnalytics correctly predicts the value of CHURN 87.4% of the time.

Figure 8: ROC Curves

DecisionQ sells predictive modeling software and complementary professional services. Alternatively, components from FasterAnalytics can be integrated into third party applications as part of broad data management and analysis platform. If you have any further questions or would like to schedule a more detailed demonstration in person or over the web, please contact us.

DecisionQ Corporation
531 Howard Street, 3rd Floor
San Francisco, CA 94105
www.decisionq.com
Phone: 415-254-7996
Fax : 415-276-6356
Email: info@decisionq.com