2010 International Conference on Intelligent Systems, Modelling and Simulation
Data Mining Applications in Customer Churn Management Sahand KhakAbi
Mohammad R. Gholamian
Industrial Engineering Department Iran University of Science and Technology, IUST Tehran, Iran
[email protected]
[email protected]
Morteza Namvar
[email protected]
in both mentioned steps by improving prediction accuracy and speed and proposing predefined strategies for specific customer groups, is data mining. “Data mining is the process that uses statistical, mathematical, artificial intelligence and machine-learning techniques to extract and identify useful information and subsequently gain knowledge from large data bases” [34]. Data mining techniques can help organizations derive valuable knowledge from their enormous customer databases. Customer churn prediction gives managers and marketers the opportunity to design preventing strategies. Yet, application of data mining techniques in that field even enhances the process by hastening it and improving its accuracy. Lejeune (2001) [35] proves that “churn costs may be substantially reduced by developing an adequate CRM framework, that includes data warehousing, data mining and Online Analytical Processing (OLAP) functionalities.” There have been a few literature reviews in area of application of data mining techniques in customer relationship management (CRM). One of them is the work of Ngai et al. [1], in which about 87 papers are classified based on their point of focus regarding CRM dimensions (Customer Identification, Customer Attraction, Customer Retention and Customer Development) and kind of data mining techniques (Association, Classification, Clustering, Forecasting, Regression, Sequence Discovery, Visualization) exploited in them. But, according to the vastness of the scope of their work, that paper does not mention the applied data mining techniques by name. For example, it uses classification instead of Neural Networks, Decision Tree and other techniques. Also it sometimes provides some rough categorizations. For instance, data mining by evolutionary learning is in line with neural network as a data mining technique, while data mining in the former expression may include a wide range of techniques like the Neural Network itself. Having said this, the current paper only focuses on customer retention and emphasizes the churn management, which is the heart of retention. According to the lack of a comprehensive literature review about the application of data mining techniques in customer churn prediction, an overview of the existing literature about that topic is
Abstract—According to the lack of a comprehensive literature review in the area of application of data mining in customer churn management, which has become a central issue in customer relationship management nowadays, this paper tackles to provide a brief review of researches in that field from two perspectives: techniques used and statistical reports. From the first point of view, a taxonomy based on the models exploited in papers is provided. The latter perspective gives insight into the trend, place and frequency of publications in the mentioned area. Also some of recent papers are summarized for the sake of interested readers. It is suggested that this paper will point out the gaps and strengths in this research issue, which might be of interest for contributors and researches and business managers respectively. Keywords-churn management, data mining techniques
I.
INTRODUCTION
It is stated that the costs of acquiring a new customer is five to ten times greater than that of retaining an existing one [Chu, 2007]. Also, the results of Dawkins and Reichheld’s research shows that only five percent increase in retention rate yields in about 25 to 95 percent increase in the net present value of customers across a wide range of industries, such as credit cards, auto services, insurance brokerage, etc [38]. As a result of these and similar facts, enterprises are getting more and more interested in customer retention instead of acquiring new customers. Indeed, firms are concluding that the best core marketing strategy for the future is to retain existing customers and avoid customer churn ([36] and [37]). This is exactly what churn management does. “Churn management consists of developing techniques that enable firms to keep their profitable customers and it aims at increasing customer loyalty.” [35] That is, treating customers such that “they remain client of their original supplier even if a competitor proposes more advantageous conditions.” [35] One approach to customer churn management is proactive management. In that approach, one should be able to first predict the churn and then design an appropriate strategy for preventing it. One of the implements able to aid
978-0-7695-3973-7/10 $26.00 © 2010 IEEE DOI 10.1109/ISMS.2010.49
220
provided in this paper. It examines 32 papers about the subject from two different points of view: technical and statistical. From the technical point of view, a classification of papers regarding the data mining is provided. Some of those techniques are neural networks, random forests, support vector machine, decision trees, etc. In each class, one paper is discussed briefly as a representative of that class, along with a report of various statistical techniques used in different stages of the research. Finally, papers are analyzed from a statistical point of view. Some of the considered factors are year of publication and techniques used. This paper gives an overview of less explored areas to anyone who may want to contribute to the current literature. It also provides useful information for managers who want to implement data mining interventions by introducing the most dominant techniques. Remaining of the paper is organized as follows. First, in this section, a list of fundamental definitions is provided. The second section includes two subsections which examine the perspectives mentioned above. The last section is dedicated for conclusion.
x
To classify a new object from an input vector, put the input vector down each of the trees in the forest. Each tree gives a classification, and we say the tree "votes" for that class. The forest chooses the classification having the most votes (over all the trees in the forest).” It is resistant to missing data and can handle inputs with large dimension. It is also efficient and interpretable. Support Vector Machine (SVM): “The support vector (SV) machine implements the following idea: It maps the input vectors x into a high-dimensional feature space Z through some nonlinear mappings, chosen a priori. In this space, an optimal separating hyperplane is constructed.” [40] This hyperplane separates the input items with diverse target feature values from each other. When an input is fed into the machine, it maps that input into space Z and determines its position regarding the hyperplane, so it can classify that input as the classmate of other items in the same position. II.
LITERATURE REVIEW
In this section, the existing literature on application of data mining and statistical techniques in customer churn prediction is reviewed from two perspectives. First, recent (published during the last seven years) relevant papers are categorized according the data mining models applied in them. Finally, a statistical report about those publications is provided.
A. Preliminary Before reviewing the existing literature, it would be useful for the beginners to get familiar to terminology of data mining. So this subsection provides the following short definitions for mostly used data mining techniques: x Neural Networks: Neural Network is “a computer system or a type of computer program that is designed to copy the way in which the human brain operates.” [43] It is consisted of a set of neurons organized in layers. Each neuron in a layer is connected to other neurons in the consequent layer via a number of weighted edges. In other words, the structure is a weighted directed acyclic graph. The construction of the network starts with a set of primary edge weights and continues till achieving the optimal weights. x Decision Trees: The primary definition of a decision tree is: “A binary tree where every nonterminal node represents a decision. Depending upon the decision taken at such a node, control passes to the left or right subtree of the node. A leaf node then represents the outcome of taking the sequence of decisions given by the nodes on the path from the root to the leaf.” [39] But, it should be noted that there is no obligation for a decision tree to be binary. Each decision node relates to one input variable and decisions are made by comparing the input value with a split point to see if the path should continue to left or right of that node. x Logistic Regression (Logit): “A form of regression analysis that is specifically tailored to the situation in which the dependent variable is dichotomous (or binary).” [41] x Random Forests: L. Breiman and A. Cutler [42], the inventors of Random Forests method, mention: “random Forests grows many classification trees.
A. Data Mining Perspective In this section, papers are categorized based on the data mining techniques used in them. As it is seen in table 1, a list of all reviewed references that have exploited each method is identified. Classes are sorted based on two factors: number of papers in that field and the recency of publication. For each of the top-nine techniques, an example paper, which is indicated by bolded font in the table, is briefly explained. The rationale for selecting explained papers is based on recency and benchmarking results; that is, the latest published paper in each category, in which the related model has exhibited the best performance amongst other benchmarks, is summarized. This will give some suitable instances of current researches conducted in the field for beginners and also contributors. Reference [17] uses Genetic Algorithm to improve Neural Network by optimizing the network weights. This research is conducted in the context of a mobile telecommunications company and has developed two models with fitness functions based on cross entropy (log maximum likelihood) and model accuracy. For optimizing these two models, authors have exploited the crossvalidation method. The models are compared with a statistical method (i.e. z-score) regarding accuracy, area under receiver operating curve (AUC) and top-decile lift. The best performance belongs to the neural network model with the model accuracy as the fitness function. In [11], first input variables, which are selected according to receiver operating curve (ROC) of a single
221
TABLE I:
In [19] a two step framework for using SVM in churn prediction is suggested. In the first step, the best C (i.e. a SVM model parameter) and subset of features are chosen by applying RFE and L2-SVM techniques on the primary C candidates and set of features. Also sampling is conducted in this step by Support Vector Sampling. Then the selected variable C, the subset of features, and the primary sample are used to train a nonlinear model called RBF-SVM in the next step. Also other features of RBF-SVM are determined in this stage. The final model is the best one among the two linear and nonlinear generated models. Authors has taken the help of auxiliary methods like 10-fold cross-validation, line and pattern search in order to improve their results which have outperformed that of Logit, C4.5, L1-SVM and CART regarding AUC measure. Reference [22] has exploited the Neural Network method to predict the hazard and survival rate of customers. For doing so, first the hazard rate is calculated by KaplanMeier equation. Then the resulting values are set as the target value in training a Neural Network model. The new model is named BP ANN & Survival Analysis. The second contribution of this paper is establishing a customer segmentation model which defines six customer types based on Life Time Value (LTV) and Customer Survival Phase. The latter measure is computed regarding a proposed curve, which is derived from Product Life Cycle. The authors suggest appropriate retention strategies for each segment. In [15] the simple Bayesian method is improved by applying Genetic Algorithm (GA) to it. In fact, GA is employed to optimize the structure of the network and the weights of its edges. Comparing this method with Bayesian Networks, TAN and APRI in the context of a credit company and regarding True Positives, True Negatives, True Positive Rate and True Negative Rate, the new method shows an outstanding performance. Reference [3] constructs a two step model in which first the probability of attrition is calculated for a customer and then, if required, an appropriate retention strategy is proposed according to the cluster that includes the customer. The authors use C5.0 and Growing Hierarchical Self Organizing Map (GHSOM) respectively in the two sequential steps. They also amend the GHSOM method and also improve C5.0 results by exploiting cross-validation based on entropy and gain rate. Reference [28] explains a research conducted in the context of financial service. It provides a new definition for churn, which is slumping LTV. It also defines a new loss function based on the profit and uses it to propose a new model evaluation criterion called area under profit curve (AUPROC). Another contribution of this research is improving the traditional AdaBoost (i.e. a model improvement method) technique to a newer one called AdaCost, which incorporates the new loss function in calculations. Taking the AUPROC, AUC, cumulative profit percentage and model accuracy into account for different models with time horizons of 3 and 6 months, AdaCost outperforms other models.
MAPPING REFERENCES TO METHODS
Data Mining Technique Neural Networks Decision Tree Logistic Regression Random Forests Support Vector Machine Survival Analysis Bayesian Network Self Organizing Maps AdaCost Gradient Boosting Machine Linear Discriminant Analysis AdaBoost Rough Set Theory K-Nearest Neighbor K-Means Time Series Tailor-Butina ROCK (RObust Clustering using linKs) Regression Forests Linear Regression Association Rules Sequence Discovery
References [4], [5], [6], [7], [12], [13], [14], [16], [17], [22], [23], [25], [26], [27], [28] [3], [4], [6], [9], [10], [11], [12], [13], [14], [16], [19], [25], [28] [2], [6], [11], [13], [18], [19], [20], [23], [28], [29], [30], [31], [33] [2], [16], [18], [20], [23], [31], [32] [2], [13], [14], [16], [18], [19], [20] [22], [24], [32] [4], [13], [15] [3], [5] [28] [20] [14] [14] [8] [6] [12] [33] [33] [25] [31] [31] [5] [21]
input predictor, are divided into several groups (e.g. demographic, bill information, etc.) based on the concepts they describe. Then each group is used as the input of a dependent Alternative Decision Trees (ADTrees) model, which is an improvement to decision trees. The outputs of ADTrees models constitute the inputs of a Logit model that will predict the churn. Recursive feature elimination (RFE) method is exploited for eliminating ineffective features in this stage. This model is benchmarked with TreeNet model (i.e. winner of Gold Prize) on a telecom company dataset and has recorded a very similar performance according to ROC. Reference [6] assumes a complex relationship between the independent and target variables. It proposes using KNearest Neighbor (KNN) method as a solution for dealing with such challenge. It first exploits a single input KNN for changing the values of each independent feature. Then the new dataset is used for training and using Logit model. The proposed method has outperformed LR, C4.5, and RBF regarding a test on four distinct datasets and accuracy and ROC criteria. By combining the cost-sensitive learning method of Weighted Random Forests with the over-sampling technique used in Balanced Random Forests, [16] has implemented Improved Balanced Random Forests. In the new approach, repeated sampling and higher penalties for wrong classifications of members of minority class helps in selecting the best subset of input features. The chief aim of this method is to deal with class imbalance in applications like churn management. The results of comparing the new method with Weighted Random Forests, Balanced Random Forests, Decision Tree, Neural Network and SVM illustrates its better accuracy, lift and lift curve.
B. Statistical Report
222
TABLE II:
other words, researches have shifted on a specific set of models through the last 7 years, which are the top five in table 2. The reason can be superb performance of these five models (i.e. Neural Networks, Decision Tree, Logistic Regression, Random Forests and Support Vector Machine) in the past researches. The other reason is that some of these models, like Logit, are suitable for benchmarking as traditional and widely used models. Business managers who are seeking robust models can exploit these models. Also, researchers might be interested in benchmarking or even improving such models. The other noticeable point is the persistence of Neural Networks as a useful model through these years. This fact can be explained according to features of this method such as comfortable application after training, applicability in situations that there are complex nonlinear relationships between inputs and outputs, sustainability against noises and consistency with different types of input variables. Also, amongst methods that are used recently but not explored enough are AdaCost, Gradient Boosting Machine, Linear Discriminant Analysis, AdaBoost and Rough Set Theory. These methods might be of interest for researchers. Fig. 2 depicts the frequency of publications in the area during the last 7 years. The trend is generally consistent with results of the work of Ngai et al. [1]. As it can be seen, the number of papers about the topic has soared noticeably in the last two years. This might be the results of more pressure and competition in the last years. Progress of data mining knowledge can be the other explanation. Table 3 classifies the papers based on the journal in which they are published. Publications with the same number of papers are sorted by date of publication. So, interested readers and contributors could choose among these top publications. The most active publication in the area is Expert System with Applications. After it, skipping
MODELS AND PUBLICATION DATES
2005
2006
2007
2008
2009
Neural Networks Decision Tree Logistic Regression Random Forests Support Vector Machine Survival Analysis Bayesian Network Self Organizing Maps AdaCost Gradient Boosting Machine Linear Discriminant Analysis AdaBoost Rough Set Theory K-Nearest Neighbor K-Means Tailor-Butina Time Series ROCK Regression Forests Linear Regression Association Rules Sequence Discovery 1
2004
Data Mining Technique
2003
Publication Year
2
3 2 2 2
3 1 2
2 2 1
2 3 3 2 3 1 2 1
3 5 5 3 4
1
1 1
1
1 1 1 1 1 1 1 1 1 1 1 1 1
Total 15 13 13 7 7 3 3 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
This section provides a statistical report of the researches conducted in the field of application of data mining techniques in customer churn prediction. The papers, which are published in years 2003 to 2009, are gathered through searching the online databases Science Direct, IEEExplore, ACM Portal, SCOPUS and Emerald. Table 2 and Fig. 1 examine the papers from two dimensions: the applied model and the publication year. According to these two illustrations, research in the area has become more focused through the years 2003 to 2009. In
Number of Papers
6 5 4 3 2 1 0 2003
2004
2005
2006
2007
2008
Neural Networks
Decision Tree
Logistic Regression
Random Forests
Support Vector Machine
Survival Analysis
Bayesian Network
Self Organizing Maps
AdaCost
Gradient Boosting Machine
Linear Discriminant Analysis
AdaBoost
Rough Set Theory
K-Nearest Neighbor
K-Means
Tailor-Butina
Time Series
ROCK
Regression Forests
Linear Regression
Association Rules
Sequence Discovery
Figure 1.
Trends in application of data mining techniques in publications between years 2003 and 2009
223
2009
Number of Papers
9 8 7 6 5 4 3 2 1 0 2003
2004
2005
2006
2007
2008
However, it should be noticed that this research might not include all the papers in the area and this is just the result of the authors search in the most famous online databases as described before. Also, further research in this area can be focusing on other statistical methods used in different phases of knowledge discovery, such as sampling, feature selection, model evaluation, etc., and classifying papers regarding those methods or designing a thorough table (like table 2) that includes both data mining models and statistical methods dimensions. In addition, one can repeat a similar research for other functions of CRM, like customer attraction and customer development.
2009
Year of Publication
Figure 2. Number of articles in each year TABLE III.
REFERENCES [1] E.W.T. Ngai, Li Xiu and D.C.K. Chau, “Application of data mining techniques in customer relationship management: A literature review and classification,” Expert Systems with Applications, vol. 36, 2009, pp. 2592–2602. [2] K. Coussement and Dirk Van den Poel, “Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques,” Expert Systems with Applications, vol. 34, 2008, pp. 313–327. [3] Bong-Horng Chu, Ming-Shian Tsai and Cheng-Seen Ho, “Toward a hybrid data mining model for customer retention,” Knowledge-Based Systems, vol. 20, 2007, pp. 703–718. [4] X. Hu, “A Data Mining Approach for Retailing Bank Customer Attrition Analysis,” Applied Intelligence, vol. 22, 2005, pp. 47–60, Springer. [5] H. S. Song, J. K. Kim, Y. B. Cho and S. H. Kim, “A Personalized Defection Detection and Prevention Procedure based on the SelfOrganizing Map and Association Rule Mining: Applied to Online Game Site,” Artificial Intelligence Review, vol. 21, 2004, pp. 161– 184. [6] Y. M. Zhang, J. Y. Qi, H. Y. Shu, and J. T. Cao, “A Hybrid KNN-LR Classifier and its Application in Customer Churn Prediction,” Proc. the IEEE International Conference on Systems, Man and Cybernetics, Oct. 2007, pp. 3265–3269. [7] G. Song, D. Yang, L. Wu, T. Wang, Sh. Tang, “A Mixed Process Neural Network and its Application to Churn Prediction in Mobile Communications,” Proc. Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06), 2006. [8] James J.H. Liou, “A novel decision rules approach for customer relationship management of the airline market,” Expert Systems with Applications, vol. 36 (3), April 2009, pp. 4374-4381. [9] M. Zan, Z. Shan, L. Li, L. Ai-jun, “A Predictive Model of Churn in Telecommunications Based on Data Mining,” Proc. IEEE International Conference on Control and Automation, IEEE Press, 2007. [10] Yi-Fan Wang, Ding-An Chiang, Mei-Hua Hsu, Cheng-Jung Lin, ILong Lin, “A recommender system to avoid customer churn: A case study,” Expert Systems with Applications, vol. 36, 2009, pp. 8071– 8075. [11] J. Qi et al., “ADTreesLogit model for customer churn prediction,” Annuls of Operations Research, vol. 168, 2009, pp. 247–265, Springer. [12] Shin-Yuan Hung, David C. Yen and Hsiu-Yu Wang, “Applying data mining to telecom churn management,” Expert Systems with Applications, vol. 31, 2006, pp. 515–524. [13] J. Zhaoa and Xing-Hua Dang, “Bank Customer Churn Prediction Based on Support Vector Machine: Taking a Commercial Bank's VIP Customer Churn as the Example,” Proc. 4th International Conference on Wireless Communications, Networking and Mobile Computing, 2008 (WiCOM'08), Oct. 2008, pp. 1-4.
DISTRIBUTION OF PAPERS IN PUBLICATIONS
Publication Title Expert Systems with Applications Conference Proceedings (Sponsored by IEEE) IEEE Conference Proceedings European Journal of Operational Research Annuls of Operations Research Information & Management Knowledge-Based Systems Telecommunications Policy Decision Support Systems Applied Intelligence Artificial Intelligence Review IEEE Intelligent Systems
Count 12 5 4 3 1 1 1 1 1 1 1 1
% 37.5 15.625 12.5 9.375 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125
conferences, comes European Journal of Operations Research. These results are consistent with that of Ngai et al. [1], which does not include conference proceedings. III.
CONCLUSION
In this paper, the recent literature in the area of application of data mining techniques in customer churn management is reviewed from two different perspectives: the applied model and the statistics of publication. The primary aim of this paper is to provide a big picture for contributors to help them determine the potential research points and areas. This paper also can be of use for business managers who are planning to implement a data mining intervention for churn management. Also, some papers are summarized in order to give examples of the recent researches in the mentioned field. In fact, this paper is a focused and detailed version of the few previous works like [1]. Results of reviewing about 32 papers show that, nowadays researchers exhibit a tendency toward techniques like Neural Networks, Decision Tree, Logit, Random Forests and Support Vector Machine. Also AdaCost, Gradient Boosting Machine, Linear Discriminant Analysis, AdaBoost and Rough Set Theory appear to be potential and researchable methods. The results also indicate that the number of publications in the area of application of data mining techniques in customer churn management has soured in the last two years and the most active publisher in this field is Expert Systems with Applications.
224
between the sales and credit department,” Expert Systems with Applications, vol. 35, 2008, pp. 497–514. [33] A. Prinzie T and D. Van den Poel, “Incorporating sequential information into traditional classification models by using an element/position-sensitive SAM,” Decision Support Systems, vol. 42, 2006, pp. 508–526. [34] E. Turban, J. E. Aronson, T. P. Liang, and R. Sharda, Decision support and business intelligence systems, 8th ed., Pearson Education, 2007. [35] M. Lejeune, “Measuring the Impact of Data Mining on Churn Management,” Journal of Electronic Network Applications and Policy, vol. 11 (5), 2001, pp. 375–387. [36] M. Kim, M. Park and D. Jeong, The effects of customer satisfaction and switching barrier on customer loyalty in Korean mobile telecommunication services, Telecommunications Policy 28 (2004), pp. 145–159. [37] H.S. Kim and C.H. Yoon, “Determinants of subscriber churn and customer loyalty in the Korean mobile telephony market,” Telecommunications Policy, vol. 28, 2004, pp. 751–765. [38] P. M. Dawkins and F. F. Reichheld, “Customer retention as a competitive weapon,” Directors & Board, Summer 1990, pp. 42–7. [39] J. Daintith, "decision tree." A Dictionary of Computing. 2004. Encyclopedia.com, 20 Oct. 2009, http://www.encyclopedia.com. [40] V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, 1995. [41] G. Marshall, "logistic regression." A Dictionary of Sociology. 1998. Encyclopedia.com, 20 Oct. 2009, http://www.encyclopedia.com. [42] L. Breiman and A. Cutler, “Random Forests”, http://statwww.berkeley.edu/users/breiman/RandomForests. [43] Cambridge Advanced Learner’s Dictionary, Cambridge University Press, 2009, http://dictionary.cambridge.org.
[14] Y. Xie and X. Li, “Churn Prediction with Linear Discriminant Boosting Algorithm,” Proc. the Seventh International Conference on Machine Learning and Cybernetics, Kunming, July 2008. [15] Hongmei Shao, Gaofeng Zheng and Fengxian An, “Construction of Bayesian Classifiers with GA for Predicting Customer Retention,” Proc. Fourth International Conference on Natural Computation, IEEE Computer Society Press, 2008. [16] Y. Xie, X. Li, E.W.T. Ngai and W. Ying, “Customer churn prediction using improved balanced random forests,” Expert Systems with Applications, vol. 36, 2009, pp. 5445–5449. [17] P. C. Pendharkar, “Genetic algorithm based neural network approaches for predicting churn in cellular wireless network services,” Expert Systems with Applications, vol. 36, 2009, pp. 67146720. [18] K. Coussement, Dirk Van den Poel, “Improving customer attrition prediction by integrating emotions from client/company interaction emails and evaluating multiple classifiers,” Expert Systems with Applications, vol. 36, 2009, pp. 6127–6134. [19] S. Lessmann and S. Voß, “A reference model for customer-centric data mining with support vector machines,” European Journal of Operational Research, vol. 199 (2), Dec. 2009, pp. 520-530. [20] J. Burez and D. Van den Poel, “Handling class imbalance in customer churn prediction,” Expert Systems with Applications, vol. 36, 2009, 4626–4636. [21] Ding-An Chiang, Yi-Fan Wang, Shao-Lun Lee and Cheng-Jung Lin, “Goal-oriented sequential pattern for network banking churn analysis,” Expert Systems with Applications, vol. 25, 2003, pp. 293– 302. [22] G. Zhang, “Customer Retention Based on BP ANN and Survival Analysis,” Proc. International Conference on Wireless Communications, Networking and Mobile Computing, 2007 (WiCom), Sept. 2007, pp. 3406-3411. [23] W. Buckinx and D. Van den Poel, “Customer base analysis: partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting,” European Journal of Operational Research, vol. 164, 2005, pp. 252–268. [24] B. Larivie`re, D. Van den Poel, “Investigating the role of product features in preventing customer churn, by using survival analysis and choice modeling: The case of financial services,” Expert Systems with Applications, vol. 27, 2004, pp. 277–285. [25] Lian Yan, Michael Fassino and Patrick Baldasare, “Predicting Customer Behavior via Calling Links,” Proc. International Joint Conference on Neural Networks, Montreal, Canada, August 2005. [26] E Xu, S. Liangshan, G. Xuedong and Z. Baofeng, “An Algorithm for Predicting Customer Churn via BP Neural Network Based on Rough Set,” Proc. the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06). [27] “Predicting Customer Behavior in Telecommunications,” L. Yan, R. H. Wolniewicz, R. Dodier, IEEE Intelligent Systems, IEEE Computer Society. [28] N. Glady, B. Baesens and C. Croux, “Modeling churn using customer lifetime value,” European Journal of Operational Research, vol. 197, 2009, pp. 402–411. [29] Jae-Hyeon Ahna, Sang-Pil Hana and Yung-Seop Lee, “Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry,” Telecommunications Policy, vol. 30, 2006, pp. 552–568. [30] K. Coussement and D. Van den Poel, “Integrating the voice of customers through call center emails into a decision support system for churn prediction,” Information & Management, vol. 45, 2008, pp. 164–174. [31] B. Larivie`re and D. Van den Poel, “Predicting customer retention and profitability by using random forests and regression forests techniques,” Expert Systems with Applications, vol. 29, 2005, pp. 472–484. [32] J. Burez and D. Van den Poel, “Separating financial from commercial customer churn: A modeling step towards resolving the conflict
225