Customer relationship management and data mining

Customer relationship management and data mining: A classification decision tree to predict customer purchasing behavior in global market Soft computing intelligent algorithms in engineering, management, and Technology (P.Vasant ed). IGIGlobal.

Niccolò Gordini University of Milan-Bicocca, Italy Valerio Veglio University of Milan-Bicocca, Italy ABSTRACT In the global market of today, customer relationship management (CRM) plays a fundamental role in marketoriented companies to understand customer behaviors, achieve and maintain a long-term relationship with them and maximize the customer value. Moreover, the digital revolution has made information easy and fairly inexpensive to capture. Thus, companies have stored a large amount of data about their current and potential customers. However, this data is often raw and meaningless. Within the CRM framework, data mining (DM) is a very popular tool for extracting useful information from this data and for predicting customer behaviors in order to take profitable marketing decisions. This research aims to demonstrate the classification decision tree as one of the main computational data mining models able to forecast accurate marketing performance within global organizations. Particular attention is paid to the identification of the best marketing activities to which firms should concentrate their future marketing investments. The criteria is based on the loss functions that confirm the accuracy of this model.

INTRODUCTION Globalization draws new competitive boundaries, modifying the traditional concepts of time and space (Brondoni, 2008a, 2008b, 2008c). In the global market, companies have crossed spatial and temporal borders (both regional and cultural), selling their products to customers located everywhere and have been aided greatly by the digital revolution that has permitted them to interact with a large amount of data. In this new, global, and highly competitive (D’Aveni, 1994) arena, a company has to think in terms of market-orientation and not only in terms of product orientation and marketing orientation in order to achieve competitive advantage. If competitive advantage was once based on structural characteristics such as market power, today the emphasis has shifted to capabilities that enable a company to consistently deliver superior value to its customers (Slater & Narver, 1994). As a result, the concept of market-orientation (Kohli & Jaworski, 1990; Narver & Slater, 1990) was developed, which emphasized the establishment of effective information processes and capabilities within the firm to understand the expressed and latent needs of customers, thus making firms more efficient and effective in managing customer relationships (Boulding, 2004). A bulk of research (Day, 1994; Gordini, 2010, 2012; Kohli & Jaworsky, 1990; Jaworski & Kohli, 1993; Narver & Slater, 1990; Slater & Narver, 1994, 1998, 1999) has highlighted that a market-orientation provides a strong foundation to reach superior customer value and competitive advantage. Among the most commonly cited, Narver and Slater (1990) stated that “market orientation consists of three behavioral components (customer orientation, competitor orientation, and inter-functional coordination) and two decision criteria (long-term focus and profitability)”. Jaworski and Kohli (1993) suggest that market-

orientation is “the organization wide generation of market intelligence pertaining to current and future needs of the customers, dissemination of intelligence horizontally and vertically within the organization, and organization wide action or responsiveness to it”. Day (1994) simply states that market orientation represents superior skills in understanding and satisfying customers. In a later study, Slater and Narver (1994) highlight that a company is market-oriented when “its culture is systematically and entirely committed to the continuous creation of superior customer value”. Deshpandé et al. (1993) define customer value “as the set of beliefs that puts the customer’s interest first, while not excluding those of other stakeholders such as owners, managers, and employees, in order to develop a long-term profitable enterprise.” Specifically, “this entails collecting and coordinating information on customers, competitors, and other significant market influencers to use in building that value” (Slater & Narver 1994). Thus, the heart of market orientation is its customer focus. In addition, today’s customers have such varied needs and preferences that it is not possible to group them into large homogenous populations to develop marketing strategies (Shaw et al., 2001). In fact, each customer wants to be served according to his individual and unique needs. Hence, marketing decisions based on traditional segmentation approaches and not on a market orientation, result in a poor response rate and increased costs. Slater and Narver (1994) suggest that market oriented companies have to understand cost and revenue dynamics of not only current customers but also of future target buyers. Therefore market-oriented companies are committed to understanding both the expressed and latent needs of their customers (Day, 1994; Gordini, 2010, 2012; Kohli & Jaworsky, 1990, Slater & Narver, 1994, 1998, 1999) as reacting only to customers' expressed needs is usually inadequate for the creation of competitive advantage in a global market. Certainly, market-oriented companies do not ignore the expressed needs of their customers, but they have to realize that, since competitive advantage is often temporary, the firm must understand how customers' needs are evolving and develop innovative solutions to such needs (D’Aveni, 1994). Therefore, the greater the market orientation of the firm, the greater the proportion of its activities oriented towards understanding latent needs. Developing a long-term relationship with customers is the best way to know the customers’ expressed and latent needs and to enable them to become loyal (Dowling, 2002). Consequently, customer relationship management becomes a fundamental concept in market-oriented companies in order to analyze and understand customer behaviors, to acquire and retain customers, to achieve and maintain a strong, long-term relationship with current and potential customers, to maximize customer value and, consequently, to obtain a competitive advantage. Simultaneously at this shift of paradigm and at the statement of the market orientation, there was an explosion of available customer data. In fact, the advent of information technology and the digital revolution has made data easy to capture and fairly inexpensive to store, transforming the way marketing is done and how companies manage data. As a result, companies, especially global companies, have realized that these huge amounts of data based on their current and potential customers, suppliers, business partners and competitors, are key factors to support important marketing decisions, and they have started to collect and store them in their own databases. The main problem for the companies is how to understand which of this data could contain competitive information and, hence, relevant knowledge in creating strategic marketing decision-making. In fact, this data remains often raw and meaningless, and is stored in databases without transforming it into meaningful information to create customer value and profit for the company. According to Belbaly et al. (2007) there are strong reasons to convert raw data into information and to spread this information throughout the companies in order to develop knowledge useful in the decision-making process. In order to do that, Buchnowska (2011) suggests: “customer information is obtained through filtering, integrating, extracting or formatting customer data. Transforming customer data into customer information, companies use various information systems”. One of the most important is data mining. Within the CRM framework, data mining techniques are a very popular tool for extracting useful information from enormous customer databases. In fact, data mining tools can help companies to uncover the hidden information about the expressed and latent needs of their customers and, hence, to understand the customer better, channeling this information into effective marketing strategies. Thus, the application of data mining in CRM is justified in market-oriented companies. Despite its relevance, data mining and its application to CRM have not been, however, thoroughly studied. The aim of this study is to examine the concepts of customer relationship management and data mining, their relationship, and to test the prediction power of the data-mining technique in estimating the probability of customer conversion in global competitive companies. In the following paragraph we make a brief overview of the concepts of CRM and data mining, illustrating their relationship and analyzing the advantages and disadvantages along with the challenges deriving from that relationship. The next three sections (data and methodology, findings, future research directions) show the empirical analysis conducted on the effectiveness of data mining in estimating the probability of customer conversion. A concluding section

discusses and summarizes the results of our research.

BACKGROUND Customer Relationship Management The globalization of markets, characterized by the lack of spatial-temporal boundaries, a fast and high technological development, and a even increasing number of competitors, obliged companies to renew their strategies in order to adapt themselves to the new competitive arena. In this even more competitive, complex and saturated market, the digital revolution has made data easy to capture and fairly inexpensive to store. The large amount of data available and the new database technologies have enabled companies to gain the knowledge on who are the customers, what and when they bought (expressed needs), and even predictions on what they would like to buy in the future (latent needs). Thus, many companies have collected and stored a large amount of data about their current and potential customers in their database and customer relationship management has become a fundamental tool to compete in today’s global and highly competitive markets. According to Rigby and Bilodeau (2009), CRM was the fourth most used tool in 2008. CRM is a business model that dynamically integrates sales, marketing and customer expressed and latent needs in order to help firms to build long term, profitable relationships with current and potential customers (Ling & Yen, 2001), to manage this relationship in an organized way (Xu et al., 2002), and to add value for both the company and its customers. The CRM concept emerged first in the practitioner community (especially vendor community) in the mid-1990s and later in the academic community. It is not a new concept in the academic literature, being a development of another well-known marketing concept, relational marketing, defined by Reinartz and Kumar (2003) as “the establishment and maintenance of long-term buyer–seller relationships”. Although CRM has become widely recognized as an important marketing tool, there is not a unique definition of this concept in the literature (Khanna, 2001; Kim, 2006; Ling & Yen, 2001; Ngai, 2005; Ngai et al., 2009; Payene & Frow, 2005; Parvatiyar & Sheth, 2001; Stone & Woodcock, 2001; Swift, 2001). According to Payne and Frow (2005), CRM can be defined from three perspectives. The first perspective defines CRM in a narrow way, as a specific technology solution project (Khanna, 2001). The second perspective, instead, defines CRM as an integrated series of customer-oriented IT solutions (Stone & Woodcock, 2001). Finally, the third perspective defines CRM in a more strategic and holistic way that highlights the management of the customer relationship to create value for both the firm and the customers. Within this perspective, Swift (2001) defines CRM as an ‘‘enterprise approach to understanding and influencing customer behavior through meaningful communications in order to improve customer acquisition, customer retention, customer loyalty, and customer profitability”. Parvatiyar and Sheth (2001) consider CRM as ‘‘a comprehensive strategy and process of acquiring, retaining, and partnering with selective customers to create superior value for the company and the customer. It involves the integration of marketing, sales, customer service, and the supply chain functions of the organization to achieve greater efficiencies and effectiveness in delivering customer value”. Scott (2001) suggests that CRM is ‘‘a set of business processes and overall policies designed to capture, retain and provide service to customers’’. Chang et al. (2002) analyzed the importance of CRM in enhancing the ability of a firm to obtain and retain key customer, while Hansotia (2002) suggests that “at the heart of CRM is the organization’s ability to leverage customer data creatively, effectively and efficiently to design and implement customer-focused strategies”. Kincaid (2003) views CRM as ‘‘the strategic use of information, processes, technology and people to manage the customer’s relationship with your company (Marketing, Sales, Services, and Support) across the whole customer life cycle”. According to Injazz and Karen (2004), CRM is ‘‘a coherent and complete set of processes and technologies for managing relationships with current and potential customers and associates of the company, using the marketing, sales and service departments, regardless of the channel of communication’’. Payne and Frow (2005) state that “CRM is a strategic approach that is concerned with creating improved shareholder value through the development of appropriate relationships with key customers and customer segments. CRM unites the potential of relationship marketing strategies and IT to create profitable, long-term relationships with customers and other key stakeholders. CRM provides enhanced opportunities to use data and information to both understand customers’ needs and create value with them. This requires a cross-functional integrating of processes, people, operations, and marketing capabilities, enabled through information, technology, and applications”. Panagiotis et al. (2007) suggest that “CRM optimizes values as profitability, revenue and customer satisfaction (what and why) by organizing around customer segments, fostering customer satisfying behaviours and implementing customer-centric business models (how)”. Therefore, CRM should enable

greater customer insight, increased customer access and more effective customer interactions (outcomes)”. Ngai et al. (2009) state that to analyze and understand “customer behaviors characteristics is the foundation of a competitive CRM strategy, so as to acquire and retain potential customers and maximize customer value”. Finally, Feng (2012) suggests that the “goals of CRM are, on the one hand, to attract and retain more customers by providing more quickly good quality service, on the other hand, to reduce costs through the overall management of business process”. As seen, many definitions of CRM exist. These plethora of definitions, ranging from CRM being the implementation of specific technology solutions to the implementation of an integrated series of customeroriented technology solution, to a holistic approach of managing customer relationships simultaneously creating value for both the customer and the firm, have caused some confusion. Parvatiyar and Sheth (2001) have noted that a prerequisite for a new concept to merge into an established field is to institute an acceptable definition that captures all the major aspects of the concept itself. In fact, the way a concept is defined is not merely a semantic question directly affecting the way a company accepts and uses the concept itself. Thus, it is possible to quibble about the specific wording used in all the above definitions, but we have to identify the basic element of this concept. Therefore, in this paper, according to Boulding et al. (2005), and to Payne and Frow (2005), CRM relates to strategy, the management of the dual creation of value (for both firm and customers), the intelligent use of data and technology, the acquisition of customer knowledge and the diffusion of this knowledge to the appropriate stakeholders, the development of profitable long-term relationships with specific customers and/or customer groups, and the integration of processes across the many areas of the firm and across the network of firms that collaborate to generate customer value. What emerges from all these definitions is that CRM requires a “cross-functional integration of processes, people, operations, and marketing capabilities that is enabled through information, technology, and applications” (Payne & Frow, 2005). Thus, CRM goes beyond a customer focus. Not only does CRM build relationships and use systems to collect and analyze data, but it also includes the integration of all these activities across the firm, linking these activities to both firm and customer value, extending this integration along the value chain, and developing the capability of integrating these activities across the network of firms in order to collaborate and generate customer value, while creating shareholder value for the firm. From an architecture point of view, the CRM can be classified into operational and analytical (Berson et al., 2000, He et al., 2004; Teo et al., 2006). According to He et al. (2004) operational CRM includes all activities concerning the direct customer contact, such as campaigns, hotlines or customer clubs. Every CRM activity is generally implemented in one of the following three activities: marketing, sales or service, since these are the activities concerned with direct customer contact (ECCS, 1999). According to He et al. (2004) analytical CRM refers to the analysis of customer characteristics and behaviors so as to support the organizations customer management strategies and to accomplish operational CRM activities, with respect to the customers’ needs and expectations Skinner (1990). Thus, the idealistic goal is to provide all information necessary to create a tailored cross-channel dialogue with each single customer on the basis of his or her actual reactions (Arndt & Gersten, 2001). As such, analytical CRM could help an organization to better discriminate and more effectively allocate resources to the most profitable group of customers. Many organizations have collected and stored a wealth of data about their current customers, potential customers, suppliers and business partners. However, the inability to discover valuable information hidden in the data prevents the organizations from transforming these data into valuable and useful knowledge (Berson et al., 2000). Data mining tools are a popular means of analyzing customer data within the analytical CRM framework. The application of data mining tools in CRM is an emerging trend in the global economy. Analyzing and understanding customer behaviors and characteristics is the foundation of the development of a competitive CRM strategy, so as to acquire and retain potential customers and maximize customer value. Appropriate data mining tools, which are good at extracting and identifying useful information and knowledge from enormous customer databases, are one of the best supporting tools for making different CRM decisions (Berson et al., 2000). As such, the application of data mining techniques in CRM is worth pursuing in a customer-centric economy. According to Swift (2001), Parvatiyar and Sheth (2001), Kracklauer et al. (2004) and Ngai et al. (2009), CRM consists of four dimensions: 1. Customer identification. Customer identification or, in other words, customer acquisition is the first phase of CRM. This phase involves targeting the population who are most likely to become customers or most profitable to the company and, at the same time, it involves analysing customers who are being lost to the competition and how they can be won back (Kracklauer et al., 2004). This phase includes two main elements: target customer analysis and customer segmentation. Target

customer analysis involves seeking the profitable segments of customers through analysis of customers’underlying characteristics, whilst customer segmentation involves the subdivision of an entire customer base into smaller customer groups or segments, consisting of customers who are relatively similar within each specific segment (Woo et al., 2005). 2. Customer attraction. This phase follows the customer identification. In fact, after identifying the segments of potential customers, companies can direct effort and resources into attracting these customer segments. Elements of customer attraction are developing new product, advertising, direct marketing (Cheung et al., 2003; He et al., 2004; Liao and Chen, 2004; Prinzie and Poel, 2005). 3. Customer retention. This is the core concern for CRM. In fact, it is not always true that building many relationships is always better, instead, building the right and long-term relationship is critical. Thus, customer satisfaction, which refers to the comparison of customers’expectations with his or her perception of being satisfied, becomes an essential condition for retaining customers (Kracklauer et al., 2004). Elements of customer retention are, for example, one-to-one marketing and loyalty programs (Chen et al., 2005; Jiang & Tuzhilin, 2006; Kim & Moon, 2006). 4. Customer development. This phase aims at increasing the number of transaction intensity, transaction value and individual customer profitability. Metrics of customer development include customer lifetime value analysis, up/cross selling and market basket analysis. Customer lifetime value analysis is defined as the prediction of the total net income a company can expect from a customer (Drew et al., 2001; Etzion et al., 2005; Rosset et al., 2003). Up/Cross selling refers to promotion activities, which aim at increasing the number of associated or closely related services that a customer uses within a firm (Prinzie & Poel, 2006). Market basket analysis aims at maximizing the customer transaction intensity and value by revealing regularities in the purchase behaviour of customers (Aggarval & Yu, 2002; Brijs et al., 2004; Carrier & Povel, 2003; Chen et al., 2005; Giudici & Passerone, 2002; Kubat et al., 2003). Finally, analyzing the efficiency and the effectiveness of these four phases, companies should recognize that the relationships evolve with distinct phases (Dwyer et al., 1987; Reinartz et al., 2004), firms interact with customers (Reinartz et al., 2004; Srivastava et al., 1998), and the distribution of relationship value to the firm is not homogeneous (Mulhern, 1999; Niraj et al., 2001; Reinartz et al., 2004). Firstly, according to Dwyer et al. (1987), CRM process should recognize that relationships evolve with distinct phases and, consequently, relationships cannot be viewed as multiple independent transactions; rather, the interdependency of the transactions creates its own dynamic over time. In other words, CRM processes are longitudinal phenomena. Secondly, firms should interact with customers and manage relationships differently at each stage (Srivastava et al., 1998). In fact, the main goal of CRM is to manage the various stages of the relationship systematically and proactively. Finally, the value of each relationship to the firm is not homogeneous (Mulhern 1999; Niraj et al., 2001). According to Reinartz et al. (2004) “a common finding is that best customers do not receive their fair share of attention and that some companies overspend on marginal customers”. In a CRM paradigm, a core object is to define different resource allocations for different tiers of customers, where the customer's tier membership depends on the economic value of that customer or segment to the firm (Zeithaml et al., 2001). In addition to the theoretical evolution, a prerequisite so as a concept is really applicable and useful in any marketing activity is that it should demonstrably enhance firm performance. (Boulding et al., 2004; Lehmann, 2004; Rust et al., 2004). Several studies analyze this topic using various research methods and different measures of firm performance. Firstly, these studies demonstrate that CRM leads to a process of dual-creation of value for both the customers and the firm (Boulding et al., 2005; Levitt, 1960, 1969; Payne & Frow, 2005; Rogers, 2005; Vargo & Lusch, 2004). The dual creation of customer and firm value is a core concept in CRM because the emphasis is not only on how to sell a product, but rather on how to create value for the customers and, consequently, for the firm. In fact, according to Levitt (1960, 1969), one of the main idea in marketing is that, if a firm want to grow up, it should focus on fulfilling expressed and latent needs of the customers. According to Boulding et al. (2005) it is essential that the firm “develops measures that are directly connected with this value dual-creation process, enabling the firm to understand the drivers of value and thus to ensure long-term success”. Therefore, through the CRM, the firms normally obtain not only final measures such as profit or shareholder value, but also intermediate measures such as customer-life time value and acquisition and retention costs, which relate to the value dual-creation process. As Boulding et al. (2005) state, “good CRM process measures provide the firm with the opportunity to gain deeper insights into how these intermediate process measure link to downstream firm performance” such as profit or

shareholders value. Secondly, CRM helps firms to begin to treat marketing costs as firm investments (Rust et al. 2004). Furthermore, this implies that marketing could gain a central role in managing a key asset of the firm as the customer asset. Finally, these studies also showed that CRM has a positive impact on performance of firm operating in different industry sectors. This is an important result because the success and the positive impact of CRM on firm performance is not contingent to a particular industry, but it is usefully applicable to all industry sectors. Analyzing some of these studies, Ryals (2005), using a case study approach and measuring the costs and the revenues associated with CRM to assess overall profit, shows that one of the business units analyzed was able to achieve a 270% increase in business unit profit above target by implementing several CRM measures. In addition, she shows that firms reduce their attention to customers after they determine that they are not able to garner enough value from these customers. Thus, for certain customers, value is taken away so that firms can increase the value they receive. Mithas et al. (2005) using a multiform database demonstrate that the use of CRM is associated with increased customer knowledge and, consequently, with greater customer satisfaction. Srinivasan and Moorman (2005) using a cross-sectional database show that firms that invest more in CRM have greater customer satisfaction than other firms. Jayachandran et al. (2005) suggest that firms that use CRM obtain performance in term of retention and customer satisfaction greater that firms that not implement CRM. Cao and Gruca (2005), using data collected within a single firm over time and its customers, focus their attention on acquiring the right customer in order to develop a specific CRM model to increase the firm performance. The authors provide a framework whereby the firm can better limit its target market to customers who both want to hear about the firm's particular offer and qualify for that offer. As a result, the firm does not send messages to customers who are unlikely to respond, thus minimizing the disturbance to these customers. This leads to an obvious win-win situation for the firm and its customers (i.e., the dual creation of value). Lewis (2005) elaborates a process that identifies dynamic customer behavior, thus helping firms to create a pricing scheme that increase long-term profits. Thomas and Sullivan (2005) using a corporate database develop a CRM model that allows the firm to modify its communication tool depending on where customers live and how they shop and, consequently, to increase its profit. The authors also examine the dual creation of value from the firm perspective proposing a process that enables firms to migrate customers into more profitable channels.

Data Mining The origin of data mining goes back to the first storage of data on computers, increases with improvements in data access, until today technology allows users to navigate through data in real time. The digital revolution has made a huge amount of data easy to capture. Nowadays, information and knowledge are strategic and indispensable prerogatives in search for competitive advantage and decision-making with time getting shorter and shorter. Therefore, so as CRM can really use this enormous amount of data it has a need for the appropriate tools. In fact, most organizations have built up massive databases about their customers and their purchase transactions. But, this data is often raw data, and raw data is often meaningless and rarely of direct benefit. According to Berson et al. (2000), the inability to discover valuable information hidden in these sets of data prevents the companies from transforming this data into a valuable and useful knowledge base, thus a wealth of customer information is permanently hidden and unutilized in such databases. The true value of this data is predicated on the ability to extract information useful for decision support or exploration, and on understanding the phenomenon governing the data source. Therefore, companies have a need for adequate statistical techniques to analyze and transform this data into useful knowledge. Traditional statistical techniques and data management tools are no longer adequate for this purpose. All this has prompted a requirement for using soft computing data analysis methodologies, which could discover useful knowledge from data. Under these conditions, several companies have adopted data mining tools to monitor funding, client consumption, prevent fraud and foreseeing customer behaviors (Galvão & Marin, 2009). In fact, data mining tools can help uncover hidden knowledge in large datasets and understand the customer better. As a consequence, the support of corporate decision making through data mining has received increasing interest and importance in operational research. Progress in technology and storage capacity has enabled the accumulation of customer data, inducing large, rich datasets of heterogeneous scales. On the one hand, the enhanced data has created particular challenges in transforming attributes of different scales into a mathematically feasible and computationally suitable format. On the other hand, this has advanced the application of data mining methods like decision trees (DT), artificial neural networks, genetic algorithms (GAs) and support vector machines (SVM), capable of mining large datasets. According to relevant literature, Turban et al. (2007) define data mining as ‘‘the process that uses statistical, mathematical, artificial intelligence and machine-learning techniques to extract and identify useful

information and subsequently gain knowledge from large databases”. Berson et al. (2000), Lejeune (2001), Ahmed (2004) and Berry and Linoff (2004) define data mining in a similar way as the process of extracting or detecting hidden patterns or information from large databases. In addition, scholars (among others, Langley & Simon, 1995; Lau et al., 2003; Su et al., 2002) agree that with comprehensive customer data, data mining can provide business intelligence to generate new opportunities. According to Berson et al. (2000) appropriate data mining tools, appear to be one of the best supporting tools for making different CRM decisions as they are efficient at extracting and identifying useful information and knowledge from enormous customer databases. In fact, analyzing and understanding customer behaviors and characteristics are the foundation of the development of a competitive CRM strategy, so as to acquire and retain potential customers and maximize customer value. Within the CRM, data mining can be seen as a business driven process aimed at the discovery and consistent use of profitable knowledge from organizational data (Ling & Yen, 2001). It can be used to guide decision-making and forecast the effects of decisions. For instance, data mining can increase the response rates of the marketing campaign by segmenting customers into groups with different characteristics and needs; it can predict how likely an existing customer is to take his/her business to a competitor (Carrier & Povel, 2003). Data mining has formed a branch of soft computing techniques (Liao, 2012; Mitra et al., 2002) and assists companies to discover and extract the hidden knowledge in large volumes of data (Ahmed et al., 2004; Berson et al., 2000; Lejeune, 2001). In fact, data mining is an interdisciplinary field with a general goal of predicting outcomes and uncovering relationships in data. It uses automated tools employing sophisticated algorithms to discover hidden patterns, associations, anomalies and/or structures from large amounts of data stored in data warehouses or other information repositories. Thus, data mining is a tool to give meaning to the data and it is actually part of a larger process called “knowledge discovery” which describes the steps that must be taken to ensure meaningful results. Although data mining techniques are used in several areas such as fraud detection, bankruptcy prediction, medical diagnosis, and scientific discoveries, their use for marketing decision support highlights unique and interesting issues such as real-time interactive marketing, customer profiling, cross-organizational management of knowledge, and CRM. It should be clear from the discussion above that CRM is a broad topic with many layers, one of which is data mining, and that data mining is a method or tool that can aid companies in their quest to better understand and manage raw data (Rygielski et al., 2002). The DM development process involves various models and, within each model, different statistical techniques in order to make possible the extraction of new knowledge. According to various researchers (Ahmed, 2004; Carrier & Povel, 2003; Mitra et al., 2002; Shaw et al., 2001; Turban et al., 2007), CRM can be supported by different data mining models that generally include the following seven: (1) Association; (2) Classification; (3) Clustering; (4) Forecasting; (5) Regression; (6) Sequence discovery; (7) Visualization. 1. Association. It aims to establishing relationships between items, which exist together in a given record (Ahmed, 2004, Jiao et al., 2006; Mitra et al., 2002). Market basket analysis and cross selling programs are typical examples for which association modelling is usually adopted. 2. Classification. It is one of the most common learning models in data mining (Ahmed, 2004, Berry & Linoff, 2004; Carrier & Povel, 2003). It aims at classifying a data records into one of several predefined classes based on certain criteria and, consequently, at building a model to predict future customer behaviours (Ahmed, 2004; Berson et al., 2000; Chen et al., 2003; Mitra et al., 2002). Common tools used for classification are neural networks, decision trees and if-then-else rules. 3. Clustering. It aims at segmenting a heterogeneous population into a number of more homogenous clusters based on similarity metrics or probability density models. (Ahmed, 2004, Berry & Linoff, 2004; Carrier & Povel, 2003; Mitra et al., 2002). It is different to classification in that clusters are unknown, are no predefined at the time the algorithm starts. Common tools for clustering include neural networks and discrimination analysis. 4. Forecasting. It estimates the future value based on a record patterns. It deals with continuously valued outcomes (Ahmed, 2004; Berry & Linoff, 2004). It relates to modelling and the logical relationships of the model at some time in the future. Demand forecast is a typical example of a forecasting model. Common tools for forecasting include neural networks and survival analysis. 5. Regression. It is a kind of statistical estimation technique used to map a data item to a real-valued prediction value (Carrier & Povel, 2003; Mitra et al., 2002). It includes curve fitting, prediction, modeling of causal relationships, and testing scientific hypotheses about relationships between variables. Common tools for regression include linear regression and logistic regression. 6. Sequence discovery. It is the identification of associations or patterns over time, like time-series

analysis (Berson et al., 2000; Carrier & Povel, 2003; Mitra et al., 2002). According to Mitra et al. (2002) “the goal is to model the states of the process generating the sequence or to extract and report deviation and trends over time”. Common tools for sequence discovery are statistics and set theory. 7. Visualization. It refers to the presentation of data so that users can view complex patterns (Shaw et al., 2001). It is used in conjunction with other data mining models to provide a clearer understanding of the discovered patterns or relationships (Turban et al., 2007). Examples of visualization model are 3D graphs, “Hygraphs” and “SeeNet” (Shaw et al., 2001). Each of this type of data mining models uses various statistical techniques. The most used in CRM are association rule, neural networks, genetic algorithms, fuzzy logic, and decision tree. Association rules regard the discovery of association relationships, which are above an interesting threshold, hidden in databases (Berry & Linoff, 2004, Brijs et al., 2004; Ngai et al., 2009; Wang et al., 2005). The threshold tells how strong the pattern is and how likely the rule is to occur again (Berson et al., 2000). This tecnique can be used to build a model for predicting the value of a future customer (Wang et al., 2005). Fuzzy logic (Elamvazuthi et al., 2010; 2012; Feng & Yuan, 2011; Madronero et al., 2010; Vasant, 2006; Vasant et al., 2010; Vasant et al., 2011) is a mathematic theory that imitates the human ability of making decisions in environments of uncertainties and inaccuracy. Through Fuzzy logic, intelligent systems of control and decision support can be built (Han & Kamber, 2006). According to (Galvão & Marin, 2009) the fuzzy logic can be used mainly in two forms. Firstly, fuzzy logic could represent the classic logic extension for a more flexible one, aiming at making formal inaccurate concepts. Secondly, fuzzy sets are applied to several theories and technologies to process inaccurate information, such as in decision-making processes. Artificial neural network (ANN) is an artificial intelligence technique introduced by McCulloch and Pitts in 1943. It mimics the biological neural network of the human nervous system. Thus, the basic idea of ANNs is that it learns from examples using several constructs and algorithms just like a human being learns new things. Unlike Fuzzy sets, It is effective in function approximation, forecasting, classification, clustering and optimization tasks depending on the neural network architecture (Berry & Linoff, 2004; Mitra et al., 2002; Turban et al., 2007). Genetic algorithms (GAs), developed by Holland in 1975, mimic Darwinian principles of natural selection and evolution to solve nonlinear, non-convex global optimization problem. (Armin & Babak, 2011; Gaby et al., 2010; Ganesan et al., 2011, 2012; Leng et al., 2012; Pinkey et al., 2011; Provas Kumar & Dharmadas 2012; Svancara et al., 2012; Vasant, 2012, 2013). GAs are stochastic search techniques that are able to seek out large and complicated spaces on the ideas from natural genetics and evolutionary principle (Davis, 1991; Etemadi et al., 2009; Holland, 1975; Goldberg, 1989; Shin & Lee, 2002; Varetto, 1998). GAs are based on three fundamental steps: selection of better individuals, crossover and mutation. The selection usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population based on their fitness, and modified, recombined and possibly randomly mutated to form a new population. The new population is then used in the next iteration of the algorithm (Gordini, 2013). With the evolution of the algorithm, only the solutions with higher prevision power survive, until they reach an ideal solution. Decision tree (DT) is a technique can be used to extract models describing sequences of interrelated decisions or predicting future data trends (Berry & Linoff, 2004; Chen et al., 2003; Kim et al., 2005). It classifies specific entities into particular classes based upon the features of the entities: a root is followed by internal nodes, each node is labeled with a question, and an arc associated with each node covers all possible responses (Buckinx et al., 2004; Chen et al., 2003). Graphically, decision tree is represented by branches, similar to a tree (Han & Kamber, 2006). Each branch of the tree represents a decision about a variable that determines how the data present division to a series of branches. Thus, it describes an association between the attribute and the target variable or, in other words, the association of each branch with other branches (Galvão & Marin, 2009). The aim of the induction of a Decision Tree is to produce an accurate prediction model or discover the predictive structure of the problem. Decision tree has several advantages: it is simple to understand and interpret, it has value even with little hard data, possible scenarios can be added, worst, best and expected values can be determined for different scenarios, unlike ANNs it uses a white box model and it can be combined with other decision techniques. A variety of different DT paradigms have been developed such as ID3, C4.5, CART or CHAID.

Customer Relationship Management and Data Mining The application of data mining tools in CRM is an emerging trend in the global economy that have many advantages, but also some disadvantages and some interesting challenges for both researchers and practitioners (especially for marketers). The most important advantages are the following. First, DM is a fundamental tool for the firms to manage and organize data because it helps companies to begin a cleaning process that eliminates errors and ensures consistency. In fact, companies have to organize raw customer data that must be transformed into information. The data preparation is a critical, vital process for the success of CRM because the data comes from various sources. In fact, companies are able to obtain data about their customers both from inside sources (massive databases that contain marketing, human resources and financial data) and from outside sources (i.e. they can purchase data from external consultant companies). According to Bergeron (2001), data mining allows firms to obtain a large amount of detailed, complete, homogeneous information about customers organized in a database, to know how to segment customers, differentiating profitable customers from those who are not, and to establish appropriate business plans for each case, to reach a deeper knowledge of expressed and latent needs and perceptions of the customers in real time, and consequently to better discriminate and more effectively allocate resources to the most profitable group of customers and reach lower costs, to reach a greater customer satisfaction and improve and extend customer relationships, generating new business opportunities. Second, in order to create high value for companies, DM focuses the attention on consumers in respect to both the “internal” (price, product, positioning) and “external” (competition, demographic) factors, which help to determine the customer consumptions, customer satisfaction and corporate profits. Third, it provides a link between individual transactions and analytical systems. Thus, the relationship between CRM and DM enables the company to manage a vast amount of data and to discover hidden affiliations with high business value. Data mining has also some disadvantages. It is clear that the role of technology is relevant throughout the CRM process but it cannot alone be sufficient for building a profitable and lasting relationship. Past experiences showed that these misunderstanding were often penalizing for the companies. According to Veglio (2013), some of the main disadvantages can be summarized as: ! Drawing the Customer-Centric Marketing Strategies after the implementation of the CRM process. Decision-makers believe that implementing CRM software is equivalent to creating marketing strategies, as a first stage company must formulate this strategy and clarify the purpose for this. ! Assuming that more CRM technology is better. Many executives also mistakenly believe that CRM is a technology-intensive product and are apt to put emphasis on new functions of CRM software. When a company begins to use a CRM software package, it is very important for them to narrow down the specifications of the software in order to minimize the burden on its users and to suppress bugs. If a company concentrates excessively on new functions, this will cause false integration of the CRM software and existing system. ! Focus on customers. Organizations must establish contact only with individuals who have a real interest in their company and product. When they approach the wrong people, they can be perceived as stalkers and lose potential customers. ! Cost of the Tools. The price of the data mining software is really high because of their complexity in terms of specific algorithms and models that are implemented. For this reason, firms sometimes cannot develop data mining analysis but just statistical analysis. ! The link between data mining software and campaign management. In the past this link was mostly nonautomatic. It required that physical copies of the scoring from the data model to be created and transferred to the database. This separation of data mining and marketing management software introduces considerable inefficiency and was prone to human error. Today the trend is to integrate the two components in order to gain a competitive advantage. Businesses can gain a competitive advantage by ensuring that their data mining software and campaign management software share the same definition of the customer segments in order to model the entire database (Thearling, 1998). The relationship between CRM and data mining also presents interesting challenges for both researchers and practitioners (especially for marketers) alike. A first challenge regards the need to manage data that crosses organizational boundaries and is distributed across supply chain partners. Customer knowledge is typically distributed across supply chain partners, and marketing is an important beneficiary of this knowledge. But, managing the cross-organizational knowledge requires organizational and industry level efforts. Further research should analyze and develop appropriate inter-organizational CR models, protection of knowledge

rights, and distribution of knowledge benefits amongst partners. A second challenge is when customers can belong to more than one category. The even more complex customer preferences make this issue particularly relevant for marketers, as they may encounter customers with multiple memberships. A marketer may also want to use multiple memberships to gain important knowledge about customers, instead of simplifying the classes and losing valuable information. But, according to Spangler (1999), current data mining techniques have been shown to be limited in handling memberships for multiple classes. Therefore, marketers need reliable classification tools. A third challenge is Web mining. According to Shaw et al. (2001), in past years the Web has become an important and convenient tool for purchasing goods and, consequently, is a source of customer data, which is very usefulness for the marketers. However, the multiple data formats and distributed nature of knowledge on the Web makes it a challenge to collect, discover, organize and manage, in a manner that is useful for marketing decision support. Therefore, web mining needs to be addressed as an important marketing knowledge management issue. Finally, the choice of data mining model and of the more accurate techniques depends on the data available and the business requirements. In fact, the choice and the development of an effective and efficient data mining model that aids in the measurement of customer value and in the interaction with heterogeneous expressed and latent needs of the customers increases the prediction accuracy rate of potential customers and, at the same time, decreases the possibility to commit a Type I and Type II error. According to Reinartz et al. (2004), companies want to avoid the mistake of not identifying a good customer and subsequently not rewarding the customer accordingly (Type I error) and, at the same time, companies also want to prevent wrongful classification of low-value customers as high-value customers and subsequent over-spending of resources (Type II error). In this paper we use a decision tree classification model to predict customers purchasing behavior in global market and reduce both Type I and Type II error. Classification is one of the most common learning models in data mining (Ahmed, 2004; Berry & Linoff, 2004; Carrier & Povel, 2003). It aims at building a model to predict future customer behaviors through classifying database records into a number of predefined classes based on certain criteria (Ahmed, 2004; Berson et al., 2000; Chen et al., 2003; Mitra et al., 2002). Common techniques used for classification are neural networks, if-then-else rules, and decision trees (Chen et al., 2003). According to Gehrke et al. (1999), Quinlan (1987), Ravi et al. (2008), Ravi Kumar et al. (2007) and Rud (2000) decision trees form a part of machine learning, an important area of artificial intelligence. A vast number of algorithms are used for building decision tree including CART, Chi squared automatic interaction detection (CHAID), Quest and C5.0. For the purpose of our research, a classification decision tree model based on the “CART” algorithm and on Gini Impurity was the best data mining model. The main reason of this choice is that the “CART” algorithm is one of the most popular criteria used to manage business problems within global organizations (Breiman et al., 1984) and it is relatively simple to interpret by decision makers, while C4.5 and C5.0 algorithms are inadequate in solving business problems because of widely used in engineering research (Giudici, 2010).

DATA AND METHODOLOGY The main purpose of this research is to demonstrate the strategic predictive power of the data mining models in forecasting punctual marketing performance in global competitive companies. Special focus is paid to the identification of the best marketing drivers, which lead potential customers in a customer state achieving to increase the probability of customer conversion. The data analyzed concerns the launching of a quarterly online marketing campaign. The first part of the campaign was proposed in December 2010, while the second part was in January and February 2011. A global marketing consultant company that offers digital data driven solutions across all interactive channels provided the database. Currently, this firm operates in many different countries around the world such as Europe, South America, Asia, Africa and Australia. We cannot give other details about the company for privacy reasons. A data mining explorative analysis is used to accomplish our research goal. The entire database contained more than 1,463,199 potential customers and an initial set of 42 variables related to their purchase behavior. Table 1 provides a description of the variables.

Table 1 Description of the Variables in the Dataset Variables

Description

Variables Measures

Activity Timestamp

Timestamp for the activity on the advertiser's website

Nominal

Activity Tag name

Activity tag name associated to the action that the potential customer performed on the client (company) site

Nominal

Advertisement Name

Advertisement name associated with the exposure

Nominal

Type of Banner

Type of banner proposed to the potential customer

Nominal

Name of the Advertiser

Name of the advertiser

Nominal

Amount of Model Conversion

Amount of conversion attributed to a specific potential customer in a journey based on the results of the model

Scale

Activity Quantity

Quantity associated with the activity. Quantity can typically be 1 representing the activity but in some cases this will be > 1

Scale

Revenue Activity

Revenue associated with the activity

Scale

Cost per Click

Click Through Rate

Cost per Click. It is an internet advertising model used to direct traffic to websites, where advertisers pay the publisher when the advertisement is clicked Click Through Rate. It is a way of measuring the success of an online campaign for a particular product or service. The click through rate advertisement is defined as the number of clicks on an advertisements divided by the number of times the advertisement is shown

Scale

Scale

Average Position

The average position of a search term in the search engine

Scale

Brand Search

The search term includes any of the branded terms

Scale

Name of the Campaign

Name of the campaign

Nominal

Creative Height

Creative Height

Scale

Creative Type

Creative Type

Scale

Creative Width

Creative Width

Scale

Creative Name

Creative for display advertisement

Scale

Head Flag

Search flag used to mark high volume keywords

Nominal

Timestamp for impression and clicks

Timestamp for impressions and clicks

Nominal

Impression or Click

‘Imp’ if the event is an impression, ‘Click’ if the event is a click

Nominal

Keywords Advertising Group

Group of Advertising Keyword.

Nominal

Keywords Campaign

Keywords Campaign

Nominal

Keywords Category

Keywords Category

Nominal

Keywords Name

Keywords on which search advertisements appear in the web

Nominal

Match Type

Type of key word in form users

Nominal

Max Search Click

If 1 then includes search if 0 then no search

Scale

Price Paid

Price paid by the consumer for the purchase

Scale

Quantity Sold

Number of items sold

Scale

Purchases

Number of items purchased by potential customers

Scale

Min Search Click

If 1 is only search, if 0 then includes display

Scale

Single Conversion Activity

Flags journeys where the potential customer only has a single conversion/activity

Scale

Site Placement

Placement on the Site (homepage)

Scale

Rank0

Variable no identified from the company

Scale

Rank1 Rank2

Rank3

Auto incrementing value representing all touches points of a potential customer journey. Does not reset on a conversion/activity Auto incrementing value representing all touches points of a potential customer journey. Does reset on a conversion/activity Auto incrementing value representing the conversion number for the user. The same value will repeat across all exposures leading to an activity and then will increment and repeat for the next set of exposures leading to a conversion/activity

Scale Scale

Scale

Record Number

Record number

Scale

Search Engine Name

Potential customers search on the search engine information related to the marketing campaign

Nominal

Search Click

Represents an exposure that is a search click

Nominal

Segment

Client specific field for this report. For instance shows a segment applied to users based on location/energy consumption

Nominal

Site Name

Site (for display advertisement)

Nominal

User Id

ID of the potential customers

Scale

The database, showed in table 1, is not aggregated by potential customer. It could be useful to aggregate the dataset by ‘user id’ (unique code for each potential customer). In other words, at each ‘user id’ will correspond a different potential customer. Therefore, in the next table (table 2), we proceed to aggregate variables and we show the aggregation criteria used.

Table 2 Variables Aggregation Criteria Variables

Aggregation Criteria

Variable Measures

New Variable Measures

Activity Timestamp

Categorization into groups and creation of dummy variables

Nominal

Scale

Advertisement Name

Categorization into groups and sum

Nominal

Scale

Type of Banner


Nominal

Scale

Name of the Advertiser


Nominal

Scale

Cost per Click

Average Value

Scale

Scale

Click Through Rate

Average Value

Scale

Scale

Average Position

Average Value

Scale

Scale

Brand Search


Scale

Scale

Name of the Campaign


Nominal

Scale

Creative Height

Categorization into groups

Scale

Scale

Creative Type


Scale

Scale

Creative Width


Scale

Scale

Head Flag


Nominal

Scale

Timestamp for impression and click

Categorization into groups and creation of dummy variables

Nominal

Scale

Impression or Click


Nominal

Scale

Keywords Advertising Group


Nominal

Scale

Keywords Campaign


Nominal

Scale

Keywords Category


Nominal

Scale

Keywords Name


Nominal

Scale

Match Type


Scale

Scale

Max Search Click

Sum

Scale

Scale

Quantity Sold

Sum

Scale

Scale

Purchases

Sum

Scale

Scale

Min Search Click


Scale

Scale

Rank 1

Maximum Value

Scale

Scale

Rank 2

Maximum Value

Scale

Scale

Rank 3

Maximum Value

Scale

Scale

Search Engine Name


Nominal

Scale

Search Click


Nominal

Scale

Segment


Nominal

Scale

Site Name


Nominal

Scale

User Id

Parameter of aggregation

Scale

Scale

Finally, in order to select only those variables with both the greatest prediction capacity and the lowest correlation level, we decided to eliminate not significant variables in the dataset. To do so, we adopted a twostage variables selection process. In the first stage, we decided to eliminate a variable relying on the opinion of the Executive Vice President (EVP) of the company. Table 3 shows the results of this first stage. Table 3 Non significant variables according to the Executive Vice President of the Company Variables Variables Description Measures Activity tag name associated to the action that the potential customer Activity Tag name Nominal performed on the client (company) site Amount of conversion attributed to a specific potential customer in a Amount of Model Conversion Scale journey based on the results of the model Quantity associated with the activity Quantity can typically be 1 Activity Quantity Scale representing the activity but in some cases this will be > 1 Revenue Activity

Revenue associated with the activity

Scale

Creative Name

Creative for display advertisement, belong to an advertiser

Scale

Price Paid

Price paid by the consumer for the purchase

Scale

Single Conversion Activity

Flags journeys where the potential customer only has a single conversion/activity.

Scale

Notes Unclear Variable Unclear Variable Redundant Variable Redundant Variable Low Predictive Value Many Missing Data Redundant

Site Placement

Placement on the Site (for instance: homepage)

Scale

Rank 0

Variable no identified from the company

Scale

Low Predictive Value Unclear Variables

In the second stage, in order to select only those variables with the lowest correlation level, we carried out a multicollinearity analysis. Multicollinearity analysis is a good method for discovering redundant variables. Literature identifies different causes of multicollinearity including, among others, the improper use of dummy variables, the use of a variable that is computed from other variables in the equation, or the use of the same or variable twice. Or, simply, it may just be that variables actually are highly correlated (Lattin et al. 2003). In this study we use the Variance Inflation Factor (VIF) method (Montgomery & Peck, 1992) as indicator of multicollinearity. Although in literature there is not a general rule about the interpretation of the VIF value, high values of the VIF means that the variables within the model are highly correlated (Caramanis & Spathis, 2006; Judge et al., 1987; Studenmund, 2006). A VIF greater of 10 could indicate a multicollinearity problem (Neter et al., 1996), while VIF values less than 2 mean that the variables are almost independent (Fernandez, 2007; Judge et al., 1987; Leow & Mues, 2012). Table 4 shows the variables with a VIF value less than 10. Table 4 Multicollinearity Analysis Variables Mean Click Trough Rate Average Position Dynamic Click Average Position Best Five Brand Search Impression or Click at 1pm Impression or Click at 2pm Impression or Click at 3pm Impression or Click at 4pm Match Type: Broad Match Type: Exact Search Engine on Google Site Name: ConionMCUK Site Name: Adjug6 Site Name: Affiliate Window Site Name: Drive PM Site Name: MCUK Quidco

COLLINEARITY STATISTICS TOLERANCE VIF ,906 1,103 ,804 1,243 ,721 1,388 ,132 7,599 ,176 5,692 ,126 7,928 ,110 9,065 ,123 8,124 ,175 5,719 ,484 2,066 ,135 7,430 ,125 7,997 ,924 1,082 ,855 1,170 ,721 1,387 ,904 1,106 ,700 1,429

The final dataset contains 1,463,199 potential customers and 18 quantitative variables (1 target variables and 17 independent variables) relating to their purchase behaviour. The target variable is dichotomous and it assumes two values: 0 when the potential customer does not purchase the service (bad customer) and 1 when the potential customer buys the service offered by the company (good customer). Table 5 describes the variable used in the study. In addition, it provides an encoding of the variables analysed in order to better understand the classification tree output. Finally, particular attention must be paid to the statistical measure of the variables.

Table 5 Description of the Selected Variables Variables

Description

Variables Measures

Dynamic Click

Number of times that a potential customer click on Banner Moving

Scale

Click Through Rate

Average Position Best Five

Click Through Rate. It is a way of measuring the success of an online campaign for a particular product or service. The click through rate advertisement is defined as the number of clicks on an advertisements divided by the number of times the advertisement is shown Number of times that the site name appears in the Top Best Five after the research in the search engine

Scale

Scale

Brand Search

Number of times that potential customer digits one of the brand name company in the search engine

Scale

Match Type: Broad

Number of times that a potential customer digits one of the keyword on the search engine

Scale

Impression or Click at 1pm

Number of times that a potential customer digits an exact keyword on the search engine Target Variable. If the potential customers purchases a service online the variable is marks with 1 otherwise 0 Number of times that a potential customer performs an impression or click at 1pm


Number of times that a potential customer performs an impression or click at 2pm

Impression or Click at 3pm Impression or Click at 4pm

Number of times that a potential customer performs an impression or click at 3pm Number of times that a potential customer performs an impression or click at 4pm

Average Position

Average position of a search term in the search engine

Scale

Search Engine on Google

Number of times that potential customers insert a company keyword on Google

Scale

Site Name: Conion MCUK

Number of times that potential customers are exposed to a specific banner

Scale

Match Type: Exact Purchases

Site Name: Affiliate Window Site Name: Drive PM

Number of times that potential customers are exposed to a specific banner Number of times that potential customers are exposed to a specific banner Number of times that potential customers are exposed to a specific banner

Site Name: Yahoo Q2006

Number of times that potential customers are exposed to a specific banner

Site Name: Adjug6

Scale Scale Scale Scale Scale Scale

Scale Scale Scale Scale

Given the nature of the target variable a classification decision tree model based on the “CART” Algorithm (Equations 1-3) and on the Gini Impurity (Equations 4 and 5) has been developed in this research. According to Pendharkar et al. (2005), CART constructs a binary decision tree by splitting a database in such a way that the data in the descendant subsets are more pure than the data in the parent set. For example, let ( xn , yn ) represent nth example, where xn is the nth example vector on independent variables and yn is the value of the target variable. If there are a total N examples, then CART calculates a best split s* so that the following is maximized over all possible splits S: (Eq.1)

Δ𝑅 s* , 𝑡 =

!"# !"# !! (!,!) ! ∈ !

(Eq. 1)

Where ΔR(s, t ) = R(t ) − R(tL ) − R(tR ) is an improvement in the re-substitution estimate for split The re-substitution estimate R(t ) is defined as follows (Eq. 2):

R(t ) =

1 N

∑ (y

n

s of t .

− y(t ))2 .

x n ∈t

(Eq. 2)

The variables t L and t R are left and right value for split t. The variable y (t ) is defined as follows (Eq. 3):

y(t ) =

1 ∑ yn N (t ) xn ∈t

(Eq. 3)

where N (t ) is the total number of cases in t . The tree continues to grow until a node is reached such that no significant decrease in the re-substitution estimate is possible. This node is the terminal node. According to Giudici (2010), the Gini Impurity is equal to (Eq. 4): k ( m)

I G (m) = 1 − ∑ π i2

(Eq. 4)

i =1

where π i are the fitted probabilities of the levels present at node m , which are the most k (m) . The fitted success probability is given by (Eq. 5)

πi

∑ =

nm

i =1

ylm

nm

(Eq. 5)

y

where the observations lm can take the value 0 or 1, and the fitted probability corresponds to the observation proportion of success in group M (Figini & Giudici, 2009; Giudici, 2010). The output of the analysis is represented through a tree. This implies that the partition performed at a certain level is influenced by the previous choices. For a classification tree, a discriminant rule can be derived at each leaf of the tree. Each leaf points out a clear allocation rule of the observations, which is read by going through the path that connects the initial node to each of them (Berry & Linoff, 2011). In particular, the final output includes the following information at each terminal node. Firstly, it shows a probability, which represents the membership degree of a terminal node of the class. Secondly, it provides the class assigned to the terminal node related to its probability and to pre-specified misclassification costs (Colombet et al., 2000). Finally, in order to evaluate the correctness of the predict model proposed, the criteria based on the loss functions such as Percentage Correctly Classified (PPC) and Area Under the receiver operating Characteristic curve (AUC) have been implemented in this research. Both measures are commonly used as performance criteria (Mozer et al., 2000). The PPC compares the ‘posterior’ probability of defection with the true status of the customer. The resulting confusion matrix is used to calculate the accuracy of the models. It contains the number of elements that have been correctly or incorrectly classified for each class. The main diagonal shows the number of observations that have been correctly classified for each class; the offdiagonal elements indicate the number of observations that have been incorrectly classified. A disadvantage of this measure is that it is not very robust concerning the chosen cut off value in the ‘a posterior’ probabilities (Baesens et al., 2002). The AUC measure takes into account all possible cut off levels. For all these points, it considers the sensitivity (the number of true positive versus the total number of events) and the specificity (the number of true negatives versus the total number of non-events) of the confusion matrix in a two-dimensional graph, resulting in a ROC curve. The area under this curve can be used to evaluate the predictive accuracy of the classification model (Hanley & McNeil, 1982). Both PPC and AUC weigh the opportunity cost of misclassifying a buyer as a non-buyer and the cost of misclassifying a non-buyer as a buyer equally. It is easier to incorporate the issue of unequal misclassification costs into the PCC criterion rather than into AUC. For instance, the probability of a misclassification is multiplied by the cost of

misclassification (for both buyers and non-buyers). Both performance criteria will be calculated on a test or holdout sample, which only consists of observations not used during model estimation, and which is half the size of the total sample (Van den Poel, 2003).

DISCUSSIONS Before explaining the results relating to the classification decision tree an explorative analysis of the target variable is given below. Table 6 shows the purchase frequency of potential customers in a given period. Just 1.74% of potential customers purchased the service at least once. In other words, 25,433 potential customers are “good” because of they purchased the service offered through a marketing campaign, while 1,437,766 are “bad” in as much as they did not purchase the service online. Table 6 Frequency of the Variable “Purchases” Purchases Absolute Frequency

Relative Frequency

0

1,437,766

98,26

1

25,433

1,74

Total

1,463,199

100

Table 7 provides some position and dispersion indexes related to the target variable. On average, just 1.70% of potential customers purchase the service online. The minimum and maximum value confirms that no outliers are present in the distribution of the variable. Finally, the high value of the coefficient of variation highlights that the arithmetic mean is an inaccurate indicator in explaining the distribution of the variable. Table 7 Descriptive Values of the Variable “Purchases”

Purchases

Minimum Value

Maximum Value

Arithmetic Mean

Variance

Standard Deviation

Coefficient of Variation

,00

1,00

,017

,017

,130

7,64

Owing to the huge volume of the data, in order to discover the main variables to enter into the classification tree we have estimated the value of the Pearson Correlation Index among the target variable (Purchases) and each independent variable collected in the database. Table 8 shows the results of the Pearson Correlation Analysis. Table 8 Pearson Correlation Analysis Variables

Pearson Correlation Value

Purchases

1,00

Dynamic Click

,62**

Average Click Through Rate

,29**

Average Position Best Five

,38**

Brand Search

,39**

Match Type: Broad

,16**

Match Type: Exact

,38**


,10**


,10**


,10**


,10**

Average Position

,40**

Search Engine on Google

,39**

Site Name: Aconiom MCUK

,09**

Site Name: Adjug6

,05**

Site Name: Affiliate Window

,41**

Site Name: Drive PM

,05**

Site Name: Yahoo Q2006

,09**

** Correlation is significant at the 0.01 level * Correlation is significant at the 0.05 level The Pearson Correlation Coefficient (PCC) measures the strength and direction (decreasing or increasing, depending on the coefficient sign) of a linear relationship between two variables without identifying cause and effect (Ahlgren et al., 2003). From a statistical point of view only the variables with p-value < 0.005 are significantly correlated to the target variables (Baum, 2006). The value of the p-value represents a decreasing index of the reliability of a result (Moody, 2009). In addition, the PCC provides information about the collinearity of the variables. In our case, the correlation values suggest that the variables are not redundant between them. According to the EVP) Data Platforms of the Company, the variables related to a specific banner, which exposed to the potential customer, will be included in the classification tree due to the fact that they could contain important strategic value. In addition, the EVP suggests continuous monitoring of the variable “Dynamic Click” because it will represent the number of ‘Fraud Clicks’ generated by potential customers. According to Asdemir et al. (2008) click fraud occurs when a web users click on a sponsored link with the malicious intent of hurting a competitor or gaining undue monetary benefits. Competitors could generate a Dynamic Click creating a direct connection with the payment page. Despite its strategic value, this variable must be removed from the classification tree because only 18,515 out of 1,463,199 potential customers generated a dynamic click into the web marketing campaign. Finally, variables such as “Average Position Best Five” and “Brand Search” seem to be redundant even though they contain different business information. A first look at the exploratory analysis shows some very interesting outcomes but a predictive model is needed to better understand the strength of these relationships. A classification decision tree has been developed in this research in order to identify the best marketing drivers that could increase the probability of customer conversion. The Classification Decision Tree is one of the main computational data mining models able to assist marketers in the detection of the main marketing drivers in which organizations should make the most of their future marketing investments, in order to maximize the probability of customer conversion. In addition, tree models aim to identify the best marketing activities performed by potential customers before purchasing the service proposed by the marketing campaign. Finally, it could be an important knowledge structure used for the classification of future events to support the decision making process as it is very simple to interpret. Graph 1 provides a representation of the classification decision tree highlighting its efficiency, effectiveness, and validity in identifying the potential customer purchase behaviour (target variable) using our explanatory variables. [Please, Insert “Graph 1 Classification Decision Tree”]

Table 9 shows a more in depth and detailed interpretation of the decision tree, identifying how the explanatory variables (in the terminal node) influence the purchase probability and the number of potential customers. Table 9 Interpretation of the Classification Tree Node 2 7

10

11

12

13

15

16

17

18

Customer Category The potential customer has been exposed to a banner on an affiliate website at least one time. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is greater than 0,5% AND has never digit the company's brand name on a search engine. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is lower than 0,5% AND has never digits a campaign keyword on a search engine AND visited the ‘Conion MCUK’ website more than 6 times. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is lower than 0,5% AND has searched on Google at least one time AND has never digits a specific campaign keyword on a search engine. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is lower than 0,5% AND has searched on Google at least one time AND has digits a specific campaign keyword on a search engine at least once. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is greater than 0,5% AND has digits the company's brand name on a search engine at least one time AND has digits a specific campaign keyword on a search engine either one or zero times. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is lower than 0,5% AND has never digit a campaign keyword on Google AND has visited Conion MCUK website less than seven times AND has visited MCUKYahooQ2006 website less than 21 times The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is lower than 0,5% AND has never digit a campaign keyword on Google AND has visited Conion MCUK website less than seven times AND has visited MCUK Yahoo Q2006 website at least 22 times. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is greater than 0,5% AND has digit the company’s brand name on a search engine at least one time AND has digits exactly a campaign keyword on a search engine at least two times AND visited websites that have an average search position lower than 1,014. The potential customer has never been exposed to a banner on an affiliate website AND has been exposed to banners whose mean click through rate is greater than 0,5% AND has digit the

Purchase probability

N. of potential customers

91%

7,362

14%

1,726

31%

3,904

35,40%

915

75%

1,583

61%

7,001

1%

1,438,292

41%

680

80%

890

92%

846

company's brand name on a search engine at least one time AND has digits exactly a campaign keyword on a search engine at least two times AND visited websites that have an average search position greater than 1,014. From the classification decision it emerges that “Affiliate Websites” are a key driver of customer conversion. 91% out of the 7,000 potential customers who have visited an affiliate website purchased the company’s service online. Also, node 15 (MCUKYahoo2006) distinguishes a customer category that is very unlikely to buy the service proposed by the company. More precisely, potential customers that neither visited affiliate websites nor searched a specific keyword campaign had a probability of customer conversion lower than 1%. Instead, node 10 (Conion MCUK) identifies a group of potential customers with a low probability (31%) of purchase. This group is mainly characterized by “Conion MCUK” visitors who do not visit “Affiliate Websites” and do not search for campaign keywords on the internet. On the contrary, node 13 (Match Type: Exact) demonstrates that activities such as to digit the company’s name on a search engine and being exposed to banners with a higher click through rate increases the conversion probability. This category is composed by 7,001 potential customers and has a probability of purchase equal to 60%. Finally, the Criteria Based on the Loss Functions confirms that the classification decision tree based on the “CART” algorithm is an accurate computational data mining predictive tool to forecast the main marketing activities helpful to increase the probability of customer conversion. Table 10 notes that the classification tree correctly classifies 13,627 out of 17,692 potential customers belonging to the group of potential customers. On the other hand, the classification tree correctly identifies 1,433,711 out of 1,445,517 “bad” potential customers. Therefore, the classification tree accurately predicts the 99,7% of potential customers, 53,6% of them as customer. In general, based on cut-off of 0,05, the accuracy of the classification tree is really high because of equal to 98,9% (Swets, 1988). Table 10 Confusion Matrix OBSERVED VALUES 0 1 Total Overall Percentage *The cut-off value is 0.05

0 1,433,711 11,806

PREDICTED VALUES Purchase Variable 1 4,055 13,627

Percentage Correct 99,7 53,6 98,9*

Due to the weakness of the Confusion Matrix, Graph 2 provides a robust measure to evaluate the accuracy of the classification decision tree model. [Please, Insert Graph 2 Roc Curve Purchases] The predictive accuracy of the classification decision tree is moderate because it is equal to 80.20% (Swets, 1988).

FUTURE RESEARCH DIRECTIONS This research focused attention on the effectiveness of the computational data mining predictive models in estimating the probability of customer conversion. Particular attention has been paid to the classification decision tree in order to test its predictive power in forecasting accurate marketing performance within global organizations in today’s competitive landscape. Future research could test new computational predictive data mining models such as hierarchical logistic regression based on Enter Methods, classification decision tree based on CHAID algorithm, parametric models based on the Exponential Family of distributions and graphical models known as expert systems or Bayesian networks in order to detect the best models able to maximize the level of the marketing campaign profitability. Futures research could also concern an analysis of a hybrid model (i.e. neural network and genetic algorithms) in order to draw an accurate churn prediction model able to discover the main marketing drivers useful to predict customer churn. Also, a framework to identify a loyalty index and an analysis of the causes of churn should be provided. A text mining analysis joined with an opinion mining could support the

design of this new framework. Moreover, with new data, the model proposed can be enhanced to predict future sales using current marketing drivers. It could be interesting to develop a cluster analysis for identifying potential customer partition based on their behavior and a survival analysis for estimating how many times customers will so remain in the company. Finally, we could test the effectiveness of the data mining techniques on a sample of small and medium-sized enterprises characterized by a completely different strategic management.

CONCLUSIONS Data mining needs to become an essential business process, incorporated into to other process including marketing, sales, customer support, product design, finance, engineering and inventory control. This virtuous cycle places this methodology in the larger contest of business, shifting the focus away from the discovery mechanism to the actions based on the discovery. In addition, data mining methodology could be a strategic tool for global and international entrepreneurship in order to discover new business opportunities and maximize the level of company profitability. This research focused the attention on the effectiveness of the computational data mining predictive models in estimating the probability of customer conversion. Particular attention has been paid to the classification decision tree in order to test its predictive power in forecasting accurate marketing performance within global organizations in today’s competitive landscape. Our research question - are innovative computational data mining models more efficient than traditional statistical models in forecasting marketing performance with global organizations? - is accomplished. In fact, traditional statistical models based on a double moving average and exponential smoothing are definitely inadequate in forecasting punctual marketing performance within global organizations. First, the double moving average models the predictive value of the data and will be less sensitive to the actual changes, and the moving average is not always an effective trend indicator. Second, the predicted value always remains at the level of the past and cannot be expected to predict a higher or lower volatility of the future. Instead, the exponential smoothing model requires a more complete historical data before starting the prediction and if season factors greatly influence business sales, times decomposition is more applicable than exponential smoothing. In our case, it would have been impossible to implement double moving average or exponential smoothing models due to the large amount of variables collected in the dataset analyzed. In the current literature a few scientific articles argue about this business problem. The exponential smoothing model is the main approach used in many organizations in order to predict marketing performance rather than their performance, which is totally inaccurate and without sense (Geng & Du, 2010; Hyndman et al., 2008). Against it emerges the strategic importance of our argued research question. In other words, many companies develop inaccurate marketing forecasts due to the implementation of inadequate predictive techniques unable to manage huge amounts of data. It is obvious that predictive data mining models compared to traditional statistical models are the key to solve business problems in the presence of enormous amounts of data (Geng & Du, 2010). More precisely, the main strengths of the classification decision tree are the following. First, classification trees are predictive rather than descriptive. Second, they perform a classification on the observations on the basis of the observations of all independent variables and supervised by the presence of the target variable. Third, in classification decision trees the segmentation is typically carried out using only the maximally independent variables. In contrast, the main weakness is that compared with traditional statistical model classification, decision tree models produce rules that are less explicit analytically but easier to understand graphically. Subsequently, tree models require a higher demand of computational resources due to the fact that they do not require assumptions about their probability distribution of the target variable. Finally, the structure of these predictive models can change at any time thus it is impossible generalize their structure in another context. From this discussion emerges the use of sophisticated and computationally intensive analytical methods which are expected to become even more common place with recent research breakthroughs in computational methods and their commercialization by leading vendors in global business today (Grossman et al., 2001). As a consequence, many authors argued about the main differences between data mining and statistics. Sato (2000) observes that the data mining analysis differs from the statistical data analysis. For instance, statisticians use sample observations to study the population parameters by estimation, testing and predicting, whilst data mining analysis is governed by the need to uncover, in a timely manner, emerging trends, whereas statistical data analysis is related to historical facts and is based on observed data. On the other hand, in adopting a data mining methodology decision-makers should consider some limits such as: ! privacy and security issues;

! !

misuse of information; and cost of the tools. People are often afraid that somebody may have access to their personal information and then use that information in an unethical way. Although companies have much information about us available online, they do not have sufficient security systems in place to protect the information. Unethical businesses may use the information obtained from data mining to take advantage over vulnerable people or discriminate against a certain group of people. In addition, the price of the data mining software is very high because of their complexity in terms of specific algorithms and models implemented. For this reason, firms sometimes cannot develop data mining analysis but merely statistical analysis. In summing up, our research suggests that decision tree is an accurate prediction model able to forecast marketing performance in today’s competitive landscape. In particular, the classification decision provided a punctual description of the main marketing activities that could grow the level of sales. “Affiliate Web Site” is the best key driver of customer conversion. In addition, this model accurately discriminates a potential customer category that is very unlikely to buy the service proposed by the company. The ROC curve confirms that the classification decision tree is an effective predictive to forecast punctual marketing performance within global organizations. In conclusion, success in using data will transform organizations from reactive to proactive.

NOMENCLATURE ANN stands for Artificial Neural Network; AUC

stands for Area Under the receiver operating Characteristic Curve;

CART stands for Classification and Regression Tree; CHAID stands for CHI-Squared Automatic Interaction Detection; CRM stands for Customer Relationship Management; DM stands for Data Mininig; DT stands for Decision Tree; EVP stands for Executive Vice President; GAs stand for Genetich algorithms; GSP stands for Generalized Sequential Patterns ID3 stands for interactive dichotomizer 3; PCC stands for Pearson Correlation Coefficient; ROC Curve stands for receiver operating characteristic curve; SVM stands for Support Vector Machine.

REFERENCES Abbott, P.A., & Lee, S.M. (2006). Data mining and knowledge discovery. In: V.K. Saba, & K.A. Mccormick. Essentials of nursing informatics. 4th ed. New York, NY: McGraw-Hill Medical Pub. Division. Aggarval, C.C., & Yu, P.S. (2002). Finding localized associations in market basket data. IEEE Transactions on Knowledge and Data Engineering, 14, 51–62. Ahlgren, P., Jarneving, B., & Rousseau, R. (2003). Requirement for a cocitation similarity measure, with special reference to Pearson’s correlation coefficient. Journal of the American Society for Information Science and Technology, 54(6), 550-60. Ahmed, S.R. (2004). Applications of data mining in retail business. Information Technology: Coding and Computing, 2, 455–459. Armin, E. M., & Babak, M. (2011). Genetic algorithm based on optimal load frequency control in two-area interconnected power system. Global Journal of Technology & Optimization, 2(1), 6-10. Arndt, D., & Gersten, W. (2001). Data management in analytical customer relationship management. Data Mining for Marketing Applications Workshop at Ecml/Pkdd. Asdemir, K., Yurtseven, O., & Yahya, M.A. (2008). An Economic Model of Click Fraud in Publisher Networks. International Journal of Electronic Commerce,13, 61-90. Baesens, B., Viaene, S., Van den Poel, D., Venthienen, J., & Dedene, G. (2002). Bayesian neural network learning for repeat purchase modelling in direct marketing. European Journal of Operational Research, 138(1), 191-211. Baum, C.F. (2006). An introduction to modern econometric using Stata. Stata Press. Belbay, N., Benbya, H., & Meissonier, R. (2007). An empirical investigation of the customer Knowledge creation impact on NPD Performance. In: Proceedings of the 40th Hawaii International Conference on System Science. Bergeron, B. (2001). Essentials of CRM: Customer Relationship Management for Executives. New York, USA: John Wiley & Sons. Berry, M.J.A., & Linoff, G.S. (2004). Data mining techniques second edition – for marketing, sales, and customer relationship management. Wiley. Berry, M.J.A., & Linoff, G.S. (2011). Data mining techniques for marketing, sales, and customer relationship management. New York, USA: Lohn Wiley & Sons, Inc. Berson, A., Smith, S., & Thearling, K. (2000). Building data mining applications for CRM. New York, NY: McGraw-Hill Bloemer, J.M.M., Brijs, T., Vanhoof, K., & Swinnen, G. (2003). Comparing complete and partial classification for identifying customers at risk. International Journal of Research in Marketing, 20(2), 117131. Boulding, W., Staelin, R., Ehret, M., & Johnsto, W.J. (2005). A customer relationship management roadmap: What is know, potential pitfalls, and where to go. Journal of Marketing, 69, 155-166. Breiman, L., Friedman, J..H., Olshen, R.A., & Stone, C.J. (1984). Classification and regression trees. California: Wadsworth. Brijs, T., Swinnen, G., Vanhoof, K., & Wets, G. (2004). Building an association rules framework to improve product assortment decisions. Data Mining and Knowledge Discovery, 8, 7–23. Brondoni, S. M. (2008a). Overture the Market-Driven Management and Global Markets - 1. Symphonya. Emeriging Issues in Management, 1, 1-13. Brondoni, S. M. (2008b). Market-Driven Management, Competitive Space and Global Networks. Symphonya. Emerging Issues in Management, 1, 14-27. Brondoni, S. M. (2008c). Overture the Market-Driven Management and Global Markets - 2. Symphonya. Emeriging Issues in Management, 2, 1-12.

Buchnowska, D. (2011). Customer knowledge management models: Assessment and proposal. SpringerVerlag Berlin Heidelberg, 93, 25–38. Buckinx, W., Moons, E., Poel, D.V.D., & Wets, G. (2004). Customer-adapted coupon targeting using feature selection. Expert Systems with Applications, 26, 509–518. Cao, Y., & Gruca, T.S. (2005). Reducing adverse selection through customer relationship management. Journal of Marketing, 69, 219-29. Caramanis, C. & Spathis, C. (2006). Auditee and audit firm characteristics as determinants of audit qualifications: Evidence from the Athens stock exchange, Managerial Auditing Journal, 21(9), 905-920. Carrier, C.G., & Povel, O. (2003). Characterising data mining software. Intelligent Data Analysis, 7, 181– 192. Cavique, L. (2004). Graph-based structures for the market baskets analysis. Inv Op., 24(2), 233-46. Chang, J., Yen, D., Young, D. and Ku, C. (2002). Critical issues in CRM adoption and implementation. International Journal of service technology and measurement, 3(3), 311–324. Chen, Y. L., Hsu, C. L., & Chou, S. C. (2003). Constructing a multi-valued and multilabeled decision tree. Expert Systems with Applications, 25, 199–209. Chen, Y.L., Tang, K., Shen, R.J., & Hu, Y.H. (2005). Market basket analysis in a multiple store environment. Decision Support Systems, 40, 339–354. Cheung, K.W., Kwok, J.T., Law, M.H., & Tsui, K.C. (2003). Mining customer product ratings for personalized marketing. Decision Support Systems, 35, 231–243. Colombet, I., Ruelland, A., Chatellier, G., Gulyffier, F., Degoulet, P., & Jaulent, M.C. (2000). Models to predict cardiovascular risk: comparison of CART, multilayer perceptron and logistic regression. Symphosium Procedings Archive, 156-160. Corniani, M. (2002). Demand Bubble Management. Symphonya. Emeriging Issue in Management (www.unimib.it/symphonya), Issue 1, 87-99. Crone, S.V., Lessmann, S., & Stahlbock, R. (2006). The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing. European Journal of operational Research, 173, 781800. D’Aveni R. (1994). Hypercompetition: Managing the dynamic of strategic maneuvering, Free Press, New York, NY. Davis, L. (1991). Handbook of genetic algorithms. New York, NJ: Van Nostrand Reinhold. Day G.S. (1994). The capabilities of market-driven organization, Journal of Marketing, 58(4), 37-52. Deshpandè, R., Farley, J.U., & Webster, F. Jr (1993). Corporate culture, customer orientation, and innovativeness in Japanese firms: a quadrad analysis, Journal of Marketing, 57, 23-73. Dowling, G. (2002), Customer relationship management: in B2C markets, often less is more, California Management Review, 44(3), 87-104. Drew, J.H., Mani, D.R., Betz, A.L., & Datta, P. (2001). Targeting customers with statistical and data-mining techniques. Journal of Service Research, 3, 205–220. Dwyer, R.R., Schur, P.H., & Oh, S. (1987). Developing Buyer-Seller Relations. Journal of Marketing, 51, 11-28. ECCS, (1999). CRM defining customer relationship http://www.eccs.uk.com/crm/crmdefinitions/define.asp.

marketing

and

management.

Elamvazuthi, I., Vasant, P., & Ganesan, T. (2010). Fuzzy linear programming using modified logistic membership function. International Review of Automatic Control (IREACO), 3(4), 370-377. Elamvazuthi, I., Vasant, P., & Ganesan, T. (2012). Integration of Fuzzy Logic techniques into DSS for profitability quantification in a manufacturing environment. In M. Khan, & A. Ansari (Eds.), Handbook of research on industrial informatics and manufacturing intelligence: Innovations and solutions (pp. 171-192). doi:10.4018/978-1-4666-0294-6. ch007.

Etemadi, H., Rostamy, A.A.A., & Dehkordi, H.F. (2009). A genetic programming model for bankruptcy prediction: Empirical evidence from Iran. Expert System with Applications, 36, 3199-3207. Etzion, O., Fisher, A., & Wasserkrug, S. (2005). E-CLV: A modelling approach for customer lifetime evaluation in e-commerce domains, with an application and case study for online auction. Information Systems Frontiers, 7, 421–434. Feng, F. (2012). Research of customer relationship management solutions based on data mining. Information Engineering and Applications, 154, 47–52. Feng, X., & Yuan, G. (2011). Optimizing two-stage fuzzy multi-product multi-period production planning problem. Information, 14(6), 1879-1893. Fernandez, G. C. J. (2007). Effects of multicollinearity in all possible mixed model selection. PharamaSUG conference, statistics and pharmacokinetics. Colorado: Denver. Figini, S., & Giudici, P. (2009). Applied Data Mining for Business and Industry. New York, USA: John Wiley & Sons, Inc. Gaby, P, Uriel, I., Inessa, A., & Gad, R. (2010). GA for the resource sharing and scheduling problem. Global Journal on Technology & Optimization, 1, 84-87. Galvão, N.D., & Marin, H. (2009). Data mining: A literature review. Acta Paul Enferm, 22(5), 686-690. Ganesan, T., Vasant, P., & Elamvazuthi, I. (2011). Hybrid Neuro-Genetic Programming approach for optimizing operations in 3-D land seismic surveys. Mathematics and Computer Modeling, 54, 2913-2922. Ganesan, T., Vasant, P., & Elamvazuthi, I. (2012). Hybrid PSO approach for solving nonconvex optimization problems. Archives of Control Sciences, 22(1), 5-23. Gehrke, J., Ramakrishnan, R., & Loh, W.Y. (1999). BOAT-optimistic decision tree construction, Proceedings ACM SIGMOD International Conference Management of Data. Philadelphia, PA, 169–180. Geng, Y., & Du, X. (2010). The Research of Data Mining Based Sales Forecast. IEEE. Giudici, P. (2010). Data Mining. Milano, Italy: McGraw-Hill. Giudici, P., & Passerone, G. (2002). Data mining of association structures to model consumer behaviour. Computational Statistics and Data Analysis, 38, 533–541. Goldberg, D. (1989). Genetic algorithms in search, optimization, and machine learning. Reading. MA: Addison-Wesley. Gordini N. (2010). Market-driven management: A critical literature review, Symphonya. Emerging Issues In Management (www.unimib.it/symphonya), Issue 2, 95-107. Gordini N. (2012). Market-driven management: From the Japanese experience to the European Schools, in S. M. Brondoni (ed.), Market-Driven Management and Corporate Growth (pp. 95-115). Giappichelli, Torino. Gordini, N. (2013). Genetic algorithms for small enterprises default prediction: Empirical evidence from Italy. In P. Vasant (ed.). Soft computing intelligent algorithms in engineering, management, and Technology. IGI-Global. Grossman, R.L., Kamath, C., Kegelmeyear, P., Kumar, V., & Novnburu, R. (2001). Data Mining for scientific and engineering applications. London: UK: Springer-Verlag. Hadden, J., Tiwari, A., Roy, R., & Ruta, D. (2005). Computer assisted customer churn management: Stateof-the-art and future trends. Computers & Operations Research, 34, 2902-2917. Han, J., & Kamber, M. (2006). Data mining: concepts and techniques. 2nd ed. Amsterdam; Boston: Elsevier: Morgan Kaufmann. Hand, D. (1997). Construction and Assessment of Classification Rules. New York, USA: John Wiley & Sons, Ltd. Hand, D.J., Mannila, H., & Smyth, P. (2001). Principles of Data Mining. Cambridge, UK: MIT Press. Hanley, J.A., & McNeil, B.J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143, 29-36.

Hansotia, B. (2002). Gearing up for CRM: Antecedents to successful implementation, Journal of Database Marketing, 10(2), 121-132. He, Z., Xu, X., Huang, J.Z., & Deng, S. (2004). Mining class outliers: Concepts, algorithms and applications in CRM. Expert Systems with Applications, 27, 681–697. Holland, J.H. (1975). Adaptation in natural and artificial system. Ann Arbor, MI: University of Michigan Press. Hyndman, R.J., Koehler, A.B., Ord, J.K., & Snyder, D.S. (2008). Forecasting with Exponential Smoothing. Springer. Injazz, D., & Karen, P. (2004). Understanding customer relationship management (CRM). People, process and technology. Available from: http://www.emeraldinsight.com/1463-7154.htm. Jaworski, B.J. & Kohli, A.K. (1993). Market orientation: antecedents and consequences, Journal of Marketing, 57, 53-70. Jayachandran, S., Sharma, S., Kaufman, P., & Raman, P. (2005). The role of relational information processes and technology use in customer relationship management. Journal of Marketing, 69, 177-92. Jiang, T., & Tuzhilin, A. (2006). Segmenting customers from population to individuals: Does 1-to-1 keep your customers forever. IEEE Transactions on Knowledge and Data Engineering, 18, 1297–1311. Jiao, J.R., Zhang, Y., & Helander, M. (2006). A Kansei mining system for affective design. Expert Systems with Applications, 30, 658–673. Judge, G.G., Hill, R.C., Griffiths, W.E., Lutkepohl, H., & Lee, T. (1987). Theory and Practice of Econometrics (2nd ed.). New York, NY: Wiley. Kass, G.V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29(2), 119-127. Khanna, S. (2001). Measuring the CRM ROI: Show them benefeits. Available at http://www.crm-forum.com. Kim, J.K., Song, H.S., Kim, T.S., & Kim, H.K. (2005). Detecting the change of customer behavior based on decision tree analysis. Expert Systems with Applications, 22, 193–205. Kim, Y. H., & Moon, B. R. (2006). Multicampaign assignment problem. IEEE Transactions on Knowledge and Data Engineering, 18, 405–414. Kim, Y. S. (2006). Toward a successful CRM: Variable selection, sampling, and ensemble. Decision Support Systems, 41, 542–553. Kincaid, J.W. (2003). Customer relationship management: getting it right! Upper Saddle River, NJ: Prentice-Hall. Kohli A.K, & Jaworski, B. (1990). Market orientation: The construct, research propositions, and managerial implications. Journal of Marketing, 54, 1-18. Kracklauer, A.H., Mills, D.Q., & Seifert, D. (2004). Customer management as the origin of collaborative customer relationship management. Collaborative Customer Relationship Management - taking CRM to the next level, 3–6. Kubat, M., Hafez, A., Raghavan, V.V., Lekkala, J.R., & Chen, W. K. (2003). Item set trees for targeted association querying. IEEE Transaction on Knowledge and Data Engineering, 15, 1522–1534. Langley, P., & Simon, H. A. (1995). Applications of machine learning and rule induction. Communication of the ACM, 38, 55–64. Lattin, J., Douglas, C., and Green, P. (2003). Analyzing Multivariate Data. Duxbury Applied Series, Hardcover. Lau, H.C.W., Wong, C.W.Y., Hui, I.K., & Pun, K.F. (2003). Design and implementation of an integrated knowledge system. Knowledge-Based Systems, 16, 69–76. Lehmann, D. R. (2004). Metrics for making marketing matter. Journal of Marketing, 68, 73-75. Lejeune, M.A.P.M. (2001). Measuring the impact of data mining on churn management. Internet Research: Electronic Networking Applications and Policy, 11, 375–387.

Leng, K., & Chen, X. (2012). A genetic algorithm approach for TOC-based supply chain coordination. Applied Mathematics and Information Sciences, 6(3), 767 774 . Leow, M., & Mues, C. (2012). Predicting loss given default (LGD) for residential mortgage loans: A twostage model and empirical evidence for UK bank data. International Journal of Forecasting, 28, 183-195. Levitt, T. (1960). Marketing myopia. Harvard Business Review, 38, 45-60. Levitt, T. (1969), The marketing mode: Pathways to corporate growth. New York, NY: McGraw-Hill. Lewis, M. (2005). Incorporating strategic consumer behavior into customer valuation. Journal of Marketing, 69, 230-38. Liao, S. H., & Chen, Y. J. (2004). Mining customer knowledge for electronic catalog marketing. Expert Systems With Applications, 27, 521–532. Liao, S.H., Chu, P.H., & Hsiao, P.-Y. (2012). Data mining techniques and applications. A decade review from 2000 to 2011. Expert Systems with Applications, 39, 11303-11311. Ling, R., & Yen, D.C. (2001). Customer relationship management: An analysis framework and implementation strategies. Journal of Computer Information Systems, 41, 82–97. Madronero, M.D., Peidro, D., & Vasant, P. (2010). Vendor selection problem by using an interactive fuzzy multi-objective approach with modified s-curve membership functions. Computers and Mathematics with Applications,60, 1038-1048. McCulloch, W.S., & Pitts, W.A. (1943). Logical calculus of ideas immanent in nervous activity. Bulletin of Mathematical Biophysic, 5, 115–133. Mithas, S., Krishnan, M.S., & Fornell, C. (2005). Why do customer relationship management applications affect customer satisfaction? Journal of Marketing, 69, 201-209. Mitra, S., Pal, S.K., & Mitra, P. (2002). Data mining in soft computing framework: A survey. IEEE Transactions on Neural Networks, 13, 3–14. Montgomery, D.C., & Peck, E.A. (1992). Introduction to linear regression analysis. New York, NY: John Wiley & Sons. Moody, C. (2009). Basic Econometric with Stata. (http://www.scribd.com/doc/54549213/Stata-Manual2009). Mozer, M.C., Wolneiewicz, R., Grimes, D.B., Johnson, E., & Kaushansky, H. (2000). Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Transactions on Neural Networks, 11, 690-696. Mulhern, F. (1999). Customer profitability analysis: Measurement, concentration, and research directions. Journal of Interactive Marketing, 13, 25-40. Narver, J.C., & Slater, S.F. (1990). The effect of a market orientation on business profitability. Journal of Marketing, 54, 20-35. Neter, J., Kutner, M.H., Nachtsheim, C.J., & Wasserman, W. (1996). Applied Linear Regression Models. USA: Chicago. Ngai, E.W.T. (2005). Customer relationship management research (1992–2002): An academic literature review and classification. Marketing Intelligence, Planning, 23, 582–605. Ngai, E.W.T., Xiu, L., & Chau, D.C.K. (2009). Application of data mining techniques in customer relationship management: A literature review and classification, Expert System with Applications, 36, 25922602. Niraj, R., Gupta, M., & Narasimhan, C. (2001). Customer Profitability in a supply chain. Journal of Marketing, 65, 1-16. Panagiotis, I., Soukakos, Nikolaos, B., Georgopoulos, & Victoria, P.E. (2007). Two interrelated frameworks proposed for mapping and performance management of customer relationship management strategies. International Journal of Knowledge and Learning, 3(2-3), 299–315. Parvatiyar, A., & Sheth, J. N. (2001). Customer relationship management: Emerging practice, process, and

discipline. Journal of Economic & Social Research, 3, 1–34. Payne, A., & Frow, P. (2005). A strategic framework for customer relationship management. Journal of Marketing, 69, 167-179. Pendharkar, P.C., Subramanian, G.H., & Roger, J.A. (2005). A probabilistic model for predicting software development effort, IEEE Transactions on Software Engineering, 31(7), 615-624. Pinkey, C., Deep, K., & Pant, M. (2011). Optimizing CNC Turning Process Using Real Coded Genetic Algorithm and Differential Evolution. Global Journal Technology and Optimization, 2(2). Prinzie, A., & Poel, D.V.D. (2005). Constrained optimization of data-mining problems to improve model performance: A direct-marketing application. Expert Systems With Applications, 29, 630–640. Prinzie, A., & Poel, D.V.D. (2006). Investigating purchasing-sequence patterns for financial services using Markov, MTD and MTDg models. European Journal of Operational Research, 170, 710–734. Provas Kumar, R., & Dharmadas, M. (2012). Optimal reactive power dispatch using quasi-oppositional biogeography-based optimization. International Journal of Energy Optimization and Engineering, 1(4), 3855. Quinlan, J.R. (1987). Decision trees as probabilistic classifiers. Proceedings of 4th International Workshop Machine Learning, Irvine, CA, 31–37. Quinlan, J.R. (1993). C4.5: Programs for machine learning. San Mateo: Morgan Kaufmann. Rajshekhar (Raj) G.J., Martin C.L., & Young, R.B. (2006), Marketing research, market orientation and customer relationship management: a framework and implications for service providers, Journal of Services Marketing, 20(1), 12 – 23. Ravi Kumar, P., & Ravi, V. (2007). Bankruptcy prediction in banks and firms via statistical and intelligent techniques – a review. European Journal of Operational Research, 180, 1-28. Ravi, V., Kurniawan, H., Thai, P.N.K., & Ravi Kumar, P. (2008). Soft computing system for bank preformance prediction. Applied Soft Computing, 8, 305-315. Reinartz, W., Krafft, M., & Hoyer, W.D. (2004). The customer relationship management process: Its measurement and impact on performance. Journal of Marketing Research, XLI, 293-305. Reinartz, W.J., & Kumar, V. (2003), The impact of customer relationship characteristics on profitable lifetime duration. Journal of Marketing, 67, 77–99. Rigby, D., & Bilodeau, B. (2009). Management Tools and Trends 2009, research report, Bain & Company, (accessed November 11, 2010), [available at http://www.bain.com/bainweb/PDFs/cms/Public/Management Tools_2009.pdf]. Rogers, M. (2005). Customer strategy: Observations from the Trenches. Journal of Marketing, 69, 262-63. Rosset, S., Neumann, E., Eick, U., & Vatnik, N. (2003). Customer lifetime value models for decision support. Data Mining and Knowledge Discovery, 7, 321–339. Rud, O.P. (2000). Data Mining Cook Book. New York, NY: John Wiley and Sons. Rust, R. T., Ambler, T., Carpenter, G.S., Kumar, V., & Srivastava, R.K. (2004). Measuring marketing productivity: current knowledge and future directions. Journal of Marketing, 68, 76-89. Ryals, L. (2005). Making customer relationship management work: The measurement and profitable management of customer relationships. Journal of Marketing, 69, 252-61. Rygielski, C., Wang, J.C., & Yen, D.C. (2002). Data mining techniques for customer relationship management, Technology in Society. 24, 483-502. Sato, Y. (2000). Perspective on data mining from statistical viewpoints. Knowledge Discovery and Data Mining. In: Current issues and New Applications, 4th Pacific-Asia Conference, PAKDD, Kyoto, Japan. Scott, D. (2001). Understanding Organizational Evolution: Its Impact on Management and Performance. Quorum Books. Shaw, M.J., Subramaniam, C., Tan, G.W., & Welge, M.E. (2001). Knowledge management and data mining for marketing. Decision Support Systems, 31, 127–137.

Shin, K.S., & Lee, Y.J. (2002). Genetic algorithm application in bankruptcy prediction modeling. Expert System with Applications, 1-8. Skinner, S.J. (1990). Marketing. Boston: Houghton Mifflin. Slater, S., & Narver, J.C. (1994). Market orientation, customer value, and superior performance. Business Horizons, 37(2), 22-28. Slater, S.F., & Narver, J.C. (1998). Customer-led and market-oriented: Let’s not confuse the two. Strategic Management Journal, 19(10), 1001-1006. Slater, S.F., & Narver, J.C. (1999). Market-oriented is more that being customer-led. Strategic Management Journal, 20(12), 1165-1168. Song, H.S., Kim, J.K., Cho, Y.B., & Kim, S.H. (2004). A personalized defection detection and prevention procedure based on the self-organizing map and association rule mining. Applied to online game site. Artificial Intelligence Review, 21, 161–184. Spangler, W.E., May, J.H., & Vargas, L.G. (1999). Choosing data-mining methods for multiple classification: Representational and performance measurement implications for decision support. Journal of Management Information System, 16(1), 37-62. Srinivasan, R., & Moorman, C. (2005). Strategic firm commitments and rewards for customer relationship management in online retailing. Journal of Marketing, 69, 193-200. Srivastava, R., Shervani, T., & Fahey, L. (1998), Marketing-Based assets and shareholder value: A framework for analysis. Journal of Marketing, 62, 2-18. Srivastava, R.K., Shervani, T., & Fahey, L. (1998). Market-based assets and shareholder value: A framework for analysis. Journal of Marketing, 62, 2-18. Stone, M., & Woodcock, N. (2001). Defining CRM and assessing its quality, in Successful Customer Relationship Marketing, Brian Foss and Merlin Stone (eds). London, UK: Kogan Page. Studenmund, A.H. (2006). Using econometrics, a practical guide (fifth edition). New York, NY: AddisonWesley. Su, C.T., Hsu, H.H., & Tsai, C.H. (2002). Knowledge mining from trained neural networks. Journal of Computer Information Systems, 42, 61–70. Svancara, J., Kralova, Z., & Blaho, M. (2012). Optimization of HMLV manufacturing systems using genetic algorithm and simulation. International Review on Modelling and Simulations, 5(1), 482-488. Sweet, J.A. (1988). Measuring the accuracy of diagnostic system. Science, 240, 1285–1293. Swift, R.S. (2001). Accelarating customer relationships: Using CRM and relationship technologies. Upper Saddle River, NJ: Prentice Hall PTR. Teo, T.S.H., Devadoss, P., & Pan, S.L. (2006). Towards a holistic perspective of customer relationship management implementation: A case study of the housing and development board, Singapore. Decision Support Systems, 42, 1613–1627. Thearling, K. (1998). Data Mining and privacy: A conflict in the making? www.thearling.com. Thomas, J.S., & Sullivan, U.Y. (2005). Managing marketing communications with multichannel customers. Journal of Marketing, 69, 239-51. Turban, E., Aronson, J.E., Liang, T.P., & Sharda, R. (2007). Decision support and business intelligence systems (Eighth ed.). Pearson Education. Van den Poel, D. (2003). Predicting mail-order repeat buying. Which variables matter? Review of Business and Economics, XLVIII(3), 371-404. Varetto, F. (1998). Genetic algorithms applications in the analysis of insolvency risk. Journal of Banking & Finance, 22, 1421-1439. Vargo, S.L., & Lusch, R.R. (2004). Evolving to a new dominant logic for marketing. Journal of Marketing, 68, 1-17. Vasant, P. (2006). Fuzzy decision making of profit function in production planning using S-curve

membership function. Computers and Industrial Engineering, 51(4), 715-725. Vasant, P. (2012). Solving Fuzzy Optimization Problems of Uncertain Technological Coefficients with Genetic Algorithms and Hybrid Genetic Algorithms Pattern Search Approaches. In P. Vasant, N. Barsoum, & J. Webb (Eds.), Innovation in Power, Control, and Optimization: Emerging Energy Technologies (pp. 344-368). doi:10.4018/978-1-61350-138-2. Vasant, P. (2013). Meta-heuristics optimization algorithms in engineering, business, economics, and finance, IGI-Global, doi:10.4018/978-1-4666-2086-5. Vasant, P., Elamvazuthi , I., & Webb, J.F. (2010). Fuzzy technique for optimization of objective function with uncertain resource variables and technological coefficients. International Journal of Modeling, Simulation, and Scientific Computing, 1(3), 349-367. Vasant, P., Ganesan, T., Elamvazuthi, I., & Webb, J.F. (2011). Fuzzy linear programming for the production planning: The case of textile firm. International Review on Modelling and Simulations, 4(2), 961-970. Veglio, V. (2013). The strategic importance of data mining analysis for customer-centric marketing strategies. In Kaufmann, H.R., & Panni, M.F.A.H. (eds), Customer Centric Marketing Strategies: Tools for building organizational performance. IGI Global. Wang, K., Zhou, S., Yang, Q., & Yeung, J. M. S. (2005). Mining customer value: From association rules to direct marketing. Data Mining and Knowledge Discovery, 11, 57–79. Woo, J.Y., Bae, S.M., & Park, S.C. (2005). Visualization method for customer targeting using customer map. Expert Systems with Applications, 28, 763–772. Xu, Y., Yen, D.C., Lin, B., & Chou, D.C. (2002). Adopting customer relationship management technology. Industrial Management & Data Systems, 102(8/9), 442-452. Zeithaml, V.A., Rust, R.T., & Lemon, K.N. (2001). The customer pyramid: Creating and serving profitable customers. California Management Review, 43(4), 118-42. Zhu, D., Porter, A., Cunningham, S., Carlisie, J., & Nayak, A.A. (1999). Process for mining science & technology documents databases, illustrated for the case of “knowledge discovery and data mining”. Ci Inf., 28(1), 7-14.

ADDITIONAL READING Ahmed, S. R. (2004). Applications of data mining in retail business. Information Technology: Coding and Computing, 2, 455-459. Ahn, JH., Han, SP., & Lee, YS. (2006). Customer churn analysis: churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry. Telecommunications Policy, 30(10-11), 552-568. Ahn, WH., Kim, WJ., & Par, D. (2004). Content-aware cooperative caching for cluster-based web servers. Journal of Systems and Software, 69(1-2), 75-86. Berger, P.D., Bolton, R.N., Bowman, D., Briggs, E., Kumar, V., Parasuraman, A., & Creed, T. (2002). Marketing actions and the value of customer assets. Journal of Service Research, 5, 39-54. Bhattacharya, A., & Vasant, P. (2007). Soft-sensing of level of satisfaction in TOC product-mix decision heuristic using robust fuzzy-LP. European Journal of Operational Research, 177(1), pp. 55-70. Bhattacharya, A., Abraham, A., Vasant, P., & Grosan, C. (2007). Evolutionary artificial neural network for selecting flexible manufacturing systems under disparate level-of-satisfaction of decision maker. International Journal of Innovative Computing, Information and Control, 3(1), 131-140. Cebi, S., Kahraman, C., & Kaya, I. (2012). Soft computing and computational intelligent techniques in the evaluation of emerging energy technologies. In P. Vasant, N. Barsoum, & J. Webb (Eds.), Innovation in power, control, and optimization: Emerging energy technologies (pp. 164-197). Hershey, PA: Engineering Science Reference. doi:10.4018/978-1-61350-138-2.ch005. Chang, C.T. (2010). An approximation approach for representing S-shaped membership functions. IEEE Transactions on Fuzzy Systems, 18(2), 412-424.

Chen, G.Q., Wei, Q., Liu, D., & Wets, G. (2002). Simple association rules (SAR) and the SAR-based rule discovery. Computers & Industrial Engineering, 43(4), 721-733. Chen, W.C., Hsu, C.C., & Hsu, J.N. (2011). Optimal selection of potential customer range through the union sequential pattern by using a response model. Expert Systems with Application, 38, 7451-7461. Coviello, N.E., Brodie, R.J., Danaher, P.J., & Johnston, W.J. (2002). How firms relate to their markets: An empirical examination of contemporary marketing practices. Journal of Marketing, 66, 33-47. Dieu, V. N., & Ongsakul, W. (2012). Hopfield lagrange network for economic load dispatch. In P. Vasant, N. Barsoum, & J. Webb (Eds.), Innovation in power, control, and optimization: Emerging energy technologies (pp. 57-94). Hershey, PA: Engineering Science Reference. doi:10.4018/978-1-61350-1382.ch002. Dostál, P. (2013). The use of soft computing for optimization in business, economics, and finance. In P. Vasant (Ed.), Meta-heuristics optimization algorithms in engineering, business, economics, and finance (pp. 41-86). Hershey, PA: Information Science Reference. doi:10.4018/978-1-4666-2086-5.ch002. Elamvazuthi, I., Ganesan, T., & Vasant, P. (2011). A comparative study of HNN and Hybrid HNN-PSO techniques in the optimization of distributed generation power systems. In Proceedings of the 2011 International Conference on Advanced Computer Science and Information Systems (ICACSIS’11), 195-199, Jakarta, Indonesia. Elamvazuthi, I., Ganesan, T., Vasant, P., & Webb, J.F. (2009). Application of a fuzzy programming technique to production planning in the textile industry. International Journal of Computer Science and Information Security, 6(3), 238-243. Feinberg, F. M., Krishna, A., & Zhang, Z.J. (2002). Do we care what others get? A behaviorist approach to targeted promotions. Journal of Marketing Research, 39, 277-92. Ganesan, T., Vasant, P., & Elamvazuthi, I. (2011). Solving engineering optimization problems with KKT Hopfield neural networks. International Review of Mechanical Engineering, 7(7) 1333-1339, 2011. Ganesan, T., Vasant, P., & Elamvazuthi, I. (2012). Hybrid neuro-swarm optimization approach for design of distributed generation power system. Neural Computing and Applications, DOI:10.1007/s00521-012-0976-4. Grönroos, C. (1994). Qua vadis, marketing? Toward a relationship marketing paradigm. Journal of Marketing Management, 10(5), 347-60. Hasuike, T., & Ishii, H. (2009). Product mix problems considering several probabilistic conditions and flexibility of constraints. Computers and Industrial Engineering,56(3), 918-936. Johnson, M.D., & Selnes, F. (2004). Customer portfolio management: Toward a dynamic theory of exchange relationships. Journal of Marketing, 68, 1-17. Khan, M.A., & A.Q. Ansari (Eds.), Handbook of research on industrial informatics and manufacturing intelligence: Innovations and solutions (pp. 104-131). doi:10.4018/978-1-4666-0294-6. ch005.2012. Leng, K., & Wang, Y. (2012). Research on capacity allocation in a supply chain system based on TOC. Lecture Notes in Electrical Engineering,140, 517-524. Liu, S. (2011). T. Solution of fuzzy integrated production and marketing planning based on extension principle. Computers and Industrial Engineering,63(4), 1201-1208. Loveman, G.W. (1998). Employee satisfaction, customer loyalty and financial performance. Journal of Service Research, 1, 18–31. Peacock, P.R. (1998). Data Mining in Marketing: Part 1. Marketing Management, 7, 9–18. Peidro, D., & Vasant, P. (2011). Transportation planning with modified s-curve membership functions using an interactive fuzzy multi-objective approach. Applied Soft Computing, 11, 2656-2663. Pettit, R. (2002). The state of CRM: Addressing efficiencies and the achilles’ heel of CRM. Business Intelligence Advisory Service Executive Report, 2 (3). Purnomo, H. D., & Wee, H. (2013). Soccer game optimization: An innovative integration of evolutionary algorithm and swarm intelligence algorithm. In P. Vasant (Ed.), Meta-heuristics optimization algorithms in engineering, business, economics, and finance (pp. 386-420). Hershey, PA: Information Science Reference.

doi:10.4018/978-1-4666-2086-5.ch013. Rubin, M. (1997). Creating customer-oriented companies. Prism, 4(4), 5–27. Sadrnia, A., Nezamabadi-Pour, H., Nikbakht, M., & Ismail, N. (2013). A gravitational search algorithm approach for optimizing closed-loop logistics network. In P. Vasant (Ed.), Meta-heuristics optimization algorithms in engineering, business, economics, and finance (pp. 616-638). Hershey, PA: Information Science Reference. doi:10.4018/978-1-4666-2086-5.ch020. Senvar, O., Turanoglu, E., & Kahraman, C. (2013). Usage of metaheuristics in engineering: a literature review. In P. Vasant (Ed.), Meta-heuristics optimization algorithms in engineering, business, economics, and finance (pp. 484-528). Hershey, PA: Information Science Reference. doi:10.4018/978-1-4666-2086-5.ch016. Subramaniam, L.V., Faruquie, T.A., Ikbal, S., Godbole, S., & Mohania, M.K. (2009). Business Intelligence from Voice of Customer. IEEE International Conference on Data Engineering. Svancara, J., Kralova, Z., & Blaho, M. (2012). Optimization of HMLV manufacturing systems using genetic algorithm and simulation. International Review on Modelling and Simulations, 5(1), 482-488. Van den Poel, D., & Larivière, B. (2004). Customer attrition analysis for financial services using proportional hazard models. European Journal of Operational Research, 157(1), 196-217. Varadarajan, R., & Jayachandran, S. (1999). Marketing strategy: An assessment of the State of the field and outlook. Journal of the Academy of Marketing Science, 27(2), 120-43. Vasant, P. (2010). Hybrid simulated annealing and genetic algorithms for industrial production management problems. International Journal of Computational Methods, 7(2), 279-297. Vasant, P. (2012). A novel hybrid genetic algorithms and pattern search techniques for industrial production planning. International Journal of Modeling, Simulation, and Scientific Computing, DOI: 10.1142/S1793962312500201. Vasant, P., & Barsoum, N. (2009). Hybrid genetic algorithms and line search method for industrial production planning with non-linear fitness function. Engineering Applications of Artificial Intelligence, 22(4-5), 767-777. Vasant, P., & Barsoum, N. (2010). Hybrid pattern search and simulated annealing for fuzzy production planning problems. Computers and Mathematics with Applications, 60(4), 1058-1067. Vasant, P., Bhattacharya, A., Sarkar, B., & Mukherjee, S.K. (2007). Detection of level of satisfaction and fuzziness patterns for MCDM model with modified flexible S-curve MF. Applied Soft Computing Journal, 7(3), 1044-1054. Vasant, P., Ganesan, T., Elamvazuthi, I. (2012). Improved Tabu Search Recursive Fuzzy method for crude oil industry. International Journal of Modeling, Simulation, and Scientific Computing, 3(1). Vasant, P., Nagarajan, R., & Yaacob, S. (2004). Decision making in industrial production planning using fuzzy linear programming. IMA Journal of Management Mathematics, 15(1), 53-65. Verhoef, P.C. (2003). Understandingthe Effect of customer relationship management efforts on customer retention and customer share development. Journal of Marketing, 67, 30-45. Viaene, S., Baesens, B., Van Gestel, T., Suykens, J.A.K., Van den Poel, D., Vanthienen, J., De Moor, B., & Dedene, G. (2001). Knowledge discovery in a direct marketing case using least square support vector machines. International Journal of Intelligent Systems, 16(9), 1023-1036. Vo, D.N., & Schegner, P. (2013). An improved particle swarm optimization for optimal power flow. In Vasant, P. (Ed.), Meta-heuristics optimization algorithms in engineering, business, economics, and finance (pp. 1-40). Hershey, PA: Information Science Reference. doi:10.4018/978-1-4666-2086-5.ch001. Whiting, R. (2001). CRM's realities don't match hype. Information Week, 79-80. Wright, P. (1986). Schemer schema: Consumers' intuitive theories about marketers' influence tactics. In Advances in Consumer Research, 13, Richard Lutz, ed. Provo, UT: Association for Consumer Research, 1-3.

Zanjani, M.S., Rouzbehani, R., & Dabbagh, H. (2008). Proposing conceptual model of customer knowledge management: A study of CKM tools in British dotcoms. Word Academy of Science, Engineering and Technology, 38, 303-307. Zhang, G., Hu, M.Y., Patuwo, B.E., & Indro, D.C. (1999). Artificial neural networks in bankruptcy prediction: General framework and cross validation analysis. European Journal of Operational Research, 116, 16-32.

KEY TERMS AND DEFINITIONS Keywords: Customer relationship management, data mining, decision tree, customer purchasing behavior, market-oriented companies, globalization, customer value. Definitions: Customer relationship management is a coherent and complete set of methodologies for managing relationships with current and potential customers. Data Mining is the process that uses soft computing techniques to extract and identify useful information and subsequently gain knowledge from large databases. Decision Trees as a hierarchical collection of rules that describe how to divide a large collection of records into successively smaller group of records. With each successive division, the member of the resulting segment became more and more similar to one another with respect to the dependent variable. Customer Purchasing Behavior is the set of factors and beliefs that lead customer to make a purchase. Market-oriented companies are companies committed to understanding the expressed and latent needs of their customers and of the others players in the market better and before than competitors. Globalization is a process that, from the 80s, draws new competitive boundaries and rules. Customer Value is a process that puts the customer’s interest first, while not excluding those of other stakeholders in order to create a competitive advantage for the firm.

APPENDIX Graphical Abstract

Expressed and Latent Needs of Customers

!

"

Customer Purchasing Prediction Model

"

!

Customer Relationship Management

"

Data Mining

"

Decision Tree

!