customer insights from transactional database: database marketing case

3 downloads 3814 Views 425KB Size Report
to answer to marketers' questions: which customers uses a promotional voucher? ... Data mining, Database Marketing, Marketing databases, Knowledge. 1.
IADIS European Conference Data Mining 2008

CUSTOMER INSIGHTS FROM TRANSACTIONAL DATABASE: DATABASE MARKETING CASE Filipe Pinto, Alzira Ascensão Marques

School of Technology and Management, Polytechnic Institute of Leiria, Portugal

Manuel Filipe Santos

Department of Information Systems, University of Minho, Portugal

ABSTRACT Marketing databases are currently one of the most important resources in any marketing departments. Regarding their customer knowledge needs many of them had develop their own database repository system. Although it immediate usefulness, as long as their databases were growing that behavior had revealed insufficient and created others barriers to their customer knowledge acquisition process, like data quality or data heterogeneity problems. This paper focuses one relationship marketing program that was conceived to be supported by a huge marketing database. Although database’s dimension it has many structural problems that had conditioned the initial approach. During his work it was proposed and developed a knowledge discovery in databases process using data mining algorithms in order to answer to marketers’ questions: which customers uses a promotional voucher? Which are they main characteristics?. The results show great promise as the predictive accuracy of the integrated self organized maps with decision tree C5 algorithm. KEYWORDS Data mining, Database Marketing, Marketing databases, Knowledge.

1. INTRODUCTION Contemporary marketing departments support their marketing research activities over information technologies. Indeed, marketing databases (MD) in specific case represent one of the most important active for those marketing practitioners. Advances in information and communication technologies, organizations can effectively obtain and store transactional and demographic data on individual customers at reasonable costs [1]. The challenge now is how to extract important knowledge from these vast databases in order to gain a competitive advantage [2]. Emphasis on customer relationship management makes the marketing function an ideal application area to greatly benefit from the use of Data Mining (DM) tools for decision support. Through DM, organizations can identify valuable customers, predict future behaviors, and make proactive, knowledge-driven decisions. This includes understanding the customers’ preferences through facts and customers’ behavior through analyzing their transaction data. There has been much research done in this direction, and clustering transactions to learn segments has been one research stream that has generated a variety of useful approaches [3][4]. In this paper, the DBM process involved a development of models to correctly classify which clients use (or not) a voucher, using five answers as inputs, predicting the customer response, enabling the commercial organization to offer products suitable to the right customers. First, a description of the adopted data is given. Then, a brief presentation of KDD is performed. Next, experiments are presented and the results analyzed. Finally closing conclusions are drawn

113

ISBN: 978-972-8924-63-8 © 2008 IADIS

2. MARKETING DATA 2.1 Marketing Database At the beginning of this project a huge database with more than 630 000 customers registered was constructed. In this case-study two Databases (DB) of the referred database were used: one containing personal information about registered customers (including the response to the inquiries mailed in direct marketing projects with discount vouchers, and by renting of external DB), and another with the transactional data with the registered discount of the vouchers in the supermarkets. The first phase of the project consisted in merging of the two DB, in order to make possible the DM task (64 482 customers were registered and 63 961 discount vouchers were mailed). This project focused on the Most Valuable Consumers (MVC), which received the discount voucher of a specific hygienic product (19 382 customers). The classification rule for consumers according to a pseudo-code is depicted below: If household >=2 and dishwater=yes and washing machine=yes Then Consumer= MVC

2.2 Database Problems The database of this project presented several problems, mainly in terms of the inquiries’ data. The inquiries DB contained several missing data, which had to be treated (described in Section 3). However, this drawback did not occur in the discount vouchers DB, due to the automatic acquisition of the data. After the merging process, a single dataset was created. Table 1 shows a synopsis of the main attributes of this data. Table 1. Main database attributes Attribute 1. Household 2. Dishwater 3. Monthly Consumption (€) 4. Household Income (€) 5. Childs 6. Number of Childs 7.Voucher Use

Domain {NR, 1, 2, 3, 4, 5, 6 or more} {NR, Yes, No} {NR, [0…150], [151…350],[351…500], [501…650], [651…[} {NR, [0…500],[501..750], [751…1000],[1001…1500], [1501…2250], [2251…[ } {NR, No, Yes} {NR, 1, 2, 3, 4, 5, 6, 7, 8, 9, [10…[} {No, Yes}

3. KNOWLEDGE DISCOVERY FROM DATABASES The scope of a Knowledge Discovery in Databases (KDD) process is to mine database in order to identify valid, novel, potentially useful and ultimately understandable knowledge nuggets. To this end, it implies the accomplishment of different tasks, like domain understand, data preprocessing, data mining, and knowledge interpretation/evaluation. Each of these can be realized by different and various tools. Such tools come from different research areas, like databases, machine learning, artificial intelligence, statistics and so on. It implies that an user should have various skills and expertise in order to manage all of them [12]. As a matter of fact, in a KDD process the user has to choose, to set-up, to compose and to execute the tools more suitable to the problem.

114

IADIS European Conference Data Mining 2008

3.1 Pre-Processing As a initial step it took two approaches: first and slightly strange in scientific community, a visualization task, pursued in order to identify the missing data origin [11] or eventually to graphically understand the data. The second and the most relevant preprocessing task had conducted the study to more specific and accurate analysis. In this task, many problems in data were addressed, such as example: a) Several registers of the first six attributes contained blank values. As presented in Table 1, this is missing data, because in the domain it is considered one value (Non response) for the customers that do not respond to the addressed question. This is due to a refusal when some respondents find some questions personally or sensitive b) Inconsistency in the related questions: Childs and Number of Childs (Table 2). c) Inconsistency regarding in many attributes that uses codification, like gender or post code; d) Too much classes in attribute which are never used, e.g. Number of Childs. A significant bias occurs due to natural and inadequate questionnaire options. A work of levelling was done reducing the number of classes, in order to enhance the learning phase. To solve the problems of missing data addressed, some techniques were used, taking in account several aspects regarding statistical significance issues after pre-processing the data. The blank values were considered Non Response, assuming an error in introduction of data by the operator, reinforcing the necessity of validation mechanisms in the software used. Attributes Childs? and Number of Childs were processed together regarding the important issue of concordance between the information of one and another attribute. For example, when Childs equal to No and Number of Childs equal to Non Response (n=3004), the value of Number of Childs was changed to 0. In cases such as Childs equal to Yes and Number of Childs equal to 0 or Non Response, or Childs equal to No and Number of Childs not equal to 0 and Non Response, the cases were discarded. The number of classes of the attribute Number of Childs was reduced, merging the classes 5, 6, 7, 8, 9, 10 or more, in the class 5 or more. After this cleaning stage, the number of cases was reduced to 14 710.

3.2 Modelling Objectives and Data Selection The objective of this project was to classify clients regarding their voucher usage. Accordingly, it was considered for this study two index: - Customer activity index (CAI) - which represents the tendency of customer to use the voucher, regarding V

E−R the number of vouchers used by the total of all vouchers received.. CAI = VE − R + V E − .NR *100%

Voucher rebate index (VRI) – Percentage of the usage for each voucher = + E R

-

− E R N

Therefore it was assigned to each customer record their correspondent index values and as well for each vouchers the correspondent rebate index. As result it was possible to rank the customer by activity (table 4) and voucher by its success. To group customers and therefore perform analysis it was developed a customer segmentation model known as the pyramid model [16]. The pyramid model groups customers by the index considered for the previous customer order, e.g., in this case by CAI. It was considered four categories, corresponding to 1%, 5%, 20% and 80% of all actives MVC (table 2). To the cases where the CAI was the same and the correspondent classification became different, it was individually settled the group Table 2. Customer classification with pyramid model cli_ID 900914 905666 904507 905565 904945

V-NR 25 25 25 25 25

V-R 20 20 20 20 20

CAI 44,44% 44,44% 44,44% 44,44% 44,44%

N 203 204 205 206 207

CAI 0,97% 0,98% 0,99% 1,00% 1,01%

Group Top Top Top Big Big

115

ISBN: 978-972-8924-63-8 © 2008 IADIS

3.3 Data Mining All experiments reported in this section were conducted using the Clementine Data Mining System version 7.0. Initially, a decision tree algorithm, the C5.0 algorithm [13], was applied in the whole dataset, in order to achieve a set of rules which could explain the Voucher Use behaviour in terms of the set of 5 golden questions. However, the algorithm produced a single naïve rule (Voucher Use equal to No), which presented a classification accuracy of 75%, corresponding to the percentage of No uses. In order to improve the predictive results and to obtain more explainable information about the clients’ behaviour, a different strategy was pursued.

i) Finding groups of customers - clusters

Having a well defined segmentation introduced in the database a clustering approach was adopted, in order to find clusters of clients with homogeneous behaviours. Table 3. Clusters description Cluster 1 2

N Not use % Used 1 895 1389 73% 506 35 17 49% 18

… 25 487 Total 15 965

385 79% 102 11 892 75% 3 975

% 27% 51% 21% 25%

This task consists of two essential modules: one is the exploratory clustering module based on a neural network, a Self-Organizing Map (SOM), and the other is the rule extraction module by employing a decision tree that can extract association rules for each homogenous cluster. According to different cluster’s characteristics, different marketing strategies could be adopted by making use of the set of classification rules. SOM are a special kind of neural network model that performs unsupervised learning. It takes the input vectors and performs a type of spatially organized clustering, or feature mapping, to group similar records together. This algorithm produces to each record one value (KX,KY) that makes the correspondence between the record and the cluster that was assigned. In the SOM, several parameters were experimented, being the final topology set to 20 input nodes and 25 output nodes, with a map of a 5x5 grid, corresponding to 25 clusters. Table 3 shows the distributions of each cluster.

ii) Explaining the groups – rules generation

The C5.0 algorithm [13] was applied to each of the 25 clusters (given by the SOM), in order to obtain a set of explanation rules. The training sets used a random sample with 2/3 of the data while the test sets contained the remaining 1/3. The set of classification rules, given by the C5.0, provided accurate results, with a 99.93% predictive accuracy to describe the SOM clustering. As an example, the rules for Cluster 5, which are simple to understand, are depicted below: Rule 1 Dishwater? Yes Childs? Yes Household? 4 Monthly Consumption?

[151-350],[501,750],[750..[

It should be stressed that in 60% of the data, the Voucher Use distribution within a cluster is higher than the original dataset (75%). This means that in such cases, the prediction given by the SOM is better than the one given by the default C5.0 naïve rule.

4. RESULTS AND DISCUSSION In order to test the results it was applied to the rest of the group (1/3 – 560 cases) and a confusion matrix [17] (table 4). A confusion matrix reflects the quantity of correct and incurred predictions were made over a set of examples.

116

IADIS European Conference Data Mining 2008

Table 4. Confusion matrix Verified / Predicted Negative Positive

Negative 290 18

Positive 17 235

Specificity – calculates the ratio of right prediction of true negatives (Tn) regarding all cases classified as negatives (Tn+Fp); Accuracy: reflects the ratio between correct prediction cases (Tp+Tf) relative to all cases (Tp+Tf+Fp+Fn); Sensibility: reflects the ratio of correct positives predicted (Tp) against all the results presented as positives (Tp+Fn). As a resume it was produced the table 5. Table 5. Quality of model

Accuracy 76.03%

specific 76.06%

sensibility 75.99%

5. CONCLUSIONS Adding more attention to the main function of the marketing database by means of well defined procedures of data collection and more sophisticated process of data quality could highly improve their use in database marketing projects. Moreover, our study confirmed the importance of a well considered customer accomplishment strategy. It demonstrated that when the data about customers its presents its possible to get a model for marketingdecision makers using customers of whom textual information is available. The novelty of this proposed clustering approach is that allowed the generation of homogeneous clusters according to personal and transactional characteristics and the rule extraction that characterizes each cluster, what favoured the accuracy of proposed models and the readiness of the rules. These can help DB marketers to perform their customers/products fragmentation and one-to-one marketing. The obtained results, although not authoritative, are encouraging. Furthermore, this study allowed the identification of important issues, which need to be improved in future DBM projects, namely: reviewing of the inquiries questions and options of response, using automatic validation of the responses (in order to reduce inconsistencies and missing data), and merging of data from more databases (e.g. credit card and geographical data). If these questions are regarded in future direct marketing projects, intelligent systems could be developed, taking in account more characterization attributes.

REFERENCES Naik, A. P., Tsai, C., Isotonic single-index model for high-dimensional database marketing, Computational Statistics and Data Analysis, 2006. Cohen, M. D., Exploiting Response Models – Optimizing cross-sell and up-sell opportunities in banking, Information and Systems, 29, 327-341, 2004. Tao, Y., Yeh, C., Rosa, C., Simple database marketing tools in customer analysis and retention, International Journal of Information Management, 23, 291-301, 2003. Verhoef, P. C., Spring, P. N., Hoekstra, J. C., Leeflang, P. S. H., The commercial use of segmentation and predictive modelling techniques for database marketing in the Netherlands, Decision Support Systems, 34, 471-481, 2005. Wheeler, R., Aitken, S., Multiple algorithms for fraud detection, Knowledge-Based Systems, Volume 13, Issues 2-3, 9399, 2007. Cielen, A., Peeters, L., Vanhoof, K., Bankruptcy prediction using a data envelopment analysis , EuropeanJournal of Operational Research, Volume 154, Issue 2, 526-532, 2006. Shaw, M. J., Subramaniam, C., Tan, G., Welge, M., Knowledge management and data mining for marketing, Decision Support Systems, 31, 127-137, 2004. Changchien, S.W., Lu, T.C., Mining association rules procedure to support on-line recommendation by customers and products fragmentation, Expert Systems with Applications, 20, 325-335, 2001.

117

ISBN: 978-972-8924-63-8 © 2008 IADIS

Brown, L.M., Kros, J.F., Data Mining and the impact of missing data, Industrial Management & Data Systems, 103/8, pp 611-621, 2003. Jessica Cellini, Claudia Diamantini, and Domenico Potena. Kddbroker: Description and discovery of kdd services. In 15th Italian symposium on Advanced Database Systems, 2007. Kohonen, T., Self Organizing Maps, Springer-Verlag, 1995. Flexer, A., On the use of Self-Organizing Maps for clustering and visualization, Proceedings of the 3rd International European Conference PKDD 99, 1999. Curry Jay; Curry Adam; “The Customer Marketing Method: How to Implement E Profit from Customer Relationship Management”, Free Press, 2000. Kohavi R.; “Wrappers for Feature Subset Selection” Artificial Intelligence 97, 273-324, 1997.

118