Expert Systems with Applications 29 (2005) 291–299 www.elsevier.com/locate/eswa
Targeting customers via discovery knowledge for the insurance industry Chien-Hsing Wua,*, Shu-Chen Kaob, Yann-Yean Suc, Chuan-Chun Wud a
Department of Information Management, National University of Kaohsiung, Kaohsiung 811, Taiwan, ROC Department of Information Management, Kun Shan University of Technology, Tainan 700, Taiwan, ROC c Department of Information Management and Communication, Wenzao Ursuline College of Languages, Kaohsiung, Taiwan, ROC d Department of Information Management, I-Shou University, Kaohsiung 840, Taiwan, ROC b
Abstract In this paper, the knowledge discovery in databases and data mining (KDD/DM), one of the data-based decision support technologies, is applied to help in targeting customers for the insurance industry. In most KDD/DM application cases, major tasks are required, including data preparation, data preprocessing, data mining, interpretation, application and evaluation. A case study is presented that KDD/DM is utilized to explore decision rules for a leading insurance company. The decision rules can be used to investigate the potential customers for an existing or new insurance product. The research firstly constructed the application framework, then defined and conducted each task required, and finally obtained feedback from the case company. Discussions and implications with respect to this research are presented also. q 2005 Elsevier Ltd. All rights reserved. Keywords: Knowledge discovery in databases; Data mining; Insurance; Decision support
1. Introduction Researches in applications of operations research methodologies and fuzzy logic that are depicted in literature in the past few years have presented a significant contribution to decision support for the insurance industry (Brockett & Xiaohua, 1997; Shapiro, 2004). To date, management has also paid increased attention to the usage data that reflects the changes of customers and suppliers, or even the organizations themselves. Importantly, the advent of modern information technology both facilitates the use of past data and enhances its value in various applications. In consequence, the extensive development of data-based decision support technology has evolved significantly. Shim et al. (2002) presented a comprehensive description of the capability of decision support technology in discussing of its past, present, and future implementation. In order to effectively disclose information in usage data, information technology has introduced the development and application of the technology of knowledge discovery in databases (KDD) and data mining (DM). KDD/DM is
* Corresponding author. Tel.: C886 75919509; fax: C886 75919328. E-mail address:
[email protected] (C.-H. Wu).
0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2005.04.002
basically a data-oriented paradigm that integrates database management, knowledge representation, and machine learning (Fayyad & Stolorz, 1997). It is introduced with both descriptive and predictive capability to discover patterns in historical data that are previously unknown, but meaningful and decision-supportable. In response to its applications in business, KDD/DM has been extensively employed in support of management decisions (Han & Fu, 1999). For example, discovery knowledge that ‘new customers who prefer low annual premium are most likely to purchase the product of whole life health’ would be very decision-supportable. Pitta (1998) highlighted KDD/DM as an important tool that marketers can use to reveal patterns in databases, while also emphasizing the one-to-one marketing strategy. Feelders, Daniels, and Holsheimer (2000) presented a fundamental concept for data mining by aiming at its application process. More importantly, the applications in various areas of business depicted in literature in the past few years have also witnessed the increased use of KDD/DM. Some relevant works can be found in the hotel data market (Sung & Sang, 1998), personal bankruptcy prediction (Donato et al., 1999), customer service support (Hui & Jha, 2000), the special issue edited by Kohavi and Provost (2001) of a journal of data mining and knowledge engineering, and material acquisitions budget allocation for libraries (Wu, 2003a). By focusing on the insurance domain, the current study presents a practical KDD/DM application
292
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
case that would be of reference and/or of interest for applications in similar domains. Technically, there is much to contend with in order to successfully implement a KDD/DM application plan. It is realized that when KDD/DM is applied in a field, it systematically follows six stages: identifying the problem, collecting datasets, preprocessing collected datasets, mining preprocessed datasets, interpreting and implementing discovered informatics, as well as evaluating the discovered informatics. Identifying problem specifies the critical question(s) and/or the issues that are associated with the improvement of decision-making capability. Collecting datasets deals mainly with the compilation of data that may include important information useful to achieve the defined objective. Preprocessing data uses operations of refinement and reconstruction of data tables, consistency of multi-typed data tables, elimination of redundant (or unnecessary) attributes, combination of highly correlated attributes, and discretization of continuous attributes. Mining data employs a mining mechanism to perform knowledge discovery with the form of association, classification, regression, clustering, and summarization/generalization [11]. Interpreting discovered knowledge is to use visualization techniques to display discovered results, such as text, table, figure, graph, animation, diagram, etc. Implementation deals with the transformation from discovered results into decisions that are made. Evaluating the implemented discovered knowledge is about the performance test. Criteria used can be validity, significance/uniqueness, effectiveness, simplicity and generality (Hirota & Pedrycz, 1999). Validity considers whether or not the discovered informatics is practically applicable. Uniqueness/significance deals with how different the discovered informatics is from the knowledge that an organization already has. Effectiveness is to see the impact the discovered informatics is to the decisions that have already been made and implemented. Simplicity considers the degree of understandability while generality the degree of scalability. Equally importantly, decision support using KDD/DM also requires knowledge of application domain, database management, as well as KDD/DM professionals (Feelders et al., 2000). The domain specialists need to help in defining problem(s). For example, a question that ‘what are the potential customers to whom a new insurance product with special features can be effectively introduced’ may be raised. They also need to provide information about the structure of the database being discovered, the meaning of attributes in the database, and the representation of possible values for each attribute, in order to help in preprocessing the database (e.g. elimination of redundant attributes), to help in interpreting the discovered knowledge, to help in implementing the discovered informatics (as a decision), and to help in evaluating the implemented decisions. Techniques in database management are needed for conducting data preprocessing, representation and interpretation of results. Generally, specialists in KDD/DM are
responsible for the overall application project, particularly for efficient and effective collaboration for both domain specialists and database management specialists. Essentially, they need to find relevant tools or to develop a special one as the data miner that can totally match the mining requirements (Giraud-Carrier & Povel, 2003). The remaining part of this paper contains the following sections. Section 2 describes the application framework where KDD/DM is utilized to disclose knowledge in an insurance transaction database for the case company. Data collection and preparation, mining mechanism used, learning accuracy testing, and discovered knowledge and interpretation are included in the same section. Discussions are presented in Section 3. Section 4 concludes this research and addresses future research.
2. Application framework The application framework is illustrated in Fig. 1. It consists of two major components. The first component deals mainly with the task of knowledge discovery and feedback from the application case. The second one is the discussions drawn from the application case, as a reference for any case that is the same as or similar to the current study. The application features are described in Table 1, including the objective of the application, the size of database used, the way to prepare data (e.g. attribute selection) that was mined, the way to granulate attributes, the mining mechanism used, the representation of discovered knowledge, and the purpose of the discovered knowledge. Details are described below. 2.1. Case background The application case as a participant in the current study was a leading insurance company in Taiwan. In the first meeting, we found, surprisingly, that the case Database Data preparation Data selection
Discussions of case company
Feedback
Granulation Data mining Results
Discussions
Accuracy test Discovered knowledge and interpretation
1. The experiences of KDD/DM adoption in targeting customers for insurance industry. 2. Usability of KDD/DM in supporting decision-making operation. 3. A lesson of theory toward practice.
Fig. 1. Application framework.
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299 Table 1 Features of the application case Features
Description
Objective
KDD/DM application in insurance industry 188,464 cases 1. Transformation from text to database 2. 7 Numeric attributes and 1 semantic class selected by a discussion of the case company 3. 15 Concepts in class attribute Listed in Table 2 by a discussion of case company Induction-based ID3 Decision rules via classification
Size of database used Data preparation
Granulation regulation Mining mechanism used Representation of discovered knowledge Use of discovered knowledge
Targeting customers for insurance products
company seemed unfamiliar with KDD/DM. What they knew was that customer information and transaction data of their company were stored in the databases for the maintenance of general operations, such as insertion, deletion, update, and reference. Importantly, although the case company was willing to support our current study, what they mainly cared about was the information security, particularly the customer’s personnel data. In consequence, it took us much time to explain to them the value of usage data in order for this study to be continued. After a series of in-depth interviews with managers, the case company finally was willing to provide their transaction database with customer names eliminated and identifiers replaced with series numbers. The size of database they provided was 188,464 cases. The case company also said that they might not have any support
293
on technical issues, but managerial opinions. For example, they could give comments on whether or not the discovered knowledge was sense-making for their marketing strategy. This part is described later in the section of implications and discussions. 2.2. Data preparation and selection The original database from the case company was in text format, so we first transformed these data from text to database by simply using a database management system. The total number of attributes was nineteen. It is known that too many attributes involved will very possibly result in discovered information that is difficult to interpret, or even meaningless. Therefore, by in-depth discussion with domain managers, we eliminated some of the attributes and finally came to a conclusion of 7 attributes, namely (1) annual premium (Ap), (2) age of insured policy (Aip), (3) policy period (Pp), (4) insured age (Ia), (5) financial resources (Fr), (6) installation premium (Ip), (7) insured amount (Iam), and one class, namely type of insurance product. The attribute of class in the database showed 15 types of insurance products (see Table 2). Furthermore, because some attributes contained numerical value, granulation was necessary in order to have the mining mechanism perform the knowledge explosion operation (Wu & Urpani, 1999). Generally, the number of granules for a context will greatly affect the information granularity to be defined and final decision rules to be discovered (Wu, 2003b). However, a dilemma occurs with this decision since it is very possible that the generated decision tree will be too complex to interpret if the number of granules is too large. More specifically, the decision tree to be generated using induction-based technique will be too big to interpret patterns if labeling
Table 2 Granulation regulation for attributes Attr. #
Names
Granulation regulations Scales
1 2 3 4 5 6 7
Annual premium (NTD) Age of insured policy (years) Policy period (years) Insured age (years) Financial resource (10 thousands NT) Installation premium Insured amount (10 thousands NT) Class (15 types)
Very little (0–5000) Fresh (1–5)
Little (5001–20,000) Junior (6–15)
Medium (20,001–50,000) Senior (above 15)
High (50,001–150,000) –
Very high (above 150,000) –
Short (1–10)
Medium (11–20)
Long (above 20)
–
–
Child (1–14) Low (below 100)
Adult (31–50) High (300–1000)
Oldster (above 51) Very high (above 1000) Monthly
– –
Yearly
Young (15–30) Medium (101–300) Bi-yearly
Low (below 50)
Medium (51–200)
High (201–1000)
Quarterly
Single-premium
Very high – (above 1000) CA01, whole life annuity; CA02, term annuity; CA03, whole life return; CA04, term return; CA05, term deposit; CA06, whole life endowment; CA07, term for juvenile; CA08, term for women; CA09, whole life protection; CA10, whole life health; CA11, term health; CA12, whole life health return; CA13, term health return; CA14, whole life family; CA15, term family
294
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
too many granules. If too small, a granule may not be able to contain acceptable number of instances, and consequently the discovered results may be meaningless. In spite of full freedom for a granulating method, the number of granules should be carefully chosen to eliminate unnecessary problems. By a discussion with the case company, the granulation operation was based on the granulation regulations that are listed in Table 2. For example, the attribute of annual premium was granulated into five levels, namely Very little, Little, Adequate, High, and Very high. The defined ranges were 0 to 5000 NT dollars for the level of Very little, 5001 to 20,000 for Little, 20,001 to 50,000 for Medium, 50,001 to 150,000 for High, and above 150,000 for Very high, respectively. Note that the class attribute contains 15 types of insurance products that are denoted from CA01, CA02,. to CA15. The collected transaction database, denoted by D, was then granulated for the mining operation. 2.3. Data mining 2.3.1. Mining mechanism used This current research utilized the classification method as the mining mechanism. ID3, an important and most widely applied classification method, was used to mine decision rules in the collected database. The ID3 algorithm, introduced by Quinlan (1986), has been wildly used to help measure the information entropy for a set of data under the consideration of multiple classes (Murthy, 1998; Sestito & Dillon, 1994; Wu, 2003a). Based on the information theory, it takes on a top-down induction method to return the degree of ability (DA) that a variable can separate the other. More precisely, the more the value of DA, the less equally the data is distributed, so it can be a better variable to separate the other. Therefore, the first step for using this algorithm is to obtain the expected information for the class, then the expected information for each variable in the dataset (e.g. annual premium), and finally the expected entropy for each variable, that is the difference of expected information between the class and the variable. Importantly, the main advantages of using ID3 were accuracy of prediction and ability of interpretation with respect to decision rule discovery. This decision was supported by previous studies such as (Mak & Munakata, 2002; Ohmann, Moustakis, Yang, & Lang, 1996; Stark & Pfeiffer, 1999). The ID3 uses the following three functions to perform its computation of expected entropy. Iðnc1 ; nc2 ; .ncn Þ n n nc nc c c Z K 1 log2 1 C/C K n log2 n M M M M
(1)
nCi number of records that return to class Ci, iZ1,2,.,n. M total number of records.
EðAÞ Z
t h X nvi iZ1
M
Iðavic1 ; avic2 ; .; avicm vicm Þ
i
(2)
t number of different values that appear in attribute A. nvi total number of records that attribute A takes value Vi, iZ1,2,.,t. avicj total number of records that attribute A takes value Vi and returns to class Cj, iZ1,2,.,t, jZ1,2,.,m. M total number of records. Exp EðAÞ Z Iðnc1 ; nc2 ; .ncn Þ K EðAÞ
(3)
Basically, all records (or instances) in a granulated database have to be taken into account because they all have to participate in the generation of decision rules. The discovery process starts partitioning via the value that the attribute takes on to form a decision tree, and consequently the decision rules can be generated. For example, when the attribute of Annual premium takes on the value of ‘high’, all records are checked to see whether or not the value that the class takes on is the same. If so (Life insurance), a rule ‘IF Annual premiumZhigh THEN ClassZLife insurance’ can be simply obtained. However, if the conclusion is inconsistent, the second order of the attribute (e.g. Financial resource) needs to be considered. This process is repeated until same conclusion is reached to return decision rules. 2.3.2. Learning accuracy test Although ID3 has shown its adequacy of prediction accuracy, our study did not skip over the process of accuracy test. Therefore, we randomly divided D into two parts that were denoted as DA and DB, respectively. The DA contained 2/3 of the size for the purpose of learning (120,094 examples in total), and the DB 1/3 for the purpose of accuracy testing (68,370 examples in total). Both DA and DB were granulated based on the granulation regulations in Table 2. DA was then used as the inputs for ID3 mining mechanism to generate decision rules. Eventually, the purpose of using ID3 is to derive the information gain for each attribute, so the decision rules can be generated accordingly. We conducted this process for each attributes Table 3 Expected entropy from each attribute Attributes
Expected information
Expected entropy
Class Annual premium Financial resource Policy period Age of insured policy Insured age Installation premium Insured amount
1.8877 1.5815 1.7408 1.7770 1.8195 1.8351 1.8381 1.8478
– 0.3062 0.1469 0.1107 0.0682 0.0526 0.0496 0.0399
Order
1 2 3 4 5 6 7
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
295
Table 4 Part of the generated decision rules Rule #
Ap
Fr
Pp
Aip
Ia
Ip
Iam
Class
Marks
1 2 3 4 5 6
Medium Adequate Adequate Adequate Adequate Adequate
Medium Medium Medium Medium Medium Medium
Medium Medium Medium Medium Medium Medium
Junior Junior Junior Junior Junior Junior
Oldster Adult Adult Adult Adult Adult
Bi-yearly Monthly Monthly Bi-yearly Yearly Yearly
Medium Medium Low Low Medium Low
CA01 CA10 CA01 CA01 CA10 CA01
12 10 152 225 24 134
by using formulas (1)–(3) and obtained the results downwardly listed in Table 3, indicating that the attribute of annual premium obtained the highest information gain, which was 0.3062, then financial resource of 0.1469, and so on so forth. Part of the final generated decision rules was shown in Table 4. For example, the first record can be read as that ‘If the annual premium is ‘medium’, financial resource is ‘medium’, policy period is ‘medium’, age of insured policy is ‘junior’, insured ageZ‘oldster’, installation premium is ‘bi-yearly’, and insured amount is ‘medium’ Then insurance type is ‘whole life annuity’’. Note that the number 12 appearing in the column of marks indicates that there were 12 instances returned for this rule. There were 2689 rules in total generated from DA. The overall decision rules were then used to check every example in DB for accuracy testing. Of the 68,370 examples
in DB, 55,172 were checked as satisfactory, indicating an 80.70% learning accuracy. Although not very great, this number is highly acceptable. 2.3.3. Discovered knowledge and interpretation The study then moved to the knowledge discovery for the whole granulated database; in other words, the size of database involved was 188,464 cases. The information gain by ID3 for the attributes indicated the same order as listed in Table 3, so, we performed the decision rule generation and obtained 3103 decision rules in total. The discovered knowledge was then demonstrated to the case company to obtain responses with respect to decision supportability. First, the number of marks implicitly indicates the strength of a rule. More precisely, the more the number of marks a decision rule has, the more reliable it is because it covers
Fig. 2. Part of the discovered decision rules.
296
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
more cases in the transaction database. It is believed that management will prefer a decision rule if it is supported by a higher number of cases. Therefore, it is necessary to define a threshold as a filter to elicit the decision rules that have a certain number of marks. However, there is no evidence so far that can be relied on to derive an adequate threshold, and usually it is a highly domain-dependent problem. Theoretically, since there were 188,464 cases used and 3103 rules discovered, each rule should be supported by 60.7 marks in average. By discussion with the case company, the study finally came to the decision that the threshold was determined to be 61 and finally obtained 1264 decision rules. Part of these is illustrated in Fig. 2. These decision rules were used to serve as an advisor to help in evaluating whether or not a customer will give a certain reply based on his/her personnel characteristics. Second, although more reliable decision rules were obtained, the case company still had difficulty
understanding what these rules meant when globally looking at the decision-supportability of the discovered knowledge. We therefore, re-summarized the obtained rules based on the aspect that was to disclose the obvious relationships between insurance products and the characteristics of customers, so the case company could target the right customers effectively when new and/or existing products were being placed in market. Based on three scales of marks, part of the summarized results in a concise manner as well as the opinions from case company on these results is listed in Table 5. Although they did not completely understand the case company in general gave positive support with respect to the discovered knowledge. Importantly, notable observations based on the customers’ characteristics were also described below. (1) For the annual premium, it was found that the lower the annual premium is, the more the tendency is that
Table 5 Interpretation and opinions from the case company for discovered knowledge Discovery knowledge
Marks
Interpretation
Opinions from the case company
If Ap is medium or high Then product is CA01
61–200
Customers who prefer medium or high annual premiums are most likely to purchase the product of whole life annuity Customers whose financial resources are high and age is from 1 to 14 are most likely to purchase the product of whole life annuity New customers who prefer low annual premium are most likely to purchase the product of whole life health Customers whose financial resources are low or medium and whose age is from 31 to 50 are most likely to purchase the product of whole life protection Customers whose financial resources are high and age is from 1 to 14 are most likely to purchase the product of whole life protection There is a trend that customers who purchase the product of whole life protection will prefer their policy period to be from 1 to 10 years Precisely, hot products are whole life annuity and whole life health
Makes sense. A high annual premium implies a high insured amount as long as investment is concerned Makes sense. Parents who have a high income are willing to make a long-term investment for their children Not very sure, but interesting
Customers who prefer medium or low annual premiums and whose financial resources are low or medium are most likely to purchase whole life annuity Parents are willing to purchase the product of whole life annuity for their children, as a longterm investment, even they do not have a good financial resources Customers who prefer low annual premiums and like their policy period to be from 1 to 10 years, and whose financial resource is low are willing to purchase whole life protection A very high proportion of customers prefer whole life health New customers who like their policy period to be from 11 to 20 will be most likely to purchase whole life health
Interesting. Does this mean that the product of whole life annuity likely captures the attentions of customers, even though their financial resource is medium or low? Makes sense. Investment for children via an insurance plan has become a trend. This may be the answer of the previous question
If Fr is high and Ia is Child Then product is CA01 If Ap is low and Aip is fresh Then the product is CA10 If Fr is low or medium and Ia is adult Then product is CA09
If Fr is high and Ia is child Then product is CA09 If Pp is short Then product is CA09 Most products purchased in this mark range are CA01, CA02, CA10, and CA12 If Ap is medium or low and Fr is low or medium Then product is CA01
201–500
If Ap is medium, Fr is low, Pp is long, and Ia is child Then product is CA01 If Ap is low, Fr is medium or low, and Pp is short Then product is CA09 More than 72% of the products purchased in this range is CA10 If Pp is medium and Aip is fresh then product is CA10
Above 500
Not very sure, but interesting
Very interesting in comparison with the previous one Not very sure, but very interesting
Very true
Not very sure in regard to the policy period. However, it is sure that customers like whole life protection, no matter what level the annual premium and/or the financial resource is Makes sense Makes sense. Most new customers purchased whole life health
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
customers will prefer products of both whole life health and whole life protection. However, these customers are most likely to shift to the product of whole life annuity, since the annual premium increases to the level of high and/or very high. Interestingly, those customers who can accept a high (or very high) annual premium are also willing to accept the product of term return, in addition to the whole life annuity. This is quite sensible because a high annual premium reasonably produces a high insured amount, which can be a sign of investment concern. For the financial resources, it was found that with less financial resources, there is more tendency that customers are more likely to accept the product of whole life health. Indisputably, most customers with poor financial resources have a need for medical compensation in case of accident or sickness. On the contrary, those with good financial resources are more willing to accept the product of whole life annuity because they are more capable of paying medical expenses for any accident or sickness. (2) For the policy period, it is quite clear that most customers take on the medium level of policy period. There are two reasons for this. One is that the short policy period will cause a higher annual premium. The other is that, although it has a lower rate of annual premium, a long policy period will cause a longer financial load. Consequently, customers are willing to take on the middle-of-the-road strategy. For the age of insured policy, it was found that most seniors purchase the product of whole life annuity, instead of whole life health. This is because the medical care in the past fifteen years was not as good as it is nowadays. Increasingly, we find that more than a half younger people accept the product of whole life health. Currently, as people pay more attention to medical care, this product is becoming increasingly popular. Furthermore, the products accepted by the new customers are varied, primarily due to the change of insurance concepts and the innovation of diverse insurance products. Importantly, the disclosed results indicate that the new customers do not accept the product of term deposit because of the lack of health care. (3) In regard to insured age, it was found that the less the insured age is, the more the tendency is that customers will purchase insurance products of both protection and return. As the insured age increases, this tendency turns to focus on products only with the type of protection. The major reasons for this are: (1) the contents of protection, for example, medical compensation for accidents, rather than heart attack should be included for children; and (2) the rate of annual premium increases as insured age increases. Eventually, the case company should help customers to better understand their requirements to effectively select the most advantageous products. After all, knowledge in
297
insurance domain is not easy to catch and customers’ characteristics are too various to identify. (4) Concerning the installation premium, it was found that most customers take on the yearly level because of the premium discount. However, the purchased products by this installation method vary. This implies that customers are willing to accept the yearly installation premium, no manner what type of product they purchase. Furthermore, those products using the single-premium are mostly the term deposit and whole life protection. The reason is that customers taking on this installation method usually have good financial resources, and in consequence investment and asset keeping are major concerns. Regarding the insured amount, it was found that almost 2/3 of the cases are in the level of low (below 500,000 NT dollars) and almost 1/3 are medium (from 500,001 to 2,000,000). Most purchased products for these two levels of insured amount center on whole life health and whole life annuity. However, as the insured amount increases, the number of both products decreases. This implies that customers are more likely to purchase products with low or medium level of insured amount.
3. Discussions As indicated in Fig. 1, discussions focus mainly on three points, including the experiences of KDD/DM application in targeting customers for the insurance industry, the usability of KDD/DM in supporting decision-making operation, and a lesson of theory toward practice. First, the KDD/DM technology is basically a data-based paradigm used as a management tool in support of decision-making. Although the mode of experience-oriented decision-making has been in fact utilized for a long time, and will continue to play an important role in support of management, it has been seen that the decision support mode has increasingly shifted from experience-oriented to information-oriented. The experience of this case application showed that at the beginning of the first interview, the case company seemed unfamiliar with what KDD/DM is about and what objective KDD/DM was attempting to reach. Generally, the incremental case company hesitated to accept an innovative technique unless they are confident that the use of KDD/DM is one way of doing things better, and not just of doing things in a different manner. Therefore, for those insurance companies who do not understand well what data reuse is, what can be obtained from it, and how the obtained knowledge can help in making decisions, their consultants and vendors should deepen the KDD/DM awareness of these types of insurance companies. More importantly, it is our suggestion that researchers, vendors, consultants, and government agencies should first help the users understand how KDD/DM can help in their decision-making.
298
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
Moreover, we also found that the top management of an insurance company is the main decision-maker, particularly when it comes to major decisions. Therefore, our current study strongly suggests that obtaining their attentions to enhance the possibility of KDD/DM acceptance is the first step to spread use of KDD/DM technology. In regard to the usability of KDD/DM, we have demonstrated the discovered knowledge such as what a rule means, how to interpret a rule, what information a rule can show, etc. The case company then conducted an informal review according to their experience for the discovered knowledge and gave some opinions (see Table 5). Importantly, although not entirely sense making, the discovered knowledge have attracted much attention in the case company. This is because the results showed the case company connections they have never considered between their products and customers. The case company realized that the competitive environment does in fact give significant force for companies to use information technologies, particularly those that can help to instantaneously reflect changes of customers, suppliers, and even the organizations themselves. In support of this, they need to keep tracing all changes of customers’ or competitors’ behavior by exploring information in databases. The usability of the discovered knowledge presented in our current research showed that at this stage KDD/DM could be a possible path toward enhancement of targeting customers. For the impact of the discovered knowledge at the second stage, it will take time for the case company to collect data and do comparison. For example, does the discovered knowledge significantly help increase the customer response for the next year (or next 2, 3, 4,., years)? This part is one of our important future research focuses. Finally, it is generally believed that theory is one thing, and practice is another. From this application, we have seen that bringing theory into the practical environment is not as easy as expected. The KDD/DM profession, application process, and knowledge of insurance domain are all issues that need to be considered carefully. More importantly, the ultimate goal for the KDD/DM technology is not the technology itself, but to help enhance the quality of management, in particular the quality of decision-making. This needs continuous participation of domain researchers and experts. Knowledge hidden in data can change so fast that decision-makers are often ignored. Usually, an insurance company can offer a number of products/services. These products are diverse and constantly change in order to satisfy a variety of customers’ preferences and requirements, especially when one-to-one marketing strategies are being increasingly used. Basically, an insurance company must be aware of the capability of learning in its data, its potential benefits, and how to distribute it properly, so they can more easily accept and implement a KDD/DM plan. Our suggestion is that academic and government agencies responsible for narrowing the gap between theory and practice of KDD/DM should make more efforts to raise KDD/DM visibility. To help in this, they can encourage
KDD/DM seminars and training programs by financial support, especially designed for top management of insurance and other applicable companies.
4. Concluding remarks In this paper, we have described the importance of practical KDD/DM application, the major tasks required for the KDD/DM application, the case application of an insurance company, their feedback, as well as discussions and implications of the application experiences. Not merely focusing on technical points, the core research attempt in this paper is to obtain experiences of practical application for KDD/DM. Three points obtained from this application case are also addressed: (1) the experiences of KDD/DM application in targeting customers for the insurance industry; (2) the usability of KDD/DM in supporting insurance decision-making operation; (3) a lesson of theory toward practice. It is seen that the gap between theory and practice still need to be narrowed, in particular to persuade the insurance company to change their concept of using KDD/DM where information security is concerned. This research would be of value because of the practical evidence that draws attention to the significant roles played by the KDD/DM applied in the insurance industry. However, although our research successfully demonstrated the KDD/DM application in insurance domain, other issues may also be significant, such as considering other customer attributes (e.g. occupation), computerizing the whole process for the complex computation (e.g. module development), and updating the discovered results over time (e.g. software development). As personalized marketing becomes increasingly important for customer response, so KDD/DM is utilized in targeting customers effectively, as shown in this application case for insurance industry. The personalized marketing emphasizes that the marketing strategy should center on customer individual characteristics. Accordingly, the experience-oriented style that most insurance companies take on is no longer sufficient to run such a complicated task because so many customers and so many characteristics need to be considered. With the rapid growth in the volume of data that information systems collect, more and more marketers are likely to be using data-based decision support tools to improve the efficiency as well as effectiveness of their marketing decisions. However, despite the many KDD/DM tools available (e.g. Giraud-Carrier & Povel, 2003), particularly for business, the use of this technology is a highly domain-specific task that may require domain knowledge as well as KDD/DM professionals to accomplish, as we have seen in this current application case. Particularly, data preparation is the most time-consuming process of KDD/DM application. From the application point of view, what is most
C.-H. Wu et al. / Expert Systems with Applications 29 (2005) 291–299
important may be a methodology that can both deal with data preparation and carry out knowledge interpretation, utilization and evaluation on a domain-by-domain basis. As suggested by Clifton and Thuraisingham (2001), although standardizing the KDD/DM process is not easy, it can help KDD/DM technology to be confidently accepted by business management. To reach this goal, we need much effort.
References Brockett, P., & Xiaohua, X. (1997). Operations research in insurance: A review. Insurance: Mathematics and Economics, 19(2), 154. Clifton, C., & Thuraisingham, B. (2001). Emerging standards for data mining. Computer Standards & Interfaces, 23, 187–193. Donato, J. M., Schryver, J. C., Hinkel, G. C., Schmoyer, R. L., Leuze, M. R., & Grandy, N. W. (1999). Mining multi-dimensional data for decision support. Future Generation Computer Systems, 15(3), 433–441. Fayyad, U., & Stolorz, P. (1997). Data mining and KDD: Promise and challenge. Future Generation Computer Systems, 13(2–3), 99–115. Feelders, A., Daniels, M., & Holsheimer, M. (2000). Methodological and practical aspects of data mining. Information & Management, 37, 271–281. Giraud-Carrier, G., & Povel, O. (2003). Characteristising data mining software. Intelligent Data Analysis, 7(3), 181–192. Han, J., & Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11(5), 798–805. Hirota, K., & Pedrycz, W. (1999). Fuzzy computing for data mining. Proceedings of the IEEE, 87(9), 1575–1600. Hui, S. C., & Jha, G. (2000). Data mining for customer service support. Information & Management, 38(1), 1–13. Kohavi, R., & Provost, F. (2001). Data mining and knowledge engineering. 5(1/2).
299
Mak, B., & Munakata, T. (2002). Rule extraction from expert heuristics: A comparative study of rough sets with neural networks and ID3. European Journal of Operational Research, 136(1), 212–229. Murthy, S. K. (1998). Automatic construction on decision three from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2, 345–389. Ohmann, C., Moustakis, V., Yang, Q., & Lang, K. (1996). Evaluation of automatic knowledge acquisition techniques in the diagnosis of acute abdominal pain. Artificial Intelligence in Medicine, 8(1), 23–36. Pitta, D. (1998). Marketing on-to-one and its dependence on knowledge discovery in databases. Journal of Consumer Marketing, 15(5), 468–480. Quinlan, J. R. (1986). Induction of decision tree. Machine Learning, 1, 81–106. Sestito, S., & Dillon, T. S. (1994). Automated knowledge acquisition. New York: Prentice Hall. Shapiro, A. (2004). Fuzzy logic in insurance. Insurance: Mathematics and Economics, 19(2), 399–424. Shim, J. P., Warkentin, M., Courtney, J. F., Power, D., Sharda, R., & Carlsson, C. (2002). Past, present, and future of decision support technology. Decision Support Systems, 33, 111–126. Stark, K. D. C., & Pfeiffer, D. U. (1999). The application of non-parametric techniques to solve classification problems in complex data sets in veterinary epidemiology-an example. Intelligent Data Analysis, 3(1), 23–35. Sung, H. H., & Sang, C. P. (1998). Application of data mining tools to hotel data mart on the intranet for database marketing. Expert Systems With Applications, 15(1), 1–31. Wu, C. H. (2003a). Data mining applied to material acquisition budget allocation for libraries: Design and development. Expert Systems With Applications, 25(3), 401–411. Wu, C. H. (2003b). On the granulation simplicity for the decision rule discovery in databases: EWI vs. EFI. International Journal of Science and Technology, 14, 28–36. Wu, X., & Urpani, D. (1999). Induction by attribute elimination. IEEE Transactions on Knowledge and Data Engineering, 11, 805–812.