Electronic Commerce Research and Applications 6 (2007) 19–28 www.elsevier.com/locate/ecra
Development of an automatic customer service system on the internet Judy C.R. Tseng b
a,1
, Gwo-Jen Hwang
b,*
a Department of Computer Science and Information Engineering, Chung Hua University, Hsinchu, 300, Taiwan, ROC Department of Information and Learning Technology, National University of Tainan 33, Sec. 2, Shulin St., Tainan city 70005, Taiwan, ROC
Received 25 November 2004; received in revised form 5 August 2005; accepted 21 April 2006 Available online 12 September 2006
Abstract Most existing network-based customer services heavily rely on manpower in replying e-mails or on-line requests from customers, which not only increases the service cost, but also delay the time for responding the service requests. To cope with these problems, this paper proposes a customer service system, which can automatically handle customer requests by analyzing the contents of the requests and finding the most feasible answers from the frequently asked question (FAQ) database. In the situation that a customer is not satisfied with the reply, the system will forward the request to the appropriate service personnel for further processing. An assistance mechanism has been developed to help the service personnel in finding potential answers from existing FAQ data or creating more appropriate answers. Experimental results on practical applications showed that over 87.3% of users were satisfied with the replies given by the system; therefore, we conclude that the system can significantly reduce the service cost and provide more efficient and effective customer service. 2006 Elsevier B.V. All rights reserved. Keywords: Customer relationship management; Customer service system; Document matching; Call center; Internet applications
1. Introduction Researchers showed that enterprises would increase over 60% profits if they can build strong relationships with customers [15]. Nevertheless, some investigations also indicated that most businesses loose their customers by 25% rate per year in average [14]. The cost of finding new ones, however, is five times that of keeping original customers. Therefore, one of the most important issues for increasing enterprise competitive advantages is the development of new mechanisms to provide good service to customers. Customer relationship management (CRM) is an integrated solution designed to reduce costs and increase profitability by solidifying customer loyalty. A successful CRM *
Corresponding author. Tel.: 886 915396558; fax: 886 6 2606132. E-mail addresses:
[email protected] (J.C.R. Tseng), gjhwang@mail. nutn.edu.tw (G.-J. Hwang). 1 Tel.: +886 915396565. 1567-4223/$ - see front matter 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.elerap.2006.04.009
includes four key elements: consumer commutation management, customer relationship management, decision support management and integration of financial or logistic system. WorldTalk Corporation estimates that over 60 millions of people prefer to deal with their work by e-mail. The investigation of Department of Industrial Technique in Taiwan also observed that 96% of companies have employed e-mail systems to serve their customers. It seems that e-mail has become an important communication tool for most people. However, an investigation in 1998 has pointed out that enterprises gave no response to 65% of customer’s inquiry [17]. The report also indicated that 63% of responded customers spend 5 days or longer to wait for the response. From these observations, it seems to be valuable to build up an efficient system for dealing with customer requests. In recent years, e-mail based customer service system has been discussed and presented in several researches [3,5]. In most existing systems, automatic e-mail reply
20
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
functions and customer request classification functions are not taken into consideration. Li and Tseng [13] proposed a system that can automatically handle customer requests by analyzing the request contents; however, the domain expert must define the weight of each keyword in advance, which is time-consuming and will decrease the accuracy of the analysis results. To cope with these problems, this study proposes a new system, automatic customer service system (ACSS), which can automatically reply the requests from customers by invoking a knowledge base containing a set of frequently asked questions (FAQ’s) and the corresponding character vector (CV) of each FAQ. While receiving a request, ACSS will generate a CV for the request, and compares it with those of the FAQ’s by employing the space vector concept to find the most feasible answer for the customer. If no feasible answer can be found, ACSS will forward the inquiry to the appropriate service staff. Once the service personnel has provided an answer to the request, the new request as well as its answer will be recorded in the FAQ database, such that the system will be able to reply similar requests automatically. 2. Relevant works Since 1985, enterprises started to build up call centers to provide customer services. Most of the call centers only serve consumers at regular office hours. Moreover, the service personals must receive a series of training courses before they can offer appropriate customer service, which implies the requirement of a large amount of training cost. It can be seen that such traditional service systems not only provide inefficient and ineffective service, but also increase the service cost; therefore, the development of automatic customer service systems has become an important issue for enterprises. Witt et al. [22] examined the relationship of the interaction between emotional exhaustion and conscientiousness with objectively-measured call volume performance and subjectively-measured service quality ratings among 92 call center customer service representatives of a financial service institution. They indicated that the interactive effects on call volume but not service quality; therefore, it might be a good idea to develop automatic customer service systems. A call center and its associated information technology (IT) provide an opportunity to redesign and improve service-delivery operations [1]. On-line FAQ database are frequently adopted in traditional customer service systems, especially in the Internet environment [2]. Many technical companies not only employ on-line call centers to improve the service quality, but also provide relevant techniques and software to assist other companies in maintaining customer relationships; for examples, the call center systems of Ticali, Dell and IBM. Although providing FAQ database can reduce service cost, the web site users need to select the possible category, link to the database and search for the answers manually,
which is usually time consuming, and the customers are likely to reach the limits of their patience, especially when the system load is heavy or the network traffic is burdened with a large amount of requests. To more accurately identify the requests of customers, researchers have proposed several methods in recent years. For example, Hoch [8] presented the statistical methods of information retrieval used at INFOCLAS, which is capable of classifying print business letters according to message types such as order, offer and enclosure. In 1996, Cohen [5] proposed the ‘‘key word spotting rule’’ approach, which can efficiently classify e-mails and has been applied to the development of e-mail management systems. In the meantime, Cooper[6] reported the FAQfinder, which employed a set of weighted parameters to determine the similarity between the customer’s request and the FAQ. Later, Li and Tseng [13] proposed an intelligent network-based customer service system (INCSS), which assigns a weighted keyword set to each FAQ, and then compares the keyword set of the customer’s request and that of each FAQ to find the most feasible answer. INCSS can automatically reply the requests submitted from the customers; however, the weighted keyword set of each FAQ need to be assigned manually, which is time-consuming. Other relevant studies include the issues of keyword retrieval and sentence similarity comparison. There are several ways to retrieval keywords, e.g., term extraction method [17], phrase extraction method [11], and Statistic analysis method [9]. Term extraction method can detect important terms from classified input text; phrase extraction method is used to detect phrases in the text; statistic analysis method can identify possible keywords from a large amount of unclassified input text by computing the occurrences of each keyword. 3. Structure of the automatic customer service system Fig. 1 presents the structure of the automatic customer service system (ACSS), which consists of four databases and six modules. In the Keyword Database, a set of domain-relevant keywords is maintained. Once a question has been submitted by the customer, the Question Identification Module will try to identify the features of the question by checking if any word of it is in the Keyword Database. The matched keywords of each question form a characteristic vector (CV), which represents the features of the customer’s question, and is then used to find the best-fit answer by applying the similarity comparison algorithm addressed in the next section. The FAQ and the corresponding answers are kept in the FAQ database. The workflow of ACSS is given as follows: (1) Receives a user’s e-mail. (2) Invoke the Question Identification Module to decompose the e-mail into several terms defined in the Keyword Database. Accordingly, the question submitted by the customer is transferred into a CV.
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
21
Automatic Customer Service System Request
Question Identification Module
Answer
Answer Judgment Module
User
Reply
Characteristic Vector
Answer Picking-up Module Satisfaction Analytical Module
Service Personnel Distribution Module
Answer
Keyword Database
FAQ Database
Service Personnel Database
Service Personnel Assistance Module
Fig. 1. Structure of the automatic customer service system.
(3) Invoke the Answer Judgment Module and Answer Picking-up module to select the best answer from FAQ Database by matching Characteristic Database, and reply the answer to the user. (4) If the customer is not satisfied with the answer, the system will assign the inquiry to the appropriate service staff. An assistance system is also developed to help the service staff finding a suitable answer from the FAQ database. 4. Information retrieval algorithms of ACSS Owing to the growth of popularity of computer and network technologies, information retrieval has become a widely discussed issue. In the past decades, the issue of information retrieval has attracted the attentions from researchers all over the world. Spink [19] studied the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. He found that terms extracted from the users’ question statements were the most effective among others. Later, Savoy [18] proposed a vector-processing scheme for searching information in hypertext systems. Vakkari [21] analyzed certain features of working tasks and relate these features to types of information people are looking for and using in their tasks, extracting patterns of search strategies for obtaining information and
relevance assessments in choosing retrieved documents. Recently, Kerenidis and Wolf [10] presented an algorithm to cope with the Private Information Retrieval problem. Moreover, Hansen and Jarvelin [7] presented empirical results to show that the patent task performance process involves highly collaborative aspects throughout the stages of the information seeking and retrieval process. They also showed that these activities may be categorized and related to different stages in an information seeking and retrieval process. In this section, we shall present the information retrieval algorithms of ACSS, which are base on the notations from the results of previous investigations, such as vector processing and frequency of the occurrences of query terms. 4.1. Request analysis algorithm Among various issues concerning information retrieval, the central problem is ranking documents according to their relevance to a query [12]. Croft [4] suggested that significant improvements in retrieval performance will require techniques that, in some sense, ‘‘understand’’ the content of documents and queries in order to infer their probable relationships. In this view, information retrieval is an evidential reasoning process in which we estimate the probability that a user’s information need is met given a document as ‘‘evidence’’ [20].
22
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
TF·IDF is a well-known scheme for ranking N documents according to their relevance to a query containing M query terms [16]. Researchers have attempted to extend the scheme to extract additional information from hypertext links to enhance retrieval effectiveness in the World Wide Web environment [18,23]. The original TF·IDF formula is given as follow: X TFi;j Ri;q ¼ 0:5 þ 0:5 IDFj and TFi;max termj 2q ! N X IDFj ¼ log N = C i;j
keyword set and the number of occurrences of each keyword. Each question in Table 1 can be represented as the following concept vectors: Table 2 Answer part of the FAQ’s Serial number
Answers to question in Table 1
Q1
A disaster recovery plan is a document that details what should be done in an organization, and by whom, if critical ISs go down, or if IS operations become untrustworthy. The plan includes a list of preventive measures as well as procedures to implement in the event of a disaster, and should minimize the number of decisions that must be made following the disaster. It is set up to address the worst-case scenario, but should permit parts of the plan to be executed when less severe disruptions occur. It addresses hardware, systems software, applications software, and communications. The RFI is a document prepared by the organization interested in purchasing the software. The document is sent to vendors. It requests general, somewhat informal, information about the software package. The RFP is also a document prepared by the project team of the ordering organization. It specifies all the system requirements and solicits a proposal from each vendor contacted. The response to the RFP should include technical requirements and a detailed description of the implementation process as well as a timeline and budget that can be easily transformed into a contractual agreement. The goals of information security measures are: (1) to lower the risk that systems and organizations may cease operating; (2) to maintain information confidentiality; (3) to ensure the integrity and reliability of data resources; (4) to ensure the availability of data resources; and (5) to ensure compliance with national security laws and privacy laws. (1) Loose fit between needs and features. Ready-made software is developed for the widest common denominator of potential user organizations. It may not fit the needs of each individual organization or its culture. (2) Bankruptcy of the vendor. If the vendor goes out of business, the purchaser is left without support, maintenance service, and the opportunity to purchase upgrades for an application to which it is committed. Much of the investment of training users is lost, too, as the organization may need to adopt totally new software instead of only upgrades for software with which the employees already have much experience. (3) High turnover of vendor personnel. Turnover among IS professionals is significantly higher than in other occupations. If a significant number of employees involved in application development and upgrading leave the vendor, support for adopters is likely to deteriorate, and upgrades will be of poor quality.
i¼1
where Di represents the i-the document for 1 < i < N, Qj represents the j-th query term in q for 1 < j < M, TFi,j represents the number of occurrences for Qj in Di, TFi, max represents the maximum number of occurrences for the key terms in Di and Ci,j = 1 if Di contains Qj; Ci,j = 0, otherwise. In comparing with World Wide Web or library documents, we found that the average number of words in the FAQ documents is apparently small. That is, it is not necessary to use a complex formula (such as TF·IDF) for obtaining the ratings of the documents. Therefore, we try to simplify the TF·IDF formula such that accurate ratings can be obtained while the efficiency of document retrieval can be improved. As TF·IDF aims at taking both the keyword frequency in a specified document and its frequency in the global documents into consideration, for a set of questions Q = {Q1, Q2, . . . Qm}, we use a character vector (CV) to represent question Qi as follow: fij W ij ¼ P m fkj
Q2
Q3
Q4
ð1Þ
k¼1
CVðQi Þ ¼ fðK 1 ; W i1 Þ; ðK 2 ; W i2 Þ . . . ðK j ; W ij Þ . . . ðK n ; W in Þg ð2Þ where Kj represents jth keyword P and fij is the number of m occurrences of Kj in question Qi; k¼1 fkj is the total number of occurrences of Kj in the FAQ database. Assume that there are four FAQ’s in the database. Table 1 shows the question part of a FAQ database and Table 2 shows the answer part. Table 3 depicts the
Table 1 Question part of the FAQ’s
Table 3 Number of occurrences of each keyword
Serial number
Content of the questions
ID
Keyword
Number of occurrences
ID
Keyword
Number of occurrences
Q1 Q2
What is a business recovery plan, and what is its purpose? When considering the purchase of ready-made software, what is the purpose of the RFI (request for information), and what is the purpose of the RFP (request for proposal)? What should the response to the RFP include? What are the goals of information security measures? What are the risks involved in purchasing ready-made software?
K1 K2
Recovery plan Purpose
1 3
K6 K7
1 1
K3
Ready-made software RFI RFP
2
K8
Goal Information security measures Risk
1 2
K9 K10
Purchase Purchasing
1 1
Q3 Q4
K4 K5
1
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
23
Table 4 Illustrative example of a customer request
4.2. Similarity comparison algorithm
Question
As mentioned previously, CV(Q) and CV(Qi) are determined from the proposed question analysis algorithm. Base on the concept of vector space, an algorithm is used to compare the similarity between customer’s request and question part of each FAQ in the database. In our approach, inner product and Euclidean distance methods are employed to compute the degree of similarity.
What are the risks involved in purchasing ready-made software?
Table 5 Keywords and the corresponding weights of the illustrative example Keyword
Number of occurrences
Wj
Risk Purchasing Ready-made software
1 1 1
1/1 1/2 1/2
1 1 CVðQ1 Þ ¼ k1; ; k2; ð3Þ 1 3 2 1 1 2 1 k2; ; k3; ; k4; ; k5; ; k9; CVðQ2 Þ ¼ 3 2 1 2 1 1 1 CVðQ3 Þ ¼ k6; ; k7; 1 1 1 1 1 k 3 ; ; k 8 ; ; k 10 ; CVðQ4 Þ ¼ 2 1 1
ð4Þ ð5Þ ð6Þ
Similarly, the CV of customer’s request Q can be expressed as follows: CVðQÞ ¼ fðK 1 ; W 1 Þ; ðK 2 ; W 2 Þ . . . ðK j ; W j Þ . . . ðK n ; W n Þg ð7Þ fj Wj ¼ P m fkj
ð8Þ
k¼1
Table 4 demonstrates a customer’s request. The keywords and the corresponding weights of the request are depicted in Table 5.
Inner product method Inner product method is a matrix multiplication method. The similarity between customer’s request Q and FAQ i is given as n X Di ¼ ðW k W ik Þ ð10Þ k¼1
For the example given above, we have the following similarity values by employing the inner product method: D1 ¼ ð0 1=1Þ þ ð0 1=3Þ þ ð1=2 0Þ þ ð1=1 0Þ þ ð1=1 0Þ ¼ 0 D2 ¼ ð0 2=3Þ þ ð1=2 1=2Þ þ ð0 1=2Þ þ ð0 2=2Þ þ ð1=2 1=2Þ ¼ 1=2
ð11Þ
D3 ¼ ð1=2 0Þ þ ð0 1=1Þ þ ð0 1=1Þ þ ð1=1 0Þ þ ð1=1 0Þ ¼ 0 D4 ¼ ð1=2 1=2Þ þ ð1=1 1=1Þ þ ð1=1 1=1Þ ¼ 9=4
In Eq. (11), D4 is the maximum value, and hence the answer part of Q4 will be reply to the customer. Euclidean distance method By applying Euclidean distance method, the similarity between customer’s request Q and FAQ i is given as sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X 2 Di ¼ ð12Þ ðW k W ik Þ k¼1;n
For the example given above, the Euclidean distance values of the customer’s request and the question part of the FAQ’s are given as follows:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 2 ð0 1=1Þ þ ð0 1=3Þ þ ð1=2 0Þ þ ð1=1 0Þ þ ð1=1 0Þ ¼ 121=36 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi D2 ¼ ð0 2=3Þ2 þ ð1=2 1=2Þ2 þ ð0 1=1Þ2 þ ð0 2=2Þ2 þ ð0 1=1Þ2 þ ð1=1 0Þ2 ¼ 40=9 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi 2 2 2 2 2 D3 ¼ ð1=2 0Þ þ ð0 1=1Þ þ ð0 1=1Þ þ ð1=1 0Þ þ ð1=1 0Þ ¼ 17=4 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi D4 ¼ ð1=2 1=2Þ2 þ ð1=1 1=1Þ2 þ ð1=1 1=1Þ2 ¼ 0
D1 ¼
From Table 5, CV of the request is represented as follows: 1 1 1 CVðQÞ ¼ k 3 ; ; k 8 ; ; k 10 ; ð9Þ 2 1 1
ð13Þ
In Eq. (13), D4 is the minimum value, which implies that the customer’s question Q is most similar to the question Q4. This result is the same as that of the inner product method.
24
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
Fig. 2. ACSS user interfaces for maintaining the keyword database.
4.3. Service request assignment algorithm By assuming that each of the service personnel is responsible for a particular category of requests, a character vector CV(Pg) is used to describe the requests handled by service personnel g. In ACSS, we have CVðP g Þ ¼ fðK 1 ; W g1 Þ; ðK 2 ; W g2 Þ . . . ðK j ; W gj Þ . . . ðK n ; W gn Þg ð14Þ where Kj represents the jth keyword in one particular group; Wij is the relevance of Kj to service personnel g. By applying Euclidean distance method, the relevance of customer’s request Q and service personnel g is given as ffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X 2 Dg ¼ ð15Þ ðW k W gk Þ k¼1;n
Based on the computation procedure, the waiting-forreply requests can be assigned to the most relevant service personnel. 5. Implementation and evaluation ACSS was implemented on a PC server with Intel Pentium 4 1.6 G CPU and 512 MB RAM. The development environment consisted of Microsoft Windows 2000 operat-
ing system, Apache 1.3 web server, MySQL 3.23 database management system, and J2SDK 1.4 Java development kit. Fig. 2 shows the ACSS user interface for maintaining the keyword database. Fig. 3 shows the interface for inserting, updating and deleting service personnel information, including the keywords and their corresponding weights that are used to describe the experiences and knowledge of the service personnel. Fig. 4 shows the interface for editing the FAQ database. Once the customer requests are assigned to the service personnel, new questions and corresponding answers are entered to the database by applying the interface. Several experiments have been conducted to evaluate the performance of ACSS by comparing its execution results with those of previous approaches. Two FAQ databases with different languages were adopted in the experiments: the Yahoo!-Kimo database contained 578 FAQ’s and 953 key words in Chinese, and the MiTAC International Corp database contained 808 keywords and 569 FAQ’s in English. Two versions of data were generated from each FAQ database: the classified data based on six interrogatives, that is, ‘‘who’’, ‘‘what’’, ‘‘where’’, ‘‘when’’, ‘‘why’’ and ‘‘how’’, and the original data without being classified by the six interrogatives. Since there are several Chinese phrases that can be used to represent each of the six
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
25
Fig. 3. Interface for maintaining service personnel information.
interrogatives, an English-to-Chinese mapping table (see Table 6) is used to deal with Chinese requests and FAQ’s. In addition, three types of requests (as shown in Table 7) are used to evaluate the performance of different approaches, including the requests that are similar to the ones being asked before, the requests that are quite different from any request being asked, and the mixed case of the two types of requests.
Tables 10 and 11, it is observed that the correctness of ACSSED or ACSSIP is higher than INCSS about 2% 5%. In addition, the comparative results show the correctness for Random data and Classified Data in each system are close. In average, ACSSIP provides the best performance in getting top-five possible answers among these three systems, while ACSSED provides the best performance in getting the best answers.
5.1. Experiments to find the best-fit answer
5.3. Limitations of ACSS
Tables 8 and 9 present the experimental results by, respectively, applying ACSS with Euclidean distance method (ACSSED), ACSS with inner product method (ACSSIP) and INCSS to reply 550 requests from customers. It can be seen that the correct reply ratio of ACSS was higher than INCSS no matter which similarity comparison methods were adopted. The experimental results also reveal that ACSS is able to handle those questions that have not been asked before.
From the experimental results, some factors that might influence the finding of correct FAQ’s are observed and listed as follows:
5.2. Experiments for finding five best-fit answers Tables 10 and 11 illustrate the comparative results between ACSS and INCSS (in getting the ‘‘Top 5’’ possible answers). The customer service personnel can choose the answer among the possible answers to reply directly. From
(1) Keywords located at different positions of a sentence might leads to inverse meaning of a request. For example, the sentences ‘‘How to copy contact information from Yahoo! Address Book to Microsoft Outlook Express?’’ and ‘‘How to copy contact information from Microsoft Outlook Express to Yahoo! Address Book?’’ contains the same keywords ‘‘Address Book’’, ‘‘Yahoo’’ and ‘‘Microsoft Outlook Express’’; however, the meanings of these two questions are opposite. Therefore, if the customer is not satisfied with the answer, the system will try another answer with the same weight.
26
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28
Fig. 4. ACSS user interfaces for maintaining the FAQ database. Table 6 English-to-Chinese mapping table English word
Chinese phrase
What Where Who Why When How
Table 7 Experiments with different types of requests Number
Content of the experiment
Case 1
All of the customer requests are similar to the question part of some FAQ’s (i.e., such questions have been asked before). All of the customer requests are quite different to the question part of any FAQ (i.e., such questions have not been asked before); however, the answers of the requests exist in the FAQ database. Customer requests are randomly selected from Case1 and Case2.
Case 2
Case 3
(2) Customer requests are not clear or contain too few keywords. For example: The system might have difficulty to find a good answer for the question ‘‘What is My Yahoo!?’’ since only one keyword ‘‘Yahoo’’ does not provide sufficient information.
Table 8 Experimental results with Yahoo!-Kimo (Chinese) FAQ database Original data
Case 1 Case 2 Case 3 Average
Data classified by six interrogatives
ACSSED (%)
ACSSIP (%)
INCSS (%)
ACSSED (%)
ACSSIP (%)
INCSS (%)
97.2 81.1 88.8 89.3
94.4 80.8 88.1 87.7
77.7 62.5 70.5 70.2
95.1 78.8 88.2 87.3
94.4 81.6 87.4 87.8
84.7 59.6 70.3 71.5
(3) Customer requests contain too many keywords that are not relevant to the correct answer. For example: the user want to know ‘‘How to find the information about on-line game?’’; however, the question is ‘‘I want to play Yahoo! Game, the on-line help show me to the
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28 Table 9 Experimental results with Mitac (English) FAQ database Original data
Case 1 Case 2 Case 3 Average
Data classified by six interrogatives
ACSSED (%)
ACSSIP (%)
INCSS (%)
ACSSED (%)
ACSSIP (%)
INCSS (%)
98.1 92.9 95.7 95.5
92.9 90.5 91.8 91.7
73.2 66.3 69.3 69.6
94.7 84.4 89.0 89.3
92.1 82.9 86.7 83.9
80.4 66.0 72.8 73.0
Table 10 Experiments for finding five best-fit answers from Yahoo!KIMO FAQ database Original data
Case 1 Case 2 Case 3 Average
Data classified by six interrogatives
ACSSED (%)
ACSSIP (%)
INCSS (%)
ACSSED (%)
ACSSIP (%)
INCSS (%)
99.3 85.2 92.4 92.3
98.5 95.7 97.1 96.1
96.3 81.0 88.3 88.5
98.5 81.3 90.3 90.0
98.5 91.0 94.9 94.8
96.8 75.9 84.5 85.7
Table 11 Experiments for finding five best-fit answers from Mitac FAQ database Original data
Case 1 Case 2 Case 3 Average
Data classified by six interrogatives
ACSSED (%)
ACSSIP (%)
INCSS (%)
ACSSED (%)
ACSSIP (%)
INCSS (%)
99.0 95.8 97.3 97.3
98.1 96.6 97.0 97.2
93.5 90.2 91.3 91.6
95.8 87.9 91.1 91.6
96.3 86.1 91.6 91.3
94.6 85.0 89.1 89.5
system announcement of Yahoo! Game. But I cannot find the hyperlink. Please tell me how can I find the system announcement?’’ Those words with underline are the frequently appeared keywords that might confuse the system in finding the correct answers. In addition to these factors, poor formulation of the question on the asker’s side, and incorrect answers given in the FAQ database may significantly affect the performance of the systems. As the Mitac and the Yahoo!KIMO FAQ databases used in our experiment are already well tuned, we only need to pay attentions to the search of the best-fit answers. However, in dealing with a new created FAQ database, the validation of the answers given in the database will be an important and challenging issue. 6. Conclusions This study proposed an automatic customer service system (ACSS) that can automatically reply customer requests by selecting the mostly feasible answers from the FAQ database. If a customer is not satisfied with the answer
27
given by ACSS, a service request assignment algorithm is invoked to assign the inquiry to the most relevant service staff, and the new answers given by the personnel will be recorded for further usage. From several experimental results, it can be seen that ACSS can provide high quality services efficiently, and hence the work burdens of service personnel can be significantly relieved. Meanwhile, the 24-h service provided by ACSS can also shorten the customer waiting time, which not only reduces the service cost, but also increases the competition advantages of enterprises. The idea of this research can also be applied to other applications, such as the development of an intelligent BBS system that can find past relevant discussions for a new proposed question to avoid duplicated discussion contents, and the development of an intelligent tutoring assistance system to solve the students’ problems in learning and to ease the burden of the teachers. Acknowledgements The authors thank Mr. Yi-Shiang Huang for his assistance in implementing the system and conducting the experiments. This study is supported in part by the National Science Council of the Republic of China under contract number NSC-94-2524-S-024-001. References [1] M. Adriaa, S.D. Chowdhury, Centralization as a design consideration for the management of call centers, Information and Management 41 (2004) 497–507. [2] G. Burnetta, L. Bonnici, Beyond the FAQ: explicit implicit norms in usenet newsgroups, Library and Information Science Research 25 (2003) 333–351. [3] Y.M. Chang, G.J. Hwang, Development of an Adaptive Online Customer Service System, National Computer Symposium, Taiwan, 2001. [4] W.B. Croft, Approaches to intelligent information retrieval, Information Processing and Management 23 (4) (1987) 95–110. [5] W. Cohen, Learning Rules That Classify E-mail, AAAI Spring Symposium on Machine Learning in Information Access, 1996. [6] E. Cooper, Improving FAQfinder’s Performance: Setting Parameters by Genetic Programming, AAAI Spring Symposium on MLIA Technical Papers, 1996. [7] P. Hansen, K. Jarvelin, Collaborative information retrieval in an information-intensive domain, Information Processing and Management 41 (2005) 1101–1119. [8] R. Hoch, Using IR techniques for text classification in document analysis, in: Seventieth ACM International Conference on Research and Development in Information Retrieval, 1994, pp. 31–40. [9] L.P. Jones, E.W. Gassie, S. Radhakrishnan, INDEX: the statistical basis for an automatic conceptual phrase-indexing system, Journal of American Society for Information Science 41 (2) (1990) 87–98. [10] I. Kerenidis, R.de Wolf, Quantum symmetrically-private information retrieval, Information Processing Letters 90 (2004) 109–114. [11] B. Krulwich, Learning Document Category Descriptions through the Extraction of Semantically Significant Phrase, Workshop on Data Engineering for Inductive Learning, IJCAI-1995, Montreal, Canada, August 20, 1995. [12] L.S. Larkey, M.E. Connell, Structured queries, language modeling, and relevance modeling in cross-language information
28
[13]
[14] [15] [16] [17]
[18]
J.C.R. Tseng, G.-J. Hwang / Electronic Commerce Research and Applications 6 (2007) 19–28 retrieval, Information Processing and Management 41 (2005) 457–473. M.Y. Li, Judy C.R. Tseng, Development of an Intelligent Networkbased Customer Service System, 2001 Taiwan Academic Network Conference (TANET2001), October 24–26, 2001. D. Peppers, M. Rogers, Building Relationships One Customer at a Time, The One to One Future, New York, 1993. F.F. Reichheld, W.E. Sasser, Zero defections: quality comes to service, Harvard Business Review (1990) 105–111. G. Salton, M. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, New York N.Y, 1983. G. Salton, C. Buckley, Term weighting approaches in automatic information retrieval, Journal of Information Proceeding and Management 24 (3) (1998) 513–524. J. Savoy, An extended vector-processing scheme for searching information in hypertext systems, Information Processing and Management 32 (2) (1996) 155–170.
[19] A. Spink, Term relevance feedback and mediated database search: implications for information retrieval practice and system design, Information Processing and Management 31 (2) (1995) 161–171. [20] I. Syu, S.D. Lang, Adapting a diagnostic problem-solving model to information retrieval, Information Processing and Management 36 (2000) 313–330. [21] P. Vakkari, Task complexity, problem structure and information actions Integrating studies on information seeking and retrieval, Information Processing and Management 35 (1999) 819–837. [22] L.A. Witt, M.C. Andrews, D.S. Carlson, When conscientiousness is not enough: emotional exhaustion and performance among call center customer service representatives, Journal of Management 30 (1) (2004) 149–160. [23] B. Yuwono, D.L. Lee, Wise: a world wide web resource database system, IEEE Transactions on Knowledge and Data Engineering 8 (4) (1996) 548–554.