users via some kind of interface (e.g. text chat or voice chat [30]). ..... statistical information retrieval approach that the FAQ ... A short training period for assistants ...
Copyright 2001 IEEE. Published in the Proceedings of the Hawai’i International Conference On System Sciences, January 3-6, 2001, Maui, Hawaii.
Collection and Exploitation of Expert Knowledge in Web Assistant Systems Johan Aberg and Nahid Shahmehri Department of Computer and Information Science, Linköping University, Sweden E-mail: {johab,nahsh}@ida.liu.se Abstract Recent research and commercial developments have highlighted the importance of human involvement in user support for web information systems. In our earlier work a web assistant system has been introduced, which is a hybrid support system with human web assistants and computer-based support. An important issue with web assistant systems is how to make optimal use of these support resources. We use a knowledge management approach with frequently asked questions for a questionanswering system that acts as a question filter for the human assistants. Knowledge is continuously collected from the assistants and exploited to augment the questionanswering capabilities. Our system has been deployed and evaluated by an analysis of conversation logs and questionnaires for users and assistants. The results show that our approach is feasible and useful. Lessons learned are summarised in a set of recommendations.
1. Introduction Previous research and recent commercial developments have highlighted the importance of so-called web assistant systems [1]. A web assistant system is a user support component that can be added to any kind of web information system (WIS) [16]. Such WISs include systems for electronic commerce [1], digital libraries [23], and home banking systems [19]. Web assistant systems have a clear potential for increasing users' trust for a WIS as well as giving the WIS a more personal atmosphere and a human touch. Trust is an important aspect of a WIS (e.g. [15] [25]), especially for WISs where user decisions can have large consequences for the user. During our earlier evaluation subjects showed great enthusiasm for web assistant support and the special human touch that the assistants give [1]. A web assistant system is a hybrid support system featuring both computer-based and human-based support. Human web assistants provide real-time support for WIS users via some kind of interface (e.g. text chat or voice chat [30]). The computer-based support can be any kind of support executed electronically, such as a recommendation function [26] or a search function.
An important issue with web assistant systems is how to make optimal use of the resources for human-based and computer-based support through knowledge management. Some form of cooperation between the web assistants and the computer-based support is necessary to get the full strengths of this kind of hybrid support system. There are many parameters to consider for this kind of resource allocation. These parameters include the following: • • • •
User trust. Human assistants affect users' trust of a WIS in a very positive way. Computer-based support does not have the same advantage. User preferences. Some users prefer to have conversations with human assistants, while others prefer to consider information in peace and quiet. Resource cost. Human assistants can be more costly than computer-based support. Support efficiency. Human-based support may be the most efficient solution for some problems while computer-based support is the better alternative in other cases.
The importance of a parameter is highly dependent on the application domain of the WIS. For example, for a WIS for banking or insurance, user trust and user preferences can be crucial while the resource cost might be of less importance. On the other hand, for a traditional information-providing site from a non-profit organisation, resource costs and support efficiency may be the most important parameters. In Figure 1 we illustrate the resource allocation problem for web assistant systems. A user asks a question to the system and expects to receive an answer. The job of the support router is to decide the best way to answer the question. In other words, how to make the most efficient use of the human-based and the computer-based support in order to construct the best possible answer for the user. One approach to support routing is to go through a web assistant first. The router sends the question to an available assistant. The assistant is then responsible for finding an answer and can make use of the computerbased support for this purpose. Another approach is to go through the computer-based support first. If a proper answer to the question cannot be achieved this way, the
question is passed on to a web assistant. In this case, the computer-based support works as a filter that handles routine questions, while difficult question are delegated to the human web assistants. Yet another approach would be to divide the question in a set of sub problems that are delegated to the computer-based support or the humanbased support as appropriate. An answer is then constructed in the end and returned to the user. Observe that communication between a support component and the user may be necessary in order to answer a question. This is the support component's responsibility and is not part of the routing. Web Assistant System Knowledge
Human-based support Computer-based support Balance
Support routing
User
Figure 1. Support routing for web assistant systems
illustrate this by the balance in Figure 1. The knowledge extracted from the assistants is represented as frequently asked questions (FAQs). Our approach to this kind of knowledge management is described in more detail in the next section. We have implemented a web assistant system extended with our question-answering component. When a user asks a question to the system, the question is first matched against the FAQ file. The top matching questions and answers (i.e. the top matching FAQ items) are then shown to the user as a first attempt to solve the problem. If the FAQ items do not help to solve the user's problem sufficiently, the user can choose to proceed and have a help conversation with a web assistant. If no FAQ item exceeds the matching threshold the user is automatically connected to a web assistant. To test and evaluate our support routing approach we have deployed the extended web assistant system as a support component for an existing WIS. The WIS is in the art and literature domain. It has an international user community and serves around 14,500 daily visitor sessions. We recruited a number of volunteer assistants living in different time zones around the world, giving us almost 24 hours of daily human web-assistant support. The web assistant system was deployed for a period of three weeks. We evaluated the support routing approach by analysing the conversation logs and by employing questionnaires for users and assistants. The remaining parts of this paper are structured as follows. In section 2 we describe our approach to support routing. In section 3 our implemented system is described from a user's and an assistant's perspective. Section 4 describes our field study and in section 5 we present and discuss our evaluation results. Section 6 summarises lessons learned from our study and in section 7 we discuss related work. In section 8 we conclude and give some directions for future work.
2. Support routing The choice of a routing approach will depend on the parameter setting of the WIS at hand. The WIS of our field study is an information-providing site without commercial interest. For such a site the resource cost is of most importance, but also support efficiency and user preferences are important. Thus, we take a filtering approach in an attempt to minimise the number of needed assistants. We exploit the web assistants' knowledge for an automatic question-answering component. The component works as a filter, which is extended as knowledge is continuously extracted from the assistants as they are helping users. As more relevant knowledge is extracted from the assistants and exploited the capabilities of the question-answering system are expanded. This means that more and more work load is taken off the assistants. We
2.1 Overview Our approach to support routing is to filter user questions through a natural language question-answering system before propagating any questions to human web assistants. The task of automatic question answering requires domain knowledge. This domain knowledge needs to be collected and represented in a suitable way. Some form of reasoning is also required in order to generate answers for questions using the domain knowledge. We present and motivate our main design decisions along these lines below.
•
•
•
For large domains it is infeasible to make an exhaustive collection of domain knowledge in an initial phase. We therefore chose to collect knowledge in an incremental manner. The collection is based on actual user needs implying that only relevant knowledge will be collected. An alternative approach would be to collect some initial knowledge and then extend this knowledge incrementally based on system usage. Knowledge is represented as FAQ items (a question with an associated answer) tailored to an average user. An alternative to this is to represent knowledge using some knowledge representation mechanism (for example description logics [28]). This would facilitate reasoning about the knowledge. However, creating such a knowledge base for a large domain is a very complex and resource demanding task [28]. Another alternative is to store all the knowledge in a textbook format or as a number of feature articles. However the more comprehensive these books or articles are the more difficult it would be to refer a user to the relevant parts. Our approach on the other hand presents the knowledge in short and concise pieces that users quickly can get an overview of. A more advanced approach could have several versions of each FAQ item tailored to different groups of users with different characteristics, known as stereotypes [27]. The reasoning we perform on the represented knowledge is the similarity computation between represented FAQ items and user questions.
For this system we have chosen to collect knowledge from assistants based on the knowledge they use to answer user questions. The knowledge is represented as questions and answers, in the style of FAQs. The assistants are responsible for constructing these FAQ items based on the help conversations they have with users. The formulation of the FAQs is not tailored to any user in particular, but more towards an average system user. In Figure 2 we illustrate the tool used by the assistants to create new FAQ items. A number of pre-defined FAQ topic categories are available and the assistant has to select the most appropriate topic for each new FAQ item being created. In Figure 3 we show the FAQ index file. The assistants are also responsible for checking the FAQ before creating a new item to make sure to avoid duplicates. We employ information retrieval techniques to select the FAQ items that best match a user's question. These items are then returned to the user as possible answers to the question. This automatic question-answering system is placed as a preliminary step before a user can be connected to a web assistant. If none of the FAQ items
answer the user's question, the user can choose to be connected to an assistant instead. We will describe our implementation in the context of the web assistant system further in Section 3.
Figure 2. Screen shot from the FAQ editor
Figure 3. Screen shot from the FAQ index file
2.2 Information retrieval architecture Our implementation consists of two main parts, namely an indexer and a question manager. The indexer is responsible for creating an internal representation of the FAQ items. The question manager is responsible for
creating an internal representation of users' questions and computing the best matching FAQ items. An overview of the question-answering system architecture is given in Figure 4. (User) Client
Questions/Answers
Server
(Assistant) Client
Question Manager
Questionanswering client
FAQ editor Internal FAQ representation
∑
t i =1
t i =1
wi , j × wi , q
w × 2 i, j
∑
t i =1
. 2 i,q
w
The formulas for the weight and similarity calculations are all well known and have been proven to work well in general [4]. The ten FAQ items that are most similar to a user's question and have a similarity value over 0.1 are returned as the result of the question. A result page is generated with a ranked list with links to the best matching FAQ items. To make the question-answering system as fast as possible all FAQ item vectors are kept in memory together with the first 60 characters of the text that are used for creating result pages. New FAQ items are indexed when the indexer is run every ten minutes.
2.3 Alternative approaches
Indexer
FAQ items
sim( j , q) =
∑
FAQ updates
Figure 4. Architecture of the question-answering system Following the approach of the FAQ Finder system described in [8] we use the vector model (e.g. [4]) for representing FAQ items and questions. This way an indexed document is represented as a weighted keyword vector. The actual keywords used are the words that remain in the set of all FAQ items after stopword removal and stemming. We use the stopword algorithm in [10] and the stemming algorithm by Porter (according to the implementation in [11]). For FAQ items, the weight for a keyword i in FAQ freqi , j N item j is calculated as: w = × log , where i, j maxl freql , j ni N is the total number of FAQ items in the system, ni is the number of FAQ items in which the keyword i occurs, freqi , j is the frequency of keyword i in FAQ item j. The maximum in the formula is computed over all keywords in FAQ item j. For user questions, the weight for a keyword i in a question q is calculated as: 0.5 freqi , q N wi , q = (0.5 + ) × log , where freqi , q is max l freql , q ni the frequency of keyword i in the question q. The similarity between a user question q and an FAQ item j is calculated as:
The MURAX system [20] is a natural language question-answering system for general knowledge questions that have a definite answer represented by a noun phrase rather than a procedural answer. The system makes use of an on-line encyclopedia. The questionanswering technique proceeds basically as follows. Keyword queries are constructed from the natural language question and passed to an information retrieval system for an on-line encyclopedia to find text that is relevant to the question. The retrieved text is analysed and noun phrase hypothesis are extracted and further analysed. The best matching hypothesis is returned as an answer to the question. Quotes from the text where the noun phrase occurs are also returned to show the user why the system gave this particular answer. While this technique seems to work well for questions with noun-phrase answers, it does not work for questions with procedural answers. Thus, the technique could not be used in our system where many questions require procedural answers. The system called GENIE [29] is an on-line help system that answers user questions through natural language generation techniques. The prototype system is designed for an e-mail application and uses a knowledge base consisting of the possible actions, plans and unique goals in the e-mail system. A user model is kept where the user’s knowledge is stored in the form of the plans that the user knows. The help system attempts to identify the user’s current goal based on the question and the user model, and then generates a piece of answer text. The system has been shown to perform well in the e-mail application domain. Still, the success of the technique depends on the completeness of the knowledge base. In our case it is infeasible to create a knowledge base covering all possible actions and plans. This is due to the very large task domain that we attempt to cover.
The INFOMOD system [12] is a knowledge-based email help list moderator aiming to provide automated question answering. An e-mail that is sent to the help list is matched against a knowledge base of previously asked questions. Matching items are then returned to the user. If the items do not answer the user’s question the user can choose to send the question to all subscribers of the help list. Each item in the knowledge base has a predefined expression for question matching. The expression is a conjunction of disjunctions of keywords that have to be satisfied by the user question. While this technique allows for flexible matching the creation of matching expressions requires some extra manual effort. In our implementation we wanted to limit the manual effort and thus we chose an automatic matching technique instead. Question-answering techniques are not the only possibility for the kind of support filter that we want to achieve. Alternatives include diagnose systems, such as expert systems (e.g. [17]), and context help systems (e.g. [14], [9], [24]). An expert system approach would not have been feasible considering the size of our application domain. A context help system could have been employed as part of a support system, but it could not have been enough on its own. The support we want to provide goes beyond the actual WIS page context and is also a user service where users can get help with WIS related issues independently if the issues are covered in the WIS pages or not.
User asks a question
Wait in queue for assistant
Assistant is available Chat
User Answer found
Tired of waiting in queue
Stop
Stop
Answer found
Stop
Figure 5. An example of question-answering sessions from a user's point of view The system is illustrated from an assistant's point of view by showing an example of system usage in Figure 6. An assistant who is logged in to the web assistant system, and is available, can receive a request from the system to help a user at any time. If the assistant accepts to help the user, a chat connection is established. The user's question and the question category are displayed to the assistant when the chat starts. The user's model is also displayed to the assistant directly when the chat starts. This model is available for viewing and updating for the entire chat session and until a chat with a new user starts. Based on the information in a user model the assistant can tailor the answers to the current user's skill and experience. At any time when the assistant is logged in, he or she can create new FAQ items. View and update user model
3. Web assistant system In this section we give an overview of our implemented web assistant system with automatic question-answering capabilities. Our web assistant system allows a user to ask questions related to the WIS. An example of a user's process for asking a question to the system is illustrated in Figure 5. When a user asks a natural language question to the system, the user also has to provide an appropriate category for the question, based on a predefined topic hierarchy. The user's question is first matched against an FAQ. The best matching FAQ items are returned to the user. If the user does not find the answer to the question in the returned FAQ items, the user can request for a chat with a human assistant as a next step. The question category is matched against expertise models of the assistants that are currently logged into the system. The system attempts to establish a chat connection with the user and the best matching assistant, or place the user in a queue for such an assistant if no one is available for the moment. Our decision to have users choose a topic category for each question in order to match users with appropriate assistants, was an appealingly intuitive solution.
Answer not found
View matched FAQ items
System requests user chat
Chat with user
User satisfied
Add new FAQ item
Stop
Assistant
Figure 6. An example of a question-answering session from an assistant's point of view
4. Field Study For the purpose of evaluation the outlined web assistant system has been attached to an existing WIS for a test period of three weeks. The WIS, called Elfwood, is in the art and literature domain. Amateur artists and writers can display their material at the site. Currently around 4400 artists, and 800 writers display their work in the fantasy and science fiction genre. The WIS has about 14,500 visitor sessions on a daily basis. Interaction is important in the WIS, and user's can write their comments on almost every page. A part of the site is dedicated to teaching art to the users through a number of feature articles on art creation topics.
The kind of user support we intended to provide with the web assistant system was quite wide ranging. From basic site related support (how to use the WIS), to general “how to” aspects of art and literature. We had voluntary assistants living in different countries and continents. Of the approximately 35 assistants that were volunteering, 30 actually helped users during the field study. The remaining assistants who never got the chance to help users were only logged in at times when not so many users were on the WIS. During the field study, 636 users registered with the support system. 129 of these users went all the way to have help conversations with assistants.
5. Evaluation The purpose of this evaluation is to test our approach to support routing and find possible improvements. We have evaluated the question-answering system in two ways. First, we have made an analysis of all conversations that took place in the field study. Second, we have made an evaluation from the assistants' and the users' point of view using questionnaires. We saw questionnaires as appropriate for our interest in the opinions of assistants and users. The use of alternative evaluation tools such as interviews and think-aloud protocols [18] were considered to be difficult due to the wide geographical distribution of the users and the assistants.
5.1 Methodology Question analysis. To evaluate the amount of user questions that are suitable for being turned into FAQ items we have analysed the conversation logs. In total, 177 conversations took place. Based on these conversations 48 FAQ items were created by the assistants. For each question asked in the conversations we considered the following issues. First, were the question answered by the assistant? Second, had the question been asked before? Third, could the question be made into an FAQ item or was it too idiosyncratic? Fourth, if the question could be made into an FAQ item, under what topic category would it be placed? Questionnaires. In order to evaluate our approach also from the assistants’ point of view we employed a questionnaire. The questionnaire contained questions related to their usage of the question-answering system. It was sent out to all assistants who had participated in the field study and who had actually held help conversations with users. In total, 30 assistants fulfilled these criteria. The questionnaire was sent out via e-mail just after the experiment was over. We received 24 answers. Six of these stated that they had not used the automatic question-
answering system and therefore could not answer the questions. This left us with 18 properly filled out questionnaires (response rate of 60%). The responding assistants were from North America (63%), Europe (25%), Asia (6%), and Oceania (6%). The assistants were from different age groups: 10-19 (44%), 20-29 (37%), 3039 (13%), and 40-49 6%). The gender distribution was even. Due to loss of data the demographics cover 16 of the 18 responding assistants. The questionnaire contained three statements where the respondent was asked to give a rating of agreement or disagreement according to a 1 to 10 scale. The respondent was asked to motivate the ratings for the statements. We also had three open-ended questions. To see if the question-answering system was useful for the users we made another questionnaire that was sent to the users who registered with the web assistant system but did not have a conversation with an assistant. In total, the questionnaire was sent out to 507 users. We received 175 answers corresponding to a response rate of 35%. A slight majority of the respondents were female (56%). The majority came from North America (75%), followed by Europe (17%) and Oceania (4%). The remaining 4% came from Africa, South America, and Asia. The age distribution was: 10-19 (60%), 20-29 (28%), 30-39 (8%), 40-49 (3%), and 50-59 (1%). Due to loss of data, the demographics represent 98% of the respondents. The questionnaire asked for the reason(s) for not having participated in a help conversation with a human web assistant. Several reasons were listed and the respondents also had the possibility of giving other reasons as well. The respondents were allowed to give any number of reasons. In the design of the two questionnaires we considered the guidelines in [5].
5.2 Results Question analysis. In our analysis we found a total of 178 user questions. Some help conversations contained more than one question while others did not contain any real question. Out of these questions we found that 31 questions were duplicates (roughly 17% of the questions), and another 52 could not be made into FAQ items. There were two main reasons for a question not being suitable as an FAQ item. Either the assistant could not answer the question, or the question was about a problem that was too idiosyncratic and thus would not be of any value as an FAQ item. We noticed 24 questions that the assistants could not answer and 29 that were too special. This left us with 95 questions that could be turned into FAQ items (roughly 53% of the questions). We studied the distribution of these questions according to the predefined topic hierarchy for FAQ items. The distribution is
illustrated in Table 1. In some cases it was difficult to decide the appropriate topic category because there were more than one reasonable alternative. In those cases we made arbitrary but consistent choices. Table 1. Topic distribution of possible FAQ items Art navigation at Elfwood: 5 Art creation: 3 Art inspiration: 3 Art media: 1 Wet: 3 Dry: 7 Digital art: 11 Art styles: 3 Art techniques: 30 Story navigation at Elfwood: 1 Story creation: 2 Story inspiration: 3 Story styles: 1 Story technical: 2 Member functions at Elfwood: 11 User functions at Elfwood: 9 Questionnaires. The results from the rating questions from the assistant questionnaire are presented in Table 2. From the discussion questions we learned that different assistants had interpreted our instructions for when to create a new FAQ item differently. We stated that a question should be turned into an FAQ item whenever it was general enough. Further, the assistants generally did not use the FAQs themselves when assisting users (with two exceptions). However, two assistants indicated that they would use them once they grew to become more comprehensive. Finally, all assistants thought it was a feasible approach to have assistants construct FAQs after help sessions. Table 2. Questions and answers about constructing FAQ items Statement Deciding if a new FAQ item should be created based on a help conversation was (hard=1, easy=10) Formulating a new FAQ item based on a help conversation was (hard=1, easy=10) Choosing an appropriate category for a new FAQ item was (hard=1, easy=10)
Mean 6
S. Dev. 2.76
6.56
2.41
6.22
2.34
In Table 3 we present the results from the user questionnaire. Notice that we got several alternative reasons according to question G. Most of these indicated that the users had registered but then never got the time to actually test the support system.
Table 3. Question and answers about the reason(s) for not having a conversation with an assistant Even though you registered with The Elfwood Assistant you never had a help conversation with a human assistant, why? A. Because the system didn’t work. B. Because the answers to my questions were directly displayed from the FAQs when I asked questions to the system. Thus, there was no need to talk to a human assistant. C. Because there was no assistant logged in when I used the system. D. I registered just to check things out. I had no real need to chat with a human assistant. E. Because I didn’t understand how to use the system. F. Because the system took too long time to load. G. For a reason other than the above (please specify).
Answer Number
18 (10%) 26 (15%)
50 (29%) 67 (38%) 24 (14%) 20 (11%) 24 (14%)
5.3 Discussion Based on the evaluation results we discuss the following questions that are related to the feasibility of our support routing approach. Are the questions that are asked suitable as FAQ items? Does the questionanswering system really help users? How large share of the questions can be handled by the question-answering system? Can assistants create suitable FAQ items based on their help sessions? From the conversation analysis we emphasise two main findings. First, a large amount of the questions that were asked were indeed appropriate to be made into FAQ items (53%). Second, a significant number of the questions were variants of previously asked questions (17%). This does not mean that the users do not consider the FAQs before asking an assistant. Of course, some users behave that way, but not all. Rather, only around 50% of the possible FAQ items were actually created by the assistants, and thus the likelihood of finding matching FAQ items was smaller than it should have been. A total of 15% of the responding users who did not chat with an assistant said that the question-answering system provided them with the answers they needed. Considering the short time that the system was in use (three weeks) and that the FAQs were built up from scratch, this is a good result. It shows that our support routing approach is indeed useful. It should also be noted that some users may prefer to use the FAQs exclusively,
simply because they prefer to consider information in peace and quiet without having to involve themselves in conversations. To further evaluate the performance of our support routing approach it would be interesting to evaluate the traditional precision and recall measures [4]. This would allow us to better estimate how large share of the user questions can be handled by the question-answering system. Unfortunately, as has been argued in [8], these measures are not directly applicable for this kind of question-answering system. In contrast to traditional information retrieval we are not interested in retrieving all documents relevant to the question. Rather, we assume that one FAQ item that answers the question is enough. Still, the definitions of precision and recall can be modified as was done for the FAQ Finder system [8]. We adopt these definitions. Recall is re-defined as the percentage of questions for which the question-answering system returns a correct answer when one exists. Instead of precision, a new measure called rejection can be used, defined as the percentage of questions that the questionanswering system correctly reports as being unanswered among the existing FAQ items. The FAQ Finder system has been evaluated across several different domains using these measures. Since we use the same traditional statistical information retrieval approach that the FAQ Finder system used, the results are likely to be valid also for our system. If we are willing to sacrifice rejection for a high recall (by using a low matching threshold value) we can expect to reach around 58% of recall. If we combine this result with the data about the rate of duplicate questions (17%) from our question analysis we find that the question-answering system can be expected to handle roughly 10% of all questions. This may seem as a low number, but considering the very large domain of our support system we believe it is a good result. In more limited domains we could expect the rate to be higher. It would also be possible to achieve even better recall using more advanced information retrieval methods [8]. Considering the creation of FAQ-items the assistants were initially instructed that they should create FAQ items based on all questions that were “general” enough. This instruction turned out to be too vague, and was interpreted in different ways. Some assistants expressed their confusion in the questionnaire as shown in the results. We also noticed that some assistants thought it was difficult to construct FAQ items and to find the right category for new items. Clear instructions and more training could improve this situation.
6. Recommendations To summarise the experience from our field study we provide a number of recommendations for designing and implementing support routing for web assistant systems. • It must be clearly defined what kind of questions should be made into FAQ items. Vague definitions can make some web assistants unsure and lead to fewer FAQ items. A good definition is: “Any question that could possibly be asked by another user should be made into an FAQ item”. • Many assistants thought it was difficult to formulate FAQ items. A certain level of fluency in the English language is required. A short training period for assistants is also recommended before they start to work. • Many assistants had problems choosing an appropriate topic category for new FAQ items. The topic hierarchy may need to be modified at run-time and the system should be designed to facilitate such modifications. • Spell checking is important for FAQ items. Since the information retrieval method is based on keywords, users may risk of missing good FAQ items due to spelling errors. • It is important to let the users browse the FAQs as well and not rely completely on the natural language question-answering system.
7. Related work Since the initiation of our work there have been commercial moves in the directions discussed in this paper. Companies such as LivePerson (liveperson.com) and FaceTime (facetime.org) now offer commercial systems for human assistance in web sites. This trend increases the importance of the kind of studies reported in this paper. While the efforts by FaceTime and LivePerson are well motivated, little is known about how users interact with this kind of systems and how to get the most benefit from the combination of human assistants and computer-based support. In the context of the Alexandria Digital Library project, an online help desk called AlexHelp! has been developed [23]. The idea is to have librarians online to help users of the digital library. The design and implementation of a prototype system is presented. In [19] two applications for web-based collaborative customer service are described. The applications are for a banking kiosk setting, and for home banking over the Internet. The banking kiosk application uses a shared browser approach with ink annotation for communication. The home banking application also uses a shared browser approach and has support for voice chat.
The two related research approaches we have just described are similar to our approach. They are however limited when it comes to providing personalised support. The descriptions of the systems are on an architectural level, and no form of evaluation is provided. The Answer Garden (AG) system [2] by Ackerman is a question-answering system involving human experts and thus related to our work. Still, there are some important differences. There is no form of support for personalisation in AG similar to our user modelling approach. Further, where our system supports synchronous communication between users and assistants via textual chat, question answering by an expert in AG consists of two asynchronous messages corresponding to the question and the answer. The AG system was later extended to a second version called Answer Garden 2 (AG2) [3]. AG2 features mainly two new functions considering an automatic escalation of help sources and a form of collaborative refinement of the system database taking other sources than user questions into consideration. These features could very well be incorporated also in our system, and would be interesting to study as future work. The problem of querying FAQs using natural language questions has been studied previously. The FAQ Finder system [13], [6], [7], [8] is a natural language questionanswering system. A number of FAQs on the web have been collected and the FAQ Finder works as an interface towards these. The system uses a combination of statistical similarity (as we do in this paper) and semantic similarity. In [21] a case-based reasoning approach to finding answers in an FAQ database is outlined. A probabilistic approach to the same problem is described in [22]. Our work on an FAQ-based question-answering system differs from the related work. To our knowledge it is the first time that this kind of question answering is applied for optimisation of human and computer-based support resources.
8. Conclusions and future work We have described an approach to support routing for web assistant systems. The approach uses an FAQ-based question-answering system, which makes it possible to consider many parameters such as resource cost and support efficiency in the design of web assistant systems. The web assistant system has been implemented and deployed for the purpose of evaluation. Our field study has shown that our approach is indeed feasible and useful. We have summarised our experiences into a set of recommendations for how to implement and deploy this kind of user support routing.
Constructing FAQ items out of help conversations can be a difficult but also tedious task. It would be interesting to investigate what kind of computer-based support can be provided for the assistants for this task. Could natural language techniques or machine learning techniques be used for this purpose? An assistant has the possibility to tailor an answer to a user's question in a help conversation based on the user model. The same kind of personalised support should be provided also from the FAQ question-answering system and needs further work. Perhaps each FAQ item could be available in different versions and have the system choose which version is presented to the user based on the user model! We strongly believe that human involvement in user support is here to stay. Future work will emphasise the great potential of a close collaboration between humans and computers, and how to best exploit their respective advantages for a common purpose.
Acknowledgements This work has been partially supported by TFR. We would like to thank the many assistants and users who participated in this experiment. Also Johan Löwdahl and Jonas Almfeldt deserve a special acknowledgement for their work on the implementation of the system. Finally we are indebted to Thomas Abrahamsson for letting us use Elfwood for our field study.
References [1] J. Aberg, and N. Shahmehri, “The role of human web assistants in e-commerce: An analysis and a usability study”, Internet Research: Electronic Networking Applications and Policy, 10(2), 2000, pp. 114-125. [2] M. Ackerman, “Augmenting the Organizational Memory: A Field Study of Answer Garden”, in Proceedings of the Conference on Computer-Supported Cooperative Work, 1994, pp. 243-252. [3] M. Ackerman, and D. McDonald, “Answer Garden 2: Merging Organizational Memory with Collaborative Help”, in Proceedings of the Conference on ComputerSupported Cooperative Work, 1996, pp. 97-105. [4] R. Baeza-Yates, and B. Ribeiro-Neto, editors, Modern Information Retrieval, Addison Wesley, 1999. [5] T. Bouchard Jr., “Field research methods: Interviewing, questionnaires, participant observation, unobtrusive measures”, in M. Dunnette (editor) Handbook of Industrial and Organizational Psychology, Rand McNally, 1976, pp. 363-413. [6] R. Burke, K. Hammond, and J. Kozlovsky, “KnowledgeBased Information Retrieval from Semi-Structured Text”, in Working Notes from AAAI Fall Symposium on AI Applications in Knowledge Navigation and Retrieval, 1995, pp. 19-24.
[7] R. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg, “Natural Language Processing in the FAQ Finder System: Results and Prospects”, in Working Notes from AAAI Spring Symposium on NLP on the WWW, 1997, pp. 17-26. [8] R. Burke, K. Hammond, and V. Kulyukin, “Question Answering from Frequently-Asked Question Files: Experiences with the FAQ Finder System”, Technical Report TR-97-05, University of Chicago, Department of Computer Science, June 1997. [9] M. Encarnacao, and S. Stoev, “An Application Independent Intelligent User Support System Exploiting Action-Sequence Based User Modelling”, in Proceedings of the Seventh International Conference on User Modeling, 1999, pp. 245-254. [10] C. Fox, “Lexical Analysis and Stoplists”, in W. Frakes, and R. Baeza-Yates, editors, Information Retrieval: Data Structures and Algorithms, Prentice-Hall, 1992, pp. 102130. [11] W. Frakes, “Stemming Algorithms”, in W. Frakes, and R. Baeza-Yates, editors, Information Retrieval: Data Structures and Algorithms, Prentice-Hall, 1992, pp. 131160. [12] R. Hall, “INFOMOD: A Knowledge-based Moderator for th Electronic Mail Help Lists”, in Proceedings of the 5 International Conference on Information and Knowledge Management, 1996, pp. 107-114. [13] K. Hammond, R. Burke, and K. Schmitt, “A Case-Based Approach to Knowledge Navigation”, in Proceedings of the AAAI Workshop on Indexing and Reuse in Multimedia Systems, 1994, pp. 46-57. [14] R. Hellman, “User Support: Illustrating Computer Use in Collaborative Work Contexts”, in Proceedings of the Conference on Computer-Supported Cooperative Work, 1990, pp. 255-267. [15] D. Hoffman, T. Novak, and M. Peralta, “Building Consumer Trust Online”, Communications of the ACM, 42(4), April 1999, pp. 80-85. [16] T. Isakowitz, M. Bieber, and F. Vitali, “Web Information Systems”, Communications of the ACM, 41(7), July 1998, pp. 78-80. [17] H. Jones, “Familiar Contexts, New Technologies: Adapting Online Help to Simulate an Expert System”, in th Proceedings of the 15 Annual International Conference on Computer Documentation, 1997, pp. 145-151. [18] J. Karat, “User-Centered Software Evaluation Methodologies”, in M. Helander, P. Landauer, and P. Prabhu (editors) Handbook of Human-Computer Interaction, Elsevier Science B. V., 1997, pp. 689-704. [19] M. Kobayashi, M. Shinozaki, T. Sakairi, M. Touma, S. Daijivad, and C. Wolf, “Collaborative Customer Services
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
Using Synchronous Web Browser Sharing”, in Proceedings of the Conference on Computer-Supported Cooperative Work, Seattle, WA USA, November 14-18 1998, pp.99-108. J. Kupiec, “MURAX: A Robust Linguistic Approach For Question Answering Using An On-Line Encyclopedia”, in th Proceedings of the 16 Annual International Conference on Research and Development in Information Retrieval, 1993, pp. 181-190. M. Lenz, and H-D. Burkhard, “CBR for Document Retrieval: The FALLQ Project”, in Proceedings of the Second International Conference on Case-Based Reasoning, 1997, pp. 84-93. R. Maxion, and P. Syme, “Mitigating Operator-Induced Unavailability by Matching Imprecise Queries”, in th Proceedings of the 26 International Symposium on FaultTolerant Computing, 1996, pp. 240-249. R. Prince, J. Su, H. Tang, and Y. Zhao, “The Design of an Interactive Online Help Desk in the Alexandria Digital Library”, in Proceedings of the International Joint Conference on Work Activities, Coordination, and Collaboration, San Francisco, USA, February 22-25, 1999, pp. 217-226. N. Randall, and I. Pedersen, “Who Exactly is Trying to Help Us? The Ethos of Help Systems in Popular Computer th Applications”, in Proceedings of the 16 Annual International Conference on Computer Documentation, 1998, pp. 63-69. P. Ratnasingham, “The Importance of Trust in Electronic Commerce”, Internet Research: Electronic Networking Applications and Policy, 8(4), 1998, pp. 313-321. P. Resnick, and H. Varian, “Recommender Systems”, Communications of the ACM, 40(3), March 1997, pp. 5658. E. Rich, “Stereotypes and User Modeling”, in W. Wahlster, and A. Kobsa, editors, User Models in Dialog Systems, Springer Verlag, 1989, pp. 35-51. P-H. Speel, “Selecting Knowledge Representation Systems”, PhD Thesis, Universiteit Twente Enschede, Netherlands, 1995. U. Wolz, “Providing Opportunistic Enrichment in Customized On-Line Assistance”, in Proceedings of the International Workshop on Intelligent User Interfaces, 1993, pp. 167-174. Q. Zhang, C. Wolf, S. Daijavad, and M. Touma, “Talking to Customers on the Web: A Comparison of Three Voice Alternatives”, in Proceedings of the Conference on Computer-Supported Cooperative Work, Seattle, WA USA, November 14-18, 1998, pp. 109-117.