Hotel Recommender System Based on User's ... - IEEE Xplore

1 downloads 0 Views 353KB Size Report
{saga, hayashi}@mis.cs.osakafu-u.ac.jp, [email protected].ac.jp. Abstract— This paper ... preference transition network when a user selects a hotel. This.
Hotel Recommender System Based on User’s Preference Transition Ryosuke Saga, Yoshihiro Hayashi, and Hiroshi Tsuji Graduate School of Engineering Osaka Prefecture University Sakai, Japan {saga, hayashi}@mis.cs.osakafu-u.ac.jp, [email protected] Abstract— This paper proposes a hotel recommender system based on sales records. Basic premise under the research is that the sales records include the user's preference relations among hotels. The proposed system recommends hotels based on preference transition network when a user selects a hotel. This paper describes four steps procedure for building the preference transition network proposes in detail. The proposed recommender system is available for repeatable purchase without explicit product evaluation. The features of the prototype system are also illustrated. Keywords— information filtering, recommender system, information retrieval, electronic commerce, hotel recommendation

I.

INTRODUCTION

Recently, people have gained massive information via Internet and dealt with them. Then the information filtering becomes useful tool as well as information retrieval. There are some kinds of information filtering techniques: contents-based filtering, collaborative filtering and so on [1]-[2]. Recommender system has been developed as an application of the information filtering [2]. The main function of recommender system predicts user’s preference from the rating information, filters some items from massive information, and suggests candidate items for user. In general, recommender system has been used for support of selecting items. Especially in e-commerce, the system is now attractive to enhance customer satisfaction. Such recommender systems have worked for selecting music CD and book. If a user expresses his evaluation for the known items, the system recommends unknown items based on others evaluation. In e-commerce, is hotel reservation is a legacy application [3]-[4]. Although the recommender function might be attractive for the application, the traditional system for CD and book does not work because of some reasons. The representative reason of hotel recommendation is reusability. The items such as CD and books are available for free over and over if user buys them once. On the other hand, in case of hotel reservation, user may make reservation for the same hotel. Basic assumptions under this research are: •

If user likes a hotel, he will stay there repeatedly,

If user dislikes a hotel, he will not stay there and stay in other hotel,



Such preference can be expressed in preference transition network.

In fact, the recommender system must predict better hotels from the global view of hotel reservation for users. This paper proposes a recommender system for hotel reservation in consideration of the assumption. The proposed system expresses user’s behavior as preference transition network. The network is created from sales records which include implicit rating for the hotel. And, the system makes recommendation based on the network. The rest of this paper is structured as follows: Chapter 2 describes the issues of hotel recommendation and the requirements, and shows the image of the system to solve the requirements. Chapter 3 describes two components, preference transition network and recommendation method. Chapter 4 describes the prototype system constructed from real sales records, and chapter 5 describes the related works. Chapter 5 concludes this paper. II.

PREFERENCE TRANSITION-BASED SYSTEM

First of all, let us clarify the requirement of hotel recommendation by considering features of hotel room reservation. A. Features of Hotel Recommendation There are three features of hotel recommendation as follows while some points may be common issues in recommender systems 1) Reusability: There are two kinds of products for customers: only once purchase product and repeated purchase product. The music CD and books are the former examples. On the other hand, hotel room reservation is the latter example. Therefore, it is necessary to include reusability feature for the hotel system recommendation. 2) Transition of user’s preference: Suppose that there are some hotels in the area where a customer wants to stay. In some cases he may always stay in the same hotel while in

2437 c 2008 IEEE 1-4244-2384-2/08/$20.00 



other cases he may change. Such movement should express user’s preference. If we can express the movement in computer system, we can implement recommender function. Intuitively, user’s preference might be embedded in recent usage. Then, if the recent hotel user stayed is different from the previous hotel, there is a possibility that the users’ preference moves into the new hotel from the previous hotel although it would be rush to forejudge that user’s preference has moved to the new hotel. The user might use the new hotel at a whim. Even after the preference seems to move into new hotels from the previous, the user may stay in the previous hotels again. Therefore, it is necessary to make a standard for judging whether the preference moves or not. 3) Acquirement of rating information: It is necessary for a recommender system to acquire rating information how user likes some hotels and dislikes others. However, the load for rating hotel is so high for the user that it is unlikely to acquire exact feedback to identify user’s preference, although some users should have a chance to feedback the rate about hotels. B. Requirement of Hotel Recommendation In order to recommend hotel, the proposed system which has the above features is designed based on three requirements; •

The system can make the candidate list which includes novel hotels as well as the previous hotels in consideration of reusability. Additionally, the system can decide the order of hotels in the list based on the user’s preference.



For the acquirement of approximate rating information, the system uses sales record which is one of implicit rating.



The system analyzes the transition of user’s preference from the implicit rating information and shows the transition in a network which is expressed as a directed graph. Additionally, user tends to make decision referring to the ratings assigned by third parties. Thus, the system analyzes the transition from implicit rating information which contains not only individuals' information but also third parties' information.

The system based on the requirements is shown in Figure 1.This proposed system has sales record database and hotel database. The system makes recommendation to a user based on a hotel which a user selects on the system. The proposed system expresses the relations between hotels from sales records, and analyzes the relations to a directed graph called preference transition network. Additionally, the proposed system makes recommendation based on preference transition network as introduced later. III.

SYSTEM COMPONENTS

2438

Selected hotel

date Sales records

charge plan

User

etc

Hotel

Recommended hotels Preference transition network

Figure 1. System image

A. Creation of Preference Transition Network To recommend more preferable hotels, the proposed system creates preference transition network which express the movement of user’s preference between hotels. The network is generated from sales records as a directed graph. In this network, a node shows a hotel, a directed link from node A to node B shows that the users’ preference moves into B from A, and a bidirectional link between A and B shows that both A and B is preferable for users equally. Preference transition network is followed the course to make recommendation. The process to create preference transition network consists of the following steps: 1) expressing the relations between hotels, 2) calculation of the co-occurrence and filtering the relations, and 3) identification of user’s preference transition. 1) Expressing the relations between hotels: First, the proposed system generates the relations between hotels from sales records. The network generated from sales records is called Item network and can be regarded as a co-occurrence graph of hotel reservation. This process to create the network consists of three substeps as shown in Figure 2. a)Filtering sales records: According to the situation such as season, place, and place range, the system filters related sales records from total sales records. b)Creating binary graph between users and hotels: From the filtered sales records, the system generates the relations Sales records

user

hotel i1

user

hotel

u1

u1

i1

u2

u1

i2

u2

i2

u2

i5

u3

i2

丵 丵丵

This chapter describes the two components which are fundamental to the proposed system. First, let us explain preference transition network, and next explain recommendation method based on the network.

Recommender system

i2 i4

u4

i5 i6

u5

i4

i3

u3

i1

i2

i5

i3

i6

i7 i8

Figure 2. Expression of the relations between hotels

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

i7 i8

i4

Simpson's Coefficient 0.8

i1

0.7

0.6

i2 i3

0.4

i5

0.1

0.5

0.7

i7

Threshold 0.3

0.5

i8

0.9

i6

i4

0.6

i1

0.4

i2

i5

i3

i6

Here, Simpson(X, Y) indicates the strength of the co-occurrence between hotel X and hotel Y, count(X) is the number of people who stayed in hotel X, count (XŀY) is the number of people who stayed in both hotel X and hotel Y, and count (XUY) is the number of people who stayed in either hotel X or hotel Y.

i7 i8

Figure 3. Filtering relations based on co-occurence

i4 i1

i2

i5

i3

i6

i4

i7

i7 i1

i8

i2

i5

i3

i6

i8 Preference transition network

Figure 4. Identification of preference transition TABLE I.

Transition pattern Meaning

TRANSITION PATTERN i

j

i

No difference between i and j

TABLE II.

j

i is better than j

EXAMPLE OF HOTEL USAGE

Time of sales

1/2004

12/2003

10/2003

5/2003

4/2003

1/2003

10/2002

Rank

1

2

3

4

5

6

7

7/2002 8

Item

A

A

B

A

B

B

B

B

between users and hotels as binary graph. c) Connecting hotel nodes: The system connects by links between two hotels which are stayed by same user. For example, in Figure 2, if a user u1 stayed in both hotel i1 and hotel i2, then hotel i1 and hotel i2 look relatable or alike, and the system connects two hotels by link. 2) Filtering relations based on co-occurrence: Next, in order to skim the significant relations, the system filters the relations of item network based on co-occurrence strength. Cooccurrence strength is used for several researches such as trend analysis and keyword analysis[5]-[6], and it is useful for recognition of the relations by quantitative value. Thus, by setting threshold, the system regards the links with cooccurrence than threshold as nonqualified, and removes the links. For example, in Figure 3, if threshold is 0.3, then the system removes the link between i2 and i6 because the cooccurrence of the link is 0.1. Here, there are several representative coefficients in order to measure co-occurrence such as Simpson coefficient and Jaccard coefficient. In this paper, we adopt Simpson coefficient. Simpson coefficient is defined as formula (1). Simpson ( X , Y ) =

count ( X  Y ) min( count ( X ), count (Y )) 

(1)

3) Identifying preference transition: Finally, the system identifies preference transition and provides information about the transition to item network. User’s preference seems to be apparent in previous hotel. However, the previous hotel might be used by accident. Thus, it is not necessarily appropriate to suggest that system regards previous hotel as more preferred than other hotels. Therefore the system identifies the preference transition between two hotels from sales records statically and makes recommendation based on the preference transition. In order to identify the preference transition, the system uses Mann-Whitney U test [7]. This test is a rank test for assessing whether two populations are different or not. If the test is rejected, one population is different from another population, and if not, both populations seem not to be different. In this paper, the system uses recent k records from sales records of hotel i and hotel j and carries out Mann-Whitney U test. If the test between hotel A and hotel B is not rejected, the system regards that the preference between two hotels seems not to be different. That is to say, the preference of hotel A may be equal to that of hotel B. Thus, as a matter of convenience, the result is shown in item network as bidirectional link. On the other hand, if the test between hotel A and hotel B is rejected, the system regards that the preference seems to be different (one hotel is more preferable than another hotel). The result is shown in item network as directional link which leads from the node with higher average of rank to the node with lower average (Table1). Let us show an example of deciding a direction among hotels. Here, it assumes that k is 8 and Table 2 shows sales records about two hotels (A and B). Additionally, the sales records are also ranked in order of most recently reservation. In this case, the statistics value of Mann-Whitney U test is 0.036. If the level of significance is 5%, null hypothesis is rejected and the system concludes that preferences on A and on B are different. Here, average of rank in A is (1+2+4)/3=2.3 and that in B is (3+5+6+7+8)/5=5.8. Consequentially, the system judges that user’s preference seems to move into A from B, and draws a directional link to A from B. Let us show other examples. If sales records include “AABBBBAB”, null hypothesis is not rejected. Then, the preferences on A and on B are not different, and bidirectional link is drawn between A and B. If sales records are “ABABABAB”, null hypothesis is not rejected, and bidirectional link is also drawn. In this way, our system creates preference transition network as shown in Figure 4. B. Recommendation based on Preference Transition Network When a user selects a hotel, the system starts to search more preferable hotel if exists based on the proposed preference

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

2439

The system make ranking the hotel based on the relations between nodes in preference transition network, To be concrete, the system uses the number of links into a node and the number of links from a node shown in formula (2).

S1 S S2

Pi =f (InDegreei ,OutDegreei)

Figure 5. Target-node selection

T1 T T2

Figure 6. Tracking-back selection N Node InDegree OutDegree

N1

N2

N3

P

N1

1

1

0

N2

1

2

-1

N3

3

0

3

N

2

2

0

Figure 7. Ranking of recommended hotels

transition network. The recommendation to search more preferable hotel consists of the following steps: 1) selection of recommended candidates, and 2) ranking candidates. 1) Selection of recommended hotels: First, the system selects some candidates for the selected hotel based on preference traditional network. There are two patterns to select candidates. a)Selection of the targets of directed links: This pattern (called target-node selection) selects the nodes as candidates which locate at the destination of the link. This underlying idea is derived from the feature that node of link head is more preferred than that of link tail. In Figure 5, if a user selects hotel S, this pattern selects hotel S1 and hotel S2 as candidates. b)Selection by tracking back in-links: This pattern (called tracking-back selection) traces the source nodes of links connected to the selected hotel and selects the nodes as candidates which locate at the other destinations of the source nodes. This underlying idea is derived from the perspective that the candidate nodes and selected node seem to be in the same level from the viewpoint of the source node. To be more precious in Figure 6, if a user selects hotel T, this pattern selects hotel T1 and hotel T2 as candidates. To avoid the degraded recommendation, our system also adds a previous hotel to the candidates. Including the previous hotel in the candidates allows the system to make recommendation in consideration of reusability. 2) Ranking Candidates: Next, the proposed system makes ranking of selected hotels in order to reduce user’s burden.

2440

(2)

Here, Pi is the value to make ranking. InDegreei is the number of links into hotel i and OutDegreei is the number of links to hotel i. InDegreei shows how many hotels transfer to i. On the contrary, OutDegreei shows how many hotels are transferred from i. Using formula (2), the system can grade hotels by users’ preference. In addition, there is still a problem that users grow tired of hotel by being recommended the same hotel, which is hotel N in Figure 7 continuously. To resolve the problem, we set penalty to the ranking parameter P by adding the C which is the number of continual stay to the model shown as formula (3). Pi =f (InDegreei ,OutDegreei, C)

(3)

The system configures the formula (3) as the more continual usage of selecting i is, the fewer Pi is. So we can solve the above problem. In our prototype system in next chapter, we set formula (3) up as InDegreei - OutDegreei – C2. For example in Figure 7, C is 2 because a user selects N1 twice, and P of N is 04=-4. As result, N3, N1, N2 and N are recommended in order because the values of P are 3, 0, -1, and -4 respectively. IV.

PROTOTYPE SYSTEM

This chapter describes the prototype system by using real sales records of internet hotel reservation service. A. Overview of Prototype System The goal of creating prototype system is to verify the performance of preference transition network and recommendation based on the network. The prototype system

is available for the hotel usage in Tokyo, Japan.

In order to implement the system, we have used sales records which contain 2,227 users who stayed the hotels in Tokyo more than 20 times. Details are shown in Table3. In addition, as a result of preliminary investigation, we removed the data of one hotel because its P, which is a value to make ranking in recommendation, is much higher than those of the other hotels (Figure 8). In fact, the removed hotel is in the recommended lists in many cases. Then, the hotels which did not have link with others except the removed hotel were also removed and the number of hotel became 172. TABLE III.

DETAIL DATA OF STAYING Total Number

Usage Number

Hotel

351

263

Customer

65,463

2,227

Average Staying Frequency

5.5

40.4

NB: Usage Number: Number of those whose frequency is more than twenty

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

Figure 10 shows the details concentrated on a hotel a in the preference transition network shown in Figure 9. When a user stayed in hotel a and selects the same hotel, the prototype system selects hotel b, c and d by using targetnode selection. In addition, the system selects hotel f as well as hotel b and hotel c newly by tracking-back selection. Finally considering reusability, the system includes hotel a and recommends hotel a, b, c, d and f in order based on P.

200

150 Value to m ake ranking(P)

B. Components and Recommendation Example Let us show preference transition network in the prototype system in Figure 9. Note that Simpson coefficient is 0.5 and k in Mann-Whitney U test is 20 in Figure 9. As shown in Figure 9, certain nodes have many links, and the directions of links concentrated on these nodes. From the figure, we could find that (i) user’s preference concentrates on the certain hotels, and (ii) the links and nodes converge in center of network.

100

50

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 -50 Rank

Figure 8. The distribution of the value to make ranking with hotels

C. Speculation From Figure 9, we have found that many links concentrated on certain nodes and the direction of the links also concentrated on the nodes. This indicates that we can know the characteristic of the relations between hotels effectively by the analysis of preference transition network. In addition, the prototype system could have recommended the hotels by two-pattern selection as shown in Figure 10. Then, we notified that the feature of hotels elected by tracking-back selection is a little different from that of hotels selected by target-node selection, although our system tracks back a link once. The system may recommend more novel hotels if tracing back a link more times, V.

RELATED WORKS

Let us discuss the position of our research by comparing to the related works on recommendation: content-based filtering and collaborative filtering, and consider about the related work on co-occurrence graph. A. Content-based Filtering A system with content-based filtering makes recommendation based on comparing user profile with the characteristic of content model [7] Content model is made by extracting the characteristic of items, and user profile is made by analyzing the questionnaire and the evaluation about items from users. Content-based filtering uses the only characteristics of items in recommendation. On the other hand, our recommendation method considers the preference transition between hotels which is shown as a directed graph. In this point, our system differs from content-based filtering. B. Collaborative Filtering A system with collaborative filtering recommends items for a user which is liked by other users whose preference is alike. The collaborative filtering uses not the content of items but the similarity of users or items. Breeze et al. classify collaborative filtering methods into memory-based method and model-based method [9].

Figure 9. Preference transition network in protorype system. (Simpson coefficient: 0.5, node: 172, link: 254) P=5 d

P=12 c a e P=1

f

P=-2

P=2 b

Figure 10. Recommendation example

1) Memory-based method: Memory-based method makes recommendation based on the user’s rating on items. Similarities of users or items are calculated from this information. For example in the case of GroupLens, first the

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

2441

similarities of users’ preference are calculated from values of items with Pearson’s correlation [10]. Next items’ values of a user for whom an item is recommended are predicted from the similarities between the user and others. 2) Model-based method: Model-based method uses the performative model which is based on the relations between users and items. There are max-entropy(maxent) model and Markov decision process (MDP) model using sales records [11]. Mexent model recommends items which user is most likely to select based on the sales records. Concretely, when sales records exist in the time series, a system with the maxent model calculates the conditional probability in the next usage, and recommends items which have the highest probability. On the other hand, MDP model is the extension of maxent model [12]. MDP model is the model which considers the effects whether systems make recommendation to users or not based on a reward function. MDP model allows systems to make recommendation in consideration of the total rating value of items. Our system is the application of model-based method because our system creates preference transition network, which is the model that expresses relations between users and hotels, in advance. Here, let us compare our system with others. In the maxent model and MDP model, probability exists as a basis. Our model, that is, the preference transition network utilizes not probability but statistical test. Specifically, our system with preference transition network makes recommendation qualitatively in contrast to the other systems which make recommendation quantitatively. Although the qualitative recommendation has demerit that the recommendation is ambiguous not to express the value, it is the merit that users can understand the process of recommendation more easily. Additionally, our model can be expected to eliminate the ambiguity over recommendation because of the statistical judgment by Mann-Whitney U test. VI.

CONCLUSION

This paper has proposed a hotel recommender system based on preference transition. Our system is applicable if the item recommended has three characteristics: 1) the item is purchased repeatedly like commodity products, 2) user may purchase new item instead of the item purchased before, and 3) it is difficult to acquire the rating information for items. In order to recommend the hotels, we have proposed the system which identifies the user’s preference transition and makes recommendation based on them. In order to identify the preference transition, we have proposed the creation method of preference transition network along with three steps: 1) expressing hotel relations based on volumes of sales records, 2) filtering relations based on Simpson coefficient, and 3) identifying preference transition by Mann-Whitney U test. We have also proposed the recommendation methods which consist of two steps: 1) selecting the recommended candidates by target-node selection

2442

and tracking-back selection, and 2) Ranking candidates in consideration of in-degree and out-degree in preference transition network. In the prototype system, we have confirmed the performance about preference transition network and recommendation and also acquired the new aspect about them. Our system is not quantitative but qualitative, so it has the demerit that the recommendation is ambiguous not to express the value. But it has the merit that users can understand the process of recommendation easily and see how preference between hotels transfers visually when the relations of hotels are illustrated. ACKNOWLEDGEMENT The authors would like to express sincere thanks to Prof. Ikuo Arizono and Prof. Zhonqi Sheng. Prof. Arizono gave us the advice for this research and Prof. Sheng read the manuscript and feedback comments for it. Their advice and comments are gratefully acknowledged. This research is partially supported by the ICOM Electronic Communication Engineering Promotion Foundation , Japan. REFERENCES [1]

M. J. Pazzani, "A framework for collaborative, content-based and demographic filtering," Artif. Intell. Rev., vol. 13, no. 5-6, pp. 393-408, 1999. [2] P. Resnick and H. R. Varian, "Recommender systems," Communications of the ACM, vol. 40, no. 3, pp. 56-58, March 1997. [3] R. Saga, H. Tsuji, and J. Onoda, “Agent System for Notifying Hotel Room Reservation Alternatives”, in Proceeding of Human Computer Interaction International 2005 (HCII2005), Las Vegas, August 2005. [4] R. Saga and H. Tsuji, “Sales Records Based Recommender System for TPO-Goods”, IEEJ Trans. EIS, vol. 126, no. 5, March 2006. [5] Y. Ohsawa, N. E. Benson, M. Yachida, ”KeyGraph : Automatic Indexing by Segmenting and Unifing Co-occurrence Graphs”, IEICE DΣ, Vol. J82-D-Σ, No. 2, pp. 391-400, 1999 [6] R. Feldman, Y. Aumann, A. Zilberstein, and Y. Ben-Yehuda, “Trend graphs: Visualizing the evolution of concept relationships in large document collections”, in Proceeding of Second European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD 1998), pp. 38-46, 1998 [7] E. L. Lehmann, Nonparametric Statistical Methods Based on Ranks. New York, McGraw-Hill, 1975. [8] M. Balabanovic and Y. Shoham, “Fab: Content-Based, Collaborative Recommendation.” Communications of the ACM, vol. 40, no.3, pp. 6672, March 1997. [9] J. S. Breese, D. Heckerman, and C. Kadie, "Empirical analysis of predictive algorithms for collaborative filtering," in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI98). San Francisco: Morgan Kaufmann, 1998, pp. 43-52. [10] P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl, "Grouplens: An open architecture for collaborative filtering of netnews," in Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work. Chapel Hill, North Carolina: ACM, 1994, pp. 175186. [11] D. Pavlov, E. Manavoglu, and C. Giles, "A maximum entropy approach to collaborative filtering in dynamic, sparse, high-dimensional domains," in Proceedings of the Sixteenth Annual Conference on Neural Information Processing Systems (NIPS-2002), vol. 15, December 2002., pp. 1441-1448. [12] G. Shani, R. I. Brafman, and D. Heckerman, "An mdp-based recommender system." Journal of Machine Learning Research, vol. 6, pp. 1265-1295, December 2005.

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)