Emphasizing Temporal-based User Profile Modeling in the Context of Session Search Ameni Kacem
Mohand Boughanem
Rim Faiz
LARODEC, ISG University of Tunis, Tunisia IRIT, University of Paul Sabatier Toulouse, France
IRIT, University of Paul Sabatier Toulouse, France
LARODEC, IHEC Carthage University of Carthage, Tunisia
[email protected]
[email protected]
[email protected] ABSTRACT
defined by Boldi et al. [2] is a sequence of queries issued by a single user within a specific time limit.
In this paper, we aim at modeling the user profile containing timely relevant information extracted from his interactions with search engines. We considered a time-sensitive user profile that provides relevant and fresh information inferred from his submitted queries, reformulated queries and clicked results. We used a unique profile that integrates current and recurrent interactions within a session giving more importance to recent interactions without ignoring the old ones. We conducted experiments using the 2013 TREC Session track and the ClueWeb12 collection that showed the effectiveness of our approach compared to state-of-the-art ones.
An effective way to personalize a current query in a session is to understand the user’s interests and preferences that can be modeled through a user profile and expressed in his previous interactions such as submitted queries, reformulated queries and clicked results. Our aim is to improve sessions’ results taking into account user’s interactions under the assumption that recent performed ones are more related to the current needs than to the foregoing ones. In fact, a user reformulates a query in order to find new relevant information adapted to his current information need. The user profile can be represented as vectors of keywords. The weight of the keywords, in most of the prior works, is assigned using the TF.IDF scheme or its variants [23]. Other approaches extract taxonomies from the Open Directory Project (ODP) hierarchy to represent the user profile [20].
CCS Concepts •Information systems → Information retrieval; Query log analysis; Personalization;
Keywords
In this paper, we address the issue of leveraging user interactions with search engines in order to represent his profile as a vector of keywords where terms are weighted according not only to their frequency but also to their freshness. Specifically, user’s activities are presented in the form of a unique time-sensitive profile that merges both current and recurrent interactions giving more importance to recent ones.
personalization, user profile, session search, temporal dynamics
1. INTRODUCTION Usually a user interacts with a search engine by submitting a query describing his information need. A list of results are presented and he clicks on one or more results that interest him. When his information need has not been satisfied, the user reformulates the previous query with different manners [9].
The remainder of this paper is organized as follows. In Sect.2, we review related works focusing on personalized search systems and session search. In Sect.3, we propose a temporal-frequency user profile that adjusts the term frequency according to its recency. Sect.4 describes the experimental methodology used to evaluate our proposed approach based on 2013 TREC Session Track followed by the corresponding results and their discussion. The final Section presents a summary of our work and future directions.
Recently, some works have moved from considering each submitted query independently [15] to taking into account the user behavior towards previous queries with the purpose of satisfying his current information need. A session search, as Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
2.
RELATED WORK
In this Section, we review prior works addressing the problem of personalized search and examine approaches proposed within session search. Personalized search techniques can be used in the session search context in order to adapt search results and improve the accuracy of current query results by incorporating prior information and context.
SAC 2017,April 03-07, 2017, Marrakech, Morocco
Copyright 2017 ACM 978-1-4503-4486-9/17/04. . . $15.00
http://dx.doi.org/10.1145/3019612.3019693
925
2.1 Personalized Search
2.2.2
The personalization of information retrieval can be achieved using several approaches namely the query expansion and the result reranking [18]. Result reranking is the most common approach in the IR field. It consists in reranking the initial list of results returned by a standard search engine by combining the scores obtained from query-document and profile-document.
Novelty within Session Search
In [10], the authors considered that there is a need to explore approaches balancing performance and novelty. They assumed that the user’s interest degrades each time viewing a result, and proved the usefulness of past queries in whole-session search performance and the effectiveness of click-through information on maintaining the novelty. In a previous work of the same authors [11], a method to eliminate duplicate results in ranking was introduced. They simulated users’ browsing behavior in a session search under the assumption that a user reformulates a query to find not only relevant but also novel results.
For example, Shen et al. [19] developed an intelligent clientside web search agent that can handle both query expansion based on previous queries and result reranking based on click-through information. Dou et al.[6] proposed an evaluation framework to compare various personalization strategies such as profile-based and click-based history. A rich user profile is presented in [22], and constructed from multiple resources such as issued queries, visited Web pages, documents and emails. Sugiyama et al.[21] modeled the user profile using his browsing history and produced predictions of the user behavior by considering the similarity between the active user and a subset of users. They considered both the persistent and ephemeral preferences.
In addition, Clarke et al. [5] presented a framework for evaluation that emphasizes novelty and diversity based on the probability ranking principle (PRP). In order to satisfy the user intent, Gollapudi et al.[7] used an axiomatic framework using the measures of relevance and novelty. The novelty is obtained by computing the number of categories represented in the list of top-N results for a given query.
2.2.3
Recency and Time within Session Search
Using session search, relevant information can be operated in different ways but they all have the same aim: delivering the most relevant results to the current information need. One way to achieve this goal is to consider the temporal sequence of new additional information to the user profile during a session.
2.2 Session Search A session comprises: a current query, to which we need to predict results, a set of previous interactions containing past submitted queries and clicked results. The goal of session search is to improve a current query results taking into consideration user behavior.
In [17], the authors proposed an algorithm to improve the query’s original ranking by using the one trained on feedback on the nearest query which is chosen based on different distance measures. They proved the performance of the standard ranking’s results fusion with those returned by rerankers.
Approaches addressing session search issue can be grouped into three categories: approaches analyzing the user behavior through historical queries, those integrating the novelty and those employing recency of results.
Kotov et al. [13] modeled a classification framework based on features of individual queries and long-term user search behavior at different granularity. Their contribution had an impact on complex information needs and on cross-session search tasks. In [1], the authors addressed the analysis of session strategies effectiveness over time. They proved thanks to the time-based evaluation that the more time is available the less it matters how a user searches.
2.2.1 Historical Queries within Session Search In [24], the authors proposed an approach for current query change using the Markov Decision Process (MDP) by decomposing each adjacent query-pair into three parts: the added terms, the removed terms and the theme terms (common terms between queries) based on cross-session information. Similarly, Guan et al. [8] proposed a novel query change retrieval model (QCM) based on MDP. To enhance session search, they utilized syntactic changes between nearby queries and the relationship between query change and previously retrieved results.
From previous works, we note that a temporal function is highly recommended to be integrated in order to track additional information and giver more importance to the most recent evidence in the user profile.
In [14], the authors studied the influence of both task type and situation on user’s query reformulation behavior. A taxonomy of query reformulation was proposed based on five reformulation types: Generalization, Specialization, Word Substitution, Repeat and New.
3.
Chen et al.[4] used query expansion based on historical queries and the current query. They used unigram, bigram, 3gram and 4-gram phrases to detect the entity candidates and then weighted each term or phrase accordingly. In [16], the authors exploited conservative query expansion strategy based on past queries/clicked documents, similar sessions from other users and their clicked results. Specifically, they used different strategies of segmenting the queries for identifying and underlining concepts in the queries.
926
TIME-SENSITIVE USER PROFILE FOR SESSION SEARCH
In this work, the user profile is represented as a vector of terms corresponding to the user interests and extracted from his browsing history. Our goal is explore user’s past interactions and activities in order to enhance a current query’s results. If the user submits different queries, we assume that the recent ones are more likely to fit his information need. Thus, we assume that this way we naturally combine short-term and long-term interests. Precisely, we adopt a time-sensitive
approach under the assumption that older frequent terms should not outperform current and not frequent ones. Using session’s interactions, we first collect keywords from past search queries and click-through documents. Then, we compute their weights by combining both their frequency and their appearing moment. We consider the browsing history but any other source does not affect our approach. Formally, we consider a session search S = {Ij = Qj , {Dj }, {Cj }}, QC |j = 1 : N } where {Ij } represents all previous interactions. Each interaction comprises a submitted query Qj to which the search engine generates a list of documents {Dj }. A user clicks on a set of documents labeled {Cj }. Each session contains a current query denoted QC for which we need to predict results {R}. − → We define the user profile as a vector U of terms and their corresponding global weights W: − → U = (t1 : W1 , t2 : W2 , ..., tn : Wn )
Figure 1: Distribution of queries’ terms using termfrequency (a) and time-sensitive (b) approaches
where T F and IDF are the term-frequency and the inversedocument-frequency, K(CLast ST , Cj ST ) represents its timebiased function as described in equation 4 using the starttime of the j t h clicked document compared to the last one. If the document appears recently than it is more interesting for the user because s/he didn’t find the information being sought in the previous documents and tends to explore new ones.
(1)
where {ti |i = 1 : n} are the terms forming the user profile. The temporal weight W of a term ti is obtained by summing the weights obtained from previous queries WQ (ti ) and its clicked documents WC (ti ): W (t) = β.WQ (t) + (1 − β).WC (t)
After measuring the resulting global weight of each term in the user profile, we measure the score of each resulting document as follows: − → − → −−−−→ − → Score(R) = α.Sim( U , R ) + (1 − α).Sim(QCurr , R ) (6)
(2)
On the one hand, WQ (t) is computed through the following linear combination representing the weight obtained from all previous queries Qi : WQ (t) =
∑
nT F (t).K(QCurr , Qi )
− → − → − → − → where Sim( U , R ) and Sim( Q , R ) are the similarities between the user profile and the document result on the one hand, and between the result and the query on the other hand. α is the correlation variable. Both similarities are measured using the cosine function.
(3)
i
In equation 3, QCurr represents the current query, Qi represents each previous query, nTF is the normalized-termfrequency of a term and K(QCurr , Qi ) is its time-biased function that boosts term frequency of recent terms:
We give a sample session in Table 1. This session is composed of 7 queries, 4 of them has at least one clicked document that the user was interested in. Figure 1 shows the distribution of queries’ terms using cumulative frequency of words (a) as well as their distribution using cumulative time-based approach (b). The term “scooter”’ has a uniform distribution for both approaches as it is used in all previous queries. In Fig. 1(a), the last used terms “stores”, “price” and “review” have the same value (tf = 1). However, in Fig. 1(b) terms appeared recently have higher value when using a temporal-based approach: W (stores) > W (price) > W (review). In fact, we assume that a user changes keywords’ queries when s/he has a need that has not been satisfied previously. Consequently, there is a need to consider the timing and to exploit a temporal distribution of added terms rather than considering only their occurrence.
[ ] 2 −(QCurr ST − Qi ST ) 1 K(QCurr , Qi ) = √ . exp 2.σ 2 2.Π.σ (4) We propose to use the Gaussian Kernel function which determines the weight of propagated terms between the current query and each previous one where QCurr ST represents the start time of a session’s current query and Qi ST represents the start time of each previous one submitted by the user during the same session. This accumulated frequencybiased weight can give the value of a term t at the current query QCurr by considering its positions at past queries Qi favoring recent ones. On the other hand, we measure the weight obtained from all clicked documents in a session WC (t) for each query Qi : ∑ WC (t) = T F.IDF (t).K(CLast ST , Cj ST ) (5)
4.
EXPERIMENTS AND RESULTS
In this Section, we investigate the impact of the time-sensitive user profile strategy in the context of session search using
j
927
Table 1: Session search example with current query “where to buy scooters” Previous query Start-time SAT Clicks clueweb12-1616wb-28-27881 Q1. Scooter brands 79.932 clueweb12-0103wb-88-30226 Q2. Scooter brands reliable 229.262 clueweb12-0307wb-60-02121 Q3. Scooter 259.409 None Q4. Scooter cheap 303.478 None Q5. Scooter review 338.978 clueweb12-1616wb-28-27883 Q6. Scooter price 645.962 clueweb12-0002wb-43-35858 Q7. Scooter stores 690.053 None
content criterion and match the query’s terms. As already mentioned, Indri is a search engine that provides state-of-the-art text search part of the Lemur Toolkit which is designed to facilitate research in information retrieval. We proceeded to the stop words removal and we used also Porter stemmer.
2013 TREC Session track data. More specifically, we examine the impact of our proposed temporal pattern in improving the accuracy of the Web search. We particularly analyze how the proposed Time-Sensitive User Profile (TSUP), described in Sect.3, affects personalization and achieves better performances comparing to two baseline approaches namely the standard results returned for the current query without considering prior information, and the personalization approach using only TF.IDF scoring scheme.
3. Results’ Profiles Creation: After getting each document content, we create its corresponding keywordsbased profile using the TF.IDF scheme as it is the most common scheme for Web documents.
4.1 Experiments Settings
4. Smoothed Personalization: The similarity is measured using the cosine function between the document and query, on one hand, in order to get the content score of a document. On the other hand, it is measured between the document and the profile allowing to get the temporal score of the document. Those similarities are aggregated linearly as described in equation 6 by setting, after a set of experiments, α = 0, 6, β = 0, 7 and σ = 4 as used in our previous work [12].
4.1.1 Dataset To evaluate our work, we used 2013 TREC Session track dataset. The track proposed 69 different topics and 87 sessions used for evauation. Each session has a topic describing the aim of the search and covers historical queries and their issued times, ranked list of results, set of clicked URLs/snippets and the time spent by the user visiting a URL with an average of 4.4 clicks. Session track used the ClueWeb12 collection1 . The full collection consists of roughly 730 million English language Web pages, comprising approximately 5TB of compressed data. We used Indri to perform retrieval within this collection, an indexing and retrieval component for the Lemur Toolkit2 developed after collaboration between the universities of Massachusetts and Carnegie Mellon.
5. Relevance Evaluation: We used judgment values provided by Session Track: -2 for spam document; 0 for not relevant; 1 for relevant; 2 for highly relevant; 3 for key (top result) and 4 for navigational (specific result).
4.1.3 4.1.2 Evaluation Setup
Evaluation Approach
In order to evaluate our retrieval system, we conduct experiments using three runs: Standard Results (SR): that represents non personalized results obtained by submitting the current query, Term-Frequency Inverse-Document-Frequency based user profile (T F.IDF ): approach that dot not consider time and our personalized model that combines both frequency and temporality Time-sensitive User Profile (T SU P ).
For each session we follow the next steps: 1. Time-Sensitive User Profile Creation: We create a user profile for each session. We used the scoring approach described in details in Sect. 3 by choosing β = 0, 7 giving the best value of precision after a series of test. We consider the start time of each query and each clicked result in order to give more value to recent submitted queries and seen results. We used Lucene3 in order to index the session data, Lucene stop words removal and PorterStemFilter as stemmer.
4.1.4
Measures
Based on the qrels provided by NIST, we evaluated the submitted models SR, T F.IDF and T SU P for the 87 queries used to evaluation. In addition to the Mean Average Precision, the Precision P@10 and P@20, and the Normalized Discounted Cumulative Gain nDCG@10 and nDCG@20, we used the Expected Reciprocal Rank (ERR) based on the cascade model of search [3]:
2. Current Query Retrieval: We submit the current query of each session using the Indri search engine and we selected top 100 results. Those results satisfy the 1
http://lemurproject.org/clueweb12/ http://www.lemurproject.org/ 3 https://lucene.apache.org/core/ 2
ERR = E(1/s) =
k−1 K ∑ ∏ 1 p(q, dk ) (1 − p(q, di )) k i=1 k=1
928
(7)
Figure 2: Impact of features on Precision@10
Figure 3: Query position impact on our personalization model
where s denotes the rank at which we stop, q is a query in a session, K is the number of returned documents, where the probability that document k satisfies the user query is given by the transform of the editorial grade assigned to the query-document pair : p(q, dk ).
recent history. These results enabled us to see how the recent interactions of a user affect the quality of the profile. In fact, terms used recently by the user reflects a new information need expressed in the most recent queries. This proves that the temporal feature has an impact on the improvement of the ranking.
4.2 Results and Discussion This Section provides the results and the impact of a dynamic representation of the user’s search behavior.
5.
CONCLUSION
In this paper, we investigated how the temporal-based user profile influences the accuracy of results in the context of session search. We proposed a time-sensitive approach that merges both frequency and freshness of user’s actions thanks to the Kernel function. The vector-based representation takes into account the temporal -frequency of previous queries and clicked documents.
4.2.1 Performance Results Main results of runs’ comparison are presented in Table 2. Our TSUP achieves the best performance for all evaluation metrics. It represents the best results comparing to SR and TF methods. In fact, the aggregation of users past activities within a session giving more importance to recent performed ones improves significantly the search relevance. We notice that SR gives always the worst performance. This is due to the fact that it only considers the short-term (current) action of the user and do not consider prior ones.
We compared our approach to two non-temporal sensitive approaches: the standard results returned by Indri search engine and the user profiling based on the Normalized Term Frequency scheme. We find promising results proving the impact of the temporal-frequency such as the query issue time and time spent visiting a Web-page.
4.2.2 Features impact We further evaluated in Fig. 2 the impact of each feature taken individually by considering past queries only (Q) and then clicked results only (CL). We can see that the aggregation of those features indeed improves the search accuracy with an improvement of over 70% comparing to Q and of over 30% comparing to CL.
In addition, we analyzed the aggregation of the current and recurrent information. We found that increasing amount of items appeared in recent queries yields to greater improvement in retrieval performance. Although these results proved that temporal dynamics of users’ activities can play a significant role in session search, our time-sensitive model can be applicable to any collection. Thus, in the future, we will investigate how our user profiling method performs on other collections containing temporal dynamics information.
In fact, previous queries contain few words comparing to clicked results. Terms in clicked documents are used to enrich queries’ terms. A temporal distribution of those terms gives an overview of the moment of appearance in addition to how often they were used.
4.2.3 Dynamic Personalization impact
6.
We now report the impact of the previous queries position on the performance of personalization. We found that the more queries we consider, the better the quality of the personalized rank. As the average of session length is 11.5, we consider the 11 submitted queries of each session and study their impact on MAP improvement.
ACKNOWLEDGMENTS
This research work is carried out within the MOBIDOC device, under the PASRI program4 , managed by the ANPR5 , and funded by the European Union and Orange Tunisia Corporation6 . Any opinions, findings, conclusions, or recom4
www.pasri.tn www.anpr.tn 6 www.orange.tn 5
From Fig. 3, we notice that the increase of MAP comes from
929
Table 2: Performance comparison of our personalization approach using various measures SR TF.IDF TSUP MAP 0,348 0,352 0,397 ERR@10 0,212 0,254 0,261 nDCG@10 0,149 0,158 0,193 nDCG@20 0,109 0,123 0,155 P@10 0,350 0,352 0,361 P@20 0,247 0,259 0,273
mendations expressed on this paper are those of the authors, and do not necessarily reflect those of the sponsors. [13]
7. REFERENCES [1] F. Baskaya, H. Keskustalo, and K. J¨ arvelin. Time drives interaction: Simulating sessions in diverse searching environments. In Proceedings of ACM SIGIR’12, pages 105–114. ACM, 2012. [2] P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: Model and applications. In Proceedings of ACM CIKM’08, pages 609–618. ACM, 2008. [3] O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In Proceedings of ACM CKIM’09, pages 621–630. ACM, 2009. [4] Z. Chen, L. Xia, X. Yu, Y. Liu, and X. Cheng. Ictnet at session track trec 2013. In Proceedings of TREC’13. NIST Special Publication: SP 500-302, 2013. [5] C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. B¨ uttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In Proceedings of ACM SIGIR’08, pages 659–666. ACM, 2008. [6] Z. Dou, R. Song, and J.-R. Wen. A large-scale evaluation and analysis of personalized search strategies. In Proceedings of WWW’07, pages 581–590. ACM Press, 2007. Note that Zhicheng Dou’s email address has been changed to zhichdou at microsoft.com now. The old email
[email protected] is unavailable. Sorry for this. [7] S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of WWW’09, pages 381–390. ACM, 2009. [8] D. Guan, S. Zhang, and H. Yang. Utilizing query change for session search. In Proceedings of ACM SIGIR’13, pages 453–462. ACM, 2013. [9] B. J. Jansen, A. Spink, and T. Saracevic. Real life, real users, and real needs: A study and analysis of user queries on the web. Inf. Process. Manage., 36(2):207–227, Jan. 2000. [10] J. Jiang and D. He. Pitt at TREC 2013: Different effects of click-through and past queries on whole-session search performance. In Proceedings of TREC’13, 2013. [11] J. Jiang, D. He, and S. Han. On duplicate results in a search session. In Proceedings of TREC’12, 2012. [12] A. Kacem, M. Boughanem, and R. Faiz. Time-sensitive user profile for optimizing search
[14]
[15] [16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
930
personlization. In Proceedings of the UMAP’14, pages 111–121. Springer International Publishing, 2014. A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. Modeling and analysis of cross-session search tasks. In Proceedings of ACM SIGIR’11, pages 5–14. ACM, 2011. C. Liu, J. Gwizdka, J. Liu, T. Xu, and N. J. Belkin. Analysis and evaluation of query reformulations in different task types. In Proceedings of ASIS&T ’10, pages 17:1–17:10. American Society for Information Science, 2010. T.-Y. Liu. Learning to Rank for Information Retrieval. Springer Berlin Heidelberg, 2011. H. Matthias and b. . P. y. Michael, V¨ olske and Jakob, Gomoll and Marie, Bornemann and Lene, Ganschow and Florian, Kneist and Abdul Hamid, Sabri and Benno, Stein. Webis at trec 2013-session and web track. N. Neubauer, C. Scheel, S. Albayrak, and K. Obermayer. Distance measures in query space: How strongly to use feedback from past queries. In Proceedings of IEEE/WIC/ACM WI’07, pages 607–613. IEEE Computer Society, 2007. J. Pitkow, H. Sch¨ utze, T. Cass, R. Cooley, D. Turnbull, A. Edmonds, E. Adar, and T. Breuel. Personalized search. Commun. ACM, 45(9):50–55, Sept. 2002. X. Shen, B. Tan, and C. Zhai. Implicit user modeling for personalized search. In Proceedings of CIKM’05, pages 824–831, 2005. M. Speretta and S. Gauch. Personalized search based on user search histories. In Proceedings of IEEE/WIC/ACM WI’05, pages 622–628. IEEE Computer Society, 2005. K. Sugiyama, K. Hatano, and M. Yoshikawa. Adaptive web search based on user profile constructed without any effort from users. In Proceedings of WWW’04, pages 675–684. ACM, 2004. J. Teevan, S. T. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and activities. In Proceedings of ACM SIGIR’05, pages 449–456. ACM, 2005. D. Vallet, I. Cantador, and J. M. Jose. Personalizing web search with folksonomy-based user and document profiles. In Proceedings of ECIR’10, pages 420–431. Springer-Verlag, 2010. S. Zhang and H. Yang. Applying the query change retrieval model on session search-georgetown at TREC 2013 session track. In Proceedings of The 22nd TREC’13, 2013.