adjustment, (iii) a bound on the number of wrong difficulty settings relative to the best static setting chosen in hindsight, and (iv) an empirical investigation of the.
seen a picture of it when reading an article that contains the word panda. Although this idea sounds .... Educational Ps
Feb 8, 2005 - Copies may be requested from IBM T. J. Watson Research Center , P. ... with well-defined attributes, e.g., network measurements, call records,.
(Q1) In the second chapter of the document named "zoo.xml", find the figure(s) ... document("zoo.xml")/chapter[2] ...... as a universal medium for data interchange.
May 2, 2000 - the unrealistic assumption that a machine trusted by all parties is ... secret sharing [GeMi 95], secure key agreement [BeCK1 98, Shou 99] and ... switching model for it, and we found that typical real-life protocols, even after the ...
Aug 25, 2005 - is easy to determine, and we find that a short recursive function com- ... Notation In this paper, âxâ denotes the greatest integer not more than x ...
PDF File: Whisky In Your Pocket: A New Edition Of Wallace Milroy's The Origin 2. Page 2 of ... Ibm showcase query manual
Keywords: query-performance prediction, query drift, score distribution. 1 Introduction. Many information retrieval (IR) systems suffer from a radical variance in ...
Mar 13, 2013 - Copies may be requested from IBM T. J. Watson Research Center , P. ... optimizations we employed to better exploit the P775 architecture. .... then schedules the runtime call outside the upc forall loop nest containing the ...
strate the robustness and the effectiveness of our learning algorithms. Categories and Subject Descriptors. H.3.3 [Information Storage and Retrieval]: Information.
software engineering, analytics and optimization, services innovation, and .... â¢Masters in Computer Science, Electric
PhD in Machine Learning /Data Mining / Statistical Analysis with with ... Matlab, SAS, Weka) as well as big data process
A topic is deemed worst with respect to the individual run being evaluated. The use of full ... larity in the position of the difficult and of the easy queries. There is ...
Aug 19, 2005 - collection, or a single search engine retrieving documents from a plethora of ..... tained by optimizing the predictors to minimum mean square.
Jul 19, 2010 - tion accuracy shows that hard queries are easier to identify while ... system [5]. However, most post-retrieval query performance pre- dictors are ...
Apr 9, 2009 - patterns for 964,780 Web searching sessions. (composed of .... query into a mutually exclusive group based on an IP address, cookie, query ...
Oct 21, 2005 - We present a portable bug pattern customization tool defined on top ... These requirements lead to the definition of a minimal syntactic sugar ex ...
Jul 21, 2005 - Projects Agency (DARPA) under contract No. NBCH30390004. programmer forgets to obtain the appropriate locks at any of these points.
A typical centralized network file system is based on a dedicated file server that ..... translation table and allocation bitmap, and the other part is managed as a.
Jul 21, 2005 - Mandana Vaziri, Frank Tip, Julian Dolby. IBM Research ... Concurrency-related bugs may happen when multiple threads ac- cess shared data ... propriate synchronization for a significant subset of Java's Standard. Collections ...
Feb 11, 2011 - Almaden - Austin - Beijing - Cambridge - Haifa - India - T. J. Watson - Tokyo - ... is on the programmer to ensure that all concurrently accessed ...
Oct 31, 2001 - Copies may be requested from IBM T. J. Watson Research Center ..... considering only a subset of the supply patterns, which we call the current.
Service Interaction Networks [7] are abstractions used in modeling and analyzing people- .... Let us now consider the case of distributed software development. ... Modeling of service delivery in terms of ideas from supply chain management also faces
Oct 29, 2002 - workstation, desktop, and laptop systems are becoming available on these more mobile compute platforms. Existing applications are not only ...
Page 1 ... Hubble builders got award fees, magazine says. Hubble ... b+d a+c. Total c+d d c. Other a+b b a. Top 10. Sub- query. Total. Other. Top 10. Full query ...
Information Retrieval Group & Machine Learning Group
Predicting query difficulty
David Carmel Adam Darlow
IBM Haifa Labs
Shai Fine Elad Yom-Tov
IBM Haifa Labs
What is query difficulty?
Information Retrieval Seminar
IBM Haifa Labs
Why to predict query difficulty? Feedback to the user Improving the search engine effectiveness Additional applications
Information Retrieval Seminar
IBM Haifa Labs
Talk Outline Related work Predicting query difficulty – our approach Experimental Results Applications of query prediction Improving effectiveness Missing content detection Distributed Information Retrieval
Information Retrieval Seminar
IBM Haifa Labs
Related work Clarity score (Cronen-Townsend et al., SIGIR 2002) Reliable Information Access (RIA) workshop (2003): Investigated the reasons for variability between topics and systems. Perform failure analysis of several systems on TREC topics, 10 failure categories were identified. Five categories out of the 10 relates to systems’ failure of identifying all aspects of the topic.
TREC Robust Track 2004, 2005 Rank topics by predicted precision System performance is measured by comparing the ranking of topics based on their actual precision to the ranking of topics based on their predicted precision.
SIGIR’05 workshop on Predicting Query Difficulty www.haifa.il.ibm.com/sigir05-qp/ Information Retrieval Seminar
IBM Haifa Labs
Some topics (from TREC Robust track) Query (Topic’s title)
#Relevant docs
Median - Average Precision (Robust04 participants)
Overseas Tobacco Sales
38
0.067
Most Dangerous Vehicles
35
0.004
International Art Crime
34
0.016
Black Bear Attacks
12
0.067
Implant Dentistry
5
0.466
Information Retrieval Seminar
IBM Haifa Labs
Clues for query difficulty: Query independent features (Moth & Tanguy 2005) Morphological features :
Syntactical features :
- number (#) of words
NBWORDS
-Avg # of coujunctions
CONJ
- Avg word length -Avg # of morphemes per word
LENGTH
-Avg # of prepositions
PREP
-Avg # of suffixed tokens
SUFFIX
-Avg # of proper nouns
PN
Semantic feature :
-Avg # of acronyms
ACRO
-Avg polysemy value
-Avg # of numerical values
NUM
-Avg # of unknown tokens
MORPH
UNKNOWN
-Avg # of perso. Pronouns -Avg syntactic depth -Avg syntatic link span
PP SYNTDEPTH SYNTDIST
SYNSETS
Semantic ambiguity is the only (significant) indicator for query difficulty. (Pearson correlation ~ -0.3)
Information Retrieval Seminar
IBM Haifa Labs
Clues for query difficulty (cont): Result set features S1
Hubble Hubble troubles blamed on scrimping by NASA
Hubble Telescope Achievements
Toil and trouble: NASA needs to repair its public image Simple test would have found flaw in Hubble telescope ' flawless'mission over NASA scrubs launch of discovery
Hubble
Hubble builders got award fees, magazine says Nation in brief Flaw in Hubble telescope Hubble telescope Hubble space telescope placed aboard shuttle
Hubble troubles blamed on scrimping by NASA
S2
Telescope Touchy telescope torments controllers It' s all done with mirrors Future of Keck telescope
Hubble Telescope Achievements
Toil and trouble: NASA repair its public image Full S1needs S2to S3
Great eye sets sights sky high
Simple test would have found flaw in Hubble telescope Simple test would have found flaw in Hubble telescope Nation in brief
Even telescope' s downtime is booked Great eye sets sights sky high
S1 -
' flawless'mission over
Cause of Hubble telescope defect reportedly found
1
0
0 NASA scrubsS2 launch1of discovery
Powerful telescope soars into space aboard shuttle Nation in brief Star mix-up snags telescope focusing Telescope launched
Li Peng issues rules on teaching achievements Vu Oanh attends mobilization bloc conference
Hubble telescope
Paper views pan-arab role, al-asad-salih talks Beijing fel achieves saturated oscillation Press reacts to israeli-plo cairo agreement Good results achieved in oil recovery process
Hubble space telescope placed aboard shuttle
Abulaiti abudurexiti views xinjiang' s economic 70 achievements of mei institute 13 certified NTT develops world' s fastest transistor Six advanced radar achievements of institute 14
Information Retrieval Seminar
Hubble space telescope placed aboard shuttle Cause of Hubble telescope defect reportedly found Flaw in Hubble telescope Flawed mirror hampers Hubble space telescope Touchy telescope torments controllers NASA scrubs launch of discovery Hubble builders got award fees, magazine says
IBM Haifa Labs
Measuring the agreement between experts: The Kappa statistic
Doctor 1 Healthy Doctor 2
Sick
Total
Healthy
a
b
a+b
Sick
c
d
c+d
Total
a+c
b+d
a+b+c+d
Information Retrieval Seminar
a d N 1
a b a c c d b d N N N N a b a c c d b d N N N N
IBM Haifa Labs
Measuring the agreement between queries: The Kappa statistic
Full query
Subquery
Top 10
Other
Total
Top 10
a
b
a+b
Other
c
d
c+d
Total
a+c
b+d
a+b+c+d
Information Retrieval Seminar
IBM Haifa Labs
Measuring kappa using the overlap
Number of overlaps
ID of top 10 documents retrieved: Full query: “Magnetic Levitation Maglev” Sub-query 1: Magnetic Sub-query 2: Levitation Sub-query 3: Maglev Sub-query 4: levitation*magnetic Sub-query 5: levitation*maglev Sub-query 6: magnetic*maglev
77002
39741
76311
39741
6794
50129
47457
47947
39741
77003
35274
75402
39741
77002
67036 12944
87941
33402
47457
1013
22953
17382
1013
43506
89657
35273
69131
9273
47457
77266
17948
3
77402
69775
77002
87941
28369
4
45525
17881
70077
1010
24524
71172
76499
0
47457
30123
87941
16481
60688
89657
20896
22162
4
9391
39741
94709
76311
47457
35273
1013
61496
77002
6
1013
26549
89657
39741
35273
77266
6123
33402
84398
3
Information Retrieval Seminar
35273
IBM Haifa Labs
Block diagram of Query Prediction
Break into keywords and lexical affinities
SE Results
SE Query
Results
Measure overlap
Query prediction can be computed efficiently if we have access into the SE scoring process. Information Retrieval Seminar
Query terms’ Document frequency Top document score Low top score – a good indication for a difficulty
Number of terms in the query
Information Retrieval Seminar
IBM Haifa Labs
Experiments: The Robust Track task
A collection of approximately 500,000 documents 249 queries, with known relevant documents Each query consists of 3 parts: Title, Description, Narrative
Title: African Civilian Deaths Description: How many civilian non-combatants have been killed in the various civil wars in Africa? Narrative: A relevant document will contain specific casualty information for a given area, country, or region. It will cite numbers of civilian deaths caused directly or indirectly by armed conflict.
Information Retrieval Seminar
IBM Haifa Labs
Query Prediction evaluation Following the Robust track of TREC 2004 we use Kendall’s-tau to measure the distance between the queries sorted by actual difficulty and the queries sorted by the predicted difficulty.
Kendall’s-tau: Measures the distance between two ordered lists, by counting the number of (bubble sort) operations required to sort one list so that it is ordered as the other list. 1 2 3 4
2 3 1 4
5 5
1 2 3 4
2 1 3 4
5 5
1 2 3 4
1 2 3 4
5 5
Information Retrieval Seminar
KT
2 1 5
0 .6
IBM Haifa Labs
Query Prediction results (Kendall-tau scores): Comparison to other methods in the Robust track 2004
Method
Title Query
Description Query
Top score
0.260
0.379
Ave. top 10 scores
0.211
0.294
Standard deviation of the idf
0.110
0.243
Overlap
0.371
0.571
(49 new queries of the TREC 2004 Robust track) Information Retrieval Seminar
IBM Haifa Labs
How the search engine can use Query Prediction
Selective query expansion Modifying the search engine parameters Switch between query parts
Information Retrieval Seminar
IBM Haifa Labs
Results (TREC Robust Track 2004) Run name
MAP
P@10
%no
Description only
0.281
0.473
9.5
Description with AQE
0.284
0.467
11.5
Description with selective AQE
0.285
0.478
9.5
Description with modified parameters
0.282
0.467
9.5
Title only
0.271
0.437
10.0
Title. Description.
0.294
0.484
8.5
Title switch Description
0.295
0.492
6.0
Information Retrieval Seminar
IBM Haifa Labs
Identifying missing content queries: Knowing what you (don’t) know Missing content queries (MCQs) are queries for which there are no relevant documents in the index. Why is identification of MCQs important? For the Search Engine operator: Logging of the information that interests users but that the document collection cannot answer. For the user: Identify queries that cannot be answered by the search engine.
Identification of MCQs is performed using a modified version of the Query Prediction. Information Retrieval Seminar
IBM Haifa Labs
Identifying missing content queries: Proof of concept Goal: distinguish MCQ queries from non-MCQ queries.
Query
Experiment: all relevant documents of 170 queries were deleted from the collection, thus generating MCQs. MCQ predictor was trained for that data (based on overlaps) Results were evaluated using 10-fold cross-validation The obtained ROC area is over 0.34
Original Query predictor is used to filter easy queries The obtained ROC area with prefilter is over 0.9. Information Retrieval Seminar
Query predictor
MCQ predictor
System diagram
Decision
IBM Haifa Labs
Federation Merged ranking
Given several databases that might contain information relevant to a given question,
Federate results
How do we construct a good unified list of answers from all these datasets?
Search engine
dataset
Information Retrieval Seminar
dataset
dataset
IBM Haifa Labs
Federation using Query Prediction Train a predictor for each collection For a given query: Predict its difficulty for each collection Weight the results retrieved from each collection accordingly, and generate the federated list
Information Retrieval Seminar
IBM Haifa Labs
Results of the Federation experiment TREC-8 collection was divided into 4 sub-collections A query predictor was trained for each sub-collection Search results returned from each sub-collection were merged by different merging schemes Single collection
Merge method
P@ 10
%no
FBIS
76 1 .76
51 .8
FR 94
0 .70
75 .1
FT
2 .62
32 .1
LA -Times
2 .56
28 .9
Unweighted
3 .38
20 .9
CORI
3 .47
17 .7
Prediction
3 .63
15 .5
Information Retrieval Seminar
IBM Haifa Labs
Meta-search using Query Prediction
Merged ranking
Part of TREC-8 collection (LA Times) was indexed using four search engines.
Federate results
A predictor was trained for each search engine. For each query the results-set from each search engine was weighted using the prediction for the specific queries. The final ranking is a ranking of the union of the results-sets, weighted by the prediction.
Search engine
Search engine
Collection
Information Retrieval Seminar
Search engine
IBM Haifa Labs
How similar are the desktop search engines?
Information Retrieval Seminar
IBM Haifa Labs
Results of the metasearch experiment
Single search engine
Metasearch
P@10
%no
SE 1
0.139
47.8
SE 2
0.153
43.4
SE 3
0.094
55.2
SE 4
0.171
37.4
Round-robin
0.164
45.0
MetaCrawler
0.163
34.9
Prediction-based
0.183
31.7
Information Retrieval Seminar
IBM Haifa Labs
Summary Query difficulty estimation is: Achievable Low computational cost Provides some insight into what makes a query difficult for search engines Useful both in simple applications (Improving effectiveness) as well as more sophisticated one (MCQ estimation and DIR).
Future work: Improve prediction methods Understand better what makes a query difficult Information Retrieval Seminar
IBM Haifa Labs
Thanks!
any difficult
Questions?
Information Retrieval Seminar
IBM Haifa Labs
Query estimation results (Kendall-tau scores)
Collection TREC 249 queries
WT10G 100 queries
TREC+WT10G 349 queries
Query type
KTMAP
KTP10
Title Description
0.254 0.439
0.253 0.360
Title Description
0.110 0.093
0.155 0.140
Title Description
0.312 0.464
0.291 0.414
Observations: Better results for longer queries More training data do help Information Retrieval Seminar