Cross-Lingual Query Suggestion Using Query Logs of ... - Google Sites

7 downloads 221 Views 352KB Size Report
A functionality that helps search engine users better specify their ... Example – MSN Live Search .... Word alignment
Cross-Lingual Query Suggestion Using Query Logs of Different Languages Wei Gao1*, Cheng Niu2, Jian-Yun Nie3, Ming Zhou2, Jian Hu2, Kam-Fai Wong1, Hsiao-Wuen Hon2 1The

Chinese University of Hong Kong

{wgao, kfwong}@se.cuhk.edu.hk 2Microsoft

Research Asia

{chengniu, mingzhou, jianh, hon}@microsoft.com 3Université

de Montréal

[email protected] *This work was done when the author was visiting at Microsoft Research Asia

Outline „ „

„ „ „ „

Introduction Discriminative Model for Cross-Lingual Query Suggestion (CLQS) Mono-/Cross-Lingual Features CLIR with CLQS Performance Evaluation Conclusions

Query Suggestion „

Query Suggestion „

„

A functionality that helps search engine users better specify their information needs with related queries having been frequently used by other users.

Example – MSN Live Search

More Example „

MSN and Google’s keyword tool suggesting terms for pay-forperformance search market.

Query Suggestion and Query Expansion „

Query Suggestion vs. Query Expansion Query expansion

Query suggestion

Target

Terms and/or phrases

Full queries

Mechanism

Term/phrase extraction from documents

Query extraction from query logs

Cross-Lingual Query Suggestion (CLQS) „

Cross-Lingual Query Suggestion (CLQS): suggests related queries, but in a different language. „

Example: (French) terrorisme international Æ (English) international terrorism, world terrorism, what is terrorism, terrorist attacks, terrorist groups, september 11, …

CLQS by Query Log Mining „

„

A query in a source language is likely to have correspondents in the query log of the target language.

French Query

English Query Log

voyage en france gagner des millions produits de luxe météo strasbourg

luxury commodity luxury cars travel in France rail travel in France strasbourg weather win some millions

CLQS based on mining query logs of difference languages Source F Query

Cross-Lingual Query Suggestion

Query log

E Relevant Queries

Application

Outline „ „

„ „ „ „

Introduction Discriminative Model for Cross-Lingual Query Suggestion (CLQS) Mono-/Cross-Lingual Features CLIR with CLQS Performance Evaluation Conclusions

Principled Approach to Cross-lingual Similarity Estimation „ „

Central Task: Define and estimate cross-lingual query similarity Principled approach to similarity estimation: the cross-lingual similarity is equal to the monolingual similarity between the target query and the source query’s translation. qf: pages jaunes

T(qf): yellow page

white pages yellow page (1.000) phone books fitting phone book (0.986) home page telephone directory (0.992) telephone directory white page (0.837) phone book “lost” (0.000) telephone directories business directory Monolingual query similarity yellow page as target to fit CLQS candidates

SVM Regression for CLQS „

Regression model for learning the cross-lingual query similarity function. simCL (q f , qe ) = simML (Tq f , qe )

„

Advantages: 1. Only need a list of manually created query-translation pairs. 2. Used to fit any proper monolingual query similarity measure. 3. A principled way to integrate multiple features.

„

SVM Regression: simCL (q f , qe ) = w • φ ( f (q f , qe ) )

Outline „ „

„ „ „ „

Introduction Discriminative Model for Cross-Lingual Query Suggestion (CLQS) Mono-/Cross-Lingual Features CLIR with CLQS Performance Evaluation Conclusions

Monolingual Query Similarity „

Monolingual query similarity „

Combining both query content-based similarity and click-based similarity, estimated from query log (Wen et al., ACM TOIS, 2002).

simML ( p, q ) = λ ∗ simcontent ( p, q ) + (1 − λ ) ∗ simclick −through ( p, q ) KN ( p , q ) sim content ( p , q ) = Max (kn ( p ), kn ( q ) ) RD ( p , q ) sim click − through ( p , q ) = Max (rd ( p ), rd ( q ) )

KN(x,y): # of query words in common; kn(x): # of query words in x RD(x,y): # of common URLs; rd(x): # of clicked URLs of x

Cross-Lingual Features „

Queries qf and qe are bilingually similar if „ „

„

„

they are translatable by a bilingual dictionary. they are statistically associated in word-aligned parallel data. their query words co-occur frequently on eb pages. qe is monolingually similar with queries generated as above.

Cross-Lingual Features (1): Dictionarybased Translation with Disambiguation „

Dictionary-based translation disambiguation using word co-occurrence statistics q = {w , w ,..., w } T ( w ) = {t , t ,..., t } f

f1

f2

fn

fi

MI (tij , t kl ) = P(tij , t kl ) log S dict (q f , Tq f ) =

„

∑ MI (t

i , k ,i ≠ k

ij

i1

i2

P (tij , t kl ) P(tij ) P(t kl )

, t kl )

im

qf: la

seconde guerre mondiale

world an second war the wartime worldwide II war quarrel a warfare it

Use top-4 translations to retrieve target queries, and assign corresponding translation scores.

Cross-Lingual Features (2): Translation Score based on Parallel Corpora „

Parallel corpus as an assistant bilingual resource. „

„

„

Word alignment optimization: GIZA++ (Och and Ney, 2003) Given qf, retrieve queries from the log containing the aligned words of qf. Associate the candidate queries with bi-directional similarity score based on IBM model 1 (Brown et al., 1993) translation probability: S IBM 1 (q f , q e ) =

PIBM 1 (q f | q e )⋅ PIBM 1 (q e | q f

)

|t | |s| |t | |s| C (t j , si ) + δ 1 1 ( ) | PIBM 1 (t | s ) = p t s = j i (| s | +1)|t| ∏∑ (| s | +1)|t| ∏∑ j =1 i = 0 C (si ) + δN j =1 i = 0

Cross-Lingual Features (3): Online Mining for Related Queries „

If a target-language query often co-occurs with the source-language query in many web pages, they are likely to be semantically related.

Cross-Lingual Features (3): Online Mining for Related Queries (cont’) „

Format bilingual search queries: „

„

la seconde guerre mondiale: (la seconde guerre mondiale) AND (the OR la OR a OR it) AND (second OR II) AND (war OR wartime OR quarrel OR warfare) AND (world OR worldwide OR war)

Co-Occurrence Double Checking (CODC): two objects a and b are considered to have association if b can be found by using a as query, and vice versa (Chen et al., ACL, 2006).  0 , if freq (q @ q )⋅ freq (q @ q ) = 0 ( ) ( ) (q , q ) =  S ( ) ( ) e

CODC

f

e

 e 

 freq q e @ q log  freq Q f 

f

f



f

@ Qe   freq Q e 

freq Q

e

α

f

, otherwise

Cross-Lingual Features (4): Monolingual Query Suggestion „

Further improve the recall of CLQS for a given set of target candidate queries Q0 expanded by using monolingual query suggestion. English: telephone directory

French: pages blanches SQML(business directory): white pages [xxx,1.000,xxx]

phone books […,0.986,…] home page […,1.000,…] SQML(yellow page): telephone directory […,1.000,…] phone book […,0.986,…] telephone directories […,1.000,…] yellow page […,0.950,…]

business directory [xxx,0.887,xxx]

Q0

telephone directory (1.000) phone book (0.986) white page (0.992) yellow page (0.950) business directory (0.850)

qe ∈Q0

qe ∉Q0

English: white page white page (1.000) telephone directory (0.975) phone book (0.960) yellow page (0.911) business directory (0.887)

Recap: SVM Regression for CLQS „

Regression model for learning the cross-lingual query similarity function. simCL (q f , qe ) = simML (Tq f , qe )

„

Advantages: 1. Only need a list of manually created query-translation pairs. 2. Used to fit any proper monolingual query similarity measure. 3. A principled way to integrate multiple features.

„

SVM Regression: simCL (q f , qe ) = w • φ ( f (q f , qe ) )

Outline „ „

„ „ „ „

Introduction Discriminative Model for Cross-Lingual Query Suggestion (CLQS) Mono-/Cross-Lingual Features CLIR with CLQS Performance Evaluation Conclusions

CLIR with CLQS „

Besides as a standalone system, CLQS can be leveraged to supported CLIR „ „

„

Given qf, compute CLQS {qe} For each qe, perform monolingual IR based on BM25 model (Robertson et al., 1995) Documents are merged and re-ranked by the sum of BM25 scores

Outline „ „

„ „ „ „

Introduction Discriminative Model for Cross-Lingual Query Suggestion (CLQS) Mono-/Cross-Lingual Features CLIR with CLQS Performance Evaluation Conclusions

Performance Evaluation „

Data Resources „ „ „

„ „ „

CLIR: F-to-E 1-month MSN English query log, 7.29M queries, freq>5 4,171 manually created F-E query-translation pairs, 70% as training set, 20% as test set, 10% as development set Bilingual dictionary: 120,000 F-E entries Europarl F-E parallel corpus (http://people.csail.mit.edu/koehn/publications/europarl) CLIR Benchmark collection: TREC6 CLIR French-English dataset „ „

Document set: AP88-90 newswire, 750MB Query set: 25 F-E queries pairs (CL#01-25), avg. length=3.3

Objective CLQS Performance „

Benchmark CLQS by comparing with Monolingual Query Suggestion (MLQS) „

Mean-square-error (MSE) of SVM Regression MSE =

„

(

)]

2

Classification precision (P) and recall (R) P=

„

[

1 ∑ sim CL (q fi , q ei ) − sim ML Tq fi , q ei l i S CLQS ∩ S MLQS

R =

S CLQS ∩ S MLQS S MLQS

S CLQS

CLQS performance with different feature settings Features

Regression

Classification

DD: dictionary only;

MSE

Precision

Recall

DD+PC: dictionary and parallel corpora;

DD

0.274

0.732

0.098

DD+PC+Web: dictionary, parallel corpora,

DD+PC

0.224

0.713

0.125

DD+PC+Web

0.115

0.808

0.192

DD+PC+Web+MLQS

0.174

0.796

0.421

and web mining;

DD+PC+Web+MLQS: dictionary, parallel corpora, web mining, and monolingual query suggestion

Subjective CLQS Performance „

Human subjective test on CLQS relevancy „ „

„ „

200 French queries from French log not in training examples 1,727 English queries are produced by the model, avg. 8.7 suggestions per query. 1,407 are recognized as relevant. Accuracy=80.9% An example (CL14): “terrorisme international” (international terrorism) International terrorism (0.991); what is terrorism (0.943); counter terrorism (0.920); terrorist (0.911); terrorist attack (0.898); international terrorist (0.853); world terrorism (0.845); global terrorism (0.833); transnational terrorism (0.821); human rights (0.811); terrorist groups (0.777); patterns of global terrorism (0.762); september 11 (0.734)

CLIR Performance using CLQS „

For comparisons, we run 4 experiments using „ „

„

„

CLQS-based CLIR (all features) MT-based query translation (MT): a commercial F-to-E MT system, i.e. Google’s translation tool (http://www.google.com/language_tools) Dictionary-based query translation (DT): implementation of translation disambiguation based on co-occurrence statistics (Ballestors and Croft, 1998) Post-translation expansion (Ballestor and Croft, 1997; McNamee and Mayfield, 2002) based on pseudo relevance feedback (PRF) using the output of CLQS, MT and DT. PRF takes top 10 terms from top 25 retrieved documents.

CLIR Performance without Post-translation Expansion Average precision

„

11-point interpolated precision-recall curve

IR method

Average Precision

% of Monolingual IR

Monolingual

0.266

100%

MT

0.217

81.6%

0.6

DT

0.186

69.9%

0.5

DT

CLQS

0.233

87.6%

0.4

CLQS

Monolingual: monolingual IR; MT: CLIR based on machine translation; DT: CLIR based on dictionary translation; CLQS: CLQS-based CLIR

11-point P-R curves (T REC6) 0.7

Precison

„

MLIR MT

0.3 0.2 0.1 0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

11-point P-R curve without post-translation expansion

CLIR Performance with Posttranslation Expansion „

Average precision comparisons before and after post-translation expansion

„

11-point P-R curve with posttranslation expansion 11-point P-R curves with pseudo relevance feedback (T REC6)

AP without PRF

AP with PRF

Change

Monolingual

0.266 (100%)

0.288 (100%)

+8.27%

MT

0.217 (81.6%)

0.222 (77.1%)

+2.30%

DT

0.186 (69.9%)

0.220 (76.4%)

+18.3%

CLQS

0.233 (87.6%)

0.259 (89.8%)

+11.2%

„

Significant test (t-test): p-value

Suggest Documents