Ranking Methods in Machine Learning

Ranking Methods in Machine Learning A Tutorial Introduction

Shivani Agarwal Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology

Example 1: Recommendation Systems

Example 2: Information Retrieval


information


information

Example 3: Drug Discovery

Problem: Millions of structures in a chemical library. How do we identify the most promising ones?

Example 4: Bioinformatics

Human genetics is now at a critical juncture. The molecular methods used successfully to identify the genes underlying rare mendelian syndromes are failing to find the numerous genes causing more common, familial, nonmendelian diseases . . . Nature 405:847–856, 2000

Example 4: Bioinformatics

With the human genome sequence nearing completion, new opportunities are being presented for unravelling the complex genetic basis of nonmendelian disorders based on large-scale genomewide studies . . .

Nature 405:847–856, 2000

Types of Ranking Problems

Instance Ranking Label Ranking Subset Ranking Rank Aggregation

?

Instance Ranking

doc1

>

doc2

, 10

doc3

>

doc5

, 20

…

Label Ranking

doc1

sports > politics health > money

doc2

science > sports money > politics

…

… …

Subset Ranking

query 1

doc1

>

doc2

,

doc3

>

doc5

…

query 2

doc2

>

doc4

,

doc11

>

doc3

…

…

Rank Aggregation query 1

results of search engine 1


query 2



…

…

desired ranking

…

desired ranking

Types of Ranking Problems

Instance Ranking Label Ranking Subset Ranking Rank Aggregation

?

This tutorial

Tutorial Road Map Part I: Theory & Algorithms Bipartite Ranking k-partite Ranking Ranking with Real-Valued Labels General Instance Ranking

RankSVM RankBoost RankNet

Part II: Applications Applications to Bioinformatics Applications to Drug Discovery Subset Ranking and Applications to Information Retrieval

Further Reading & Resources

Part I Theory & Algorithms [for Instance Ranking]

Bipartite Ranking

Relevant (+)

doc1+

doc2+

doc3+

Irrelevant (-)

doc1-

doc2-

doc3-

… doc4-

…

Bipartite Ranking

Bipartite Ranking

Is Bipartite Ranking Different from Binary Classification?






Bipartite Ranking: Basic Algorithmic Framework

Bipartite RankSVM Algorithm

[Herbrich et al, 2000; Joachims, 2002; Rakotomamonjy, 2004]

Bipartite RankSVM Algorithm

Bipartite RankBoost Algorithm

[Freund et al, 2003]

Bipartite RankBoost Algorithm

Bipartite RankNet Algorithm

[Burges et al, 2005]

k-partite Ranking Rating k

…

doc2k

doc3k

Rating 2

doc12

doc22

doc32

doc42

…

Rating 1

doc11

doc21

doc31

doc41

…

…

doc1k

k-partite Ranking

k-partite Ranking: Basic Algorithmic Framework

Ranking with Real-Valued Labels doc1

y1

doc2

y2

doc3

y3

…

Ranking with Real-Valued Labels

Ranking with Real-Valued Labels: Basic Algorithmic Framework

General Instance Ranking

doc1

>

doc1’

, r1

doc2

>

doc2’

, r2

…


ri


General Instance Ranking: Basic Algorithmic Framework

General RankSVM Algorithm

[Herbrich et al, 2000; Joachims, 2002]

General RankBoost Algorithm

[Freund et al, 2003]

General RankNet Algorithm

[Burges et al, 2005]

Tutorial Road Map Part I: Theory & Algorithms Bipartite Ranking k-partite Ranking Ranking with Real-Valued Labels General Instance Ranking

RankSVM RankBoost RankNet

Part II: Applications Applications to Bioinformatics Applications to Drug Discovery Subset Ranking and Applications to Information Retrieval

Further Reading & Resources

Part II Applications [and Subset Ranking]

Application to Bioinformatics

Human genetics is now at a critical juncture. The molecular methods used successfully to identify the genes underlying rare mendelian syndromes are failing to find the numerous genes causing more common, familial, nonmendelian diseases . . . Nature 405:847–856, 2000

Application to Bioinformatics

With the human genome sequence nearing completion, new opportunities are being presented for unravelling the complex genetic basis of nonmendelian disorders based on large-scale genomewide studies . . .

Nature 405:847–856, 2000

Identifying Genes Relevant to a Disease Using Microarrray Gene Expression Data






Formulation as a Bipartite Ranking Problem

Relevant

Not relevant

… …

Microarray Gene Expression Data Sets [Golub et al, 1999; Alon et al, 1999]

Selection of Training Genes

Top-Ranking Genes for Leukemia Returned by RankBoost

[Agarwal & Sengupta, 2009]

Biological Validation

[Agarwal et al, 2010]

Application to Drug Discovery

Problem: Millions of structures in a chemical library. How do we identify the most promising ones?

Formulation as a Ranking Problem with Real-Valued Labels

Cheminformatics Data Sets [Sutherland et al, 2004]

DHFR Results Using RankSVM

[Agarwal et al, 2010]

Application to Information Retrieval (IR)

Learning to Rank in IR










General Subset Ranking

query 1

doc1

>

doc2

,

doc3

>

doc5

…

query 2

doc2

>

doc4

,

doc11

>

doc3

…

…

General Subset Ranking

Subset Ranking with Real-Valued Relevance Labels query 1

doc1

y1

,

doc2

y2

,

doc3

y3

…

query 2

doc1

y1

,

doc2

y2

,

doc3

y3

…

…

Subset Ranking with Real-Valued Relevance Labels

RankSVM Applied to IR/Subset Ranking Standard RankSVM

[Joachims, 2002]

RankSVM Applied to IR/Subset Ranking RankSVM with Query Normalization & Relevance Weighting

[Agarwal & Collins, 2010; also Cao et al, 2006]

Ranking Performance Measures in IR Mean Average Precision (MAP)

Ranking Performance Measures in IR Normalized Discounted Cumulative Gain (NDCG)

Ranking Algorithms for Optimizing MAP/NDCG

LETOR 3.0/OHSUMED Data Set [Liu et al, 2007]

OHSUMED Results – NDCG

OHSUMED Results – MAP

Further Reading & Resources [Incomplete!]

Early Papers on Ranking W. W. Cohen, R. E. Schapire, and Y. Singer, Learning to order things, Journal of Artificial Intelligence Research, 10:243–270, 1999. R. Herbrich, T. Graepel, and K. Obermayer, Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers, 2000. T. Joachims, Optimizing search engines using clickthrough data, KDD 2002. Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer, An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4:933–969, 2003. C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, G. Hullender, Learning to rank using gradient descent, ICML 2005.

Generalization Bounds for Ranking S. Agarwal, T. Graepel, R. Herbrich, S. Har-Peled, D. Roth, Generalization bounds for the area under the ROC curve, Journal of Machine Learning Research, 6:393—425, 2005. S. Agarwal, P. Niyogi, Generalization bounds for ranking algorithms via algorithmic stability, Journal of Machine Learning Research, 10:441—474, 2009. C. Rudin, R. Schapire, Margin-based ranking and an equivalence between AdaBoost and RankBoost, Journal of Machine Learning Research, 10: 2193—2232, 2009

Bioinformatics/Drug Discovery Applications S. Agarwal and S. Sengupta, Ranking genes by relevance to a disease, CSB 2009. S. Agarwal, D. Dugar, and S. Sengupta, Ranking chemical structures for drug discovery: A new machine learning approach. Journal of Chemical Information and Modeling, DOI 10.1021/ci9003865, 2010.

Other Applications Natural Language Processing M. Collins and T. Koo, Discriminative reranking for natural language parsing, Computational Linguistics, 31:25—69, 2005. Collaborative Filtering M. Weimer, A. Karatzoglou, Q. V. Le, and A. Smola, CofiRank - Maximum margin matrix factorization for collaborative ranking, NIPS 2007. Manhole Event Prediction C. Rudin, R. Passonneau, A. Radeva, H. Dutta, S. Ierome, and D. Isaac , A process for predicting manhole events in Manhattan, Machine Learning, DOI 10.1007/s10994-009-5166, 2010.

IR Ranking Algorithms Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Hunag, and H.W. Hon, Adapting ranking SVM to document retrieval, SIGIR 2006. C.J.C. Burges, R. Ragno, and Q.V. Le, Learning to rank with non-smooth cost functions. NIPS 2006. J. Xu and H. Li, AdaRank: A boosting algorithm for information retrieval. SIGIR 2007. Y. Yue, T. Finley, F. Radlinski, and T. Joachims, A support vector method for optimizing average precision. SIGIR 2007. M. Taylor, J. Guiver, S. Robertson, T. Minka, Softrank: optimizing non-smooth rank metrics. WSDM 2008.

IR Ranking Algorithms S. Chakrabarti, R. Khanna, U. Sawant, and C. Bhattacharyya, Structured learning for nonsmooth ranking losses. KDD 2008. D. Cossock and T. Zhang, Statistical analysis of Bayes optimal subset ranking, IEEE Transactions on Information Theory, 54:5140–5154, 2008. T. Qin, X.D. Zhang, M.F. Tsai, D.S. Wang, T.Y. Liu, and H. Li. Query-level loss functions for information retrieval. Information Processing and Management, 44:838–855, 2008. O. Chapelle and M. Wu, Gradient descent optimization of smoothed information retrieval metrics. Information Retrieval (To appear), 2010. S. Agarwal and M. Collins, Maximum margin ranking algorithms for information retrieval, ECIR 2010.

NIPS Workshop 2005 Learning to Rank SIGIR Workshops 2007-2009 Learning to Rank for Information Retrieval NIPS Workshop 2009 Advances in Ranking American Institute of Mathematics Workshop in Summer 2010 The Mathematics of Ranking

Tutorial Articles & Books Tie-Yan Liu, Learning to Rank for Information Retrieval, Foundations & Trends in Information Retrieval, 2009. Shivani Agarwal, A Tutorial Introduction to Ranking Methods in Machine Learning, In preparation. Shivani Agarwal (Ed.), Advances in Ranking Methods in Machine Learning, Springer-Verlag, In preparation.

Ranking Methods in Machine Learning

Ranking Methods in Machine Learning

Suggest Documents

Ensemble Methods in Machine Learning

Machine Learning Methods

SPARSE MACHINE LEARNING METHODS FOR ...

SPARSE MACHINE LEARNING METHODS FOR ...

Machine Learning Methods for Quantitative

Spectral methods in machine learning and new

Machine learning methods in data fusion systems

Kernel methods in machine learning - Kernel Machines

Learning to Suggest: A Machine Learning Framework for Ranking ...

Adaptive Features of Machine Learning Methods

Machine Learning Methods for Protein Structure Prediction

Machine Learning Methods for Fully Automatic ... - CiteSeerX

Machine learning methods for omics data integration

Machine Learning methods for miRNA Gene prediction

MACHINE LEARNING METHODS FOR MICROARRAY DATA ANALYSIS

Machine Learning Methods Enable Predictive ... - Semantic Scholar

Machine learning methods for nanolaser characterization - arXiv

Machine learning methods for omics data integration

Statistical and Machine Learning forecasting methods - PLOS

Machine Learning Methods of Bankruptcy Prediction Using ...

Machine Learning Methods Enable Predictive ... - Semantic Scholar

Machine Learning methods for automatically processing ... - CiteSeerX

Generic Multiplicative Methods for Implementing Machine Learning ...

Software Effort Estimation Using Machine Learning Methods