Machine learning for inferring networks from data

8 downloads 0 Views 918KB Size Report
lorenzi, Martine Piccart, and Christos Sotiriou. Biological Pro- cesses Associated with Breast Cancer Clinical Outcome Depend on the Molecular Subtypes.
Machine learning for inferring networks from data Gianluca Bontempi, Patrick E. Meyer, Benjamin Haibe-Kains {gbonte,pmeyer,bhaibeka}@ulb.ac.be

Machine Learning Group Departement d’Informatique ULB, Université Libre de Bruxelles Boulevard de Triomphe - CP 212 Bruxelles, Belgium

http://www.ulb.ac.be/di/mlg

Machine learning for inferring networks from data – p. 1/47

Outline • The ULB Machine Learning Group • Machine learning and bioinformatics • Feature selection • Network inference. • Beyond dependencies and towards causal discovery. • Future work.

Machine learning for inferring networks from data – p. 2/47

ULB Machine Learning Group (MLG) •

8 researchers (2 prof, 3 PhD students, 3 postdocs).



Research topics: Knowledge discovery from data, Classification, Computational statistics, Data mining, Regression, Time series prediction, Sensor networks, Bioinformatics, Network inference.



Computing facilities: high-performing cluster for analysis of massive datasets, Wireless Sensor Lab.



Website: www.ulb.ac.be/di/mlg.



Scientific collaborations in ULB: IRIDIA (Sciences Appliquées), Physiologie Moléculaire de la Cellule, Bioinformatique des génomes et des réseaux (IBMM), CENOLI (Sciences), Microarray Unit (Hopital Jules Bordet), Laboratoire de Médecine experimentale, Service d’Anesthesie (ERASME).



Scientific collaborations outside ULB: UCL Machine Learning Group (B), Politecnico di Milano (I), Universitá del Sannio (I), Helsinki Institute of Technology (FIN).

Machine learning for inferring networks from data – p. 3/47

ULB-MLG: research projects 1. "Integrating experimental and theoretical approaches to decipher the molecular networks of nitrogen utilisation in yeast": ARC (Action de Recherche Concertée). 2. TANIA - Système d’aide à la conduite de l’anesthésie. WALEO II project funded by the Région Wallonne (2006-2010) 3. "COMP2 SYS" (COMPutational intelligence methods for COMPlex SYStems) MARIE CURIE Early Stage Research Training funded by the EU (2004-2008). 4. "AIDAR - Adressage et Indexation de Documents Multimédias Assistés par des techniques de Reconnaissance Vocale": funded by Région Bruxelles-Capitale (2004-2006). Partners: Voice Insight, RTBF, Titan. 5. "ARMURS - Automatic Recognition for Map Update by Remote Sensing." Funded by IRSIB (2007-2009) 6. OASIS - Detection and analysis of social fraud in Social Security Databases. Funded by the Belgian Science Policy (2007-2009) 7. PIMAN - Pôle de compétence en Inspection et Maintenance Assistée par langage Naturel. Funded by the Région Bruxelles-Capitale (2007-2008) Machine learning for inferring networks from data – p. 4/47

Knowledge discovery in bioinformatics • Within last years the complete sequence has been determined for a number of genomes from humans and other organisms. • Determining the nucleotide sequence of a DNA molecule, however, is only a first step towards the ultimate goals of understanding the system-level functionality. • The result of sequencing efforts and the availability of new measurement tools (e.g. microarrays) makes a great volume of data available for analysis. • This has created the need for (semi) automated methods to analyze massive datasets. Data analysis methods are expected to support biologists in discovering patterns, understanding correlations, reducing complexity, predicting events. This is often referred to as knowledge discovery. • Machine learning is the discipline which aims to make automatic the process of knowledge discovery.

Machine learning for inferring networks from data – p. 5/47

Machine Learning: a definition

The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. [6]

Machine learning for inferring networks from data – p. 6/47

Machine learning tasks • Many interesting problems in computer science are so complex that it is difficult or even impossible to program directly a solution. • Think to how to implement a program able to: recognize a face in a photo, to decide whether an email is a spam, recognize handwriting, categorize a news. • The same happens in bioinformatics. • Machine learning offers an alternative methodological approach to deal with these problems. • By exploiting the knowledge extracted from a collected sample of data it is possible to design algorithms able to solve this kind of problems.

Machine learning for inferring networks from data – p. 7/47

Challenges for ML in bioinformatics The adoption of ML may provide life scientists with useful insight about some relevant questions like: • Is it possible to predict whether two proteins interact on the basis of their primary structure? • Do two genes belong to the same class of functionality? • Which set of genes are predictive of survival in cancer patient? • Which kind of transcriptional mechanism underlies the observed expression profiles? The first three problems can be addressed by techniques of supervised learning techniques (e.g. classification or regression). The last problem requires an extension of supervised learning techniques to deal with network issues.

Machine learning for inferring networks from data – p. 8/47

Supervised learning input

output

error

PHENOMENON

OBSERVATIONS

prediction MODEL

• Finite amount of noisy observations. • No a priori knowledge of the dependency between the inputs and the output which characterizes the phenomenon. Example of supervised learning model: a predictive model which, once measured the expression of a certain set of genes (inputs) returns the probability of a cancer disease (output).

Machine learning for inferring networks from data – p. 9/47

Feature selection • In many bioinformatics problems the number of inputs (features) is significantly larger than the number of samples (high feature-to-sample ratio datasets). Examples are: • Breast cancer classification on the basis of microarray data. • Network inference on the basis of microarray data. • Analysis of sequence/expression correlation. • In these cases, it is common practice to adopt feature selection algorithms [3] to improve the generalization accuracy. • There are many potential benefits of feature selection: • facilitating data visualization and data understanding, • reducing the measurement and storage requirements, • reducing training and utilization times, • In theory, more features should provide more information, but in practice with a limited amount of data, excessive features will not only slow down the learning process but also confuse the learning algorithm with irrelevant or redundant features. Machine learning for inferring networks from data – p. 10/47

Information as measure of dependency • The aim of supervised learning techniques is to infer from data a model of the dependency between inputs and output. • Information theory provides an useful formalism to reason about dependency, independency and conditional independency. • Mutual information is one of the most used measures in probability to define dependency of variables. • It is a measure of the amount of information that one random variable contains about another random variable. • It can also be considered as the distance from independence between the two variables. • This quantity is always non negative and zero if and only if the two variables are stochastically independent.

Machine learning for inferring networks from data – p. 11/47

Mutual information Mutual information [1] is a functional of two random variables (e.g. an input x and an output y) which quantifies the amount of dependency between them and satisfies the following properties • I(y; x) = I(x; y) ≥ 0 • I(y; x) = H(y) − H(y|x) = H(x) − H(x|y) where H(y) is the entropy of y which quantifies the uncertainty associated to y and H(y|x) is the conditional entropy of y which quantifies the smaller uncertainty associated to y once x is known. • if x and y are independent I(x; y) = 0 • if x and y are normally distributed mutual information is a function of the correlation.

Machine learning for inferring networks from data – p. 12/47

Feature selection and mutual information • In terms of mutual information the feature selection problem can be formulated as follows. Given an output target y and a set of input variables X = {x1 , . . . , xn } selecting the optimal subset of d variables boils down to the following optimization problem X ∗ = arg

max

XS ⊂X,|XS |=d

I(XS ; y)

• However, in practice solving this problem is made extremely difficult by the so-called "curse of dimensionality": • the search space is too big and it cannot be explored in an efficient way • the estimation of the quantities I(XS ; y) for a large dimensionality d and a low number of samples is extremely inaccurate.

Machine learning for inferring networks from data – p. 13/47

Approaches to f.s. The main approaches to feature selection are: they are preprocessing methods which assess the relevance of features without using a specific learning algorithm. Examples are methods that select variables by ranking them, through compression techniques (like PCA) or by computing correlation with the output. Filter approaches rely often on individual evaluation and low variate approximation. The most typical one consists in selecting the d variables having the highest univariate mutual information with y

Filter methods:

XRANK = {xi : I(xi ; y) ≥ I(xj ; y) for all xj ∈ / XRANK } these methods assess subsets of variables according to their usefulness to a given predictor. The method conducts a search for a good subset using the learning algorithm itself as part of the evaluation function.

Wrapper methods:

Machine learning for inferring networks from data – p. 14/47

Application: Breast Cancer prognosis • The goal is to predict the prospect of remission of a breast cancer patient after the initial surgery. This information is extremely important because it assists oncologists in determining which breast cancer patients require chemo-, hormono- or other systemic therapies, and which women can safely be treated with radiotherapy alone. • Jointly with the Microarray Bordet Unit, we developed ML tools to assist physicians in their evaluation of the clinical outcome of BC. Input (clinical or microarray data)

Output (survival data)

a

Breast surgery + radiotherapy

Follow-up 5-10 years

Diagnosis

Recurrence

Remission ?

Prognosis

Machine learning for inferring networks from data – p. 15/47

Breast cancer prognosis • Breast cancer patients with the same stage of disease can have markedly different treatment responses and overall outcome. • Cancer classification has been based primarily on morphological appearance of the tumor, but with serious limitations. Tumors with similar histopathological appearance can follow significantly different clinical courses and show different responses to therapy. The strongest predictors for metastasis fail to classify accurately breast tumors according to their clinical behavior. • Cancer classification has been difficult in part because it has historically relied on specific biological insights, rather than systematic and unbiased approaches for recognizing tumor subtypes. • The hope is that the analysis of genome-wide molecular data would bring new insights into the critical, underlying biological mechanisms involved in breast cancer progression, as well as significantly improve prognostic prediction. • Problem: Prognostic clinical models predict numerous low-risk patients with early breast cancer (nodal status = 0) as high-risk. This leads to overtreatment Machine learning for inferring networks from data – p. 16/47

ML Approach to Prognostication • Improvement of breast cancer prognostication by using machine learning techniques to analyze microarray and survival data. • Objective: Identification of prognostic gene signatures. • The idea is to identify prognostic gene signatures and their corresponding risk prediction models exhibiting the following characteristics: • Good performance with independent data. • Interpretable from a biological point of view. • Useable with data generated by different microarray platforms and/or normalization techniques. Joint work with Benjamin Haibe-Kains (MLG, Bordet) and Christos Sotiriou (Bordet). [8, 9, 2].

Machine learning for inferring networks from data – p. 17/47

Prognostic Signature Identification Clinical data of breast cancer patients

Raw microarray data of breast cancer patients

clinical outcome

Data preprocessing normalized gene expressions Stability-based feature selection

Feature selection

signature Robust model building

Risk prediction modeling

risk predictions

Machine learning for inferring networks from data – p. 18/47

Feature selection procedure • Feature transformation: it identifies clusters of similar genes from the whole set of gene expressions 1. Hierarchical clustering is used to compute the full dendrogram of the gene expressions. 2. This dendrogram is cut to identify clusters of highly correlated genes. 3. Clusters including a sufficient number of annotated genes are retained. 4. each cluster of gene expressions is summarized by a feature • Feature ranking: 1. The relevance of each individual feature is assessed according to a univariate scoring function S supposed to be proportional to the relevance of the feature of interest with respect to the prediction task. 2. All the features are ranked in a decreasing order according to the scores returned by S. • Feature selection: a criterion assessing the ranking stability with respect to signature size is used in order to select the size leading to the most stable signature Machine learning for inferring networks from data – p. 19/47

Prognostic Gene Signatures • The inferred genomic signature led to the defintion of a clinical prognostic index (GGI). • GGI [8] was designed to discriminate patients with low and high histological grade (proliferation). • GGI was able to discriminate patients with intermediate histological grade (HG2). Note that the practitioner is often in trouble with respect to this intermediate case. Histological grade

GGI (HG2)

GGI

Machine learning for inferring networks from data – p. 20/47

Considerations We observed that • Biological knowledge matters • simple methods often outperform complex ones in genomic studies, suggesting that variance is the most important term to reduce in the bias-variance trade-off • Once prediction is at stake, simple filter methods such as the feature ranking, are particularly adapted thanks to their low computational cost and their reduced risk of overfitting. • Robustess is as important as accuracy. The rationale is that the gain in stability largely compensates for the lack of complexity. Are univariate techniques (e.g. ranking) powerful enough to perform network inference, too?

Machine learning for inferring networks from data – p. 21/47

Inference of regulatory networks

Machine learning for inferring networks from data – p. 22/47

Inference of regulatory networks • Most biological regulatory processes involve intricate networks of interactions and it is now increasingly evident that predicting their behaviour and linking molecular and cellular structure to function are beyond the capacity of intuition. • The idea is that transcriptional processes of induction and repression are determined through specific interactions, and can be predicted in detail by a logical or a mathematical model. • The ultimate goal is to know, for each specific gene, what other genes it influences and in what way. The process of building a network of dependencies from data is known as network inference or reverse engineering. • The availability of genome-wide gene expression technologies has enabled scientists to make considerable progress toward the identification of the interactions between genes in living systems.

Machine learning for inferring networks from data – p. 23/47

Beware! Networks are everywhere!

Machine learning for inferring networks from data – p. 24/47

Levels of interaction Networks of interactions can be constructed at various levels and can represent different types of interactions.

A gene-to-gene network inferred on the basis of transcriptional measurements returns only an approximation of a complete biochemical regulatory network since many physical connections between macromolecules might be hidden by short-cuts. Machine learning for inferring networks from data – p. 25/47

Inference of networks from data • In general terms, revealing the network of the transcriptional regulation process appears to be a very hard problem for several reasons: noisy data, non linear effects, loose connection of the regulation network, risk of overfitting, dynamic effects (e.g. feedback, stability) to be taken into consideration. • The adoption of machine learning techniques and more specifically feature selection techniques may shed a light on the interaction structure by helping in detecting for each gene the ones which provide information ( i.e. reduce uncertainty) about its expression. • These problems demand the estimation of a number of predictive models for each gene, where the number of features equals the number of measured genes.

Machine learning for inferring networks from data – p. 26/47

Inferring network from data • We will focus on the formalism of graphical models to infer a graph from a set of measured genomic data. • Graphical models are graph representations of the stochastic dependence existing between a large number of variables. In probabilistic terms, they are a visual representation of multivariate probabilistic models. • The nodes of the graph represent the variables, i.e. the genes (more specifically, a node represents a particular characteristic of a gene, such as its expression level) • The arrows make explicit the dependence between genes and the lack of arrows the independence. • But how to represent the dependence? Is a bivariate measure like mutual information sufficient to build a network from data?

Machine learning for inferring networks from data – p. 27/47

Network patterns Suppose we consider the following regulatory pattern where the arrows mean that G1 regulates both G2 and G3 . Let the random variable xi denote the expression level of Gi G1

G2

G3

• Is it possible to infer this pattern from data? • The expression levels of G2 and G3 are strongly correlated with the one of G1 (direct interaction). • At the same time the expression levels of G2 and G3 are correlated since they are the common effect of G1 (indirect interaction). Machine learning for inferring networks from data – p. 28/47

Dependency graph Suppose to infer a network from expression data by setting a link between two nodes whether they are correlated or dependent. In this case the inferred graph for the previous pattern is G1

G2

G3

• The spurious relation between the genes G2 and G3 is wrongly inferred. • This is the same error which is typically underlying the problems of false causality. • Think for example to the correlation which we could infer form statistics on past catastrophic events between "amount of firemen" and "number of casualties". • How to avoid such inconsistency? Machine learning for inferring networks from data – p. 29/47

Conditional independency • The previous examples showed that it is important to distinguish between direct and indirect relationships. • To this end, the notion of conditional independence is required. • For instance, the expressions of G2 and G3 are dependent but conditionally independent • The notions of independence and conditional independence may be formalized in probabilistic terms by means of the information theory.

Machine learning for inferring networks from data – p. 30/47

Conditional mutual information • Consider three r.v.s x, y and z. The conditional mutual information is defined by I(y; x|z) = H(y|z) − H(y|x, z) and quantifies the additional information about y brought by a variable x once another variable z (or set of variables) is already known. • The conditional mutual information is null iff x and y are conditionally independent given z, i.e. I(x; y|z) = 0 ⇔ x ⊥ ⊥ y|z • Example: does the type x of restaurant (e.g. Italian) bring information about the quality y of the pizza once you know the nationality z of the cook (e.g. Dutch)? • This shows that independence is a context-dependent relation. Though x⊥ ⊥ y, the r.v. x may become dependent with y if we observe another variable z. Also, it is possible the x may become independent of y in the context of z even if x and y are dependent. Machine learning for inferring networks from data – p. 31/47

Conditional independence graph Let us consider a set X = {x1 , . . . , xn } of n random variables. The conditional independent graph of X is the undirected graph G = (V, E) where V = {1, . . . , n} and (i, j) is NOT in the edge set E if and only if I(xi ; xj |X−{i,j} ) = 0 In other terms a link exists between i and k if and only if I(xi ; xk |X−{i,k} ) > 0 In this case the variable xk is said to be relevant for the variable xi . • The independence graph conveys a description of their pattern of interaction. • A way to indicate the strength of each edge (i, j) is to attach a number giving the conditional mutual information I(xi ; xj |X−i,j ). n • Note that for n variables there are 2( 2 ) different graphs. How to find the one corresponding the the observed data?

Machine learning for inferring networks from data – p. 32/47

Methods for network inference The most common methods are they associate a score function to a candidate network given a training data set and perform a greedy search in the space of networks. These procedures are typically very computationally expensive for large dimensionality.

Score-based algorithms:

they look for dependencies and conditional dependencies in the data and build the network accordingly. Unfortunately the conditional independence is a multivariate quantity and its estimation can be very inaccurate in front of very few data.

Constraint-based algorithms:

they adapt the feature selection strategy to the problem of network inference. They repeat for all genes the selection of the variables (i.e. other genes) which are relevant to predict it, by taking into consideration the context.

Feature selection algorithms:

Ideally we would like to adopt a feature selection technique which is context dependent yet remaining computationally simple and efficient. A good compromise is provided by the MRMR algorithm.

Machine learning for inferring networks from data – p. 33/47

MRNET • The MRNET method [5] of network inference relies on the Maximum Relevancy Minimum Redundancy (MRMR) algorithm [7]. • All the genes play one at the time the role of target variable. • This algorithm builds incrementally the set of relevant variables by selecting at each step the variable which provides the highest information about the target, being at the same time as less redundant as possible. • at the mth step

(m)

XMR

    1  (m−1) max = XMR , arg I(xk ; y) − (m−1)  m xk ∈X −X  MR

   X  I(xi ; xk )   (m−1)

xi ∈X

MR

• This procedure has a computational benefit deriving from the fact that only bivariate information gains has to be computed. • Efficiency is a key factor in the use of feature selection for network inference since the selection has to be run for each gene. Machine learning for inferring networks from data – p. 34/47

MRMR

Machine learning for inferring networks from data – p. 35/47

MRMR

Machine learning for inferring networks from data – p. 36/47

MRMR

Machine learning for inferring networks from data – p. 37/47

MRMR

Machine learning for inferring networks from data – p. 38/47

Experimental framework Network and Data Generator Original Network

Artificial Dataset

Entropy Estimator

Inference Method

Mutual Information Matrix

Inferred Network

Validation Procedure PrecisionRecall Curves and F-Scores

Since experimental data sets of the appropriate size and design are usually not available, there is a clear need to generate well-characterized synthetic data sets that allow thorough testing of learning algorithms in a fast and reproducible manner. We adopted Syntren, a generator of synthetic gene expression data for design and analysis of structure learning algorithms, in order to setup an experimental framework of validation. Machine learning for inferring networks from data – p. 39/47

Experimental results Wins/losses statistically significant [5] on 30 microarray datasets synthetically generated with the following configurations • number of genes ranging from 100 to 1000 variables • number of experiences ranging from 100 to 1000 microarrays, • gaussian noise with standard deviation ranging from 0 to 30% of the signal intensity Match

score

MRNET/RELNET

9-2

MRNET/ARACNE

8-3

MRNET/CLR

13 - 8

Two recent papers [Lopes et al., 2009, Shimamura et al., 2009] reach similar conclusions.

Machine learning for inferring networks from data – p. 40/47

Validation on six BC datasets • MRNET applied on each of them [2] • Meta-network built by aggregating the matrix of pairwise mutual information • 100 selected genes connected to survival variable(up to a graph distance equal to two) • The selected nodes are highly present in published prognostic signatures representing many different biological processes: • Proliferation: 17 and 21 genes in AURKA module and Gene expression Grade Index (GGI), respectively, • Immune response: 3 and 1 genes in STATI module and IRMODULE, • Tumor invasion: 1 gene in the PLAU module, • Stroma: 4 genes in SDPP, • Commercial signatures: 3,2 and 3 genes in GENE70, GENE76 and ONCOTYPE. • The performance of the new signature in a CV setting is competitive with the best published prognostic signatures studied. Machine learning for inferring networks from data – p. 41/47

The MINET package

Available in R/Bioconductor with the following functionalities [4] • MRNET and three other ste-of-the-art inference algorithms (RELNET ARACNE, CLR) • four methods for mutual information estimation • accuracy assessment functionality (fscores,ROC,PR) • interaction with visualization tools (RGraphviz) Machine learning for inferring networks from data – p. 42/47

From dependency to causality ? • As we showed before, correlation (or statistical dependence) does not imply causality. • But can dependency provide at least some insight about causality? • Let us consider at first a necessary condition of causality: if the random variable xi is a direct cause of xj then there is no other set of variables S, that makes xi and xj independent: xi → xj ⇒ ∀S ⊆ X−(i,j) , I(xi , xj |S) > 0 • At the same time: xj → xi ⇒ ∀S ⊆ X−(i,j) , I(xi , xj |S) > 0 • Note also that these conditions are necessary but not sufficient. • In order to make this condition sufficient, we have to make an additional strong assumption (known as Causal Sufficiency Assumption): no hidden common cause of two variables exist. In other terms all the relevant causes are taken into consideration into our set of variables. • Once this assumption is made we can build the skeleton of a causal graph (edges not oriented). Orientation is possible with additional considerations. Machine learning for inferring networks from data – p. 43/47

From dependency to causality ? G2

G1

G2

G3

I(x2 ; x3 ) − I(x2 ; x3 |x1 ) > 0,

G3

G1

I(x2 ; x3 ) − I(x2 ; x3 |x1 ) < 0

• The common cause and the common effect situations can be disambiguated thanks to the notion of conditional independence. • More generally this notion can help to test and potentially discover cause-effect relationships between variables in situations in which it is not possible to conduct randomised or experimentally controlled experiments.

Machine learning for inferring networks from data – p. 44/47

Research directions in MLG • Improvement of stability of feature selection techniques. • Collaboration with Eizirik group on signature extraction from microarray beta-cell data. • Application of network inference to breast cancer data (metanalysis). • Extension of the MINET functionalities to causal discovery.

Machine learning for inferring networks from data – p. 45/47

Computational methods and medicine • With the advent of high-throughput technologies in biomedicine, the need for data management and appropriate data analysis tools in genomics has increased dramatically. • More and more in the future, competitive research in biomedicine will ask for an effective computational analysis of the measured data. • Specific interdisciplinary profiles at the border between biology, medicine, statistics, computer science will be required. • We assist in US to the rising of centers, (e.g. the Center for Cancer Computational biology in Harvard) providing both 1. analytical services and support platform to provide assistance in the collection, management, analysis, and interpretation of large-scale 2. a research program focused on development of new methods for improving analysis and interpretation of genomic data through integration of diverse data types with the goal of creating open-source software tools to be made freely-available to the research community. • Are we (you) doing enough? Machine learning for inferring networks from data – p. 46/47

Questions?

Machine learning for inferring networks from data – p. 47/47

References [1] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley, New York, 1990. [2] Christine Desmedt, Benjamin Haibe-Kains, Pratyaksha Wirapati, Marc Buyse, Denis Larsimont, Gianluca Bontempi, Mauro Delorenzi, Martine Piccart, and Christos Sotiriou. Biological Processes Associated with Breast Cancer Clinical Outcome Depend on the Molecular Subtypes. Clin Cancer Res, 14(16):5158–5165, 2008. [3] I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003. [4] Patrick Emmanuel Meyer, Frédéric Lafitte, and Gianluca Bontempi. minet: A r/bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics, 9, 2008. [5] Patrick Emmanuel Meyer, K. Kontos Frédéric Lafitte, and Gianluca Bontempi. Information-theoretic inference of large transcriptional regulatory networks. EURASIP Journal on Bioinformatics and Systems Biology, 2007. [6] T. M. Mitchell. Machine Learning. McGraw Hill, 1997. [7] H. Peng, F. Long, and C. Ding. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and minredundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 2005.

47-1

[8] Christos Sotiriou, Pratyaksha Wirapati, Sherene Loi, Adrian Harris, Steve Fox, Johanna Smeds, Hans Nordgren, Pierre Farmer, Viviane Praz, Benjamin Haibe-Kains, Christine Desmedt, Denis Larsimont, Fatima Cardoso, Hans Peterse, Dimitry Nuyten, Marc Buyse, Marc J. Van de Vijver, Jonas Bergh, Martine Piccart, and Mauro Delorenzi. Gene Expression Profiling in Breast Cancer: Understanding the Molecular Basis of Histologic Grade To Improve Prognosis. J. Natl. Cancer Inst., 98(4):262–272, 2006. [9] Pratyaksha Wirapati, Christos Sotiriou, Susanne Kunkel, Pierre Farmer, Sylvain Pradervand, Benjamin Haibe-Kains, Christine Desmedt, Michail Ignatiadis, Thierry Sengstag, Frederic Schutz, Darlene Goldstein, Martine Piccart, and Mauro Delorenzi. Metaanalysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Research, 10(4):R65, 2008.

47-2

Suggest Documents