Cross-validation is dead. Long live cross- validation! Model validation ...

Recommend Documents

Cross-validation and cross-study validation of

cross-validate an Affymetrix 6.0 genome-wide association study of the same samples ... genomic methods, and machine learning methods in particular methods for data .... We automated the analysis pipeline using Perl and various bioinformatics ... ub.i

Performance is Dead, Long Live Performance - International ...

Common Element: All vulnerable applications support embedded scripting languages. (JavaScript, ActionScript, etc.) 26. Ben Zorn CGO 2010 Keynote ...

Fortran is Dead! Long Live Forth!

forms of scientiﬁc computation have found Forth to be a strong contender as the scientiﬁc programming language of the future. One reason is that its extensibility ...

TUNING MODEL COMPLEXITY USING CROSS-VALIDATION FOR ...

In this thesis, we review the use of cross-validation for model selection and pro- ...... In Chapter 3, we propose the MultiTest algorithm using cross-validation.

Simple Model Selection Cross Validation ... - Semantic Scholar

Oct 7, 2009 - Use validation data for tuning learning algorithm, e.g., model selection ... For each data point you leave

Cross-Validation Without Doing Cross-Validation in ...

Aug 3, 2016 - Gelfand (1992) and Albert (2009) can be used to supplement ...... J. M. Bernardo,M. H. DeGroot, D. V. Lindley, and A. F. M. Smith, Cambridge, MA: 611 ... Following Seber and Lee (2003), consider out-of-sample prediction and ...

Cross-Validation Cumulative Learning

CL system implies the use of meta-learning techniques. However .... multiple tasks with shared representations might best ..... provides for hierarchical, domain-.

LONG LIVE THE DEAD

typical obituary poster mentions a whole list of people announcing the death of so- and-so ...... (afunsoa) to find the witch responsible for the premature death.

validation using live piglets

Apr 4, 2012 - We assessed the precision and accuracy of measurements of %FM propor- tions in live animals, using aDP in comparison with biochemi-.

Complexity is dead, long live complexity! How ... - Semantic Scholar

Jun 7, 2014 - specific requirements to be met by software supporting security and compliance man- agement in complex IT outsourcing arrangements, and ...

“stationarity is dead”† — long live transformation: five principles for ...

Feb 23, 2010 ... In so doing, it lays out five principles and several subprinciples for the law of ... Five Principles for Climate Change Adaptation Law ......... 40. R.

Outsourcing is Dead, Long Live Outsourcing! - Horses for Sources

Jun 26, 2013 - Market size and forecasts for global Outsourcing expenditure. Regional and .... Application Development &

Download 'The Revolution is Dead - Long Live the Revolution; From ...

Malevich to Judd- From Deineka to Bartana' Read Books Online Free No ... Deineka to Bartana download free books for andr

FEATURE SELECTION INCREASES CROSS-VALIDATION ...

FEATURE SELECTION INCREASES CROSS-VALIDATION IMPRECISION. Yufei Xiao. 1. , Jianping Hua. 2 and Edward R. Dougherty. 1,2. 1. Department of ...

Cross-validation based Nonlinear Shrinkage

Nov 2, 2016 - the order of the dimensionality p or even smaller, the estimation quality degrades, its entries have ... CVC is the best performing general purpose covariance estimator. ..... [LW+12], the CVC results are calculated on a 2.3 GHz Intel i

Cross-validation and the Bootstrap

Starting with 5000 predictors and 50 samples, find the 100 predictors having .... A graphical illustration of the bootstrap approach on a small sample containing n ...

Standpoint Theory is Dead, Long Live Standpoint Theory! Why ...

Jul 7, 2014 - The paper investigates how the transition from materialist to post- structuralist ... unapologetic, and reflective of an actual historical reality.

Savita Bhabhi is dead. Long live Indian hypocrisy

The AAO is dead: long live the AAO - Australian Astronomical ...

galactic centre, but also the two largest nearby dwarf galaxies and the nearest major globu- lar clusters, there ..... years) by something the size of a laptop. The.

Monetarism is Dead; Long Live the Quantity Theory

In 1961, soon after leaving the Federal Reserve. Bank of Minneapolis, I gave a talk there in which I criticized Federal Reserve operating procedures for focusing ...

The Product Is Dead Long Live the Product-Service!

ovsavit.w: To optimize product cycle time, producers of .... telephone support for software and PC products. It is ... Six accounting firms [specifically, their systems.

The Patent Is Dead; Long Live The Royalties! - McCarter & English

The Patent Is Dead;. Long Live The Royalties! .... words, the licensor provides technical support to the .... transfer

Standpoint Theory is Dead, Long Live Standpoint Theory! - CiteSeerX

Jul 7, 2014 - response to the history of exclusion of feminist scholars from ..... of Marxist materialism as presented by Harding (1997) and Hartsock (1997a), ...

Editorial: The Geography of Tourism is Dead. Long Live Geographies ...

practices and performances; as well as, of course, tourism studies with its fas- cination for destinations. However, matters of location and place cannot be.

Cross-validation is dead. Long live cross- validation! Model validation ...

Download PDF

5 downloads 25492 Views 144KB Size Report

Comment

May 4, 2010 - Long live cross- validation! Model validation based on resampling. Knut Baumann. From 5th German Conference on Cheminformatics: 23. CIC- ...

Baumann Journal of Cheminformatics 2010, 2(Suppl 1):O5 http://www.jcheminf.com/content/2/S1/O5

ORAL PRESENTATION

Open Access

Cross-validation is dead. Long live crossvalidation! Model validation based on resampling Knut Baumann From 5th German Conference on Cheminformatics: 23. CIC-Workshop Goslar, Germany. 8-10 November 2009

Cross-validation was originally invented to estimate the prediction error of a mathematical modelling procedure. It can be shown that cross-validation estimates the prediction error almost unbiasedly. Nonetheless, there are numerous reports in the chemoinformatic literature that cross-validated figures of merit cannot be trusted and that a so-called external test set has to be used to estimate the prediction error of a mathematical model. In most cases where cross-validation fails to estimate the prediction error correctly, this can be traced back to the fact that it was employed as an objective function for model selection. Typically each model has some metaparameters that need to be tuned such as the choice of the actual descriptors and the number of variables in a QSAR equation, the network topology of a neural net, or the complexity of a decision tree. In this case the meta-parameter is varied and the cross-validated prediction error is determined for each setting. Finally, the parameter setting is chosen that optimizes the crossvalidated prediction error in an attempt to optimize the predictivity of the model. However, in these cases crossvalidation is no longer an unbiased estimator of the prediction error and may grossly deviate from the result of an external test set. It can be shown that the “amount” of model selection can directly be related to the inflation of cross-validated figures of merit. Hence, the model selection step has to be separated from the step of estimating the prediction error. If this is done correctly, cross-validation (or resampling in general) retains its property of unbiasedly estimating the prediction error. Matter of factly, it can be shown that data splitting into a training set and an external test set often estimates

the prediction error less precise than proper cross-validation. It is this variabability of prediction errors, which depends on test set size, that causes seemingly paradox phenomena such as the so-called “Kubinyi’s paradoxon” for small data sets. Published: 4 May 2010

Institute of Pharmaceutical Chemistry, University of Technology Braunschweig, Beethovenstr. 55, D-38106 Braunschweig, Germany

© 2010 Baumann; licensee BioMed Central Ltd.

doi:10.1186/1758-2946-2-S1-O5 Cite this article as: Baumann: Cross-validation is dead. Long live crossvalidation! Model validation based on resampling. Journal of Cheminformatics 2010 2(Suppl 1):O5.

Publish with ChemistryCentral and every scientist can read your work free of charge Open access provides opportunities to our colleagues in other parts of the globe, by allowing anyone to view the content free of charge. W. Jeffery Hurst, The Hershey Company. available free of charge to the entire scientific community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours you keep the copyright Submit your manuscript here: http://www.chemistrycentral.com/manuscript/