F-test in K-fold Cross-Validation - PLOS

Recommend Documents

PEDIATRIC HIGHLIGHT Crossvalidation of ... - Giorgio Bedogni

Am J Physiol 1995; 269: E118âE126. 15 Brambilla P, Manzoni P, Agostini G, Beccaria L, Ruotolo G, Sironi ... Cesare E et al. Increased visceral adipose tissue is ...

External Evaluation of QSAR Models, in Addition to CrossValidation ...

External Evaluation of QSAR Models, in Addition to Cross-. Validation: Verification of Predictive Capability on Totally. New Chemicals. Paola Gramatica*[a].

CrossValidation of IFN Elispot Assay for ... - Wiley Online Library

Brief Communication. Cross-Validation of IFN-g Elispot Assay for Measuring. Alloreactive Memory/Effector T Cell Responses in. Renal Transplant Recipients.

cassava in Nigeria? - Plos

Jul 11, 2017 - tations such as 'realms that should be left to God and God alone', 'playing God' or 'trying to displace the first Creator'. While it would be ...

in Drosophila melanogaster - PLOS

Jan 26, 2012 - bundle assembly. 3. X. 17,000,405 Ninja-. Dsim. ninja. FR. 1.00 FBti0062283. 21.1942 22.4992 na. 21.5800 22.3464 FBgn0065032. 520 bp us.

in Neurospora crassa - Plos

Oct 20, 2014 - a fluffy white mycelium. Putative functions of the ..... Dunlap JC, Borkovich KA, Henn MR, Turner GE, Sachs MS, et al. (2007). Enabling a ...

in Spain - PLOS

Mar 7, 2016 - notified cases in Spain from 2009 to 2010, was characterized by multilocus sequence typing ... Strain Coverage of a New Meningococcal.

in Streptococcus pneumoniae - Plos

Feb 18, 2014 - James C. Paton1, Christopher A. McDevitt1*. 1 Research Centre for ..... However, Co(II) and Ni(II) did show minor increases at 10 ÑM Zn(II): 1 ...

in Drosophila melanogaster - PLOS

Apr 27, 2012 - a deletion mutant of an AST/galanin-like receptor (NPR-9) in Caenorhabditis elegans. Associated with the foraging defect in. C. elegans npr-9 ...

Leishmania in Skin - PLOS

May 21, 2013 - single lesions, 11 had two lesions (15.07%), two (2.7%) had three ..... Volpini AC, Marques MJ, Lopes dos Santos S, Machado-Coelho GL, ...

In Vivo Using - PLOS

May 18, 2009 - ... Rodriguez4, Serge Luquet1, Pierre Cattan5, Brian Lockhart4, Jochen Lang3, .... Using a FACStarplus (Becton Dickinson), b-cells were distin-.

in Plasmodium falciparum - PLOS

Mar 26, 2012 - Smith JD, Chitnis CE, Craig AG, Roberts DJ, Hudson-Taylor DE, et al. (1995). Switches in expression of Plasmodium falciparum var genes ...

in Escherichia coli - Plos

May 2, 2013 - Department of Chemical and Biological Engineering, Princeton University, Princeton, New .... previous models provide a firm foundation for modeling NON in ...... (PAPA NONOate) were purchased from Cayman Chemical.

ACT IN 7 - PLOS

Page 1. 0. 0.02. 0.04. 0.06. 0.08. 0.1. S. G. T. 1 a. /A. C. T. IN. 7. Col-0 sgt1a-3. A. B. A. T. G. T. G. A. 200bp sgt1a-3. F. R.

Codon usage in - PLOS

Table S1: Preferred codons. Amino acid. Codons Translation speed. tRNA .... Reeh S, Neidhardt FC (1978) Patterns of Protein Synthesis in Escherichia Coli.

FVC in women - PLOS

Nov 9, 2018 - After exclusion of carriers of the APOE Îµ2/Îµ4 genotype, 4,468 ..... post-infectious bronchiolitis obliterans, which results in an decrease in the ...

and In Vivo - Plos

May 27, 2009 - number of activated T cells can express CD152. CD152 is ...... Pandiyan P, Gartner D, Soezeri O, Radbruch A, Schulze-Osthoff K, et al.

in Fusarium Species - PLOS

Sep 21, 2012 - species as well as other filamentous fungi via BLAST analysis. Study of databases .... over 95% of macroconidia had at least one germ tube from terminal cells, intercalary cells, ... native promoter was fused in-frame to the green fluo

in Gaborone, Botswana - PLOS

Feb 18, 2011 - Despite the initial success in Botswana in providing comprehensive free .... assay used for viral load testing in Botswana is the Amplicor HIV-1.

in Fabry disease - PLOS

Apr 21, 2017 - ... Xq22. The enzymatic defect leads to the accumu- lation of globotriaosylceramide (Gb3) and related glycosphingolipids throughout the body,.

in Caenorhabditis elegans - PLOS

Oct 3, 2012 - Bimodal Activation of Different Neuron Classes with the. Spectrally Red-Shifted ..... further processed by SigmaPlot and Adobe Illustrator. Action.

in Bacillus thuringiensis - PLOS

Mar 25, 2011 - ... United States Department of Agriculture, Henry A. Wallace Beltsville ..... sequencing using ABI BigDye V1.1 (Applied Biosystems, Foster.

Hybridization in Parasites - PLOS

Sep 3, 2015 - Hybridization of parasites is an emerging public health concern at the ..... Grant BR, Grant PR (2008) Fission and fusion of Darwin's finches populations. ... GDD, Greig D (2014) Hybridization facilitates evolutionary rescue.

in Streptococcus pneumoniae - PLOS

Nov 14, 2013 - Hiller NL, Ahmed A, Powell E, Martin DP, Eutsey R, et al. ... Majewski J, Zawadzki P, Pickerill P, Cohan FM, Dowson CG (2000) Barriers to.

F-test in K-fold Cross-Validation - PLOS

Download PDF

1 downloads 0 Views 79KB Size Report

Comment

F-test in K-fold Cross-Validation. The analysis and procedure presented below is based upon least-squares approach. Similar arguments hold true for the ...

F-test in K-fold Cross-Validation The analysis and procedure presented below is based upon least-squares approach. Similar arguments hold true for the application of PLS as well, although the concept of degrees of freedom (DOF) is more complicated in PLS. Least-squares approach: Let X and Y be the input and output data (single output), respectively. Let n be the number of data points and p be the number of input predictors. Let the model be: Y = Xb + E

(1)

) Then the least squares estimate of b is b = ( X T X ) −1 X T Y . Predicted value of Y is ) ) ) Y = Xb = X ( X T X ) −1 X T Y and the residual vector is E = Y − Y = [ I − X ( X T X ) −1 X T ]Y . Iff mean(Y) = 0 then mean(E ) = 0. This constraint reduces the DOF of ∑ ei2 by 1. Further, we i ) T T T T −1 T have, X E = X (Y − Y ) = X [ I − X ( X X ) X ]Y = 0 regardless of whether Y is meancentered or not. These p constraints further reduce the DOF of ∑ ei2 by p. Thus, effective DOF of

∑e

2 i

i

is (n – p - 1).

i

In the application of K-fold CV, let n be the number of total data points and f be the number of folds (f = 10 here) so that each fold contains m = n/f data points for the test set and m × ( f − 1) data points for the training set. Let p be the number of input predictors. Consider one output variable at a time. Let ri,j be the residual for the jth sample in ith fold in training set. Then, S i = ∑ ri 2,j ~ χ m2 ( f −1)− p −1 (∀i ) is true. However, the distribution of S = ∑ S i = (∑ ∑ ri 2,j ) is not i

j

i

j

clear because, out of all the f × m × ( f − 1) = n × ( f − 1) samples in all the training sets of different folds, only n samples are truly independent. In our F-test, we are computing the F-statistic as follows: For the test sets: 2 2 S test ,i = ∑ rtest DOF ( S test ,i ) = m; S test = ∑ S test ,i = ∑∑ rtest DOF ( S test ) = m * f ,i , j ; ,i , j ; j

i

i

j

(2) For the training sets: 2 2 S train ,i = ∑ rtrain DOF ( S train ,i ) = m × ( f − 1) − p − 1; S train = ∑ S train ,i = ∑∑ rtrain ,i , j ; ,i , j (3) j

i

i

j

m × ( f − 1) × f numbers of residual terms are included in S train in which m × f independent residual terms are repeated ( f − 1) times (approximately). Hence, the statistic S train /( f − 1) contains m × f independent residual terms and follows chi-square distribution with DOF ( S train /( f − 1)) ≈ m × f . Thus, the statistic F follows, F=

S test /(m × f ) S test /(m × f ) = (S train /( f − 1) ) (m × f ) S train /(m × ( f − 1) × f )

(4)

The denominator in the above expression does not include any effect of the number of input variables p . The effect of p can be achieved by considering the statistic F=

S test /(m × f ) S train /((m × ( f − 1) − p − 1) × f )

(5)

So, the equivalent DOF is, DOFtrain = (m × ( f − 1) − p − 1) × f /( f − 1) . Hence, F ~ F (n, DOFtrain ) and, we have tested the hypothesis for the significance level, α: H 0 : F < Fα (n, DOFtrain ) H 1 : F > Fα (n, DOFtrain )

(6)

where Fα (n1 , n2 ) denotes inverse cumulative F-distribution value for DOF n1 and n2 at the significance level of α.