Local False Discovery Rate Based Methods for Multiple Testing of One-Way Classified Hypotheses✩ Sanat K. Sarkar, Zhigen Zhao
arXiv:1712.05014v1 [stat.ME] 13 Dec 2017
Department of Statistical Science, Temple University, Philadelphia, PA, 19122, USA
Abstract This paper continues the line of research initiated in Liu et al. (2016) on developing a novel framework for multiple testing of hypotheses grouped in a one-way classified form using hypothesisspecific local false discovery rates (Lfdr’s). It is built on an extension of the standard two-class mixture model from single to multiple groups, defining hypothesis-specific Lfdr as a function of the conditional Lfdr for the hypothesis given that it is within a significant group and the Lfdr for the group itself and involving a new parameter that measures grouping effect. This definition captures the underlying group structure for the hypotheses belonging to a group more effectively than the standard two-class mixture model. Two new Lfdr based methods, possessing meaningful optimalities, are produced in their oracle forms. One, designed to control false discoveries across the entire collection of hypotheses, is proposed as a powerful alternative to simply pooling all the hypotheses into a single group and using commonly used Lfdr based method under the standard single-group two-class mixture model. The other is proposed as an Lfdr analog of the method of Benjamini & Bogomolov (2014) for selective inference. It controls Lfdr based measure of false discoveries associated with selecting groups concurrently with controlling the average of within-group false discovery proportions across the selected groups. Numerical studies show that our proposed methods are indeed more powerful than their relevant competitors, at least in their oracle forms, in commonly occurring practical scenarios. Keywords: False Discovery Rate, Grouped Hypotheses, Large-Scale Multiple Testing.
1. Introduction Modern scientific studies aided by high-throughput technologies, such as those related to brain imaging, microarray analysis, astronomy, atmospheric science, drug discovery, and many others, are increasingly relying on large-scale multiple testing as an integral part of statistical investigations focused on high-dimensional inference. With many of these investigations, notably in genome-wide association and neuroimaging studies, giving rise to testing of hypotheses that appear in groups, the multiple testing paradigm seems to be shifting from single to multiple ✩ Sanat K. Sarkar is Professor and Zhigen Zhao is Associate Professor of Department of Statistical Science, Temple University. Sarkar’s research was supported by NSF Grants DMS-1208735 and DMS-1309273. Zhao’s research was supported by NSF Grant DMS-1208735 and NSF Grant IIS-1633283. E-mail addresses:
[email protected] (S.K. Sarkar),
[email protected] (Z. Zhao).
Preprint submitted to Elsevier
December 15, 2017
groups of hypotheses. These groups, forming at single or multiple levels creating one- or multiway classified hypotheses, can occur naturally due to the underlying biological or experimental process or be created using internal or external information capturing certain specific features of the data. Several newer questions arise with this paradigm shift. However, we will focus on the following two questions that seem relatively more relevant in light of what is available in the literature in the context of controlling an overall measure of false discoveries across the entire collection of hypotheses: Q1. For multiple testing of hypotheses grouped into a one-way classified form, how to effectively capture the underlying group/classification structure, instead of simply pooling all the hypotheses into a single group, while controlling overall false discoveries across all individual hypotheses? Q2. For hypotheses grouped into a one-way classified form in the context of post-selective inference where groups are selected before testing the hypotheses in the selected groups, how to effectively capture the underlying group/classification structure to control the expected average of false discovery proportions across the selected groups? Progress has been made toward answering Q1 (Hu et al. (2010)) and Q2 (Benjamini & Bogomolov (2014)) for one-way classified hypotheses in the framework of Benjamini-Hochberg (Benjamini & Hochberg (1995)) type false discovery rate (FDR) control. However, research addressing these questions based on local false discovery rate (Lfdr) (Efron et al. (2001)) based methodologies are largely absent, excepting the recent work of Liu et al. (2016) where a method has been proposed in its oracle form to answer the following question related to Q1: When making important discoveries within each group is as important as making those discoveries across all hypotheses, how to maintain a control over falsely discovered hypotheses within each group while controlling it across all hypotheses? The fact that an Lfdr based approach with its Bayesian/empirical Bayesian and decision theoretic foundation can yield powerful multiple testing method controlling false discoveries effectively capturing dependence as well as other structures of the data in single- and multiplegroup settings has been demonstrated before (Sun et al. (2006); Sun & Cai (2007); Efron (2008); Ferkingstad et al. (2008); Sarkar et al. (2008); Sun & Cai (2009); Cai & Sun (2009); Hu et al. (2010); Zablocki et al. (2014); Ignatiadis et al. (2016)). However, the work of Liu et al. (2016) is fundamentally different from these works in that it takes into account the sparsity of signals both across groups and within each active group. Consequently, the effect of a group’s significance in terms of its Lfdr can be explicitly factored into a significance measure of each hypothesis within that group. On the other hand, in those other works, such as Sun & Cai (2009); Hu et al. (2010), significance measure of each hypothesis within a group is adjusted for the group’s effect through its size rather than its measure of significance. In this article, we continue the line of research initiated in Liu et al. (2016) to answer Q1 and Q2 in an Lfdr framework. More specifically, we borrow ideas from Liu et al. (2016) in developing methodological steps to present a unified group-adjusted multiple testing framework for one-way classified hypotheses that introduces a grouping effect into overall false discoveries across all 2
individual hypotheses or the average of within-group false discovery proportions across selected groups. In the next section, we present the current state of knowledge closely pertinent to the present work and make remarks motivating the development of our proposed methodologies. 2. Literature Review and Motivating Remark Suppose there are N null hypotheses that appear in m non-overlapping families/groups, with Hij being the jth hypothesis in the ith group (i = 1, . . . , m; j = 1, . . . , ni ). We refer to such a layout of hypotheses as one-way classified hypotheses. With θij indicating the truth (θij = 0) or falsity (θij = 1) of Hij , the Lfdr, defined by the posterior probability P (θij = 0|X), where X = {Xij , i = 1, . . . , m; j = 1, . . . , ni }, is the basic ingredient for constructing Lfdr based approaches controlling false discoveries. The single-group case (or the case ignoring the group structure) has been considered extensively in the literature, notably Sun & Cai (2007); Cai & Sun (2009) and He et al. (2015) who focused on constructing methods that are optimal, at least in their oracle forms. These oracle methods correspond to Bayes multiple decision rules under a single-group two-class mixture model (Efron et al. (2001); Newton et al. (2004); Storey (2002)) that minimize marginal false non-discovery rate (mFNR), a measure of false non-discoveries closely related to the notion of false non-discoveries (FNR) introduced in Genovese & Wasserman (2002) and Sarkar (2004), subject to controlling marginal false discovery rate (mFDR), a measure of false discoveries closely related to the BH FDR and the positive FDR (pFDR) of Storey (2002). Multiple-group versions of single-group Lfdr based approaches to multiple testing have started getting attention recently, among them the following seem more relevant to our work. Cai & Sun (2009) extended their work from single to multiple groups (one-way classified hypotheses) under the following model: With i taking the value k with some prior probability πk , (Xij , θij ), j = 1, . . . , ni , given i = k, are assumed to be iid random pairs with Xkj |θkj ∼ (1 − θkj )fk0 + θkj fk1 , for some given densities fk0 and fk1 , and θkj ∼ Bernoulli(pk ). They developed a method, which in its oracle form minimizes mFNR subject to controlling mFDR and is defined in terms of thresholding the conditional Lfdr’s: CLfdri (Xij ) = (1 − pi )fi0 (Xij )/fi (Xij ), where fi (Xij ) = (1 − pi)fi0 (Xij ) + pi fi1 (Xij ), for j = 1, . . . , ni , i = 1, . . . , m, before proposing a data-driven version of the oracle method that asymptoticaly maintains the original oracle properties. It should be noted that the probability πk relates to the size of group k and provides little information about the significance of the group itself. Ferkingstad et al. (2008) brought the grouped hypotheses setting into testing a single family of hypotheses in an attempt to empower typical Lfdr based thresholding approach by leveraging an external covariate. They partitioned the p-values into a number of small bins (groups) according to ordered values of the covariate. With the underlying two-class mixture model defined separately for each bin depending on the corresponding value of the covariate, they defined the so called covariate-modulated Lfdr as the posterior probability of a null hypothesis given the value of the covariate for the corresponding bin. They estimated 3
the covariate-modulated Lfdr in each bin using a Bayesian approach before proposing their thresholding method, not necessarily controlling an overall measure of false discoveries such as the mFDR or the posterior FDR. An extension of this work from single to multiple covariates can be seen in Zablocki et al. (2014); Scott et al. (2015). Very recently, Cai et al. (2016) developed a novel grouped hypotheses testing framework for two-sample multiple testing of the differences between two highly sparsed mean vectors, having constructed the groups to extract sparisty information in the data by using a carefully constructed auxiliary covariate. They proposed an Lfdr based optimal multiple testing procedure controlling FDR as a powerful alternative to standard procedures based on the sample mean differences. A sudden upsurge of research has taken place recently in selective/post-selection inference due to its importance in light of the realization by the scientific community that the lack of reproducibility of a scientist’s work is often caused by his/her failure to account for selection bias. When multiple hypotheses are simultaneously tested in a selective inference setting, it gives rise to a grouped hypotheses testing framework with the tested groups being selected from a given set of groups of hypotheses. Benjamini & Bogomolov (2014) introduced the notion of the expected average of false discovery proportion across the selected groups as an appropriate error rate to control in this setting and proposed a method that controls it. Since then, a few papers have been written in this area (Peterson et al. (2016a) and Heller et al. (2017)); however, no research has been produced yet in the Lfdr framework. Remark 2.1. When grouping of hypotheses occurs, naturally or artificially, an assumption can be made that the significance of a hypothesis is influenced by that of the group it belongs to. The Lfdr under the standard two-class mixture model, however, does not help in assessing a group’s influence on true significance of its hypotheses. This has been the main motivation behind the work of Liu et al. (2016), who considered a group-adjusted two-class mixture model that yields an explicit representation of each hypothesis-specific Lfdr as a function of its group-adjusted form and the Lfdr for the group it is associated with. It allows them to produce a method that provides a separate control over within-group false discoveries for truly significant groups in addition to having a control of false discoveries across all individual hypotheses. This paper, as mentioned in Introduction, motivates us to proceed further with the development of newer Lfdr based multiple testing methods for one-way classified hypotheses as described in the following section. 3. Proposed Methodologies i Let us define Hi = ∩n j=1 Hij to let Hi = 0 (or = 1) mean that the ith group, and hence each
(or at least one) of its component hypotheses, is non-significant (or significant). Let θi indicate the truth (θi = 0) or falsity (θi = 1) of Hi . We express each θij as follows: θij = θi · θj|i , with θj|i indicating the truth or falsity of Hij conditional on the status of Hi , i.e., θij = 0, if θi = 0; and θij = 0 or 1 according to whether θj|i = 0 or 1, if θi = 1. This representation of the θij ’s brings the underlying group structure of the hypotheses into their binary hidden states conditional on the binary hidden states of the groups containing them.
4
Let us now recall from Liu et al. (2016) the model, with a different name, extending the two-class mixture model (Efron et al., 2001) from single to multiple groups under the setting of one-way classified hypotheses. The following distribution introduced in Liu et al. (2016) with a different name plays an important role in this model: Definition 3.1. [Truncated Product Bernoulli (TPBern (π, n)]. A set of n binary variables Z1 , . . . , Zn with the following joint probability distribution is said to have a TPBern (π, n) distribution: P (Z1 = z1 , . . . , Zn = zn )
=
=
n n Y X z 1 i (1 − π)1−zi I zi > 0 π 1 − (1 − π)n i=1 i=1 ! Pn zi n X i=1 (1 − π)n π I zi > 0 . 1 − (1 − π)n 1 − π
!
i=1
When hypotheses belonging to a certain group/family are simultaneously tested, this distribution provides a natural adjustment of the commonly used product Bernoulli distribution for the set of binary hidden states of the hypotheses, conditional on the group/family itself being significant. Definition 3.2. [Group-Adjusted Two-Class Mixture Model for One-Way Classified Hypotheses (One-Way GAMM)]. Let (Xij , j = 1, . . . , ni , θi , θj|i , j = 1, . . . , ni ) be the set of random variables associated with the ith group, for i = 1, . . . , m. The groups are independently distributed with the following model for group i: ind Xij | θi , θj|i ∼ (1 − θi · θj|i )f0 (xij ) + θi · θj|i f1 (xij ), for some given densities f0 and f1 , P (θj|i = 0| θi = 0) = 1, for each j = 1, . . . , ni ; (θ1|i , . . . , θni |i ) | θi = 1 ∼ T P Bern(π2i ; ni ), θ ∼ Bern(π ). i 1 Let
Lfdrij (π1 , π2i ) ≡ Lfdrij (x; π1 , π2i ) = P r(θij = 0 | X = x), Lfdri (π1 , π2i ) ≡ Lfdri (x; π1 , π2i ) = P r(θi = 0 | X = x), and Lfdrj|i (π1 , π2i ) ≡ Lfdrj|i (x; π1 , π2i ) = P r(θj|i = 0 | θi = 1, X = x) be the local FDRs corresponding to Hij (hypothesis), Hi (group), and Hij given Hi = 1 (conditional), respectively, under One-Way GAMM. It is easy to see that Lfdrij (π1 , π2i ) = 1 − [1 − Lfdri (π1 , π2i )][1 − Lfdrj|i (π1 , π2i )],
(3.1)
showing how a hypothesis specific local FDR factors into the loacl FDRs for the group and for the hypothesis conditional on the group’s significance. Let Lfdrij (π2i ) = [(1 − π2i )f0 (xij )]/mi (xij ), with mi (x) = (1 − π2i )f0 (x) + π2i f1 (x), and Q i Lfdri (π2i ) = n j=1 Lfdrij (π2i ). Then, as shown in Appendix, Lfdrj|i (π1 , π2i ) ≡ Lfdrj|i (π2i ) = 5
Lfdrij (π2i ) − Lfdri (π2i ) , 1 − Lfdri (π2i )
(3.2)
and Lfdri (π1 ; π2i ) ≡ Lfdri (λi ; π2i ) = where λi =
Lfdri (π2i ) , Lfdri (π2i ) + λi [1 − Lfdri (π2i )]
1 − (1 − π2i)ni π1 ÷ . 1 − π1 (1 − π2i )ni
(3.3)
(3.4)
When λi = 1, Lfdrij (π1 , π2i ) reduces to Lfdrij (π2i ), and so One-Way GAMM with λi = 1 for all i represents the case of ‘no group effect’. These results can be summarised in the following: Proposition 3.1. Let Lfdrij (π2i ) be the local FDR associated with Hij in group i under the standard single-group two-class mixture model with π2i being the probability of a hypotheses in the group being significant, and Lfdrij (π1 , π2i ) be the same under One-Way GAMM that incorporates a similar two-class mixture model across the groups with π1 as the chance of a group being significant. Then, Lfdrij (π1 , π2i ) can be expressed in terms of Lfdrij (π2i ) and λi as follows by making use of (3.1)-(3.3), with λi measuring an effect due to grouping for group i: Lfdrij (λi , π2i ) =
Lfdri (π2i ) + λi [Lfdrij (π2i ) − Lfdri (π2i )] , Lfdri (π2i ) + λi [1 − Lfdri (π2i )]
(3.5)
for each i = 1, . . . , m; j = 1, . . . , ni . Remark 3.1. The above results bring home the point that in an Lfdr based approach to testing hypotheses belonging to a group/family that itself is likely to be significant with a chance of its own, the Lfdr for the group should be separated out from that for each hypothesis before assessing the true significance of the hypothesis. More specifically, suppose that we have a single group (i.e., m = 1) of hypotheses to test. Then, the hypotheses should be tested by taking away from them the confounding effect of the group’s significance by using Lfdrj|1 (π21 ) or the cumulative averages of them, depending on whether one desires to control the local FDR or the average local FDR (when controlling posterior FDR). Of course, one should test the significance of the group using its local FDR, Lfdr1 (λ1 , π21 ), before proceeding to test the hypotheses in it at a level depending on that for Lfdr1 (λ1 , π21 ). More specifically, if one wants to control the average local FDR, say at α, then we propose to reject the hypotheses associated with Lfdr(j)|1 (π21 ), j = 1, . . . , R1 , the first R1 increasingly ordered values of Lfdrj|1 (π21 ), where R1 is such that R
1 α − Lfdr1 (λ1 , π21 ) 1 X . Lfdr(j)|1 (π21 ) ≤ R1 1 − Lfdr1 (λ1 , π21 )
j=1
The Lfdr1 (λ1 , π21 ) equals 0 if the group is assumed to be significant, or it can be controlled at some pre-assigned level η < α to check if the group is significant. Clearly, when λ1 = 1, our proposal reduces to controlling the average local FDR for a single group of hypotheses under the standard two-class mixture model without introducing any group effect. We will extend this proposal from single to multiple groups of hypotheses in the following.
6
We express δij ∈ {0, 1}, the decision rule associated with θij , similarly to θij , as follows: δij (X) = δi (X) · δj|i(X), with δi (X) ∈ {0, 1} and δj|i (X) ∈ {0, 1} being the decision rules for θi and θj|i , respectively. This provides a two-stage approach to deciding between θij = 0 and θij = 1 simultaneously for all (i, j). This paper relates to the development of such two-stage approaches, but focused on controlling the posterior expected proportion of false discoveries across all hypotheses, referred to as the total posterior FDR (PFDRT ), or the posterior expected average false discovery proportion across the selected/signficant groups, referred to as the selective posterior FDR (PFDRS ), at a given level α. In other words, we consider determining (δi (X), δj|i (X)), i = 1, . . . , m, j = 1, . . . , ni , satisfying P m Pni (1 − θij )δij (X) i=1 nP j=1P o PFDRT = E m ni max δ (X), 1 ij i=1 j=1
or
X ≤ α,
(3.6)
Pn X (1 − θ )δ (X) ij ij 1 j=1 nP o X ≤ α, PFDR S = E n |S| max δij (X), 1 i∈S
(3.7)
j=1
where S is the set of indices for the selected groups, with the expectations taken with respect to θij ’s conditional on X. For notational convenience, we will often hide the symbol X in the δ’s. Using (3.1), we see that PFDRT and PFDRS simplify, respectively, to Pm Pni i=1 j=1 Lfdrij (λi , π2i )δij P PFDRT = m Pni max i=1 j=1 δij , 1 Pm i=1 δi Ri {1 − [1 − Lfdri (λi , π2i )][1 − PFDRWi ]} Pm , = max i=1 δi Ri , 1
(3.8)
and
PFDRS = where Ri = δi
Pni
j=1 δj|i ,
Pm
i=1 δi {1 − [1 − Lfdri (λi , π2i )][1 − PFDRWi ]} Pm , max i=1 δi , 1
and PFDRWi =
posterior FDR for group i.
Pni
j=1 δj|i Lfdrj|i (π2i )/ max (Ri , 1)
(3.9) is the within-group
The above representations of PFDRT and PFDRS under One-Way GAMM provide a Group Adjusted TEesting (GATE) framework for one-way classified hypotheses using their local FDRs, allowing us to produce algorithm (in their oracle forms) answering each of Q1 and Q2. We commonly refer to these algorithms as One-Way GATE algorithms. 3.1. Answering Q1 Before we present an algorithm in its oracle form answering Q1, it is important to note the following theorem that drives the development of it with some optimality property. Theorem 3.1. Let P F N RT = E
"
Pm Pni
θij (1 − δij (X)) i=1 X Pm j=1 Pni max{ i=1 j=1 (1 − δij (X)), 1} 7
#
(3.10)
denote the total posterior FNR (PFNRT ) of a decision rule δ(X) = {δij (X), i = 1, . . . m, j = 1, . . . , ni }. The PFNRT of the decision rule δ(X) with δij (X) = I(Lf drij (λi , π2i ) ≤ c), for ′ (X) with c ∈ (0, 1) satisfying P F DRT = α, is always less than or equal to that of any other δij
P F DRT ≤ α. A proof of this theorem can be seen in Appendix. Algorithm 1 One-Way GATE 1 (Oracle). 1:
Calculate Lfdrij (λi , π2i ), the hypothesis specific local FDR under One-Way GAMM, from Proposition 1, for each i = 1, . . . , m; j = 1, . . . , ni .
2:
Pool all these Lfdrij ’s together and sort them as Lfdr(1) ≤ · · · ≤ Lfdr(N ) .
3:
Reject the hypotheses o associated n P max l : lk=1 Lfdr(k) ≤ lα .
with
Lfdr(k) ,
k = 1, . . . , R,
where
R =
Theorem 3.2. The oracle One-Way GATE 1 controls PFDRT at α. This theorem can be proved using standard arguments used for Lfdr based approaches to testing single group of hypotheses (see, e.g., Sun & Cai (2007); Sarkar & Zhou (2008)). It is important to note that PFDR T may not equal a pre-specified value of α, and so Algorithm 1 is generally sub-optimal in the sense that it is the closest to one that is optimal as stated in Theorem 1. Remark 3.2. When λi = 1 for all i, i.e., when the underlying grouping of hypotheses is ineffective in the sense that a group’s own chance of being significant is no different from when it is formed by combining a set of independent hypotheses, One-Way GATE 1 reduces to the standard Lfdr based approach (like that in Sun & Cai (2007); He et al. (2015); and in many others). As we will see from simulation studies in Section 4, with λi increasing (or decreasing) from 1, i.e., when a group’s chance of being significant gets larger (or smaller) than what it is if the group consists of independent hypotheses, the standard Lfdr based approach becomes less powerful (or fails to control the error rate). 3.2. Answering Q2 There are applications in the context of selective inference of multiple groups/familes of hypotheses where discovering significant groups, and hence a control over a measure of their false discoveries, is scientifically no less meaningful than making such discoveries for individual hypotheses subject to a control over a similar measure of false discoveries across all of them. For instance, as Peterson et al. (2016b) noted, in a multiphenotype genome-wide association study, which is often focused on groups/families of all phenotype specific hypotheses related to different genetic variants, rejecting Hi corresponding to variant i is considered an important discovery in the process of identifying phenotypes that are significantly associated with that variant. They borrowed ideas from Benjamini & Bogomolov (2014) and considered a hierarchical testing method that allows control of this so-called between-group FDR in the process of
8
controlling the expected average of false discovery proportions across significant groups (due to Benjamini & Bogomolov (2014)). The following algorithm in its oracle form answering Q2 offers an Lfdr based alternative to the hierarchical testing method of Peterson et al. (2016b). It allows a control over Pm δi Lfdri (λi , π2i ) , PFDRB = i=1 Pm max i=1 δi , 1
an Lfdr analog of the aforementioned between-group FDR for the selected groups, while controlling PFDRS . The following notation is being used in this algorithm: For 0 < α′ < 1, Ri (α′ ) = max{1 ≤ k ≤ P ni : kj=1 Lfdr(j)|i (π2i ) ≤ kα′ }, with Lfdr(j)|i (π2i ), j = 1, . . . , ni , being the sorted values of the
Lfdrj|i (π2i )’s in group i.
Algorithm 2 One-Way GATE 2 (Oracle). Given an η ∈ (0, α), select the largest subset of group indices S 1 P i∈S Lfdri (λi , π2i ) ≤ η. |S|
1:
such that
For each i ∈ S, and any given α′ ≤ α, find Ri (α′ ) to calculate Ri (α′ ) X X 1 1 Lfdr(j)|i (π2i ) . (1 − Lfdri (λi , π2i )) 1 − PFDRS (α′ ) = 1 − |S| Ri (α′ )
2:
(3.11)
j=1
i∈S
3:
Find α∗ (S) = sup{α′ : PFDRS (α′ ) ≤ α}.
4:
Reject the hypotheses associated with PFDR S (α∗ (S)).
Theorem 3.3. The oracle One-Way GATE 2 controls PFDRS at α subject to a control over PFDRB at η < α. This theorem can be proved by noting that the left-hand side of (3.11) is the PFDRS of the procedure produced by Algorithm 2. Let
and
Pm θi (1 − δi(X)) i=1 P PFNRB = E X , max{ m i=1 (1 − δi (X)), 1}
PFNRWi = E
"
Pni
j=1 θj|i (1 − δj|i(X)) X P i (1 − δ (X)), 1} max{ n j|i j=1
#
denote between-group posterior FNR and within-group posterior FNR for group i, respectively, for a decision rule of the form δij (X) = δi (X)δj|i (X), with δi (X) = I(Lf dri (λi , π2i ) ≤ c) and δj|i (X) = I(Lf drj|i (π2i ) ≤ c′ ), for some 0 < c, c′ < 1, i = 1, . . . , m. Remark 3.3. From Theorem 3.1, we have the following optimality result regarding One-Way GATE 2: Given any 0 < η < α < 1, (i) the PFNRB of the decision rule of the form δi (X) = I(Lf dri (λi , π2i ) ≤ c) with 0 < c < 1 ′ (X) with P F DR ≤ η. satisfying P F DRB = η is less than or equal to that of any other δi B
9
(ii) Given δi (X), i = 1, . . . , m, with P F DRB ≤ η, there exists an α′ (η) ≤ α, subject to PFDRS = α, such that, for each i, PFNRWi of the decision rule of the form δj|i (X) = I(Lf drj|i (π2i ) ≤ c′ ) with 0 < c′ < 1 satisfying P F DRWi = α(η) is less than or equal to that of any other decision rule in that group for which P F DRWi ≤ α′ (η). Remark 3.4. It is important to note that One-Way GATE 2 without Step 1 can be used in situations where the focus is on controlling PFDRS given a selection rule (or S). 4. Numerical Studies This section presents results of numerical studies we conducted to examine the performances of One-Way GATE 1 and One-Way GATE 2 compared to their relevant competitors in their oracle forms. 4.1. One-Way GATE 1 We considered various simulation settings involving 10,000 or 100,000 hypotheses grouped into equal-sized groups to investigate the performance of One-Way GATE 1 in comparison with its three competitors, all in their oracle forms. The first competitor, named as oracle Naive Method, ignores the group structure by pooling all the hypotheses together into a single group, while the other two are oracle SC (Sun & Cai (2009)) and oracle GBH (Hu et al. (2010)) methods. They operate under our model setting with equal group size n as follows: Oracle Naive Method: The single-group Lfdr based method of Sun & Cai (2007) is applied to the mn hypotheses pooled together into a single group under a two-class mixture model Xij ∼ 1 Pm ni (1 − p)f0 (xij ) + pf1 (xij ), with p = m i=1 pi , where pi = π1 π2|i and π2|i = π2i /[1 − (1 − π2i ) ]. Oracle SC Method: The single-group Lfdr based method of Sun & Cai (2007) is applied to
the mn hypotheses pooled together into a single group assuming a two-class mixture model Xij ∼ (1 − π2|i)f0 (xij ) + π2|i f1 (xij ) for the n hypotheses in group i, for each i = 1, . . . , m. Oracle GBH Method: Xij is converted to its p-value Pij before a level α BH method is applied w = p(1 − p )P /p , i = 1, . . . , m; j = 1, . . . , n, for the mn hypotheses to the weighted p-values Pij i ij i
pooled together into a single group. The simulations involved independently generated triplets of observations (Xij , θi , θj|i ), i = 1, . . . , m(= 200 or 2000); j = 1, . . . , ni (= 5 or 50), with (i) θi ∼ Bern(π1 = 0.3); (ii) θj|i ’s jointly following TPBern(π2i ; ni ), with π2i determined from (3.4) using λ = k 2 /100 for k = 1, 2, . . ., 19 or 20; and (iii) Xij |θij ∼ N (0, 1) if θij = 0, and ∼ 0.3N (−2, 1) + 0.7N (µ2 , 1) if θij = 1, where µ2 = 1.5 or 1.6 or . . . or 2.9 or 3.0. The oracle versions of One-Way GATE 1, the Naive Method, SC method, and GBH method were applied to the data for testing θij = 0 against θij = 1 simultaneously for all (i, j) at α = 0.05, and the simulated values of the total false discovery rate, the average number of true rejections, and the average number of total rejections were obtained for each of them based on 1000 replications.
10
True Rej
True Rej
600
0.3
800
0.4
1000
FDR
400
Gate GBH SC Naive
0
0.0
200
0.1
0.2
FDR
Gate GBH SC Naive alpha=0.05
0
1
2
3
4
0
1
2
λ
3
4
λ
600
800
1000
Total Rej
0
200
400
Total Rej
Gate GBH SC Naive
0
1
2
3
4
λ
Figure 1: Oracle One-Way GATE 1: m = 2000, ni = 5, µ2 = 1.5. The x-axis corresponds to λ, varying from 0.01 to 4.
Figures 1-3 and 6-14 display how the four methods compare across different values of π2i (or λ) and µ2 as the group size changes from small to a large value. The first three of these figures are being used here to point out scenarios where One-Way GATE 1 is seen to perform better than its competitors when µ2 = 1.5. The rest of these graphs for larger values of µ are put in Appendix to see if the comparative performance pattern among the four methods changes with increasing value of µ. Figures 1-3 show that oracle One-Way GATE 1 controls the FDR at the desired level 0.05 well. The oracle Naive Method also controls the FDR at the desired level. However, it is seen to be less powerful than oracle One-Way GATE 1, as expected, with the power difference getting larger with increased group size. The superior performance of oracle One-Way GATE 1 over oracle SC method when λ 6= 1 is clearly shown by these graphs. The oracle SC method fails to control the FDR, with the resultant FDR getting as large as 0.47, when λ < 1. This happens because it uses a larger value of π2i when λ is small, inflating the FDR by an amount relating to the value of λ. When λ is larger, it uses a smaller value of π2i , resulting in a method which is overly conservative. The GBH has a similar pattern. It fails to control the FDR when λ < 1 and is overly conservative when λ > 1. This conservativeness gets more and more prominent as λ increases. When λ < 1, the SC method yields slightly more rejections, largely due to its inflated error rate. When λ > 1, oracle One-Way GATE 1 works way better than oracle SC method and oracle GBH method.
11
True Rej
0
0.0
0.1
200
0.2
FDR
Gate GBH SC Naive alpha=0.05
Gate GBH SC Naive
400
True Rej
0.3
0.4
600
0.5
FDR
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
Total Rej
400 0
200
Total Rej
600
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 2: Oracle One-Way GATE 1: m = 200, ni = 50, µ2 = 1.5. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
As seen from Figures 6-14, oracle One-Way GATE 1 is seen to retain its improved performance over the oracle versions of Naive, SC and GBH methods for larger values of µ2 . 4.2. One-Way GATE 2 Simulation studies were conducted to compare oracle One-Way GATE 2 to its only competitor, the BB method (Benjamini & Bogomolov (2014)) in its oracle form that operates as follows: Oracle BB method using Simes’ combination: Xij is converted to its p-value Pij . With Pi(1) ≤ · · · ≤ Pi(n) denoting the sorted p-values in group i, let Pi = min1≤j≤n {n(1 − π2|i )Pi(j) /j} denote Simes’ combination of the p-values in group i in its oracle form, for i = 1, . . . , m. Let G be the set of indices of the group specific hypotheses Hi rejected using the oracle level α BH method based on (1 − π1 )Pi , i = 1, . . . , m. Reject the hypotheses corresponding to Pi(j) for all i ∈ G and j ≤ Ri = max{j : (1 − π1 π2|i )Pi(j) ≤ j|G|α/mn}. The comparison was made in terms of selective FDR, average number of total rejections, and average number of true rejections were carried out under the same setting as in One-Way GATE 1. Figures 4 and 5 present the comparison for the setting where m = 2, 000, ni = 50, and π1 = 0.10 and 0.52 respectively and π2|i = 0.30. The results for other settings are reported in Figures 15-23. First, it is demonstrated that both the oracle One-Way GATE 2 and oracle BB method control the P F DRS well. 12
True Rej
0
0.0
0.1
2000
0.2
FDR
Gate GBH SC Naive alpha=0.05
Gate GBH SC Naive
4000
True Rej
0.3
0.4
6000
0.5
8000
FDR
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
8000
Total Rej
4000 0
2000
Total Rej
6000
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 3: Oracle One-Way GATE 1: m = 2000, ni = 50, µ2 = 1.5. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
The oracle One-Way GATE 2 is more powerful in terms of yielding a large number of true rejection when the π1 is relatively small, indicating a high sparsity level between-group level. When π1 is as large as 0.8, most of the groups are selected, and there is little adjustment for selection in the oracle BB method. It thus has more number of rejections. When the group size is large (=50), the oracle One-Way GATE 2 is more powerful than the oracle BB method; however, the latter one can lead to larger number of rejections when the group size is small (=5). 5. Concluding Remarks The primary focus of this article has been to continue the line of research in Liu et al. (2016) to answer Q1 and Q2 for one-way classified hypotheses, providing the ground work for our broader goal of answering these questions in the setting of two-way classified hypotheses. Two-way classified setting is seen to occur in many applications. For instance, in time-course microarray experiment (see, e.g., Storey et al. (2005); Yuan & Kendziorski (2006); Sun & Wei (2011)), the hypotheses of interest can be laid out in a two-way classified form with ‘gene’ and ‘time-point’ representing the two categories of classification. In multiphenotype GWAS (Peterson et al. (2016b); Segura et al. (2012)), the families of the hypotheses related to different phenotypes form one level of grouping, while the other level of grouping is formed by the families of hypotheses corresponding to different SNPs. Two-way classified structure of hypotheses occurs also in brain imaging studies (Liu et al. (2009); Stein et al. (2010); Lin et al. (2014); 13
True Rej
800 600
True Rej
Gate BB 400
0.04
FDR
0.06
1000
Sel FDR
0
0.00
200
0.02
Gate BB α
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
600
Gate BB
0
200
400
Total Rej
800
1000
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 4: Oracle One-Way GATE 2: m = 2000, ni = 50, π1 = 0.10. The x-axis corresponds to η varying from 0.003 to 0.05.
Barber & Ramdas (2015)). Now that we know the theoretical framework successfully capturing the underlying group effect and yielding powerful approaches to multiple testing in the one-way classified setting, we can proceed to extend it to produce newer and powerful Lfdr based approaches answering Q1 and Q2 in two-way classified setting. We intend to do that in our future communications. Also, we have focused in this paper on developing the GATE algorithms in their oracle forms. In practice, one can estimate the unknown quantities in these oracle methods using various estimation techniques; see, e.g. Liu et al. (2016). Additionally, we can assume hyper-priors for the parameters and use Bayesian tools to calculate the Lfdrs. We will leave these for our future research. The figures associated with our numerical studies involving the BB method in its oracle form seems to suggest that this method, as proposed in Benjamini & Bogomolov (2014), can potentially be improved by plugging into it an estimated proportion of active groups. This is another important direction that we will pursue in our future research. A. Appendix A.1. Proofs of (3.2) and (3.3) These results, although appeared before in Liu et al. (2016), will be proved here using different and simpler arguments. They are re-stated, without any loss of generality, for a single group with slightly different notations in the following lemma. 14
True Rej
3000
True Rej
Gate BB
2000
0.04
Gate BB α
0
0.00
1000
0.02
FDR
4000
0.06
5000
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
3000
Gate BB
0
1000
2000
Total Rej
4000
5000
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 5: Oracle One-Way GATE 2: m = 2000, ni = 50, π1 = 0.52. The x-axis corresponds to η varying from 0.003 to 0.05.
Lemma A.1. Conditionally given θ ∼ Bern(π1 ), let (Xj , θj ), j = 1, . . . , n, be distributed as folind
lows: (i) X1 , . . . Xn | θ1 , . . . , θn ∼ (1−θ ·θj )f0 (xj )+θ ·θj f1 (xj ), and (ii) θ1 , . . . , θn ∼ T P Bern(π2 ; n). Let Lfdrj (π2 ) ≡ Lf dr(xj ; π2 ) = (1 − π2 )f0 (xj )/m(xj ), with m(x) = (1 − π2 )f0 (x) + π2 f1 (x), for Q j = 1, . . . , n, and Lfdr (π2 ) = n j=1 Lf drj (π2 ). Then, Lf drj (π2 ) − Lf dr(π2 ) 1 − Lf dr(π2 )
(A.1)
Lf dr (π2 ) , Lf dr (π2 ) + λ[1 − Lf dr(π2 )]
(A.2)
P r(θj = 0|θ = 1, X1 = x1 , . . . , Xn = xn ) = and P r(θ = 0|X1 = x1 , . . . , Xn = xn ) = where λ =
π1 1−π1
n
(1−π2 ) ÷ 1−(1−π . )n 2
Proof. First, note that (X1 , . . . , Xn )|θ = 0 ∼
n Y
f0 (xj ) =
j=1
15
Qn
j=1 m(xj )
(1 − π2 )n
Lfdr (π2 ),
(A.3)
and (X1 , . . . , Xn )|θ = 1 ∼
=
=
X 1 n 1 − (1 − π2) Pn
n Y
n Y
θ {(1 − θj )f0 (xj ) + θj f1 (xj )} {π2j (1 − π2 )1−θj } j=1 θ >0 j=1 j=1 j
n n Y Y 1 m(xj ) − (1 − π2)n f0 (xj ) 1 − (1 − π2)n j=1 j=1 Qn j=1 m(xj ) [1 − Lfdr (π2 )] , 1 − (1 − π2)n
(A.4)
from which we get (X1 , . . . , Xn ) ∼
(1 − π1)Lfdr (π2 ) π1 [1 − Lfdr (π2 )] + (1 − π2 )n 1 − (1 − π2)n
Y n
m(xj ).
(A.5)
j=1
Formula (A.2) follows upon dividing (1 − π1 ) times (A.3) by (A.5). When θj = 0, the conditional distribution of X1 , . . . , Xn given θ = 1 can be obtained similar to that in (A.4) as follows: X (1 − π2 )f0 (xj ) 1 − (1 − π2)n Pn
θ >0 k(6=j)=1 k
=
=
n Y
{(1 − θk )f0 (xk ) + θk f1 (xk )}
k(6=j)=1
n Y
k(6=j)=1
n n Y (1 − π2 )f0 (xj ) Y n−1 m(x ) − (1 − π ) f0 (xk ) 2 k 1 − (1 − π2)n k(6=j)=1 k(6=j)=1 Qn j=1 m(xj ) ) [Lfdrj (π2 ) − Lfdr (π2 )] . 1 − (1 − π2)n
θ {π2k (1 − π2 )1−θk }
(A.6)
Formula (A.1) then follows upon dividing (A.6) by (A.4).
′ (X), Lf dr (X). Proof of Theorem 3.1. For notational simplicity, we will hide X in δij (X), δij ij
First, we note the following inequalities: α
X ij
X X ′ ′ ′ ≤ Lf drij ≤ c , δij − δij δij − δij δij − δij ij
(A.7)
ij
the first of which follows from the fact that the PFDRT of δ ′ is less than or equal to α, which is P ′ the PFDRT of δ, while the second one follows from ij δij − δij (c − Lf drij ) ≥ 0, because of the definition of δij . P P P ′ )Lf dr ≥ Since α = ij δij Lf drij / max{ ij δij , 1} ≤ c, we have from (A.7) that ij (δij −δij ij
0, that is,
X ij
(1 − δij )Lf drij ≤
X ij
16
′ (1 − δij )Lf drij .
(A.8)
With PFNRT (δ) and PFNRT (δ ′ ) denoting the PFNRT of δ and δ ′ , respectively, we now note that PFNRT (δ) PFNRT (δ ′ ) − 1 − PFNRT (δ) 1 − PFNRT (δ ′ ) " # ′ )(1 − Lf dr ) X (1 − δij )(1 − Lf drij ) (1 − δij ij P = c − P ′ ij (1 − δij )Lf drij ij (1 − δij )Lf drij ij # " ′ X 1 − δij 1 − δij P [c(1 − Lf drij ) − (1 − c)Lf drij ] −P = ′ ij (1 − δij )Lf drij ij (1 − δij )Lf drij c
ij
≤ 0,
with the inequality holding due to the definition of δij and the inequality in (A.8). Thus, we have P F N RT (δ) ≤ P F N RT (δ ′ ), which proves the theorem. References Barber, R. F., & Ramdas, A. (2015). The p-filter: multilayer false discovery rate control for grouped hypotheses. Journal of the Royal Statistical Society: Series B, 79 , 1247–1268. Benjamini, Y., & Bogomolov, M. (2014). Selective inference on multiple families of hypotheses. Journal of the Royal Statistical Society. Series B, 76 , 297–318. Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B, 57 , 289–300. Cai, T. T., & Sun, W. (2009). Simultaneous testing of grouped hypotheses: Finding needles in multiple haystacks. Journal of the American Statistical Association, 104 , 1467–1481. Cai, T. T., Sun, W., & Wang, W. (2016). CARS: Covariate assisted ranking and screening for large-scale two-sample inference, . Technical Report. Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model. Statistical Science, 23 , 1–22. Efron, B., Tibshirani, R., Storey, J. D., & Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association, 96 , 1151–1160. Ferkingstad, E., Frigessi, A., Rue, H., Thorleifsson, G., & Kong, A. (2008). Unsupervised empirical Bayesian multiple testing with external covariates. The Annals of Applied Statistics, 2 , 714–735. Genovese, C., & Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society. Series B, 64 , 499–517. 17
He, L., Sarkar, S. K., & Zhao, Z. (2015). Capturing the severity of type II errors in highdimensional multiple testing. Journal of Multivariate Analysis, 142 , 106–116. Heller, R., Chatterjee, N., Krieger, A., & Shi, J. (2017). Post-selection inference following aggregate level hypothesis testing in large scale genomic data. Journal of the American Statistical Association, 113 . Available online. Hu, J. X., Zhao, H., & Zhou, H. H. (2010). False discovery rate control with groups. Journal of the American Statistical Association, 105 , 1215–1227. Ignatiadis, N., Klaus, B., Zaugg, J. B., & Huber, W. (2016). Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nature Methods, 13 , 577–580. Lin, D., Calhoun, V. D., & Wang, Y. (2014). Correspondence between fMRI and SNP data by group sparse canonical correlation analysis. Medical Image Analysis, 18 , 891–902. Liu, J., Pearlson, G., Windemuth, A., Ruano, G., Perrone-Bizzozero, N. I., & Calhoun, V. (2009). Combining fMRI and SNP data to investigate connections between brain function and genetics using parallel ICA. Human Brain Mapping, 30 , 241–255. Liu, Y., Sarkar, S. K., & Zhao, Z. (2016). A new approach to multiple testing of grouped hypotheses. Journal of Statistical Planning and Inference, 179 , 1–14. Newton, M. A., Noueiry, A., Sarkar, D., & Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics, 5 , 155–176. Peterson, C. B., Bogomolov, M., Benjamini, Y., & Sabatti, C. (2016a). Many phenotypes without many false discoveries: error controlling strategies for multitrait association studies. Genetic epidemiology, 40 , 45–56. Peterson, C. B., Bogomolov, M., Benjamini, Y., & Sabatti, C. (2016b). Many phenotypes without many false discoveries: Error controlling strategies for multitrait association studies. Genetic epidemiology, 40 , 45–56. Sarkar, S. K. (2004). FDR-controlling stepwise procedures and their false negatives rates. Journal of Statistical Planning and Inference, 125 , 119–137. Sarkar, S. K., & Zhou, T. (2008). Controlling bayes directional false discovery rate in random effects model. Journal of Statistical Planning and Inference, 138 , 682–693. Sarkar, S. K., Zhou, T., & Ghosh, D. (2008). A general decision theoretic formulation of procedures controlling fdr and fnr from a Bayesian perspective. Statista Sinica, 18 , 925–945. Scott, J. G., Kelly, R. C., Smith, M. A., Zhou, P., & Kass, R. E. (2015). False discovery rate regression: an application to neural synchrony detection in primary visual cortex. Journal of the American Statistical Association, 110 , 459–471.
18
¨ Long, Q., & Nordborg, M. Segura, V., Vilhj´almsson, B. J., Platt, A., Korte, A., Seren, U., (2012). An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations. Nature Genetics, 44 , 825–830. Stein, J. L., Hua, X., Lee, S., Ho, A. J., Leow, A. D., Toga, A. W., Saykin, A. J., Shen, L., Foroud, T., Pankratz, N. et al. (2010). Voxelwise genome-wide association study (vGWAS). Neuroimage, 53 , 1160–1174. Storey, J. D. (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society. Series B, 64 , 479–498. Storey, J. D., Xiao, W., Leek, J. T., Tompkins, R. G., & Davis, R. W. (2005). Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America, 102 , 12837–12842. Sun, L., Craiu, R. V., Paterson, A. D., & Bull, S. B. (2006). Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genetic Epidemiology, 30 , 519–530. Sun, W., & Cai, T. T. (2007). Oracle and adaptive compound decision rules for false discovery rate control. Journal of the American Statistical Association, 102 , 901–912. Sun, W., & Cai, T. T. (2009). Large-scale multiple testing under dependence. Journal of the Royal Statistical Society. Series B, 71 , 393–424. Sun, W., & Wei, Z. (2011). Multiple testing for pattern identification, with applications to microarray time-course experiments. Journal of the American Statistical Association, 106 , 73–88. Yuan, M., & Kendziorski, C. (2006). Hidden Markov models for microarray time course data in multiple biological conditions. Journal of the American Statistical Association, 101 , 1323– 1332. Zablocki, R. W., Schork, A. J., Levine, R. A., Andreassen, O. A., Dale, A. M., & Thompson, W. K. (2014). Covariate-modulated local false discovery rate for genome-wide association studies. Bioinformatics, (p. btu145).
19
A.2. More simulation results True Rej
Gate GBH SC Naive
400 0
0.0
200
0.1
0.2
FDR
Gate GBH SC Naive alpha=0.05
600
True Rej
0.3
800
0.4
1000
1200
FDR
0
1
2
3
4
0
1
2
λ
3
4
λ
600
Gate GBH SC Naive
0
200
400
Total Rej
800
1000
1200
Total Rej
0
1
2
3
4
λ
Figure 6: Oracle One-Way GATE 1: G = 2000, ni = 5, µ2 = 2. The x-axis corresponds to λ, varying from 0.01 to 4.
20
True Rej
800
Gate GBH SC Naive
0
0.0
200
0.1
400
0.2
FDR
Gate GBH SC Naive alpha=0.05
600
True Rej
0.3
1000
0.4
1200
1400
FDR
0
1
2
3
4
0
1
2
λ
3
4
λ
800 600
Gate GBH SC Naive
0
200
400
Total Rej
1000
1200
1400
Total Rej
0
1
2
3
4
λ
Figure 7: Oracle One-Way GATE 1: G = 2000, ni = 5, µ2 = 2.5. The x-axis corresponds to λ, varying from 0.01 to 4.
21
True Rej
True Rej
1000
0.3
0.4
1500
FDR
Gate GBH SC Naive
0
0.0
0.1
500
0.2
FDR
Gate GBH SC Naive alpha=0.05
0
1
2
3
4
0
1
2
λ
3
4
λ
1000
1500
Total Rej
0
500
Total Rej
Gate GBH SC Naive
0
1
2
3
4
λ
Figure 8: Oracle One-Way GATE 1: G = 2000, ni = 5, µ2 = 3. The x-axis corresponds to λ, varying from 0.01 to 4.
22
True Rej
0.4
800
0.5
FDR
0
0.0
0.1
200
0.2
FDR
Gate GBH SC Naive alpha=0.05
400
True Rej
0.3
600
Gate GBH SC Naive
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
800
Total Rej
400 0
200
Total Rej
600
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 9: Oracle One-Way GATE 1: G = 200, ni = 50, µ2 = 2. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
23
1000 800 400 0
0.0
0.1
200
0.2
FDR
Gate GBH SC Naive alpha=0.05
Gate GBH SC Naive
600
True Rej
0.3
0.5
True Rej
0.4
FDR
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
1000
Total Rej
600 0
200
400
Total Rej
800
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 10: Oracle One-Way GATE 1: G = 200, ni = 50, µ2 = 2.5. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
24
True Rej
Gate GBH SC Naive
400 0
0.0
200
0.1
0.2
FDR
Gate GBH SC Naive alpha=0.05
600
True Rej
0.3
800
0.4
1000
1200
0.5
FDR
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
1000
1200
Total Rej
600 0
200
400
Total Rej
800
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 11: Oracle One-Way GATE 1: G = 200, ni = 50, µ2 = 3. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
25
True Rej
0.4
8000
0.5
FDR
0
0.0
0.1
2000
0.2
FDR
Gate GBH SC Naive alpha=0.05
4000
True Rej
0.3
6000
Gate GBH SC Naive
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
8000
Total Rej
4000 0
2000
Total Rej
6000
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 12: Oracle One-Way GATE 1: G = 2000, ni = 50, µ2 = 2. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
26
True Rej
4000 0
0.0
0.1
2000
0.2
FDR
Gate GBH SC Naive alpha=0.05
Gate GBH SC Naive
6000
True Rej
0.3
8000
0.4
10000
0.5
FDR
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
10000
Total Rej
6000 0
2000
4000
Total Rej
8000
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 13: Oracle One-Way GATE 1: G = 2000, ni = 50, µ2 = 2.5. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
27
True Rej
Gate GBH SC Naive
4000 0
0.0
2000
0.1
0.2
FDR
Gate GBH SC Naive alpha=0.05
6000
True Rej
0.3
8000
0.4
10000
12000
0.5
FDR
0.2
0.4
0.6
π2|i
0.8
0.2
0.4
π2|i
0.6
0.8
10000
12000
Total Rej
6000 0
2000
4000
Total Rej
8000
Gate GBH SC Naive
0.2
0.4
π2|i
0.6
0.8
Figure 14: Oracle One-Way GATE 1: G = 2000, ni = 50, µ2 = 3. The x-axis corresponds to π2i , varying from 0.05 to 0.95.
28
True Rej
Gate BB
0
0.00
20
40
60
True Rej
0.04
Gate BB α
0.02
FDR
80
100
0.06
120
140
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
60
Gate BB
0
20
40
Total Rej
80
100
120
140
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 15: Oracle One-Way GATE 2: G = 2000, ni = 5, π1 = 0.16. The x-axis corresponds to η varying from 0.003 to 0.05.
29
True Rej
400
True Rej
Gate BB
0
0.00
100
200
300
0.04
Gate BB α
0.02
FDR
500
0.06
600
700
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
400 300
Gate BB
0
100
200
Total Rej
500
600
700
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 16: Oracle One-Way GATE 2: G = 2000, ni = 5, π1 = 0.55. The x-axis corresponds to η varying from 0.003 to 0.05.
30
True Rej
800
Gate BB
400
600
True Rej
0.04
Gate BB α
0
0.00
200
0.02
FDR
1000
0.06
1200
1400
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
800 600
Gate BB
0
200
400
Total Rej
1000
1200
1400
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 17: Oracle One-Way GATE 2: G = 2000, ni = 5, π1 = 0.83. The x-axis corresponds to η varying from 0.003 to 0.05.
31
True Rej
True Rej
0.04
Gate BB
0
0.00
50
Gate BB α
0.02
FDR
100
0.06
150
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
Total Rej
100
150
Total Rej
0
50
Gate BB
0.01
0.02
η
0.03
0.04
0.05
Figure 18: Oracle One-Way GATE 2: G = 200, ni = 50, π1 = 0.14. The x-axis corresponds to η varying from 0.003 to 0.05.
32
Total Rej
300
Total Rej
0.04
Gate BB
0
0.00
100
200
Gate BB α
0.02
FDR
400
0.06
500
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
300
Gate BB
0
100
200
True Rej
400
500
True Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 19: Oracle One-Way GATE 2: G = 200, ni = 50, π1 = 0.5. The x-axis corresponds to η varying from 0.003 to 0.05.
33
True Rej
0.04
True Rej
400
Gate BB
Gate BB α
0
0.00
200
0.02
FDR
600
0.06
800
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
400
Gate BB
0
200
Total Rej
600
800
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 20: Oracle One-Way GATE 2: G = 200, ni = 50, π1 = 0.8. The x-axis corresponds to η varying from 0.003 to 0.05.
34
True Rej
True Rej
0.04
Gate BB
0
0.00
500
Gate BB α
0.02
FDR
1000
0.06
1500
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
Total Rej
1000
1500
Total Rej
0
500
Gate BB
0.01
0.02
η
0.03
0.04
0.05
Figure 21: Oracle One-Way GATE 2: G = 2000, ni = 50, π1 = 0.14. The x-axis corresponds to η varying from 0.003 to 0.05.
35
True Rej
3000
True Rej
Gate BB
2000
0.04
Gate BB α
0
0.00
1000
0.02
FDR
4000
0.06
5000
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
3000
Gate BB
0
1000
2000
Total Rej
4000
5000
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 22: Oracle One-Way GATE 2: G = 2000, ni = 50, π1 = 0.5. The x-axis corresponds to η varying from 0.003 to 0.05.
36
True Rej
0.04
True Rej
4000
Gate BB
Gate BB α
0
0.00
2000
0.02
FDR
6000
0.06
8000
Sel FDR
0.01
0.02
η
0.03
0.04
0.05
0.01
0.02
η
0.03
0.04
0.05
4000
Gate BB
0
2000
Total Rej
6000
8000
Total Rej
0.01
0.02
η
0.03
0.04
0.05
Figure 23: Oracle One-Way GATE 2: G = 2000, ni = 50, π1 = 0.81. The x-axis corresponds to η varying from 0.003 to 0.05.
37