Multiple Testing Methods For ChIP-Chip High ... - Semantic Scholar

4 downloads 0 Views 573KB Size Report
Aug 18, 2004 - Acknowledgements. Joint work with. Mark J. van der Laan, Division of Biostatistics, UC Berkeley. Sandrine Dudoit, Division of Biostatistics, UC ...
Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data

S¨ und¨ uz Kele¸s Department of Statistics and of Biostatistics & Medical Informatics

University of Wisconsin, Madison

BIRS Workshop, Statistical Science for Genome Biology August 14-19, 2004

S¨ und¨ uz Kele¸ s

1

08-18-04

Acknowledgements Joint work with Mark J. van der Laan, Division of Biostatistics, UC Berkeley. Sandrine Dudoit, Division of Biostatistics, UC Berkeley. Simon E. Cawley, Affymetrix. Thanks to Tom Gingeras and Stefan Bekiranov, Affymetrix. Siew Leng Teng, Division of Biostatistics, UC Berkeley.

S¨ und¨ uz Kele¸ s

2

08-18-04

Outline • Overview of ChIP-Chip experiments. • Spatial data structure of ChIP-Chip experiments: blips. • ChIP-Chip data for transcription factor p53. • Multiple hypotheses testing procedures to identify blips, i.e., bound probes. • A model selection framework for determining the blip size. • Application to ChIP-Chip data of tanscription factor p53. • Conclusions and on going work.

S¨ und¨ uz Kele¸ s

3

08-18-04

ChIP-Chip high density oligonucleotide array data: a new type of genomic data • Chromatin immunoprecipitation ChIP is a procedure for investigating interactions between proteins and DNA. Coupled with whole-genome DNA microarrays (Chip), it facilitates the determination of the entire spectrum of in vivo DNA binding sites for any given protein. • Data structure of ChIP-Chip experiments. (1) With two color spotted microarrays: a signal is measured for each intergenic sequence (regulatory region) (Ren et al. (2000)), (2) With high density oligonucleotide arrays: a signal is measured for each probe (25mer) (Cawley et al., 2004). • Two step analysis: (1) Identification of bound probes, i.e., regulatory regions. (2) Search for common regulatory motifs, i.e., exact binding site(s), in these sequences. S¨ und¨ uz Kele¸ s

4

08-18-04

ChIP-Chip experiments 1. Cross link DNA and target protein.

2. Sonicate DNA to ~1kb . 1

3

2

4

6 5

3. IP Step: Add specific antibody and immunoprecipitate. 1

5

3

2

4. Reverse cross links and purify DNA. 1

2

3

5

5. Amplify, label and hybridize to microarray.

S¨ und¨ uz Kele¸ s

5

08-18-04

ChIP-Chip experiments: Spatial structure-blips Probes ordered according to their locations on the genome

25bp

35bp

A DNA fragment of ~1kb. DNA is separated from the protein and ~1kb regions are fragmented into segments of 50-100bps.

Bound transcription factor

The resulting fragments bind to complementary probes.

Figure 1: ChIP-Chip experiments. Details of the IP-enriched DNA hybridization at the probe level.

S¨ und¨ uz Kele¸ s

6

08-18-04

ChIP-Chip experiments: spatial structure-blips location 15703036

test statistic −10 5 15

test statistic −10 0 10 20

location 24341295

probe no

location 15643916

location 11700329 test statistic −5 5

test statistic −10 10 30

probe no

probe no

probe no

Figure 2: ChIP-Chip experiments: spatial structure. Plot of the twosample Welch t-statistics around four different locations on chromosome 21. x-axis: probe index.

S¨ und¨ uz Kele¸ s

7

08-18-04

ChIP-Chip experiments: spatial structure-blips location 15703036

test statistic −10 5 15

test statistic −10 0 10 20

location 24341295

genomic location

location 15643916

location 11700329 test statistic −5 5

test statistic −10 10 30

genomic location

genomic location

genomic location

Figure 3: ChIP-Chip experiments: spatial structure. Plot of the twosample Welch t-statistics around four different locations on chromosome 21. x-axis: genomic location.

S¨ und¨ uz Kele¸ s

8

08-18-04

ChIP-Chip experiments of Cawley et al. (2004) • ChIP-Chip data for three transcription factors: p53, cMyc, Sp1. • ∼ 1.1 million 25-mer probe-pairs (PM, MM), spanning non-repeat sequences of human chromosomes 21 and 22, distributed across three Affymetrix chips. • Target DNA samples from cell lines HCT1116 (p53) and Jurkat (cMyc, Sp1). • Control DNA samples: – Whole cell extraction: skip IP step (positive). – ControlGST: bacterial antibody at IP step (negative). • For each TF and control, there are six technical replicates consisting of three hybridization replicates for each of two IP replicates. S¨ und¨ uz Kele¸ s

9

08-18-04

Multiple testing procedures for identifying bound probes Xi,j,k : quantile normalized (Bolstad et al. (2003)) log2 (P M ) value of the i-th probe in the k-th replicate of the j-th group, i ∈ {1, · · · , ∼ 1.1 million}, j ∈ {1, 2} , k ∈ {1, · · · , nj }, n1 = n2 = 6. Pnj ¯ Yi,j = 1/nj k=1 Xi,j,k , j ∈ {1, 2}. Let µi = µ2,i − µ1,i be the mean log2 (P M ) difference in control and IP-enriched DNA hybridizations for probe i. For each probe i ∈ {1, · · · , 1.1 million}, we have:

S¨ und¨ uz Kele¸ s

H0,i

: µi = 0,

H1,i

: µi > 0.

10

08-18-04

Multiple testing procedures for identifying bound probes: blips Two-sample Welch t-statistic: Y¯i,2 − Y¯i,1

Ti,n = q 2 /n + σ 2 /n σ ˆi,1 ˆi,2 1 2 To take into account the blip structure, consider the following scan test statistics: ∗ Ti,n

i+w−1 1 X = Th,n , w

i = {1, · · · , N − w + 1}

h=i

where Th,n is the two-sample Welch t-statistic for probe h. =⇒ Aims to borrow strength across a blip of size w when testing the null hypothesis for a given probe: rejections become easier in the vicinity of bound regions and harder around unbound regions. S¨ und¨ uz Kele¸ s

11

08-18-04

Type I error rates Vn : number of falsely rejected hypotheses. Rn : Total number of rejected hypotheses. • Family-wise error rate (FWER): Probability of at least one false rejection, F W ER ≡ P r(Vn ≥ 1). • Tail probability for the proportion of false positives (TPPFP): Probability that the proportion Vn /Rn of false positives among the rejected hypotheses exceeds a user supplied value q, T P P F P ≡ P r(Vn /Rn > q),

q ∈ (0, 1).

• False discovery rate (FDR): Expected value of the proportion Vn /Rn of false positives among the rejected hypotheses, F DR ≡ E[Vn /Rn ], where Vn /Rn ≡ 0, if Rn = 0.

S¨ und¨ uz Kele¸ s

12

08-18-04

Controlling the FWER: Bonferroni adjustment Assumptions: Under the null hypothesis, • The test statistics have the same marginal null distribution. • Xi,j,k ∼ N (0, σj2 ), j = 1, 2. FWER:  PQ0

max

i∈{1,··· ,N −w+1}

∗ Ti,n

 >c

≤ α,

where α is the nominal Type I error rate , and c is an unknown common cut-off. Bonferroni adjustment: Let G0 represent the null distribution of the scan test statistics, i.e., null distribution of the r.v. Pw ∗ T = 1/w h=1 Th . The Bonferroni adjusted cut-off is given by cB = G0−1 (1 − α/(N − w + 1)) . S¨ und¨ uz Kele¸ s

13

08-18-04

Controlling the FWER: Nested-Bonferroni adjustment

• The nested-Bonferroni adjustment is given by cN B = F0−1 (1 − α/K),

where F0 is the null distribution of the test statistics Z = maxi∈{1,··· ,w} Ti∗ and   N −w+1 K= . w • Nested-Bonferroni adjustment is less conservative than the Bonferroni adjustment: cN B ≤ cB . • Corresponding null distributions can be estimated by parametric bootstrap (using the normality assumption for control and treatment groups under the null hypothesis and simulating the corresponding random variables). For the Bonferroni adjustment, a normal approximation is also possible. S¨ und¨ uz Kele¸ s

14

08-18-04

Procedures for controlling different Type I error rates • For control of the FWER: B-FWER, NB-FWER

Null dist

Bonferroni

Nested Bonferroni

G0 : c.d.f. of the r.v. P T ∗ = (1/w) w h=1 Th

F0 : c.d.f. of the r.v. Z = maxh∈{1,··· ,w} Th∗

cut-off c

G0−1 (1 − α/(N − w + 1))

F0−1 (1 − α/K) l m N −w+1 where K = w

Estimation of

Parametric bootstrap or

Parametric bootstrap

the null dist

Normal approximation

They are equivalent when w = 1. • For control of the TPPFP: Augmentation procedure of van der Laan et al. (2004). VDP-TPPFP • For control of the FDR: Benjamini and Hochberg (1995). BH-FDR

S¨ und¨ uz Kele¸ s

15

08-18-04

Simulation studies • ∼ N probes with n1 = 6 control and n2 = 6 treatment observations. • Non-blip and blip data are generated from distributions N (µ0 , σ0 ) and N (µ1 , σ1 ), respectively. N

w

# blips

(µ0 , σ0 )

(µ1 , σ1 )

0

2000

10

12

(0,1)

(2,0.75)

I

2000

10

12

(0,1)

(2,0.75)

II

2000

10

12

(0,1)

(1.5,1)

III

2000

∼ Uniform[5, 16]

12

(0,1)

(1.5,1)

IV

3000

∼ Truncated gamma(10, 1)

20

(0,1)

(1.5,1)

Table 1: Summary of the simulation settings. • Estimation of the null distribution of the test statistics is based on B = 100, 000 observations. S¨ und¨ uz Kele¸ s

16

08-18-04

Simulation 0: Comparison of the actual Type I error rates w

Method

NB-FWER

B-FWER

VDP-TPPFP

BH-FDR

1

B

0.042

0.042

0.042

0.0440

N 2

B

0.042 0.032

N 5

B

B

0.05

B N

0.036

0.04

0.024

0.00

0.014 0.026

0.0459 0.0559

0.002

0.054 0.034

0.0476 0.0719

0.124

N 20

0.002

0.326

N 10

0.028

0.0451

0.0449 0.0498

0.004

0.0415 0.0449

Table 2: B: Bootstrap, N: Normal approximation, α = 0.05. S¨ und¨ uz Kele¸ s

17

08-18-04

Simulation 0: w = 10

200

number of correct rejections

180

number of rejections

170

180

160

160

150 140

140

130

BH−FDR

VDP−TPPFP

NB−FWER

BH−FDR

VDP−TPPFP

B−FWER

NB−FWER

B−FWER

120

120

Figure 4: Boxplot of the number of rejections and number of correct rejections with a blip size of w = 10 for NB-FWER, B-FWER, VDPTPPFP, BH-FDR. S¨ und¨ uz Kele¸ s

18

08-18-04

Summary of the simulations I, II, III, IV 0.95 0.85 0.75

0.85

specificity

Simulation III

0.6

0.7

0.8

0.9

1.0

0.2

0.6

0.8

Simulation II

Simulation IV

0.4

0.6

0.8

1.0

0.80 0.2

sensitivity

1.0

0.90

1.00

sensitivity

specificity

0.2

0.4

sensitivity

0.95

0.5

0.85 0.75

specificity

0.70

specificity

1.00

Simulation I

0.4

0.6

0.8

1.0

sensitivity

Figure 5: Simulations I, II, II and IV. Specificity versus sensitivity plots.

: NB-FWER, 4: VDP-TPPFP, +: BH-FDR. Different colors represent different assumed blip sizes: w = 1 , w = 2, w = 5, w = 10, and w = 20.

S¨ und¨ uz Kele¸ s

19

08-18-04

Determining the blip size • Considered multiple testing procedures are indexed by the parameter w, i.e., the blip size. 25bp

10bp

Probe

Probe

35bp

~1kb

• Theoretical calculation for the blip size: 25w + 10(w − 1) = 1000 =⇒ w ≈ 30 probes. • Empirical plots of the data suggest a smaller blip size: w ≈ 10 probes. • A model selection framework for selecting the blip size. S¨ und¨ uz Kele¸ s

20

08-18-04

Determining the blip size: Piecewise constant mean regression model for the intensity signal • Let (Yi , Li ), i = {1, · · · , N } represent the data on N probes. Yi is the two-sample Welch t-statistic and Li is the genomic location for probe i, respectively. • Recall that we have two groups of interest: bound and unbound classes. • Assume E[Yi ] = I(Li ∈ / A)µ0 + I(Li ∈ A)µ1 , where A represents the group of bound probes. • Estimation: Given the blip start sites, µ0 and µ1 can be estimated by ordinary least squares. Use a forward stepwise algorithm to estimate the blip start sites. • How many blips for a given w? S¨ und¨ uz Kele¸ s

21

08-18-04

Monte-Carlo cross-validation • One observation for each probe, i.e., one realization of the test statistics, Yi ≡ Ti,n .

B1

B1H1

B1H2

B2

B1H3

B2H1

B2H2

B2H3

Figure 6: Probe level data: B1: IP replicate 1, B2: IP replicate 2, and Hk represents the k-th hybridization replicate. Training sample: 4 hybridizations from B1 and B2, respectively. Validation sample: 2 hybridizations from each of B1 and B2. 9 different ways to divide up the data in this manner. S¨ und¨ uz Kele¸ s

22

08-18-04

150.195 150.185

150.190

cross−validated risk

150.19 150.18 150.17 150.15

150.180

150.16

cross−validated risk

w=1 w=2 w=10 w=20 w=30

150.200

150.20

150.205

Cross-validated risk over 500 blips on chip A

0

100

200

300

400

500

0

number of blips

5

10

15

20

25

30

number of blips

Figure 7: Left panel: Cross-validated risk over 500 blips with five different blip sizes, w ∈ {1, 2, 10, 20, 30}. Right panel: Zooming into the first 30 blips. S¨ und¨ uz Kele¸ s

23

08-18-04

blip−5

blip−13

30

0 10

30

blip−17

30

30

5

t−stat

−5 0 10

30

0 10

30

loc

loc

blip−2

blip−6

blip−10

blip−14

blip−18

blip−22

0 10

30

30

30

−10 0 10

30

loc

loc

loc

loc

blip−3

blip−7

blip−11

blip−15

blip−19

30

0 10

30

0 10

30

30 −10 0 10

30

0 10

30

loc

loc

loc

blip−4

blip−8

blip−12

blip−16

blip−20

30

0 10

30 loc

15 0 10

30

0 10

loc

5

t−stat

−5

−5

5

t−stat

5

t−stat

0

15 5

−5

−5

−5

5

t−stat

10

15

loc

loc

30 loc

loc

0 10

0 10

10

t−stat

5

t−stat

−5

−5

5

t−stat

20 0

5

t−stat

15

loc

0 10

10

t−stat

5 0

t−stat 0 10

−5

t−stat 0 10

−10 0

−10

10

t−stat

20 0

5 −5

30

10 20

loc

30

loc

t−stat

loc

−5

t−stat

15

t−stat 0 10

loc

0 10

t−stat

blip−21

−5

−5 0 10

5

15 5

t−stat

15 5

t−stat

−5

−5

0

t−stat

5

10 0

t−stat

−10 0 10

t−stat

blip−9

15

blip−1

30 loc

0 10

30 loc

Figure 8: p53 ChIP-Chip data. Blips identified on chip A using NBFWER multiple testing procedure with an assumed blip size of w = 2. The 28 blips displayed are identified by controlling the FWER using the NB-FWER procedure at the nominal level α = 0.05 . S¨ und¨ uz Kele¸ s

24

08-18-04

Control of the FWER for chip A w=1

w=2

w = 10

w = 20

w = 30

#blips identified

28

22

14

10

8

# real blips

8

10

13

10

8

Table 3: Multiple testing procedures applied to Chip A. Number of real blips identified by visual inspection. A real blip refers to a small cluster of probes (> 1 probes) that has test statistics greater than its surroundings.

S¨ und¨ uz Kele¸ s

25

08-18-04

Results on p53 (α = 0.05, q = 0.05) Annotation

NB-FWER

VDP-TPPFP

BH-FDR

1kb 5’ UTR

6

6

21

3kb 5’ UTR

14

14

47

1kb CpG

17

22

86

3kb CpG

39

45

162

Within a gene

87

93

231

Within an exon

1

1

15

Total

254

269

719

Table 4: Annotation of the chromosomal regions identified by the multiple testing procedures. 12 of the 15 additional blips identified by VDP-TPPFP fall into potential regulatory regions.

S¨ und¨ uz Kele¸ s

26

08-18-04

Results on p53 (α = 0.05, q = 0.05) w = 1 1kb of 5’

3kb of 5’

1kb of CpG

3kb of CpG

WCR

WE

Total

NB-FWER

1

3

6

13

37

6

128

VDP-TPPFP

1

3

6

13

39

7

134

14

29

31

75

195

18

553

BH-FDR

w = 10 1kb of 5’

3kb of 5’

1kb of CpG

3kb of CpG

WCR

WE

Total

NB-FWER

6

14

17

39

87

1

254

VDP-TPPFP

6

14

22

45

93

1

269

21

47

86

162

231

15

719

BH-FDR

w = 20 1kb of 5’

3kb of 5’

1kb of CpG

3kb of CpG

WCR

WE

Total

NB-FWER

5

11

13

27

55

2

188

VDP-TPPFP

6

11

13

28

60

2

208

BH-FDR

9

23

32

68

112

4

355

w = 30 1kb of 5’

3kb of 5’

1kb of CpG

3kb of CpG

WCR

WE

Total

NB-FWER

2

4

7

23

33

0

145

VDP-TPPFP

2

4

7

23

34

0

149

BH-FDR

3

7

15

38

63

1

225

S¨ und¨ uz Kele¸ s

27

08-18-04

Results on p53 (α = 0.05, q = 0.05) • Cawley et al. (2004) identified 48 potential p53 binding regions and verified 14 of these using RT-PCR. 23 of our 221 blips overlap with these. • Our blips include 13 of these experimentally verified regions and 49 additional blips that show at least as high hybridization signal as this verified group. • Among these 48, only 1 contains an exact copy of the p53 consensus binding sequence and none of the verified 14 have consensus matching sequences. • Among our 221 blips, 4 of them have an exact copy of the p53 consensus sequence.

S¨ und¨ uz Kele¸ s

28

08-18-04

Results on p53 Annotation

Our 221 blips

48 blips by Cawley et al. (2004)

1kb 5’ UTR

# blips

5

0

% blips

2

0

# blips

17

8

% blips

8

17

p53 consensus

# blips

4

1

sequence

% blips

2

2

Within an orf

# blips

81

% blips

37

1kb CpG

≤ 36∗



: Average over 3 transcription factors and includes 5kb downstream of the 3’ terminal exon. S¨ und¨ uz Kele¸ s

29

08-18-04

p53 consensus binding sequence • Consists of the following arrangement of the consensus DNA sequence RRRCW (.) and its reverse complement WGYYY (/): RRRCWWGYYY[0-15]RRRCWWGYYY ./ − ./, spacer − ∈ [0, 15]. • Wang et al. (1995) showed that the tetrameric p53 protein can bind to various arrangements of multiple copies of the consensus RRRCW. • Inga et al. (2002) showed that sites as many as 4bp mismatches to the 20mer consensus could be functional and enable high levels of transactivation.

S¨ und¨ uz Kele¸ s

30

08-18-04

Enrichment for p53 consensus binding sequence verified

filtered

all

./ − ., ./ − /, . − ./, / − ./

7/13

21/49

86/221

./

8/13

33/49

118/221

./ − ./ with at most 2 missmatches

7/13

35/49

141/221

Table 5: Occurrences of various arrangements of the 5mer RRRCW among the 13 experimentally verified blips of Cawley et al. (2004)), our 49 filtered blips that show higher hybridization signal than the experimentally verified blips, and all of our 221 blips.

S¨ und¨ uz Kele¸ s

31

08-18-04

Summary • The scan statistic allows incorporation of the spatial data structure into multiple testing procedures. • Identified blips show enrichment in terms of various arrangements of the p53 partial consensus sequence RRRCW as well as enrichment for potential promoter regions. • Monte-carlo cross-validation in a piecewise constant regression model provides a guide for choosing the appropriate blip size. • More ChIP-Chip data will be becoming available as a part of the ENCODE project.

S¨ und¨ uz Kele¸ s

32

08-18-04

Some other issues related to ChIP-Chip data • Type of controls: Whole cell extract versus mock IP experiments. • Size and spacing of the arrayed elements: design of the arrays for IP-enriched DNA hybridization. • Detailed characterization of the spatial structure: fragment length distribution as a result of sonication.

S¨ und¨ uz Kele¸ s

33

08-18-04

References • S. E. Cawley et al. (2004). Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116: 499-509. • S. Kele¸s, M. J. van der Laan, S. Dudoit, and S. E. Cawley (2004). Multiple Testing Methods for ChIP-Chip High Density Oligonucleotide Array Data. http://www.bepress.com/ucbbiostat/paper147/ • M.J. Buck, J.D. Lieb (2004). ChIP-Chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83(3): 349-60.

S¨ und¨ uz Kele¸ s

34

08-18-04

EXTRA SLIDES

S¨ und¨ uz Kele¸ s

35

08-18-04

1kb 5’ UTR 3kb 5’ UTR 1kb CpG 3kb CpG Within a gene Within an exon

0

20

40

%

60

80

100

Results on p53 (α = 0.05, q = 0.05)

NB−FWER VDP−TPPFP

BH−FDR

Figure 9: Annotation of the chromosomal regions identified by the multiple testing procedures.

S¨ und¨ uz Kele¸ s

36

08-18-04

Results on p53 (α = 0.05, q = 0.05) 254

211

179

49

1kb 5’ UTR

6

6

5

0

3kb 5’ UTR

14

12

9

2

1kb 3’ UTR

2

1

1

1

3kb 3’ UTR

8

6

4

2

1kb CpG

17

13

13

2

3kb CpG

39

30

28

4

Within a gene

87

71

66

10

1

1

1

0

Within an exon

Table 6: Annotation of the post-processed chromosomal regions identified by the NB-FWER procedure.

S¨ und¨ uz Kele¸ s

37

08-18-04

Suggest Documents