100 200 300 400 500 0.0 0 .01 fragment lenght Density S2 ... - Nature

0 downloads 0 Views 10MB Size Report
H3K9me3 c-MACC thresholds a. P. HDUVRQ·V FRUUHODWLRQV. All. >25% >50% >75%. >20% >40% >60% >80% c-MACC thresholds. 0.0. 0.2. 0.4. 0.6 ï ï ï.
a

S2 whole chromatin (input) 147

0.0

Density 0.01

MNase 1.5U MNase 6.25U MNase 25U MNase 100U

100

200 300 400 fragment lenght

b

500

S2 H3 ChIP 147

0.0

Density 0.01

MNase 1.5U MNase 6.25U MNase 25U MNase 100U

100

200 300 400 fragment lenght

500

c Digestion fragments

100U proĮle 1.5U proĮle

Stable nucleosome positions

147bp

Nucleosomes Nucleosomes at 1.5U at 100U

Supplementary Figure 1. Characteristics of digestion fragments in S2 cells. (a,b) Distribution of the digestion fragment lengths in input (a) and H3 ChIP (b) libraries generated for S2 cells. (c) Nucleosomes detectable at light digestion condition protect loci of subnucleosomal sizes in deep digestion. Fragments produced in 100U digestion which are related to the stable nucleosome positions identified for either 1.5U profile (red) or 100U profile (blue) were compared in terms of their lengths. The fragments associated with 1.5U nucleosomes exhibit shorter lengths, with median being smaller than the expected nucleosomal size of 147bp.

Nucleosome occupancy, H3-ChIP 35

Expressed genes



15

5

í

b

Normalized signal, x10-3

Normalized signal, x10-3

a

í TSS 1 Relative position, kb



Nucleosome occupancy, H3-ChIP 35

1.5U 6.25U 25U 100U pooled

Silent genes



15

5

í

í TSS 1 Relative position, kb



mean in annotated regions



h-MACC

    

Enhancers

TES-prox high exp

TES-prox mod exp

TES-prox low exp

Gene Body high exp

Gene Body mod exp

Gene Body low exp

 íJHQH high exp

 íJHQH mod exp

 íJHQH low exp

Promoter high exp

Promoter mod exp

Promoter low exp

í

Supplementary Figure 2. Analysis of H3 ChIP data. (a) 01DVHVHTSURILOHVDURXQG766 WUDQVFULSWLRQVWDUWVLWHV IRUH[SUHVVHG OHIW DQGVLOHQW ULJKW JHQHV 25% >50% >75% >20% >40% >60% >80%

H4 H1 NucDens (ex)

H3K27me3 )UDFï H3K9me3

0.0 ï

ï

ï

h-MACC thresholds

All >25% >50% >75% >20% >40% >60% >80%

h-MACC thresholds

f

100 80

Percentage

5 0

All >25% >50% >75% >20% >40% >60% >80%

PHDUVRQ·VFRUUHODWLRQV

PHDUVRQ·VFRUUHODWLRQV

80 mM MeDIP H3K27Ac H3.3

10

ï

All >25% >50% >75% >20% >40% >60% >80%

c-MACC thresholds

d

c

263,734

e

RE s rD LQ QG DF L F RE QDF s F ra .ac nd c .a cc

0.4

b

OHQJWKNE

0.6

H3K27me3 )UDFï H3K9me3

Length rDWLR REVrand)

H4K16ac H3K4me1 H3K4me3 CTCF

4

2

0

Inacc Acc

28

50

44

37

28

27

19

45

40

33

35



1

45



30

25

33

H3K4me3

10

35

34



27

28

+.DF

13

38

28

25

29

24

H3.3

27

35

31

20

17

21

MeDIP

8





27

32

29

H3K4me1

53

28

37

8

7

13

DHS

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

í

Frac80 H3K27Ac

H3K9me3 NucDens (ind. study) FrDFí H3K27me3

En ha nc er  í JH QH Pr om ot er 7( 6í SU R[ G en e Bo * HQ dy RP Hí ZL GH

Pellet DHS H3K36me3 H2A.Z

PHDUVRQ·VFRUUHODWLRQV

PHDUVRQ·VFRUUHODWLRQV

80 mM MeDIP H3K27Ac H3.3

150,916 low exp mod exp high exp

60 40 20 0 Inaccessible Accessible

Pr om  o G íg ter en e e ne TE Bo S- dy pr ox

a

Enhancers unannotated regions

Supplementary Figure 4. Correlation between c-MACC and h-MACC with other metrics of chromatin structure. (a) Correlations of c-MACC with chromatin markers were computed for all bins (“All”) and the bins characterized by c-MACC values above specified thresholds (higher than 20% ... 80% of all absolute values). Left panel presents positive correlations and right panel presents negative correlations. (b) Comparison of the length distributions of continuous stretches of inaccessible and accessible HMM c-MACC states. The distributions of c-MACC states based on actual data (dark blue and purple boxes) are compared to the HMM segmentation based on randomized distributions c-MACC values (light blue and pink boxes). Blue boxplots correspond to lengths of inaccessible states and purple boxplots to lengths of accessible states. (c) Ratios of the average lengths of continuous stretches of HMM states computed for observed and randomized c-MACC profiles. (d) Correlations of h-MACC with physical properties of chromatin computed as in (a). (e) A heatmap depicting relation between h-MACC and physical properties of chromatin computed within annotated regions. The values appearing in the heatmap cells represent Pearson’s correlation coefficients multiplied by 100. Color scale encodes the same values, with red and blue colors standing for positive and negative correlations respectively. The heatmap was clustered by columns. (f) Distribution of h-MACC states in genomic regions. Accessible and inaccessible states were identified with HMM for 300-bp bins. Stacked bars represent fractions of the bins assigned to each state in the corresponding regions. The numbers of bins in each state are shown above the bars.

ïNE ELQGLQJ NE VLWHV

0.3 0

G5,1*

0

0.8 0.5 0.3

Fï0ACC

0

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

G6)0%7

0

0.8

EZ

0.5 0.3 0

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

GAF

HP1a

+3E

HP1c

HP2

+3

0.5 0.3

0

ïNE ELQGLQJ NE VLWHV

-+'0 0.8 0.5 0.3

0.3 0

0

0.5 0.3

0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

JIL1 0.8

0.8

0.8 0.5 0.3

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

-0-'$ï.'0$

0

Fï0ACC

0.5

ïNE ELQGLQJ NE VLWHV

Fï0ACC

0.8

0.3

0.8

ïNE ELQGLQJ NE VLWHV

LBR

0

0.8

LSD1 Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

Fï0ACC

0.5

Fï0ACC

0.3

0.8

Fï0ACC

0.5

Fï0ACC

0.8

Fï0ACC

0.5 0.3 0

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

0%'ï5

0/(

PRGï9&

02)

05*

06/ï

ïNE ELQGLQJ NE VLWHV

0

0.5 0.3

0

0

0.8 0.5 0.3

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

0.8 0.5 0.3

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

35ï6HW

0

Fï0ACC

0.3

32) Fï0ACC

0.3

0.8

0.5

ïNE ELQGLQJ NE VLWHV

PCL Fï0ACC

0.5

0

ïNE ELQGLQJ NE VLWHV

185) 0.8

0.3

0.8

ïNE ELQGLQJ NE VLWHV

5KLQR

0

0.8

51$SRO,, Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

Fï0ACC

0.5

Fï0ACC

Fï0ACC

0.8

0.5 0.3 0

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

RPD3

6PF

SPT16

68 +: ï+%

Su(vDU ï

WDS

ïNE ELQGLQJ NE VLWHV

0.3 0

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

0.5 0.3

0.3 0

0.5 0.3

0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

Psc 0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

0.8

ïNE ELQGLQJ NE VLWHV

Ph 0.8

0

ïNE ELQGLQJ NE VLWHV

0.5

ïNE ELQGLQJ NE VLWHV

Pc 0.8

0.8

Fï0ACC

0.3 0

0

ïNE ELQGLQJ NE VLWHV

0.5

ïNE ELQGLQJ NE VLWHV

ZW5 Fï0ACC

0.5

0

ïNE ELQGLQJ NE VLWHV

;13 0.8

0.3

0.8

Trx Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

Fï0ACC

0.5

Fï0ACC

Fï0ACC

0.8

Fï0ACC

Fï0ACC

ïNE ELQGLQJ NE VLWHV

0.3

ïNE ELQGLQJ NE VLWHV

0

Fï0ACC

0.3

0

0.5

ïNE ELQGLQJ NE VLWHV

ISWI

Fï0ACC

0.5

0.3

0.8

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

Fï0ACC

0.8

0.5

ïNE ELQGLQJ NE VLWHV

0

Fï0ACC

0

0.8

ïNE ELQGLQJ NE VLWHV

Fï0ACC

Fï0ACC

0

0.5

0.3

GPTopo II Fï0ACC

0.3

0.8

0.5

ïNE ELQGLQJ NE VLWHV

CTCF Fï0ACC

Fï0ACC

0.5

0

ïNE ELQGLQJ NE VLWHV

&3ï9& 0.8

0.3

0.8

Chro(ChrL] %5

Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

BRE1

Fï0ACC

0

0.5

Fï0ACC

0.3

0.8

CG10630

Fï0ACC

0.5

%($)ï+% Fï0ACC

0.8

ASH1 Fï0ACC

Fï0ACC

ACF1

0.8 0.5 0.3 0

ïNE ELQGLQJ NE VLWHV

ïNE ELQGLQJ NE VLWHV

Supplementary Figure 5. c-MACC profiles around binding sites of selected proteins. Protein binding data were either generated by modENCODE consortium or taken from Enderle et al. (Genome Res. 2011 Feb; 21(2):216-26). To identify binding sites, the enrichment z-scores were computed in 300bp bins genome-wide. Bins with z-score above 3 were selected as binding sites.

b

a 2

chr2L

Spearman's cor = 0.89 Pearson's cor = 0.93

27 kb -1 - 2

h-MACC

-1 - 2

+í0ACC

c-MACC

-0.36 - 1.8

H3K27ac

0

-1.4 - 0.1

H3K27me3

0 - 25

RNA-Seq í í

c

0

+í0ACC

Gene

2

Rab30 CG11266

chr3L

milt

Mnn1

1,267 kb -1 - 2

h-MACC

-1 - 2

c-MACC

-0.36 - 1.8

H3K27ac

-1.4 - 0.1

H3K27me3

0 - 25

RNA-Seq Gene

d +ï0ACC.r2 +ï0ACC.r1 +ï0ACC.r1 +ï0ACC.r2 Fï0ACC.r2

+ï0ACC.r2

+ï0ACC.r1

+ï0ACC.r1

+ï0ACC.r2

Fï0ACC.r2

Fï0ACC.r1

Fï0ACC.r1

Pearson’s correlation 0.5 0.7 0.9

Supplementary Figure 6. Similarity between c- and h-MACC. (a) Correlations of MACC scores based on histone H3 and H4 chromatin immunoprecipitation assays (H3- and H4-MACC respectively) and MACC computed for whole chromatin (c-MACC). All MACC profiles were computed for two independent replicates. (b,c) Examples showing similarity of h-MACC and c-MACC accompanied by H3K27ac, H3K27me3 and gene expression at a ~27-kb (b) and a ~1,267-kb (c) loci. (d) Correlation of the H3and H4-MACC profiles computed for individual replicates.

gr1/gr2 SHDNV

NE

ïNE

gr1/gr2 SHDNV

NE

ïNE

gr1/gr2 SHDNV

NE

Pc

ïNE

gr1/gr2 SHDNV

NE

gr1/gr2 SHDNV

NE

68 +: ï+%

ïNE

gr1/gr2 SHDNV

NE

Ph

ïNE

gr1/gr2 SHDNV

NE

gr1/gr2 SHDNV

NE

Rhino

ïNE

gr1/gr2 SHDNV

NE

6X YDU 

ïNE

gr1/gr2 SHDNV

NE

3VF

ïNE

gr1/gr2 SHDNV

NE

0.85

SURWHLQVLJQDO

0.12  ïNE

gr1/gr2 SHDNV

NE

LSD1 

SURWHLQVLJQDO

ï ï

ï

+34

0.06

SURWHLQVLJQDO

NE

ïNE

gr1/gr2 SHDNV

NE

06/ï 0.11

SURWHLQVLJQDO

gr1/gr2 SHDNV

ï

SURWHLQVLJQDO

ïNE

ïNE

ïNE

gr1/gr2 SHDNV

NE

51$SRO,, 0.93

05*

EZ

0.13



SURWHLQVLJQDO

 0.33

SURWHLQVLJQDO

0.02 0.05 0

NE

0.31

SURWHLQVLJQDO

ï ï ï

gr1/gr2 SHDNV

ï

SURWHLQVLJQDO

ïNE

SURWHLQVLJQDO

ïNE

LBR

NE

ïNE

gr1/gr2 SHDNV

NE

WDS 0.56

35ï6HW

NE

gr1/gr2 SHDNV

0.02

NE

SURWHLQVLJQDO

gr1/gr2 SHDNV

gr1/gr2 SHDNV

0.19

ïNE

ïNE

0.01

0.2

SURWHLQVLJQDO

ï 0.29

SURWHLQVLJQDO

ï 

SURWHLQVLJQDO

0.02 0.31

SURWHLQVLJQDO

 0.55 

SURWHLQVLJQDO

MOF

HP2

SURWHLQVLJQDO

SPT16

NE

NE

ïNE

ïNE

gr1/gr2 SHDNV

NE

T rx 2.13

NE

gr1/gr2 SHDNV

SURWHLQVLJQDO

gr1/gr2 SHDNV

ïNE

gr1/gr2 SHDNV

0.01

ïNE

-0-'$ï.'0$



POF

NE

0.06

SURWHLQVLJQDO

 0.5 0.96 

SURWHLQVLJQDO

ï 

SURWHLQVLJQDO

0.03 0.18

SURWHLQVLJQDO

ï 0 ï

NE

gr1/gr2 SHDNV

ïNE

&KUR &KUL] %5

ï

ZW5

gr1/gr2 SHDNV

ïNE

NE

dSFMBT

ï

NE

ïNE

HP1c

SURWHLQVLJQDO

gr1/gr2 SHDNV

PRGï9&

NE

gr1/gr2 SHDNV



ïNE

NE

gr1/gr2 SHDNV

ïNE

ï

Smc3

gr1/gr2 SHDNV

SURWHLQVLJQDO

NE

ïNE

ïNE

ï

gr1/gr2 SHDNV

JIL1

G5,1*

ï

ïNE

NE

SURWHLQVLJQDO

PCL

SURWHLQVLJQDO

NE

gr1/gr2 SHDNV

0.23

gr1/gr2 SHDNV

ïNE

0.01

ïNE

SURWHLQVLJQDO

0.18

SURWHLQVLJQDO

0.01 0.22 0.03

SURWHLQVLJQDO SURWHLQVLJQDO

0.03 0.1 0.17 0.39

SURWHLQVLJQDO

0.02 0.02

0.35

MLE

+3E

NE

BRE1

SURWHLQVLJQDO

ïNE

NE

NE

gr1/gr2 SHDNV

SURWHLQVLJQDO

XNP

gr1/gr2 SHDNV

gr1/gr2 SHDNV

ïNE

5.29

NE

ïNE

ïNE

%ODQNV &*

ï

gr1/gr2 SHDNV

JHDM1

dmTopo II

SURWHLQVLJQDO

ïNE

NE

NE

0.37

RPD3

gr1/gr2 SHDNV

gr1/gr2 SHDNV

ï

NE

ïNE

SURWHLQVLJQDO

gr1/gr2 SHDNV

HP1a

ïNE

1.38

ïNE

NE

%($)ï+%

ï

NURF301

gr1/gr2 SHDNV

SURWHLQVLJQDO

NE

SURWHLQVLJQDO

gr1/gr2 SHDNV

ïNE

0.1

ïNE

CTCF

ï

0%'ï5

SURWHLQVLJQDO

NE

NE



gr1/gr2 SHDNV

gr1/gr2 SHDNV

0.01

ïNE

SURWHLQVLJQDO



0.63

ISWI

ïNE



NE

ASH1

ï

gr1/gr2 SHDNV

SURWHLQVLJQDO

0.32 ïNE

0.62

SURWHLQVLJQDO

  0.01 

0.78

*$)

0.07

SURWHLQVLJQDO

NE

0.78

SURWHLQVLJQDO

gr1/gr2 SHDNV

0.08

SURWHLQVLJQDO

ïNE



SURWHLQVLJQDO

NE

&3ï9&

ï

SURWHLQVLJQDO

gr1/gr2 SHDNV

0.38

SURWHLQVLJQDO

ïNE

0.05

SURWHLQVLJQDO

ACF1

ïNE

gr1/gr2 SHDNV

NE

group 1 group 2 Supplementary Figure 7. Profiles of protein binding signals around bins characterized as group 1 (blue) and group 2 (orange). Protein binding data were either generated by modENCODE consortium or taken from Enderle et al. (Genome Res. 2011 Feb; 21(2):216-26).

a

group 2

Su(vDU ï HP4 Q4072 # LBR JIL1 (=ï4 # Ph # PCL Q3412 * # Pc * # Psc * # Blanks(CG10630) * HP1a wa184 * HP2 * # CTCF 06/ï 4

68 +: ï+%

PRGï9& * # XNP

&3ï9& * # ASH1 Q4177 * # POF

-0-'$ï.'0$ # dmTopo II(DAJ) # SPT16 * # Trx # dRING Q3200 05*

5KLQRï4 # LSD1 * # NURF301 * # Chro(Chriz)BR * # GAF

35ï6HW * # WDS

0%'ï5 * # ISWI # RNA pol II

53'ï4 * # MOF # Smc3 * # JHDM1 * # ACF1 * # dSFMBT Q2642

=: * # MLE * # HP1c * # HP1b

%($)ï+% * # BRE1

group 1

FracWion group 2 77

group 1

3.0 2.0

61

6 1.0



 13 



28 83

0

1

2

3

4



6

47





Percentage of overlapped peaks

 18  43

0.0

Overrepresentation of overlap relative to expected vlaue, log2

b

7

=ïVFRUHWKUHVKROG

Supplementary Figure 8. Overlap of protein binding sites with the loci from group 1 or group 2. (a) Heatmap shows fraction of the loci from group 1 and group 2 overlapped by the binding sites of each protein included in the analysis (modENCODE data on protein enrichment was used. Z-score = 3 was used as a threshold to identify protein binding; see Methods for more details). The asterisk and hash tag symbol indicate significance of the overlap of the protein binding sites with MACC peaks from group 1 and 2 respectively (P