SUPPLEMENTARY METHODS and DATA Histone

1 downloads 0 Views 5MB Size Report
Odyssey CLx Imaging System was used to acquire images. Bioinformatic analyses of small RNA libraries. Characterization of control (dsLuc) small RNA libraries.
SUPPLEMENTARY METHODS and DATA

Histone-derived piRNA biogenesis depends on the ping-pong partners Piwi5 and Ago3 in Aedes aegypti. Erika Girardi1,3,†, Pascal Miesen1,†, Bas Pennings1, Lionel Frangeul2, Maria Carla Saleh2 and Ronald P. van Rij1, * 1

Department of Medical Microbiology, Radboud University Medical Center, Radboud Institute for Molecular Life Sciences,

P.O. Box 9101, 6500 HB Nijmegen, The Netherlands. 2

Institut Pasteur, Viruses and RNA interference, CNRS URM 3569, 75724 Paris Cedex 15, France.

3

Current affiliation: Architecture and Reactivity of RNA, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université

de Strasbourg, 67084 Strasbourg, France.

1

SUPPLEMENTARY MATERIALS AND METHODS Cells and virus infection Aag2 cells were cultured at 25˚C in Leibovitz’s L-15 medium (Invitrogen) supplemented with 10% heat inactivated fetal calf serum (PAA), 2% tryptose Phosphate Broth Solution (Sigma), 1x MEM Non-Essential Amino Acids (Invitrogen) and 50 U/ml penicillin and 50 μg/ml streptomycin (Invitrogen). The virus used throughout this study is a Sindbis virus recombinant expressing GFP from a duplicated sub-genomic promoter (pTE-3’2J-GFP, SINV-GFP), which was produced in BHK-21 cells as previously described (3). Aag2 cells were infected with SINV-GFP at a multiplicity of infection (MOI) of 1 for 48 hours. Mosquito manipulation for small RNA libraries For small RNA library preparation, field-derived Aedes aegypti mosquitoes originally collected in Nakhon Chum, Muang District, Kamphaeng Phet, Thailand were used within 3 generations of laboratory colonization. Seven-day-old female mosquitoes were allowed to feed on pre-washed rabbit blood meals for 30 minutes at 37°C. After blood feeding, engorged females were incubated at 28°C with 70% humidity for 7 days. Total RNA from a pool of five mosquitoes was isolated with TRIzol (Invitrogen). Size fractionation of small RNAs of 19-33 nt in length was performed as described in (4). Purified RNA was used for library preparation using the NEBNext Multiplex Small RNA Library Prep kit for Illumina (E7300L). Libraries were diluted to 4 nM and sequenced using NextSeq 500 High Output Kit v2 (75 cycles) on a NextSeq 500 (Illumina, San Diego, CA, USA). Generation of plasmids and dsRNA production Insect expression vectors based on the Drosophila Gateway Vector pAGW (kindly provided by the Carnegie Institution for Science) were constructed for N-terminal tagging of proteins with GFP. The full-length coding sequence of Ago3 and Piwi5 was amplified from Aag2 complementary DNA (cDNA) and cloned by recombination downstream of the tag sequences according to the Gateway manufacturer’s instructions (Invitrogen). For dsRNA production, in vitro transcription using T7 RNA polymerase was performed on T7 promoter-flanked PCR products. To allow the formation of double-stranded RNA, the reaction products were heated to 80˚C and then gradually cooled to room temperature. Subsequently, the RNA was purified using the GenElute Mammalian Total RNA Miniprep Kit (Sigma) following the manufacturer’s instructions. Primers used for construction of plasmids and dsRNA production are indicated in Table S1.

2

Northern blotting and qPCR Small RNA northern blot was performed as described previously (5). Briefly, total RNA was isolated using Isol-RNA Lysis Reagent (5 PRIME) and 5 µg of RNA was separated on a 17.5% PAGE gel, blotted to a nylon membrane (Hybond NX; Amersham) and cross-linked using 1-ethyl-3-(3dimethylaminopropyl) carbodiimide (EDC; Sigma). For detection of antisense H4piRNAs, 15 µg of total RNA was used. NaIO4 oxidation and β-elimination were performed as described in (5). For northern blot analyses on adult A. aegypti, total RNA from ten male, female, or blood-fed female mosquitoes was used (kindly provided by In2Care, Wageningen, The Netherlands). Hybridization with

32

P-labeled DNA oligonucleotides or in vitro-transcribed riboprobes was

performed overnight at 42˚C. The membrane was washed in 0.1% SDS, 2x SSC, followed by two washing steps in 0.1% SDS, 1x SSC and 0.1% SDS, 0.1x SSC, respectively. All washes were performed at 42˚C. For detection of the radioactive signal, the membrane was exposed to a Carestream Kodak Biomax XAR film (Sigma Aldrich). Sequences of northern blot probes are indicated in Table S1. For quantitative RT-PCR (RT-qPCR), 1 µg of total RNA was DNaseI-treated (Ambion) and reversetranscribed using the Taqman Reverse transcription kit (Roche) with random primers following the manufacturer’s instructions. qPCR reactions were prepared using GoTaq qPCR SYBR Mastermix (Promega) and measured on a LightCycler 480 (Roche). Expression was internally normalized against the expression of Lysosomal Aspartic Protease (LAP) and the relative mRNA abundance was determined using to the ΔΔCt method (3). The primers used for qPCR are indicated in Table S1. Stem-Loop RT-qPCR for piRNA quantification For Stem-Loop RT-qPCR assays, 100 ng of total RNA was reverse transcribed in a final volume of 7.5 µl, in the presence of 0.5 µl of each Stem-Loop oligonucleotide (SL_Aae_H4piRNA_A to D, and SLbantam-3p; 0.75 µM), 1.5 µl 5xFirst Strand buffer (Invitrogen), 1 µl dNTPs (0.25 mM; Qiagen), 0.125 µl Superscript II Reverse Transcriptase (200 U/µl; Invitrogen) and 0.1 µl RNase Inhibitor (20 U/µl; Applied Biosystems). Reactions were incubated for 30 minutes at 60°C, for 30 minutes at 42°C, and for 5 minutes at 85°C. Samples were placed on ice and were adjusted to 40 µl. Quantification was done by qPCR on a LightCycler 480 (Roche). Briefly, 0.6 µl forward primer (Fw_Aae_H4piRNA_A to D, F-bantam-3p; 10 µM) and 0.6 µl universal reverse primer R-univ-sRNAqPCR (10 µM), 3.8 µl MilliQ water and 10 µl GoTaq qPCR Master Mix (2x; Promega) were added to 5 µl RT reaction mix. After an incubation of 5 minutes at 95°C, 40 amplification cycles were performed (10 seconds at 95°C, 20 seconds at 60°C, and 10 seconds at 72°C). Primer sequences are provided in Table S1.

3

Strand specific RT-PCR For strand-specific RT-PCR assays, cDNA synthesis was performed on 1 µg of DNase I-treated RNA using TaqMan Reverse Transcription Reagents in a 20 μl reaction according to the manufacturer’s instructions (Applied Biosystems), using strand-specific primers tagged with a 5’ T7 promoter sequence (Table S1). Following cDNA synthesis, PCR analysis was performed using a combination of a H4-specific primer and a primer specific for the T7 promoter sequence (Table S1). The following control reactions were run in parallel to each sample: cDNA synthesis without reverse transcriptase was performed to verify the absence of contaminating DNA in RNA preparations; PCR amplification without cDNA template was used to exclude contaminations in PCR reagents. Immunoprecipitation Lysates from Aag2 cells expressing GFP-tagged PIWI proteins were incubated with GFP-Trap magnetic beads (Chromotek) according to the manufacturer’s instructions. The immunoprecipitates were washed twice in wash buffer (10 mM Tris-HCl pH 7.5, 150 mM NaCl, 0.5 mM EDTA) and split for either RNA or protein analyses. The bound RNA was isolated from the beads using Isol-RNA Lysis Reagent, extracted, and analyzed by small RNA northern blot. The bound proteins were isolated from the beads using 2x SDS sample buffer (120 mM Tris-HCl pH 6.8, 20% glycerol, 4% SDS, 0.04% bromophenol blue, 10% beta-mercaptoethanol) and analyzed by western blot. Rabbit anti-GFP antibody (1:10,000) and secondary IRdye800 conjugated goat anti-rabbit (1:10,000, LI-COR) were used to detect the proteins of interest. Odyssey CLx Imaging System was used to acquire images. Bioinformatic analyses of small RNA libraries Characterization of control (dsLuc) small RNA libraries The small RNA libraries from SINV-infected Aag2 cells have been characterized previously (5). Small RNA sequences from A. aegypti mosquitoes have been deposited in NCBI Sequence Reads Archive (accession SRA291268). Small RNA reads were mapped to the A. aegypti genome (AaegL3, downloaded from VectorBase) using Bowtie (Galaxy tool version 1.1.2), allowing no mismatches in the first 28nt of each read. For mapping to the genomes of persistently infecting viruses (Aedes aegypti densovirus, GenBank accession M37899.1; mosquito X virus segment A and B, GenBank JX403941.1 and JX403942; cell fusing agent virus, GenBank NC_001564.1) one mismatch in the first 28 nt was permitted. Before Aedes aegypti genome-derived piRNAs were analyzed, reads mapping to SINV-GFP or to the persistently infecting viruses were removed from the libraries. Subsequently, reads in the size range of 25-30 nt were selected. The genome positions of the piRNA-sized reads were overlapped with the genome locations of repetitive elements present in the Aedes aegypti genome (‘AaegL3 repeatfeatures’ downloaded from VectorBase). To determine the number of piRNAs that derive from transposable elements, reads that

4

overlap ‘TEfam elements’ within the repeatfeatures library were counted. All piRNA reads that overlapped repeat features other than TEfam elements were designated as ‘other repeats’. For the subsequent analysis of piRNA reads that overlap (non-)coding genes, reads that intersected with any type of repetitive element were excluded. To identify piRNA-sized reads that map to coding RNAs, the ‘mRNA’ elements from the ‘AaegL3 basefeatures library’ (downloaded from VectorBase) were extracted and the transcript IDs from VectorBase were replaced with the corresponding gene ID. Next, the genomic positions were overlapped with the small RNAs using the ‘intersect genomic intervals’ tool in Galaxy. From these intersected datasets, the number of overlapping piRNAs was determined. Non-coding RNAs were defined as the collection of tRNAs, miRNAs, rRNA, snRNAs, snoRNAs, misc RNA, pseudogenes, RNase MRP RNA, RNase P RNA, SRP RNA, and antisense RNAs. Their genomic positions were extracted from the basefeatures library, intersected with the position of piRNAs, and the number of overlapping piRNAs was determined. Comparison of piRNA levels in PIWI knockdown or IP libraries. 25-30 nt piRNA reads that mapped to the Aedes aegypti genome and did not overlap with repeated elements were selected from the individual PIWI knockdown or IP libraries described in Miesen et al (5). The genomic positions of piRNAs were joined to mRNA positions using the ‘join’ tool of the ‘operate on genomic intervals’ section in Galaxy. In the joined datasets the occurrence of individual mRNA names was counted to obtain piRNA counts, which were subsequently normalized to the total number of reads in the corresponding library. Finally, the fold change of normalized piRNA levels was calculated for every PIWI knockdown or IP library compared to the control libraries (dsLuc or GFP-IP, respectively). Characterization of piRNAs mapping to individual/small groups of transcripts. Small RNA libraries were mapped to FASTA-formatted Aedes aegypti transcripts available from VectorBase. Small RNA sequencing data for individual or groups of transcripts (for instance all H2A, H2B, H3 or H4 genes) were selected from the mapped reads file. Small RNA size profiles were generated from all reads that map to the transcript in sense or antisense orientation with a maximum of one nucleotide mismatch in the first 28 nt. The small RNA distribution along the transcript was plotted as the number of 5’ ends starting at the individual nucleotide position of the transcripts. Nucleotide biases were determined with the WebLogo3 program (Sequence logo generator Galaxy tool version 0.4). For presentation of the genome distribution of H4 piRNAs on the total collection of histone 4 genes, the ORFs of the individual histone transcripts were aligned from start codon to stop codon irrespective of few single nucleotide polymorphisms. Subsequently, the combined count of small RNA 5’ ends was plotted for every nucleotide position on the ORF. The VectorBase accession numbers for the histone H2A, H2B H3 and H4 families analyzed in this study are shown in Table S2. Comparison of Girardi et al. and Haac et al. datasets. To compare our dataset with independently generated small RNA libraries from Aag2 cells, publically available data (1) were imported into Galaxy (SRA submissions SRR1765315, SRR1765316 and

5

SRR1765317). These libraries have been generated from size-purified small RNAs from Aag2 cells transfected with dsRNA targeting EGFP using Illumina’s small RNA Truseq sample prep kit. They have been sequenced on a HiSeq2500 and have a combined sequencing depth of more than 84 million reads. Gene-derived piRNAs were analyzed as described above and the correlation between our data and the Haac et al data was analyzed using a Spearman’s correlation analysis. SUPPLEMENTARY REFERENCES 1. Haac,M.E., Anderson,M.A.E., Eggleston,H., Myles,K.M. and Adelman,Z.N. (2015) The hub protein loquacious connects the microRNA and short interfering RNA pathways in mosquitoes. Nucleic Acids Res., 43, 3688–3700. 2. Akbari,O.S., Antoshechkin,I., Amrhein,H., Williams,B., Diloreto,R., Sandler,J. and Hay,B.A. (2013) The developmental transcriptome of the mosquito Aedes aegypti, an invasive species and major arbovirus vector. G3 (Bethesda), 3, 1493–1509. 3. Vodovar,N., Bronkhorst,A.W., van Cleef,K.W.R., Miesen,P., Blanc,H., van Rij,R.P. and Saleh,M.C. (2012) Arbovirus-Derived piRNAs Exhibit a Ping-Pong Signature in Mosquito Cells. PLoS One, 7, e30861. 4. Gausson,V. and Saleh,M.-C. (2011) Viral small RNA cloning and sequencing. Methods Mol. Biol., 721, 107–122. 5. Miesen,P., Girardi,E. and van Rij,R.P. (2015) Distinct sets of PIWI proteins produce arbovirus and transposon-derived piRNAs in Aedes aegypti mosquito cells. Nucleic Acids Res., 43, 6545–6556.

6

SUPPLEMENTARY FIGURE LEGENDS: Figure S1: Accumulation of genic PIWI-dependent small RNAs in Aag2 cells. (A) Length distribution of small RNA reads in control libraries (luciferase dsRNA treated, SINV-GFP infected Aag2 cells) before mapping. Reads were normalized to total library size. Bars are the means +/- SEM of three independent small RNA libraries. (B,C) Relative abundance of 25-30 nt reads derived from A. aegypti protein-coding genes in the indicated (C) PIWI knockdown and (D) PIWI IP libraries compared to control libraries (dsLuc and GFP IP, respectively). Black and grey bars indicate, respectively, sense and antisense reads relative to the annotated host transcript. Bars are the means +/SD of three independent small RNA libraries (IP libraries represent one single experiment). Twotailed student’s t-test was used to determine statistical significance (*P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001). Figure S2: Examples of Ago3 and Piwi5 dependent piRNAs from class IV and V genes. (A) UCSC genome browser views of piRNA producing genes with the distribution of sense 25-30 nt RNAs across the indicated host transcript. The relative position of exons (blue boxes) and introns (grey lines) is schematically represented. Black arrows indicate the direction of transcription. (B) Size profiles of all small RNA reads derived from sense (black) or antisense (grey) transcripts. Bars are the mean +/- SD of three independent small RNA libraries. AAEL012272, fk506 binding protein; AAEL007690, RPTOR-like protein; AAEL003743, vacuolar protein ATPase; AAEL011197, actin; AAEL14915,

26S

proteasome;

AAEL006582,

calcium

transporting

ATPase

sarcoplasmic/endoplasmic reticulum type. Group IV and V are defined in Figure 1. (C) RT-qPCR analysis on group IV and V mRNAs upon transfection of the indicated dsRNAs in mock and SINVGFP infected Aag2 cells. Expression is normalized to Lysosomal Aspartic Protease (LAP) levels and presented relative to dsLuc. Bars represent means of two biological replicates +/- SD. Figure S3: piRNAs originate from replication-dependent, clustered histone 4 genes in A. aegypti. (A) Phylogeny of annotated Histone 4 genes. The H3 gene AAEL000492 was used as outgroup. Different classes (I to IV) were defined based on nucleic acid sequence similarity. For each H4 gene, the presence of a canonical polyadenylation signal (AAUAAA), conserved stem loop (SL) (AANGGNNNNNNNNNGNGCC), purine-rich Histone downstream element (HDE), localization in genomic histone clusters, and processing into piRNAs is indicated. Sub-canonical polyadenylation signals, stem loop and HDE sequences are indicated in orange. (B) VectorBase genome view of the three major A. aegypti histone clusters. The relative position and direction of transcription of the annotated histone genes in these regions are indicated. H4 genes are indicated in dark blue. H2A, H2B and H3 are indicated in red. Pseudogenes and other protein-coding genes are indicated in grey and light-blue, respectively.

7

Figure S4: Comparison of genic piRNAs between two independent datasets. (A) The top-1000 piRNA expressing genes from three dsLuc libraries (our dataset) were selected and mean piRNA counts in these datasets were compared to the mean piRNA count from three small RNA libraries generated independently by Haac and colleagues (1). piRNA counts were normalized to the corresponding library sizes. One gene was excluded from display on the logarithmic graph since it does not produce piRNAs in the Haac et al dataset. The correlation is statistically significant as determined by Spearman’s correlation coefficient (rs=0.75; P < 0.001; two-tailed). (B) Size distribution of small RNAs mapping to A. aegypti histone 4 genes. Bars show the mean and SEM of three libraries generated by Haac et al. (C) Sequence logo of 25-30 nt reads mapping in sense (upper panel) or antisense orientation (lower panel) to histone 4. The reads of three libraries were combined. n indicates the number of reads used to generate the logo, u indicates the number of unique sequences. Figure S5: Ago3 and Piwi5 dependent H4piRNAs accumulate in Aag2 cells. (A) Ping-pong signature of H4piRNAs. Upper panel: the probability for 5’ overlaps between H4piRNAs from opposite strands in three control libraries. Lower panel: examples of H4piRNA ping-pong couples. (B) Accumulation of individual H4piRNAs. Upper panel: the positions of individual sense H4piRNAs on the H4 ORF. H4piRNA A is the most abundant piRNA in the small RNA data of Akbari et al (2) available on the Aedes UCSC genome browser. Lower panels: northern blot analysis of individual sense H4piRNAs upon knockdown of the indicated PIWI/AGO genes in mock and SINV-GFP infected Aag2 cells. (C) Northern blot analysis of sense H4piRNAs upon knockdown of the indicated PIWI/AGO genes in SINV-GFP infected and mock-infected Aag2 cells. An RNA marker (10 to 150 nt) was loaded to define H4piRNA size. H4 piRNAs were detected using a pool of the four DNA oligonucleotide probes. U6 snRNA serves as loading control. All RNA samples have been analyzed on high resolution 17.5% polyacrylamide gel. (D) Full image of Ago3 and Piwi5 immunoprecipitation (IP) shown in Figure 3. Aag2 cells were transfected with expression plasmids for GFP (negative control), GFP-Ago3, or GFP-Piwi5 for 48 hours, harvested, and subjected to immunoprecipitation using GFP-trap beads. Immunoprecipitates and total lysates were analyzed by western blot using antiGFP antibodies. NT, non transfected. Asterisks indicate the position of GFP-Ago3 (*), GFP-Piwi5 (**), and GFP (***). Figure S6: Detection of H4 antisense transcripts in Aag2 cells. (A) Schematic representation of strand-specific RT-PCR assay for the detection of sense and antisense histone 4 transcripts. Sense (red arrow) or antisense (blue arrow) strand-specific RT primer with a tag sequence (green line) were used for cDNA synthesis from Aag2 total RNA. Following cDNA synthesis, PCR was performed using a combination of a H4 and a tag-specific primer. As control, a 195 bp region on H4 cDNA (red) was amplified using primer Fw (black arrow) and primer Tag (green arrow). The presence of an antisense

8

transcript (light blue) was analyzed by PCR amplification with primer Rv (black arrow) and primer Tag (green arrow). (B) PCR was performed using the indicated primer combinations on sense and antisense RTs. +RT correspond to the reverse transcribed sample. Reaction without reverse transcriptase (-RT) and PCR amplification without cDNA template (H2O) were performed to verify the absence of contaminating DNA in RNA preparations and PCR reagents, respectively. PCR amplicons were separated on a 2% agarose gel, stained with ethidium bromide. (*) and (**) indicate the expected amplicons. Figure S7: Histone 2A, 2B and 3-derived piRNAs in Aag2 cells. (A) Size profile of small RNA reads derived from H2A, H2B and H3 histone genes in Aag2 cells. Black and grey bars indicate sense and antisense reads, respectively. (B) Nucleotide bias at each position of the 25-30 nt small RNA reads mapping to the sense (upper panels) and antisense histone sequence (lower panels). All reads of three independent experiments were combined to generate the sequence logo. n, number of reads; u, number of unique sequences. (C) Relative abundance of the 25-30 nt sense (black) and antisense (grey) histone reads in the indicated PIWI knockdown libraries. Bars are the means +/- SD of three independent small RNA libraries. Two-tailed student’s t-test was used to determine statistical significance (*P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001). Figure S8: Histone-derived piRNAs in adult A. aegypti mosquitoes. (A) Distribution of 25-30 nt sense (red) or antisense (blue) RNA reads across the H4 open reading frame (ORF). The counts of 5’ ends of small reads at each nucleotide position are shown. (B) Size profile of small RNA reads derived from the H4 gene. Black and grey bars indicate sense and antisense reads, respectively. (C) Size profile of small RNA reads derived from H2A, H2B and H3 histone genes in A. aegypti mosquitoes. Black and grey bars indicate sense and antisense reads, respectively. (D) Nucleotide bias at each position of the 25–30 nt small RNA reads mapping to the sense (upper panels) and antisense histone sequence (lower panels). n, number of reads; u, number of unique sequences. The size distribution of histone piRNAs in adult mosquitoes is broader than in Aag2 cells, likely reflecting the accumulation of non-specific degradation products, which were also seen in the northern blot (Figure 3F). Maybe for this reason, the piRNA-sized reads do not show a strong nucleotide bias. Figure S9: H2A, H2B and H3 mRNA accumulation during cell cycle progression and validation of stemloop qPCR assay for H4piRNA quantification. (A) RT-qPCR analysis of H2A, H2B and H3 mRNA levels in synchronized cells. Expression is normalized to Lysosomal Aspartic Protease (LAP) levels and presented relative to asynchronous cells. Bars represent means +/- SD of three biological replicates. (B) Stem-loop (SL) RT-qPCR analyses of individual H4piRNAs (A-D, shown in Figure S5B) upon knockdown of the indicated PIWI/AGO transcripts. Expression is normalized to aae-

9

bantam-3p levels and presented relative to dsLuc. Bars represent means +/- SD of two biological replicates.

10

A normalized read counts (RPM)

200000

150000

100000

50000

0

19

21

23

25

27

29

31

read length (nt)

B

C sense

antisense

sense

2.5

antisense

-1.0

o3 Ag

o3 Ag ds

i6 Piw ds

i5 Piw ds

Piw ds

Lu

i4

-2.5

c

-2.0

-2.5

i6

-1.5

-2.0

Piw

*

i5

*

-1.5

0.0 -0.5

Piw

-1.0

0.5

i4

-0.5

1.0

Piw

0.0

1.5

P

***

0.5

GF

1.0

piRNA enrichment

2.0

ds

piRNA enrichment

1.5

V5-IP

Girardi et al. Figure S1

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

dsLUC dsPiwi4 dsPiwi5 dsPiwi6 dsAgo3

25-30nt sense piRNA count 3744 10 kb

0

AAEL006582

AAEL012272

AAEL007690 normalized read counts (RPM)

25-30nt sense piRNA count

10 kb

4386

0_

AAEL014915 normalized read counts (RPM)

25-30nt sense piRNA count 1700 2 kb

0

AAEL011197 normalized read counts (RPM)

25-30nt sense piRNA count 4213 20 kb

0

AAEL003743 normalized read counts (RPM)

25-30nt sense piRNA count

10 kb

198

0

AAEL007690

normalized read counts (RPM)

25-30nt sense piRNA count

2 kb

0

AAEL012272

AAL003743 normalized read counts (RPM)

3017

0 -0.2 -0.4 -0.6 -0.8 -1.0

200

150

100 50

0 -0.2 -0.4 -0.6 -0.8 -1.0 80

60

40

20

0 -0.2 -0.4 -0.6 -0.8 -1.0

60

40

20

0 -0.2 -0.4 -0.6 -0.8 -1.0

AAEL011197

80

60

40

20

0 -0.2 -0.4 -0.6 -0.8 -1.0

25 20 15 10

-0.2 -0.4 -0.6 -0.8 -1.0

5 0

19 21

19 21 23

19 21 23

19 21 23

19 21 23

19 21 23

group V

2.0

B

group IV

relative expression

A 200

sense

group IV

AAEL011197

23

antisense

150

100 50

AAEL012272 25 27

25

25

25

25

25

27

27

27

27

27

29

AAEL014915

31

AAEL007690

AAEL003743

29 31

29 31

AAEL014915

AAEL006582 29 31

29 31

read length (nt) 29 31

C

group V

AAEL006582

1.5

1.0

0.5

0.0

mock

SINV GFP

mock

SINV GFP

mock

SINV GFP

mock

SINV GFP

mock

SINV GFP

mock

SINV GFP

Girardi et al. Figure S2

S pi TE R N R A pr od uc tio n

E

x x x x x

x x x x x

AAEL003833

x x x x x x

x x x x x x

x x x x x x

x x x x x x

class II

LU

C

D H

x x x x x

AAEL003673 x AAEL003689 x

x x x x

x x

AAEL000517 AAEL000490 AAEL000513 AAEL003814 AAEL000501 AAEL003846 AAEL003838 AAEL003866 AAEL003823 AAEL003863

class III

x x x x x

class I

po ly A SL -si

gn al

A

class IV

AAEL011999 x AAEL013709 x AAEL000492 (Histone H3 - outgroup)

B 189.62 kb 750kb

Forward strand

800kb

850kb

900kb

SuperContig 1.98: 727,922-885,801

AAEL003823 > AAEL003851 > AAEL003826 > AAEL003818 >

AAEL003866 >

AAEL003846 >

AAEL003838 >

< AAEL003814 < AAEL015678 < AAEL003852

< AAEL003828

< AAEL015679

AAEL003863 >

AAEL003862 >

AAEL003820 >

< AAEL003836

< AAEL015681

< AAEL015680

< AAEL003827

< AAEL003850

< AAEL003833

< AAEL003856 750kb

Reverse strand

800kb

850kb

900kb

189.62 kb 68.59 kb

SuperContig 1.9: 3,650,652-3,695,658

3.65Mb

3.66Mb

AAEL000517 > AAEL018015-RA >

AAEL000518 >

< AAEL018013-RA

3.68Mb

AAEL000490 > AAEL000494 >

< AAEL000492 < AAEL015675

< AAEL018014-RA

3.69Mb

3.70Mb

AAEL000513 >

AAEL000501 >

AAEL000525 >

AAEL000497 >

< AAEL015676

< AAEL000482

< AAEL015677

< AAEL000506

< AAEL018017-RA

< AAEL018016-RA < AAEL015674 3.65Mb

3.66Mb

3.67Mb

Reverse strand

SuperContig 1.94: 2,272,257-2,305,414

Forward strand

3.67Mb

3.68Mb

3.69Mb

3.70Mb

68.59 kb

50.67 kb 2.27Mb

2.28Mb

AAEL003687 >

Forward strand 2.29Mb

2.30Mb

AAEL003673 >

AAEL018305-RA >

AAEL003706 >

< AAEL015683

AAEL003659 >

< AAEL015684-RA

< AAEL003689

< AAEL003685 2.27Mb Reverse strand

histone 4 genes

2.31Mb

AAEL015682 >

< AAEL003669

2.28Mb

2.29Mb

2.30Mb

2.31Mb

50.67 kb

histone 2A,2B and 3 genes

pseudogene

other genes

Girardi et al. Figure S3

A normalized read count Haac et al (log2)

15

10

5

0

-5 -5

0

5

10

15

normalized read count this study (log2)

normalized read count

B

500 400 300 200 100 0 5 10 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

read length (nt) 2 bits

n=84,085 (u=1,326)

0

5

2

10

15

20

n=1,228 (u=157)

bits

C

0

5

10 15 nucleotide position

20

Girardi et al. Figure S4

A

B

AB

C

D

100

200

0

5

10

15

dsPiwi5

0,05

dsLUC

0,1

300

SINV GFP

mock

0,15

dsAgo3

0

0,2

dsLUC dsAgo3 dsPiwi5

probablility [%]

0,25

20

H4piRNA-A

overlap (nt)

UUAUCUACGAGGAAACCCGUGGAG ·········· CGCAUAGAGACCAGAAUAGAUGCU

H4piRNA-B

UCUGGUCUUAUCUACGAGGAAACC ·········· CUCAGUUCGCAUAGAGACCAGAAU H4piRNA-C

AUUCCUGGAAAAUGUCAUUCGUGA ·········· CCUCACGACUUCCAUAAGGACCUU

H4piRNA-D

D

250 KDa 150 KDa 100 KDa 75 KDa

40 nt

Mix H4piRNAs

GFP-Piwi5

GFP-Ago3

NT

GFP

-

***

50 KDa 37 KDa

20 nt

25 KDa

***

IB: Anti-GFP

30 nt

IP anti-GFP GFP-Piwi5

U6 snRNA

GFP-Ago3

100 nt 90 nt 80 nt 70 nt 60 nt 50 nt

NT

150 nt

GFP

Total lysates

Marker

-

dsLUC dsAgo3 dsPiwi4 dsPiwi5 dsPiwi6 dsAgo1 dsAgo2

SINV-GFP dsAgo1 dsAgo2

-

mock dsLUC dsAgo3 dsPiwi4 dsPiwi5 dsPiwi6

C

RNA ladder

miR-2940-3p

10 nt

Girardi et al. Figure S5

A * H4 mRNA

5’

3’

H4 antisense RNA

195bp

sense RT

Fw

Tag 3’

5’ Tag

anti RT

**

Rw 145bp

B sense RT Fw/ Tag

antisense RT

Rw/ Tag

Rw/ Tag

Fw/ Tag

+RT -RT H2O +RT -RT H2O +RT -RT H2O +RT -RT H2O

600 bp 500 bp 400 bp 300 bp

200 bp

100 bp

*

**

Girardi et al. Figure S6

H2B

31

19

n=310 (u=106)

0

5

10

15

2

5 15 10 nucleotide position

0

10

15 20 n=11 (u=10)

5 15 10 nucleotide position

1.0

-1

o3

i5 w Pi ds

i4

* w

Ag

o3

i6 ds

w Pi

i5

****

0

-3

w

***

0.5

-3

Pi

sense antisense

***

1.5

-2

i4

20

2.0

-2

ds

Ag

o3

i6 ds

w Pi ds

i4

w Pi ds

Pi

w

c ds

Lu ds

i5

**

-2.5

-1

w

-2.0

**

0

Pi

-1.5

**

0.5

c

*

-1.0

31

H3

1.0

ds

**

0.0 -0.5

5

2

sense antisense

Lu

*

piRNA enrichment

1.0

29

n=4,963 (u=386)

0

20

1.5

sense antisense

ds

1.5

27

H3

H2B

H2A

0.5

25

bits

0

20

23

read length (nt)

Ag

5 15 10 nucleotide position

21

2

n=4 (u=4)

ds

0

19

20

bits

bits

31

H2B

2

15 20 n=211 (u=78)

29

Pi

10

27

c

5

2

piRNA enrichment

25

bits

bits

0

C

23

read length (nt)

i6

H2A n=8,153 (u=474)

2

21

ds

29

0 -2 -4 -6 -8 -10

w

27

20

ds

25

40

Lu

23

read length (nt)

sense antisense

60

ds

21

80

Pi

0 -2 -4 -6 -8 -10

sense antisense

ds

50

normalized read counts (RPM)

100

5 4 3 2 1 0 -2 -4 -6 -8 -10

H3

bits

sense antisense

150

19

B

normalized read counts (RPM)

normalized read counts (RPM)

H2A

piRNA enrichment

A

Girardi et al. Figure S7

B 2.0 normalized read count(RPM)

sense antisense

1.0 0.5 0.0 0.5 0

100

200

sense antisense

1.0 0.5 0.0 0.5

300

read length (nt)

H2B

H2A sense antisense

1.0 0.5 0.0 0.5 21

23

25

27

29

31

sense antisense

0.5

0.0

0.5 19

read length (nt)

D

0

15 20 n=18 (u=17)

2

n=853 (u=418)

0 2

29

20

31

1.0 0.0 1.0 19

21

10

0

23

2

15

20

n=85 (u=31)

0

5 15 10 nucleotide position

20

0

27

29

31

H4

n=2,519 (u=1,049)

5

2

0

10

15 20 n=12 (u=12)

bits

2

25

read length (nt)

bits

5

bits

5 10 15 nucleotide position

27

2.0

H3

bits

10

bits

2

5

25

H2B

n=1,035 (u=615)

bits

0

23

sense antisense

read length (nt)

H2A 2

21

H3

3.0

n=1,180 (u=432)

bits

19

1.0

2

5

10

15 20 n=24 (u=23)

bits

1.5

normalized read counts (RPM)

C

normalized read counts (RPM)

Histone H4 ORF position (nt)

normalized read counts (RPM)

1.5

1.5

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

normalized read count (RPM)

A

5 10 15 nucleotide position

20

0

5 10 15 nucleotide position

20

Girardi et al Figure S8

A H2B mRNA

0

2

0 0

2

4

6

8

hpr

Time post release (h)

relative expression

4

asynch.

relative expression

2

asynch.

relative expression

4

3 2 1 0

0

2

4

H3 mRNA

4

6

6

Time post release (h)

8

hpr

asynch.

H2A mRNA 6

0

2

4

6

8

hpr

Time post release (h)

B H4 piRNAs

relative expression

1.5

H4piRNA-A H4piRNA-B H4piRNA-C

1.0

H4piRNA-D

0.5

0.0 dsLUC

dsAgo3 dsPiwi4 dsPiwi5 dsPiwi6

Girardi et al. Figure S9

Table S1: Oligonucleotides used in this study T7FW-PIWI4 TAATACGACTCACTATAGGGAGACGTGGAAGTCCTTCTTCTCG T7RV-PIWI4 TAATACGACTCACTATAGGGAGATGTCAGTTGATCGCTTCTCAA T7FW-PIWI5 TAATACGACTCACTATAGGGAGAGCCATACATCGGGTCAAAAT T7RV-PIWI5 TAATACGACTCACTATAGGGAGACTCTCCACCGAAGGATTGAA T7FW-PIWI6 TAATACGACTCACTATAGGGAGACAACGGAGGATCTTCACGAG T7RV-PIWI6 TAATACGACTCACTATAGGGAGAAATCGATGGCTTGATTTGGA T7FW-AGO3 TAATACGACTCACTATAGGGAGATGCTTACTCGTGTCGCGTAG T7RV-AGO3 TAATACGACTCACTATAGGGAGAGGCATGGCAGATCCAATACT T7FW-AGO2 TAATACGACTCACTATAGGGAGACTACGAGCAGGAGGTCAAGG T7RV-AGO2 TAATACGACTCACTATAGGGAGATCCATGCCTTTGAGGAAATC T7FW-AGO1 TAATACGACTCACTATAGGGAGACCGGTCATCGAGTTCATGT T7RV-AGO1 TAATACGACTCACTATAGGGAGACGTGGCTTTGATCATGGTT T7FW-LUC TAATACGACTCACTATAGGGAGATATGAAGAGATACGCCCTGGTT T7RV-LUC TAATACGACTCACTATAGGGAGATAAAACCGGGAGGTAGATGAGA ANTISENSE PROBE AAEL-H4 FW TAATACGACTCACTATAGGGAG G ATGACTGGCCGTGGTAAGGG ANTISENSE PROBE AAEL-H4 RV CCCTGACGCTTCAGAGCGTAGACAACAT SENSE PROBE AAEL-H4 FW CCC TCTAGATGAAAACCAAACAAACCAAAAACCGGACG SENSE PROBE AAEL-H4 RV TAATACGACTCACTATAGGGAG GCGGCCGCTTAACCTCCGAAACCGTACAG pAG-PIWI5 FW CGCTCGAGGCGGATAGACAGCAAGGAGG pAG-PIWI5 RV AGGCGGCCGCTTATTATTACAGATAATAGAGTTTCTTTTCC pAG-AGO3 FW CGCTCGAGTCCTCGCGGTTGAATTTAGTTCG pAG-AGO3 RV AGGCGGCCGCTTATTATCACAGGTAGAACAGTTTGTCG H4 PIRNA PROBE A

GCACTCCACGGGTTTCCTCGTAGATAA H4 PIRNA PROBE B AGC ATC ACG AAT GAC ATT TTC CAG GAA T H4 PIRNA PROBE C ACG GTT TTA CGC TTG GCG TGT TCA GTG T H4 PIRNA PROBE D CCCTGACGCTTCAGAGCGTAGACAACAT MIR2940-3P AGTGATTTATCTCCCTGTCGAC U6 SNRNA TGGAACGCTTCACGATTTTG QFW-PIWI4 TCTTCTTCTCCACCACAGCC QRV-PIWI4 ATGGTGACCACCTCACAGTTAC QFW-PIWI5 ACGGCATCACATCGAGACTC QRV-PIWI5 CGACCTCCACGCTGTCCTC QFW-PIWI6 TTTTCTTCCACCCCGAGCAG QRV-PIWI6 AATACATTTGCGATGCGGCC QFW-AGO3 CTCCAGACGACGGTTTTGGA QRV-AGO3 GCAGGTACGAAATTGGCTGC QFW-LAP GTGCTCATTCACCAACATCG QRV-LAP AACTTGGCCGCAACAAATAC QFW-AAEL012272 TGAAGTCAGTCGATCACGTTTTGC QRV-AAEL012272 AGACTGAATCAGTTTGTCGTCC QFW-AAEL007690 CCTGTTCCAGGCTGGAATGCTG QAAEL007690 RV ATTGCTTATAACGTGCGCGTGG QFW-AAEL003743 GGCATTCACACGATCGAGTAT QRV-AAEL003743 CGTTCTTCAGCACCATGTTCC QFW-AAEL003827 GACTGCTCGTAAGTCTACTGG QRV-AAEL003827 GTTCCTGGCCGATAACGGTGA QFW-AAEL003820 GCTGGATTGCAGTTCCCAGTC QRV-AAEL003820 CTGCCAATTCGAGCACTTCAGCG

QFW-AAEL014915 ATACCTTCGTCAAGAATCCGGCAC QRV-AAEL014915 CTTGTCTGCAATGTGCGACACC QFW-AAEL015681 CCATGAGCATCATGAACAGCTTCG QRV-AAEL015681 CGGGATGTGATCGTCGAACG QFW-AAEL006582 GGTCTGCCAGAAGCTCTGATCC QRV-AAEL006582 ATCAGACCTTCATCAGCCTTACG QFW-AAEL0011197 GAACACCCAGTCCTGCTGACA QRV-AAEL0011197 TGCGTCATCTTCTCACGGTTAG QFW-AAEL003814 GTGGAGTCAAGCGTATCTCTGG QRV-AAEL003814 GCGTGTTCAGTGTAGGTAACAGC TAG REVERSE PRIMER GCTAGCTTCAGCTAGGCATC RT STRAND SPECIFIC RV H4-SENSE GCTAGCTTCAGCTAGGCATCAAACCGTACAGGGTTCGTCC RT STRAND SPECIFIC RV H4-ANTISENSE GCTAGCTTCAGCTAGGCATCCGAGGAAACCCGTGGAGTGC SL_Aae_H4piRNA_A GTCGTATCCAGTGCAGGGTCCGAGGTATTCGCACTGGATACGACGCACTC Fw_Aae_H4piRNA_A GCCCGCTTATCTACGAGGAAACC SL_Aae_H4piRNA_B GTCGTATCCAGTGCAGGGTCCGAGGTATTCGCACTGGATACGACAGCATC Fw_Aae_H4piRNA_B GCCCGCATTCCTGGAAAATGTCA SL_Aae_H4piRNA_C GTCGTATCCAGTGCAGGGTCCGAGGTATTCGCACTGGATACGACACGGTT Fw_Aae_H4piRNA_C GCCCGCACACTGAACACGCCA SL_Aae_H4piRNA_D GTCGTATCCAGTGCAGGGTCCGAGGTATTCGCACTGGATACGACCCCTGA Fw_Aae_H4piRNA_D GCCCGCATGTTGTCTACGCTCT SL-bantam-3p GTCGTATCCAGTGCAGGGTCCGAGGTATTCGCACTGGATACGACATCAGC F-bantam-3p GCCCGCTGAGATCATTTTGAAAG

Table S2: VectorBase accession numbers for annotated A.aegypti histone genes analyzed

in this study

Histone H2A AAEL000494-RA AAEL000497-RA AAEL000518-RA AAEL000525-RA AAEL003669-RA AAEL003687-RA AAEL003706-RA AAEL003818-RA AAEL003820-RA AAEL003826-RA AAEL003851-RA AAEL003862-RA AAEL007005-RA AAEL007609-RA AAEL007925-RA AAEL012449-RA

Histone H2B AAEL015674-RA AAEL015675-RA AAEL015676-RA AAEL015677-RA AAEL015678-RA AAEL015679-RA AAEL015680-RA AAEL015681-RA AAEL015682-RA AAEL015683-RA

Histone H3 AAEL000482-RA AAEL000492-RA AAEL000506-RA AAEL003659-RA AAEL003685-RA AAEL003828-RA AAEL003836-RA AAEL003850-RA AAEL003852-RA AAEL003856-RA AAEL003872-RA AAEL011424-RA AAEL012072-RA

Histone H4 AAEL000490-RA AAEL000501-RA AAEL000513-RA AAEL000517-RA AAEL003673-RA AAEL003689-RA AAEL003814-RA AAEL003823-RA AAEL003833-RA AAEL003838-RA AAEL003846-RA AAEL003863-RA AAEL003866-RA AAEL011999-RA AAEL013709-RA