1
Supplementary Information:
2 3
Identification and expression profiles of neuropeptides and their G
4
protein-coupled receptors in the rice stem borer Chilo suppressalis
5 6
Gang Xu1, Gui-Xiang Gu1, Zi-Wen Teng1, Shun-Fan Wu1, 2, Jia Huang1, Qi-Sheng Song3,
7
Gong-Yin Ye1, and Qi Fang1*
8 9
1
State Key Laboratory of Rice Biology & Key Laboratory of Agricultural Entomology of Ministry
10
of Agriculture, Institute of Insect Sciences, Zhejiang University, Hangzhou 310058, China
11
2
12
Local Joint Engineering Research Center of Green Pesticide Invention and Application
13
3
College of Plant Protection, Nanjing Agricultural University, Nanjing 210095, China; State &
Division of Plant Sciences, Missouri University, Columbia, MO 65211, USA
14 15
*Corresponding author: Qi Fang, Institute of Insect Sciences, Zhejiang University, Hangzhou
16
310058, China, Tel: 86-0571-88982696, E-mail:
[email protected]
17 18 19 20 21
This file includes:
22
Tables S1-S3
23
Figures S1-S10
24
Table S1 G protein-coupled receptors with likely ligands in insects. Receptor Neuropeptide receptor A1 Neuropeptide receptor A2 Neuropeptide receptor A3 Neuropeptide receptor A4 Neuropeptide receptor A5 Neuropeptide receptor A6-A Neuropeptide receptor A6-B Neuropeptide receptor A7 Neuropeptide receptor A8 Neuropeptide receptor A9 Neuropeptide receptor A10 Neuropeptide receptor A11 Neuropeptide receptor A12 Neuropeptide receptor A13 Neuropeptide receptor A14 Neuropeptide receptor A15 Neuropeptide receptor A16 Neuropeptide receptor A17 Neuropeptide receptor A18 Neuropeptide receptor A19 Neuropeptide receptor A20 Neuropeptide receptor A21 Neuropeptide receptor A22 Neuropeptide receptor A23 Neuropeptide receptor A24 Neuropeptide receptor A25 Neuropeptide receptor A26 Neuropeptide receptor A27 Neuropeptide receptor A28 Neuropeptide receptor A29 Neuropeptide receptor A30 Neuropeptide receptor A31 Neuropeptide receptor A32 Neuropeptide receptor A33 Neuropeptide receptor A34 Neuropeptide receptor A35 Adipokinetic hormone receptor Allatostatin A receptor Diapause hormone receptor FMRFamide receptor Myosuppressin receptor Pheromone biosynthesis activating neuropeptide receptor Sex peptide receptor SIFamide receptor Neuropeptide receptor B1 Neuropeptide receptor B2 Neuropeptide receptor B3 Neuropeptide receptor B4 Diuretic hormone receptor Leucine-rich repeat G protein-coupled receptor 1 Leucine-rich repeat G protein-coupled receptor 2
25 26 27 28 29 30 31 32 33 34 35 36 37 38
Close family CG7285, CG13702, BmA1 BmA2 CG13229, BmA3 CG1147, BmA4 BmA5 CG5911-A, BmA6-A CG5911-B, BmA6-B CG7395, BmA7 CG13229, BmA8 CG6881, CG6857, BmA9 CG7395, BmA10 CG7395, BmA11 BmA12 BmA13 CG14593, BmA14 CG14484, BmA15 BmA16 CG13995, BmA17 CG6986, BmA18 CG5811, BmA19 CG13229, BmA20 CG10698, BmA21 CG5811, BmA22 CG10626, BmA23 CG7887, BmA24 CG14575, BmA25 CG6111, BmA26 CG14575, BmA27 CG11325, BmA28 CG11325, BmA29 CG6111, BmA30 CG14003, BmA31 CG6515, BmA32 CG6515, BmA33 CG30340, BmA34 BmA35 CG11325, BmAKHR CG2872, CG10001, BmBAR CG9918, BmDpHR CG2114, BmRFaR CG8985 CG13802 BmMSR CG8784, CG8795 BmPBANR
Likely ligands AstC ITP Orphan NPF Orphan ETH ETH Orphan Orphan SK sNPF sNPF Orphan MS CCH2 CCH1 AT Orphan CNMa RY Orphan Crz RY LK TK/ITPL CAPA-PVK CCAP CAPA-PVK ACP ACP CCAP TR NTL (FXXXRa) NTL (YXXXRa) ITP Orphan AKH1, AKH2 AstA CAPA-PK-1 FMRF MS PK-2
CG16752, BmSPR CG10823, BmSIFR CG17415, BmB1 CG13758, BmB2 BmB3 CG4395, BmB4 CG8422, CG12370, BmDHR CG8930
SP/AstB SIF DH31 PDF Orphan Orphan DH41 GPA2/GPB5
CG7665
Bursicon
39
40 41 42 43 44 45 46 47 48 49 50 51 52 53
Table S2 Primers used for the qRT-PCR analysis of neuropeptide genes. Primers
Forward primer (5’-3’)
Reverse primer (5’-3’)
AKH1 AKH2 AKH3 AstA AstB AstC AstCC AT ITG Burα Burβ CAPA CCH1 CCH2 Crz CCAP DH/PBAN DH31 DH41 DH34 DH45 EH ETH FMRF GPA2 GPB5 ILP ITP LK MS NTL NP NPF1 NPF2 NPLP1 OKA OKB PDF Pro PTTH RY sNPF SIF IMF SK TK TR EF-1
CGAGTCTCAACTTCAAACAGCA GACGCCCAGTTGACCTTCAG GATCAGGTGTAGGGGCGTGA CTGGCCCAGTTTACGAAGAG GCGAACTTCCACGGATCTTG CGGGCAACGCTTACTTTATC CGCTCCCGTTACTGTTGGTG ACCACCAGGGCTGAGCTGTA TACCGCCACTACGCTTACGG CTTCGTCATCCCGGGTGTAA GCGGGGAAGTCAGTGTCAAC AGTCATCGCCAGCGCATATC GCCAGCTGCGAACAAAGAAC GGCAAACGATCTGGCGATAC CGAACCTAAGCCTGCTGCTG CAGGTTGTGGTCGCAAGAGA CGAACAGTTGCCTGCGTATG TGCTCGTCGTACCCACTTCA CAACTACGGTGTGCCACCAA CAACTACGGTGTGCCACCAA CAACTACGGTGTGCCACCAA TCGTGCTGTTCGCTGTCTTC GCGTTAGTCGCATATTGGCTGT CGCAATCGATCGCAGTATGA CCTGGCTGTCATCGAATAGGTC GCAGACGGACCAGAATGGAC CCAGATGCTGGGAGAGATCG GCATCTGCGACGACTGCTAC CATGGGGAGGCAAAAGATCA GAGCACCGAAACCCGTGTAG TGAGGGGCAGACGTGATTCT GTCCCGTGCTGCAAAGAATG TGAACAAACGCATCGCAGTC GTTGCTGTCCGCCATCTTGT AAAATGGCCAACTCCCGACT AATTTCGTGCGCAAGAGGAA ACCGCTGATTCGCTACAGGA TGCCCAAAGTCTGGCTTGAT CCAAAGCAAGGAACGACAGC ACGGTCATCATGTGCTGTGG GGCGTTGATGACGCTAGTGG TCAGCGTCAGCGGACAGTAG TGCGTGCAGTATTCGCCATA TCGCTGGTGCAAGTAGGAGA CTATCTTCGCGTTGGCAGTG TGGGCAAGAGGTGCAGAAAC CCTGTGGCAGTGAGTGCTTG AAATCGGCGGTATTGGTACG
AGCCGCTATAGCTTCGTCGT TTTTGCCACCATCTCTACACGA TGTCGACAAAGCCTGGTGAA AAATTGAAACGGTGCAGGTC AGGGCTGACCATGCCTTCTT ATCCGGACGAACTACAATCG GTCGACCTCCATCTGCTCGT ACTGACCGGGTGAAATCAGG CGCCCTCCGTCATCTGTATC GGGGCAAAAGAGGACTACCG TCTTCCAGCCTCAAGCCATC CTGACGTTTGGCTGTCATCG TCCAACATCGGTTTGGCTTC GCTTCCCTGGCGCTATCTTC GCCTGCCTTCCAACAAGTTC CTGCGCTTGCATCTTGAATG TGATCCGGTTGTGTGACTGC CTGAGGCCAAGATCCAATGC CCTCTCCTTCTCCAGGCTGA CTCTGAAGCACGTCCACAGC GCACCTCCATTGGGTTGTTG ACAGGATTCTGCGCATAGGG CTCCCCATTCGTGGGATGT GGGCTGTCGTATTCGTCGTC GTTGCAGCACTGACCCAATG CACTACTCCTGGGTCGCAATG TGACGTAGCAAGCCTTGTGG TTCCCGACGAAGTCGATCAT GCTTGTCCAGCGAGGGTATG GCTCGAGGAAGGTGTTGAGC TTCTTCTTCCCGCCTTCCTC CCAGCCACCGTCTACCCTTA CGGTCTAGCAGCCTGTGTGT CATCTGCCTCTTGCCGTACC CTACCGACGCGGTAACCATC TGGAACCAATGGGAAACGAC GATGCCAGCGAGGTTTCTTC ATCACCATCCGTCCAGTTGC CGCCGCGAACTTTCTTAAAATC TCCAATGCCATCTTCTGCAC TGCCCATGAAGAACCTGTCA AGCTCGTCCATTTCGTCCAA ATGCCTGGCAGGTTTCTGAG TGGGAACCAAGCTTGACAGG AGATGGCCGTAGTCGTCGAA GCATGCCTACGAAGCCCATA TACCACGGTTCAGCCACAGA AAGGGGAGGGAATTCTTGGA
54
55 56
Table S3 Primers used for the qRT-PCR analysis of the neuropeptide receptor genes. Primers
Forward primer (5’-3’)
Reverse primer (5’-3’)
A1 A2 A3 A4 A5 A6-A A6-B A7 A8 A9 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 A32 A33 A34 A35 AKHR AstAR DpHR RFaR MSR PBANR SIFR SPR B1 B2 B3 B4 DHR LGR1 LGR2 EF-1
GCTTCGCTACTCCAAAATGC TTGGCGAAGTAGGATGTCGTC TTAACCGTCGTGGGATACGTC TTCTCGCAGAACAGGAAGGT GTGGCGGATTTCATGGTGAT AATGAGAAGCGAAGGGCACA ATCGCCAAGAACCTCATGGA GCTGGAACAGCTAGGAGTGG ATGTCGGTTCTACGGGAGCA GGGAATGCGCCATGAGATAC TTTGTTGCGCATTTCATAGC CGTACGTCCAAGGAACCAGT GGATCTTGCTGCTGCTTGCT TGGACATGTTGGCGTTGTTC CCTGGTTCCGGTAATGTTCG TGTGTTCGTGAGGCACAAGG GAGAAAAGCTGCCAAAATGC CCTCGGCAACGTCCTCATAG TCATCGTGTCGTCCGTGTTC TCAGCCAGGGACAACAGGAT TTACGGCACACAACGAATGC GTGGTCTGCGGGAAACGTAG TGATCGTGGTGGCTGTATGG CCACATCGAGACGAATGCAG TGCCCATAATGTCCATGTCG TGCTTAGCCTCCGTGCTAGA ATAGCTGCCCTGCTCTGCAC CCGAGCGTTGAGGATAGTGG GTTTGGCGATTGCGAGAAAG TTGTAGCCTGCCACAGAGCA CTCGGCAAGAGAGCCACTGT CTCGTGGTGACGTTGTCGAG ACGTACGCGGTTCCAATGTT GAAATGGGCGCACCTTAACA TTACTGGAGGGACCGGTTGA CACGCTGCTCATCTTCATCG CGCCTGGATGTACTCCTCAT AGTATCAGCGGAAAGCAGGA GGCTTGGCATCAGAAACGTC CGTGTACCTGACGCTCATCG ACGGGATTAGCAGTCGCTGA TTGTGGACATACCGCCAGTG CTGGATCTTCGCCGTACTCG TCGTGGATCGGCAGTATGTG GCCCTGTCCAGACTTTGTGC TGCCGTATCTTTGCGAACTC GCAACGCTGTGTTCGTATGC GTCATCGGGTTCGACTCCAG GCGATGAGGTGGAAGAATGC GTGTACTCCGAAGCCGAACG AGGCGCATAACAACCCACAC AAATCGGCGGTATTGGTACG
GGTGGCAGACCGCTATGTAT AATGCCACAGCCCAGATCAC CGACGGTTGTGTGAATGTCG GTAGAGGCCAGTGCTTCGTC GGAAGCAGATGGCGTACCAG ATCAAAACGCTTCGCCTCAA CGCTTTAAATGGCAGCAAGC CCACAGTCAGCAAGTCCTCA AGTAGCCATGGATGCGGTTG CGGAGTCCAGCTTCTCCTCA AACGGTTTCATTACCGTTGC TGACCTATCTGAGGGCCAAC TAGCGGAGTTTGCATGACCA TTGTGGGCTCCATCTTGGTC CGGCACGCATATCAAGATCA CACTGACACCCCAATGCTGA TGATCGCTGAGTTGGCATAG CCAGTATGCCACTGCGTTCA GCGCAAAGTTTATGCCGAAG TGACTCGCGAACCACACGTA AAGTCTCGCCGAGTCAGCAC GACCAATTTGCCGCTTCTTG AGCGCAAACTGGAGAGCAAG GGCAGAAAGGGCACATGAAG ACCGCGAATATCACCACCAC ACTACAGCCCCATCCTGACG ATCACCACAGCCTGGAGGAA CAAGCTGCTCAGCTCGCATA CGCATAGGCAGAACACGTTG TCTCGCAGAAGATGCACACG TACGGCGACCAGCATAGGAC GTAGCTGAGCGAGTGCACGA GCCAGCAGATGCCGAATATC GAGCGTGTTTCCCAAGATGG AACAGCACCAAGAGCCGAAC CCTCGCCGAAGTACCAGATG GCTATGCAAATCAGCACGAA TCTGAAAGCGACCCTGAAGT GCTTGTGGTAGTGCCAGTGC TCGACGCTCACGTACGTCTC GGGTGATTGTCAACCAGATCG CGACGCAGGTGCTTATGTTG CCGTGGGGAGGATGTAACAG ACAGGGGATCAAGTGGACGA CAGCGACACGCTGTAACCTG CCCACCCCACAACGTAGTTT AGCACGAGTGGTCGTTGTGA AAGCAGTGCTGCCACAGACA GGGTGGCATTTTGTGTGTCA GAGGTGGCACATGAGGAAGC AACACGAGGTCGCTGAGTCC AAGGGGAGGGAATTCTTGGA
57 58
Figure S1 Alignment of natalisins (NTLs) predicted from C. suppressalis (Cs), B. mori (Bm), and
59
D. melanogaster (Dm). Identities are highlighted in red, and similarities are indicated by gray. The
60
red asterisks indicate the characteristics of NTL (FXXXRa, YXXXRa).
61 62 63 64 65
66 67
Figure S2 Sequence alignment of CAPA (PK1) (A) and DH/PBAN (PK2) (B) from C. suppressalis
68
(Cs), B. mori (Bm), N. lugens (Nl), T. castaneum (Tc), and D. melanogaster (Dm). Identities are
69
highlighted in dark red, and similarities are indicated by gray. The solid boxes indicate the mature
70
peptides. PVK, periviscerokinin; CPPB, CAPA precursor peptide B; PK, pyrokinin.
71 72 73 74 75 76 77 78 79
80 81
Figure S3 Protein alignment of diapause hormone/PBAN (DH/PBAN) precursor sequences in
82
moths, including C. suppressalis (Cs), B. mori (Bm), Plutella xylostella (Px), M. sexta (Ms), H.
83
zea (Hz), H. armigera (Ha), S. exigua (Se), and Agrotis ipsilon (Ai). SGNP: subesophageal
84
ganglion neuropeptide. Identities are highlighted in dark red, and similarities are indicated by gray.
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
100 101
Figure S4 Schematic representation of the highly conserved ITP gene structures that contains
102
exons arranged in tandem and lead to derived alternative mRNA splice forms that encode the
103
common and the distinctive parts of short (yellow, ITP) and long (red, ITPL) peptide isoforms.
104
The red arrow marks the stop codon of ITPL. SP, signal peptide; IPRP, ITP precursor-related
105
peptide.
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
126 127
Figure S5 Protein alignment of the ITP and ITPL precursor sequences from C. suppressalis (Cs), B.
128
mori (Bm), M. sexta (Ms), T. castaneum (Tc), A. mellifera (Am), S. gregaria (Sg), A. aegypti (Aa),
129
and D. melanogaster (Dm). Identities are highlighted in dark red, and similarities are highlighted
130
in light color. The signal peptide, ITP precursor-related peptide (IPRP), and mature peptide are
131
separated by dashed lines. The red asterisks mark the conserved cysteine residues, and three
132
disulfide bridges are indicated.
133 134 135 136 137 138 139 140
141 142
Figure S6 Sequence alignment of the allatostatin double C (AstCC) from C. suppressalis (Cs), B.
143
mori (Bm), N. vitripennis (Nv), A. mellifera (Am), N. lugens (Nl), A. pisum (Ap), T. castaneum
144
(Tc), and D. melanogaster (Dm). Identities are highlighted in dark red, and similarities are
145
indicated by gray. The solid box indicates the mature peptide, and the dashed boxes indicate Lys
146
and Arg residues which may form convertase cleavages sites or be removed by carboxypeptides.
147 148 149 150 151 152
153 154 155
Figure S7 Protein alignment of novel alternative splicing variants of three neuropeptide precursor
156
genes from C. suppressalis. (A) AstCC; (B) CCH1; (C) sNPF. The dashed boxes indicate the
157
signal peptides, the solid red boxes indicate the mature peptides, and the solid blue box indicates
158
one more mature peptide of sNPF. The red texts represent the differences between two splicing
159
variants.
160 161 162 163 164 165 166
167 168
Figure S8 Alignment of CCHamides predicted from C. suppressalis (Cs), B. mori (Bm), N. lugens
169
(Nl), A. mellifera (Am), T. castaneum (Tc), and D. melanogaster (Dm). Identities are highlighted
170
in dark red, and similarities are indicated by gray. The red asterisks mark the conserved residues
171
explaining the etymology of CCHamide.
172 173 174 175 176 177 178 179 180 181 182 183 184 185 186
187 188
Figure S9 Alignment of short neuropeptide F (sNPF) peptides from C. suppressalis (Cs), B. mori
189
(Bm), A. mellifera (Am), and D. melanogaster (Dm). Identities are highlighted in dark red, and
190
similarities are indicated by gray.
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205
206 207
Figure S10 Protein alignment of proctolin (Pro) precursor sequences from C. suppressalis (Cs), B.
208
mori (Bm), N. lugens (Nl), T. castaneum (Tc), and D. melanogaster (Dm). Identities are
209
highlighted in dark red, and similarities are indicated by gray. The solid boxes indicate the mature
210
peptides. The arrows indicate potential cleavage sites.