Web Server for Prediction of miRNAs and Their ... - Springer Link

5 downloads 205 Views 157KB Size Report
The above listed methods enabled the prediction of many new pre miRNAs, miRNAs, and their bind. Web Server for Prediction of miRNAs and Their Precursors.
ISSN 00268933, Molecular Biology, 2015, Vol. 49, No. 5, pp. 755–761. © Pleiades Publishing, Inc., 2015. Original Russian Text © P.S. Vorozheykin, I.I. Titov, 2015, published in Molekulyarnaya Biologiya, 2015, Vol. 49, No. 5, pp. 846–853.

BIOINFORMATICS UDC 577.2

Web Server for Prediction of miRNAs and Their Precursors and Binding Sites P. S. Vorozheykina and I. I. Titova, b a

b

Novosibirsk State University, Novosibirsk, 630090 Russia Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, 630090 Russia; email: [email protected] Received December 27, 2014; in final form, February 27, 2015

Abstract—A microRNA (miRNA) is a small noncoding RNA molecule about 22 nucleotides in length. The paper describes a web server for predicting miRNAs and their precursors and binding sites. The predictions are based on either sequence similarity to known miRNAs of 223 organisms or contextstructural hidden Markov models. It has been shown that the proposed methods of prediction of human miRNAs and premiRNAs out perform the existing ones in accuracy. The average deviation of predicted 5'ends of human miRNAs from actual positions is 3.13 nt in the case of predicting one pair of complementary miRNAs (miRNA–miRNA* duplex). A useful option for our application is the prediction of an additional miRNA pair. In this mode, the pairs closest to actual miRNA deviate by 1.61 nt on average. The proposed method also shows good performance in predicting mouse miRNAs. Binding sites for miRNAs are predicted by two known approaches based on com plementarity and thermodynamic stability of the miRNA–mRNA duplex and on a new approach, which takes into account miRNAs competition for the site. The role of the secondary structure in miRNA processing is con sidered. The web server is available at http://wwwmgs.bionet.nsc.ru/mgs/programs/rnaanalys/. DOI: 10.1134/S0026893315050192 Keywords: miRNA, premiRNA, binding site, secondary structure, hidden Markov model, partition function

MicroRNAs (abbreviated miRNAs) are small RNA molecules, about 22 nt in length that regulate gene expression at the posttranscriptional step [1, 2]. The number of annotated miRNAs constantly increases, and new methods for predicting novel miRNAs, their precursors, and binding sites are developed. The simplest computational methods of predicting miRNAs and their precursors are based on the detec tion of similarity between candidates and known sequences [3, 4]. Additional criteria are secondary structure conservation in premiRNAs and features of the primary and secondary structures (MiRscan [5], miRseeker [6], miRAlign [7], Vmir [8], miRPara [9], miRNAFold [10], etc. The ab initio search for miRNAs and premiRNAs follows two major approaches. One of them is based on support vectors and decision trees, which recognize miRNAs and premiRNAs using sets of features of the primary and secondary structures: miRabela [11], TripletSVM [12], RNAmicro [13], MiRFinder [14], microPred [15], Virgo [16], MaturePred [17], MiR mat [18], etc. The other group of methods involves construction of probabilistic miRNA or premiRNA models based on naive Bayes classifiers (BayesMiRNAfind [19], matureBayes [20], etc.), sto Abbreviations: SNP, singlenucleotide polymorphism; nt, nucle otide.

chastic contextfree grammars (CIDmiRNA [21]), or hidden Markov models (HMMs) (ProMiR [22], miRRim [23], SSCprofiler [24], etc.) The methods invoke information of the location of a miRNA in a precursor and the statistics of sequences and second ary structures. The simplest methods of predicting binding sites for miRNAs in mRNAs are based on the complemen tarity between the 5'end of a miRNA and its target mRNA and on the calculation of the RNA–RNA duplex stability. They are implemented in Diana microT [26], RNAhybrid [27, 28], etc. The accuracy of predictions is improved by taking into account the conservation of candidate site sequences, local sec ondary mRNA structure, and the nucleotide compo sition of the duplex, i.e., miRanda [29], TargetScan [30], etc. [31, 32]. Approaches with machine learning invoke sequences of known miRNA sites from various species. Program packages miTarget [33], MirTarget2 [34], miREE [35], etc. are based on the SVMmethod and parameters of known miRNA–mRNA duplexes. A physicochemical approach to calculating the equi librium degree of binding between miRNA and its site is proposed in [36]. The abovelisted methods enabled the prediction of many new premiRNAs, miRNAs, and their bind

755

756

VOROZHEYKIN, TITOV

ing sites. The results are presented as freely accessible databases [34, 37–41]. We present a web server for prediction of premiRNAs, miRNAs, and their binding sites. The server consists of three applications. One of them seeks sequences simi lar to known miRNAs from 223 species. The second predicts ab initio human miRNAs and premiRNAs based on stochastic models and regularities of the sec ondary structures of miRNA precursors. The third predicts miRNA binding sites based on the full comple mentarity of positions 2–8 at the 5'end of a miRNA to the binding site and/or thermodynamic parameters of the miRNA–mRNA duplex, with regard to competition of miRNAs for the site. Our method outperforms the current methods in the quality of miRNA and premiRNA prediction in humans. The mean deviation of the predicted 5'ends of human miRNAs is 3.13 nt in the mode of predicting one miRNA–miRNA* duplex. Sometimes, the pre dicted miRNA–miRNA* pair is far from actual miRNAs; therefore, we provide the option for finding an additional pair. This option reduces the deviation of the prediction of 5'ends for the best predicted pairs from actual miRNAs to 1.61 nt. The proposed method can be applied to miRNA predictions in mice. The role of the secondary structure of premiRNAs in miRNA recognition and maturation is discussed. EXPERIMENTAL Data. Secondary structures and energies of pre miRNAs were calculated with GArna software [42]. For training and testing hidden Markov models and seeking miRNAs by similarity, sequences were extracted from miRBase database release 21.0 [43, 44]. The rate of false premiRNA rejection (type I error) was evaluated with the negative sample from [12]. Two testing sequences [10] were used for comparison with the existing methods of premiRNA prediction. For the same purpose, comparison with existing methods of miRNA prediction was done with testing and train ing samples of human premiRNAs [17]. Structure of the Web server. The server includes three computational programs. One of them is designed for ab initio search for human premiRNAs and prediction of miRNA–miRNA* duplexes by hid den Markov models. Another program searches a base of experimental data for sequences similar to miRNAs. The third calculates miRNA binding sites in mRNAs by three methods. The server is available at http://wwwmgs.bionet. nsc.ru/mgs/programs/rnaanalys/. It also provides a detailed description of the implemented models and algorithms. Input and output data. The input data must contain only characters A(a), C(c), G(g), and U(u)/T(t). To find homologs of a miRNA, the user enters its sequence or selects it from the list including known

miRNAs from 223 species. When predicting miRNA ab initio, he chooses the number of variants of miRNA pairs (one or two) and additional parameters to filter premiRNAs: G+C content, HMM threshold, char acterization of the nucleotide composition (Escore; the equation for calculation is presented on the server), and free energy. In seeking binding sites, one should specify the method and parameters of the search: free energy threshold for the RNA–RNA duplex from the binding energy or the concentration parameter for constructing the probability profile. The program for predicting miRNAs and their pre cursors calculates the secondary structure of a detected premiRNA and sequences of predicted duplexes miRNA–miRNA*. The program for seeking miRNA homologs outputs the homolog sequence and the position of its start in the RNA sequence. The pro gram for predicting binding sites constructs a plot of the probability of the binding of a chosen miRNA to a specified mRNA or nucleotide sequences of binding sites with their starting positions. RESULTS The program for seeking homologs of human miRNAs is the same as we employed before [45, 46]. It has no differences from similar programs. Two of the three methods implemented in the program for bind ing site prediction are also similar to existing algo rithms. Therefore, we dwell on the results of our ab initio prediction of human premiRNAs and miRNAs, and compare them with current computational methods. The quality of human premiRNA prediction was assessed by 5fold crossvalidation in samples of known human premiRNAs (miRBase, release 21.0) and pseudomiRNAs [12]. The threshold for the hid den Markov model K was taken to be the index for pre miRNA classification. The proposed model generates any nucleotide sequence with a probability of P calcu lated in the model. A candidate of length L is classified as premiRNA at ⎯log(P)/L < K. An increase in K increases the number of false premiRNA predictions and decreases the number of false negatives. The default value of the threshold was chosen to be K = 1.93. It corresponded to 11% errors of types I and II. The curve with check results is shown on the server (http://goo.gl/tmjN11). Most methods generally predict only the 5'end of a miRNA and sets the 3'end at a certain distance from it [17, 20]. Similar to ProMiR [22], our program pre dicts both miRNA ends separately. However, in order to assess the quality of miRNA prediction, we calcu late the mean deviation and distribution of absolute deviations for predicted 5'ends of the miRNA in the precursor. The 5'end of a miRNA affects the position of the binding site in mRNA and, thereby, miRNA functions. Therefore, accurate prediction of this end is more important than the prediction of the 3'end. MOLECULAR BIOLOGY

Vol. 49

No. 5

2015

WEB SERVER FOR PREDICTION OF miRNAs AND THEIR PRECURSORS

757

Errors of prediction of miRNA 5'ends

Percentage

100 80 60 40 20 0 0

1 2 Error, nt

3

One miRNA–miRNA* pair (Test 1) Two miRNA–miRNA* pairs (Test 1) One miRNA–miRNA* pair (Test 2) Two miRNA–miRNA* pairs (Test 2)

Fig. 1. Errors of prediction of miRNA 5'ends in two tests. Test 1: 5fold crossvalidation. Test 2: prediction of miRNAs in new sequences added to releases 19–21 of miRBase. X axis shows absolute errors of miRNA end prediction with reference to actual positions; Y axis shows percentage of ends predicted with errors not exceeding specified values.

The method was validated by 5fold crossvalida tion and prediction of sequences of new miRNAs added to releases 19–21 of miRBase [43, 44]. In the mode of predicting two miRNA pairs, errors were cal culated as minimal errors in both pairs separately for the 5' and 3'branches of the precursor. Figure 1 shows that the levels of prediction errors are similar in both tests. This fact indicates the absence of overfitting in the hidden Markov model with 5fold crossvalidation. The mean prediction error is 3.13 nt in the case of the prediction of one miRNA–miRNA* sequence pair and 1.61 nt for predicting two pairs (Fig. 1, test 1). In the modes of seeking one and two miRNA– miRNA* sequence pairs, the prediction error does not exceed 3 nt for 71.2 and 89.98% of ends, respectively. Thus, the prediction of a suboptimal miRNA– miRNA* sequence pair refines the prediction of miRNAs. The option is recommended in cases when a user needs better miRNA prediction quality without extending the range of candidate sequences consider ably. Prediction errors are tabulated on the server (http://goo.gl/7avpcv). We compared our method with programs for the ab initio prediction of premiRNAs in a genomic sequence as follows: miRNAFold [10], mirPara [9], CIDmiRNA [21], SSCprofiler [24], and Vmir [8]. For these programs, source codes and/or Web servers are available. The tests were carried out with two test sequences proposed in [10], i.e., artificial and genomic. The parameters for classifying candidate precursors in our method were the threshold for the hidden Markov model and the secondary structure energies of candidates. Measures of sensitivity and specificity were used to assess prediction quality. The numerical results of the comparison are shown at (http://goo.gl/wcFHIe). The miRNAFold method showed the best results of all of the aforementioned programs. Although SSCprofiler showed higher sensitivity in the genomic sequence, it was not sufficiently specific. The test shows that our method has better prediction quality in MOLECULAR BIOLOGY

Vol. 49

No. 5

2015

both test sequences than miRNAFold, which was the best of the methods with which the comparison was made. The introduction of the additional threshold of secondary structure energy increases the selectivity in the artificial and genomic sequences by factors of 2 and 1.5, respectively. DISCUSSION The canonical pathway that produces most animal miRNAs includes several steps [1, 2, 47–49]. At each step, the processing is influenced by the primary and secondary structures. Known examples of the influ ence of premiRNA structure on the processing will be presented in accordance with the succession of miRNA maturation steps. We will supplement these examples with our own data on the properties of pre miRNA that may influence maturation. The first step is the transcription of the primary transcript (primiRNA), and hairpin fragments of miRNA precursors (premiRNAs) are excised from it with the Drosha–Pasha/DGCR8 complex. The processing of primiRNA is regulated by both the primary and secondary structures. First, the inter action between a primiRNA near the hairpin of the precursor and the mature miRNA may affect the rate of primiRNA editing, as shown for pairs primiR 15/161 + miR709 and prilet7 + let7 [50, 51]. Second, transcript editing is regulated by the second ary structure of primiRNA through binding of pro teins to the premiRNA stem (proteins SMAD, SF2/AF, hnRNPA1 for miRNAs miR7, miR105, miR199a, and others) or to the hairpin loop of the premiRNA (proteins hnRNPA1, KSRP, Lin28, MBNL1 for miRNAs miR7a, miR18a, miR21, miR105, and others) [52]. Moreover, it follows that about 14% of human premiRNAs contain conserva tive nucleotides in the hairpin loops of their precursors [53]. This indicates the evolutionary conservation of regulation by the secondary structure. Third, it has been found for some human primiRNAs that the sec

758

VOROZHEYKIN, TITOV

Frequency

0.4

Frequencies of loops in miRNAs and adjacent regions (a) (b)

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

–2 1 3 5 7 9 11 13 15 17 19 21 Position no.

21 19 17 15 13 11 9 7 5 3 1 –2 Position no.

0

Fig. 2. Frequencies of loops in human miRNAs and adjacent regions. Positive numerals indicate positions within miRNAs. (a) Alignment of miRNAs with reference to the boundary at the hairpin base; (b) alignment of miRNAs with reference to the boundary near the loop.

ondary structure in the vicinity of sites of cleavage by Drosha can also accelerate or arrest the processing, and alteration of the secondary structure in the precursor hairpin may even shift the positions of transcript editing and thereby change the final product [54]. Then premiRNA is transported from the nucleus to cytoplasm by Exportin5 protein. The hairpin structure of the precursor and the overhanging 3’end of the hairpin are essential for Exportin5 to recognize and transport the molecule [55]. At the third step of processing, the Dicer enzyme excises the miRNA–miRNA* duplex about 22 nt in length from premiRNA. The precursor editing often results in duplexes with two overhanging 3'ends. As a result, miRNA sequences in the duplex are shifted with reference to each other [1, 49, 56]. These duplexes occur in 70% of our set of premiRNAs with two func tional branches, and the mean length of the overhang ing ends is about 2 nt. This observation is indicative of the conservation of miRNA lengths during maturation. In a similar way to primiRNA processing, the binding of proteins to specific sites in the hairpin loop of a precursor may facilitate or arrest hairpin editing and even cause premiRNA degradation. Examples are proteins Lin28, KSRP, and TDP43 for precursors let7, mir107, mir143, mir200c, and others [52, 57]. The stability of the premiRNA stem depends on the miRNA duplex position in the stem. Flanking miRNA regions (1–6 nt) are generally less stable than the corresponding regions inside miRNA [58]. Boundaries of miRNAs in premiRNAs are most often detected in close proximity to loops in the sec ondary structure [58]. The frequency of loops in our set of human premiRNAs is elevated at positions close to miRNA ends (Fig. 2). Loops at boundaries probably mark sites for cleavage by Dicer. However, we found no association between the thermodynamic sta bility of miRNA duplexes and miRNA expression rates. At the fourth step, the miRNA–miRNA* duplex is attached to one of the Argonaut proteins and split into two branches. One of them is stabilized to participate

in RNA interference, and the other is degraded. In Drosophila, the secondary structure in the middle of the miRNA–miRNA* duplex determines the Ago protein species and thereby affects miRNA processing at this step [59]. Our observations indicate that about onehalf, i.e., 50.5% of all human premiRNAs have one functional branch each, whereas precursors of the other half can generate miRNAs from both 5' and 3'branches. What branch of the two is functional? Admittedly, RNA interference involves the duplex branch possess ing the less stable 5'end [60]. It is likely that the choice of the functional branch is also affected by loops at miRNA positions 9–13 (Fig. 2). As a result, the miRNA from the duplex branch less shielded by secondary structure is more prone to degradation. In addition, it is conjectured that loops in the secondary structure make the miRNA longer in accordance with loop sizes [61]. This conjecture is in agreement with our observations that functional 5' and 3'branches of human premiRNAs are approximately equivalent in both frequency (74.8 and 74.7% for 5' and 3'branches, respectively) and lengths of miRNAs in them (21.6 and 21.7 nt for 5' and 3'branches, respectively). Some of the miRNAs are produced by noncanoni cal pathways. Most of them are mirtrons [56]. These miRNAs skip the step of primary transcript editing, and their precursors form from introns by splicing and truncation of RNA ends [62]. Another noncanonical pathway of miRNA biogenesis occurs in the miR451 family, where the aforementioned third step is replaced by catalytic editing of premiRNAs by the Ago2 protein followed by involvement of the 5'branch of the precursor in RNA interference [63]. The ratio of the canonical and noncanonical pathways in miR451 depends on the length of the premiRNA hairpin [64]. This is an additional argument for the importance of the precursor secondary structure for miRNA biogenesis. Mutations in premiRNAs can alter their second ary structures and thereby affect their processing. However, most replacements are found near miRNA ends and in the central portion [65]. This fact agrees MOLECULAR BIOLOGY

Vol. 49

No. 5

2015

WEB SERVER FOR PREDICTION OF miRNAs AND THEIR PRECURSORS

with the location of miRNA loops observed in our study (Fig. 2). Singlenucleotide polymorphisms (SNPs) are of special interest among mutations. It is expected that they, owing to their short range, should exert a weak action on the secondary structures of pre miRNAs, altering their shapes and free energy values rather than entirely disrupting them. Singlenucle otide polymorphisms are less frequently noted in pre miRNAs than in flanking sequences [66]. They occur even less frequently at miRNA positions 2–8 (comple mentary to the 3'untranslated region of the mRNA). In general, an SNP can alter the expression rate and function of the miRNA [67]. Thus, the range of variabil ity of a miRNA precursor indicates that secondary struc ture conservation is essential for miRNA processing. In addition to the influence on processing, muta tions can modify miRNA functions [68]. It is conceiv able that in the miR548 family, mentioned above [46], mutations shift miRNA borders in the premiRNAs and change the targets. Positions 2–8 may be conservative because miRNAs and their functions should have been conserved in the course of evolution [66]. In a similar way to the effect of mutations in miRNAs on their functions, changes in the nucleotide sequence of the target may generate new binding sites or disrupt exist ing ones [67]. With regard to all these observations, it is no won der that nearly all methods of miRNA prediction invoke information on the secondary structure and its parameters. Consideration of this information in search methods improves the quality of prediction of miRNAs and their precursors [22, 69]. We combine Web applications for prediction of dis tant and close homologs of human miRNAs, their pre cursors, and binding sites. Close homologs are predicted by alignment against known miRNAs of 223 species (miRBase, release 21.0). Ab initio search for human pre miRNAs and their distant miRNA homologs is done by using statistical classifiers, namely, contextstruc tural hidden Markov models; information on the pri mary and secondary structures of premiRNAs; and detected regularities of the location of miRNAs in their precursors. The implemented methods allow highly precise detection of premiRNAs in an arbi trary nucleotide sequence and prediction of miRNA– miRNA* duplexes in candidates. The accuracy of the prediction of miRNA boundaries in the mode of detection of one pair is 3.13 nt and in the detection of two miRNA–miRNA* pairs, 1.61 nt. The program predicting binding sites allows for the calculation of candidate miRNA binding sites in the RNA sequence and the construction of a plot that shows the probability of miRNA that binds to each position in the RNA. The methods are based on the cal culation of complete complementarity of positions 2–8 at the 5'end of a miRNA to the site sequence and the thermodynamic stability of the miRNA–mRNA duplex. The construction of the probabilistic profile of miRNA–mRNA binding assumes competition of MOLECULAR BIOLOGY

Vol. 49

No. 5

2015

759

miRNAs for the binding site; it uses calculation of par tition functions for miRNA–mRNA duplexes and adapted dynamic programming algorithms. The prob ability plot allows recognition of multiple binding sites, able to inhibit mRNA translation at significant miRNA concentrations [32]. ACKNOWLEDGMENTS This work is supported by the Russian Scientific Foundation, project 142400123 “Systems Compu tational Biology: Analysis and Modeling of the Struc ture–Functional Organization and Evolution of Gene Networks.” REFERENCES 1. Bartel D.P. 2004. MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell. 116, 281–297. 2. Lawrie C.H. 2014. MicroRNAs in Medicine. New Jer sey: Wiley. 3. GriffithsJones S., Bateman A., Marshall M., Khanna A., Eddy S.R. 2003. Rfam: An RNA family database. Nucleic Acids Res. 31, 439–441. 4. Weber M.J. 2005. New human and mouse microRNA genes found by homology search. FEBS J. 272, 59–73. 5. Lim L.P., Lau N.C., Weinstein E.G., Abdelhakim A., Yekta S., Rhoades M.W., Burge C.B., Bartel D.P. 2003. The microRNAs of Caenorhabditis elegans. Genes Dev. 17, 991–1008. 6. Lai E.C., Tomancak P., Williams R.W., Rubin G.M. 2003. Computational identification of Drosophila microRNA genes. Genome Biol. 4, R42. 7. Wang X., Zhang J., Li F., Gu J., He T., Zhang X., Li Y. 2005. MicroRNA identification based on sequence and structure alignment. Bioinformatics. 21, 3610–3614. 8. Grundhoff A., Sullivan C.S., Ganem D. 2006. A com bined computational and microarraybased approach identifies novel microRNAs encoded by human gammaherpes viruses. RNA. 12, 733–750. 9. Wu Y., Wei B., Liu H., Li T., Rayner S. 2011. MiRPara: A SVMbased software tool for prediction of most probable microRNA coding regions in genome scale sequences. BMC Bioinform. 12, 107. 10. Tempel S., Tahi F. 2012. A fast abinitio method for predicting miRNA precursors in genomes. Nucleic Acids Res. 40, e80. 11. Sewer A., Paul N., Landgraf P., Aravin A., Pfeffer S., Brownstein M., Tuschl T., Nimwegen E., Zavolan M. 2005. Identification of clustered microRNAs using an ab initio prediction method. BMC Bioinform. 6, 267. 12. Xue С., Li A., He E., Liu P., Li Y., Zhang X. 2005. Clas sification of real and pseudo microRNA precursors using local structuresequence features and support vector machine. BMC Bioinform. 6, 310. 13. Hertel J., Stadler P.F. 2006. Hairpins in a Haystack: Recognizing microRNA precursors in comparative genomics data. Bioinformatics. 22, e197–e202. 14. Huang T.H., Fan B., Rothschild M.F., Hu Z.L., Li K. 2007. MiRFinder: An improved approach and software

760

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28. 29.

30.

VOROZHEYKIN, TITOV implementation for genomewide fast microRNA pre cursor scans. BMC Bioinform. 8, 341. Batuwita R., Palade V. 2009. MicroPred: Effective clas sification of premiRNAs for human miRNA gene pre diction. Bioinformatics. 25, 989–995. Kumar S., Ansari F.A., Scaria V. 2009. Prediction of viral microRNA precursors based on human microRNA precursor sequence and structural features. Virol. J. 6, 129. Xuan P., Guo M., Huang Y., Li W., Huang Y. 2011. Mature Pred: Efficient identification of microRNAs within novel plant premiRNAs. PLOS ONE. 6, e27422. He C., Li Y.X., Zhang G., Gu Z., Yang R., Li J., Wang J. 2012. MiRmat: Mature microRNA sequence predic tion. PLOS ONE. 7, e51673. Yousef M., Nebozhyn M., Shatkay H., Kanterakis S., Showe L.C. 2006. Combining multispecies genomic data for microRNA identification using a Naive Bayes classifier. Bioinformatics. 22, 1325–1334. Gkirtzou K., Tsamardinos I., Tsakalides P., Poirazi P. 2010. MatureBayes: A Probabilistic algorithm for iden tifying the mature miRNA within novel precursors. PLOS ONE. 5, e11843. Tyagi S., Vaz C., Gupta V., Bhatia R., Maheshwari S., Srinivasan A., Bhattacharya A. 2008. CIDmiRNA: A web server for prediction of novel miRNA precursors in human genome. Biochem. Biophys. Res. Comm. 372, 831–834. Nam J.W., Shin K.R., Han J., Lee Y., Kim V.N., Zhang B.T. 2005. Human microRNA prediction through a probabilistic colearning model of sequence and structure. Nucleic Acids Res. 33, 3570–3581. Terai G., Komori T., Asai K., Kin T. 2007. MiRRim: A novel system to find conserved miRNAs with high sen sitivity and specificity. RNA. 13, 2081–2090. Oulas A., Boutla A., Gkirtzou K., Reczko M., Kalanti dis K., Poirazi P. 2009. Prediction of novel microRNA genes in cancerassociated genomic regions: A com bined computational and experimental approach. Nucleic Acids Res. 37, 3276–3287. Stark A., Brennecke J., Russell R.B., Cohen S.M. 2003. Identification of Drosophila microRNA targets. PLoS Biol. 1, E60. Kiriakidou M., Nelson P.T., Kouranov A., Fitziev P., Bouyioukos C., Mourelatos Z., Hatzigeorgiou A. 2004. A combined computationalexperimental approach predicts human microRNA targets. Genes Dev. 18, 1165–1178. Rehmsmeier M., Steffen P., Hochsmann M., Giege rich R. 2004. Fast and effective prediction of microRNA/target duplexes. RNA. 10, 1507–1517. Maziere P., Enright A.J. 2007. Prediction of microRNA targets. Drug Discov. Today. 12, 452–458. Enright A.J., John B., Gaul U., Tuschl T., Sander C., Marks, D. S. 2003. MicroRNA targets in Drosophila. Genome Biol. 5, R1. Lewis B.P., Burge C.B., Bartel D.P. 2004. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 120, 15–20.

31. Brennecke J., Stark A., Russell R.B., Cohen S.M. 2005. Principles of microRNAtarget recognition. PLoS Biol. 3, e85. 32. Bartel D.P. 2009. MicroRNAs: Target recognition and regulatory functions. Cell. 136, 215–233. 33. Kim S.K., Nam J.W., Rhee J.K., Lee W.J., Zhang B.T. 2006. MiTarget: MicroRNA target gene prediction using a support vector machine. BMC Bioinform. 7, 411. 34. Wang X., El Naqa I.M. 2008. Prediction of both con served and nonconserved microRNA targets in ani mals. Bioinformatics. 24, 325–332. 35. ReyesHerrera P.H., Ficarra E., Acquaviva A., Macii E. 2011. MiREE: miRNA recognition elements ensem ble. BMC Bioinform. 12, 454. 36. Ragan C., Zuker M., Ragan M.A. 2011. Quantitative prediction of miRNA–mRNA interaction based on equilibrium concentrations. PLoS Comp. Biol. 7, e1001090. 37. Betel D., Wilson M., Gabow A., Marks D.S., Sander C. 2008. The microRNA.org resource: Targets and expres sion. Nucleic Acids Res. 36, D149–D153. 38. Maragkakis M., Reczko M., Simossis V.A., Alexiou P., Papadopoulos G.L., Dalamagas T., Hatzigeorgiou A.G. 2009. DIANAmicroT web server: Elucidating microRNA functions through target prediction. Nucleic Acids Res. 37, W273–W276. 39. Dweep H., Sticht C., Pandey P., Gretz N. 2011. MiR Walk database: Prediction of possible miRNA binding sites by “walking” the genes of three genomes. J. Biomed. Inform. 44, 839–847. 40. Kozomara A., GriffithsJones S. 2011. MiRBase: Inte grating microRNA annotation and deepsequencing data. Nucleic Acids Res. 39, D152–D157. 41. Yang J.H., Li J.H., Shao P., Zhou H., Chen Y.Q., Qu L.H. 2011. StarBase: A database for exploring microRNA–mRNA interaction maps from Argonaute CLIPSeq and DegradomeSeq data. Nucleic Acids Res. 39, D202–D209. 42. Titov, I.I. Vorob’ev, D.G., Ivanisenko V.A., Kolchanov N.A. 2002. A rapid genetic algorithm for analyzing RNA sec ondary structure. Izv. Akad. Nauk, Ser. Khim. 7, 1047– 1056. 43. GriffithsJones S., Grocock R.J., van Dongen S., Bate man A., Enright A.J. 2006. MiRBase: MicroRNAse quences, targetsandgene nomenclature. Nucleic Acids Res. 34, D140–D144. 44. Kozomara A., GriffithsJones S. 2014. miRBase: Annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 42, D68–D73. 45. Titov I.I., Vorozheykin P.S. 2011. miRNAcontaining human transposable elements. Vavilov. Zh. Genet. Sele kts. 15, 323–326. 46. Titov I.I., Vorozheykin P.S. 2011. Analysis of miRNA gene duplication in the human genome and the role of transposable element evolution in this process. Vavilov. Zh. Genet. Selekts. 15, 139–147. 47. LeeY., JeonK., LeeJ.T., KimS., KimV.N. 2002. MicroRNA maturation: Stepwise processing and sub cellular localization. EMBO J. 21, 4663–4670. MOLECULAR BIOLOGY

Vol. 49

No. 5

2015

WEB SERVER FOR PREDICTION OF miRNAs AND THEIR PRECURSORS 48. Lee Y., Kim M., Han J., Yeom K.H., Lee S., Baek S.H., Kim V.N. 2004. MicroRNA genes are transcribed by RNA polymerase II. EMBO J. 23, 4051–4060. 49. Winter J., Jung S., Keller S., Gregory R.I., Diederichs S. 2009. Many roads to maturity: MicroRNA biogenesis pathways and their regulation. Nat. Cell Biol. 11, 228– 234. 50. Zisoulis D.G., Kai Z.S., Chang R.K., Pasquinelli A.E. 2012. Autoregulation of microRNA biogenesis by let7 and Argonaute. Nature. 486, 541–544. 51. Tang R., Li L., Zhu D., Hou D., Cao T., Gu H., Zen K. 2012. Mouse miRNA709 directly regulates miRNA 15a/161 biogenesis at the posttranscriptional level in the nucleus: Evidence for a microRNA hierarchy sys tem. Cell Res. 22, 504–515. 52. Libri V., Miesen P., van Rij R.P., Buck A.H. 2013. Reg ulation of microRNA biogenesis and turnover by ani mals and their viruses. Cell. Mol. Life Sci. 70, 3525– 3544. 53. Michlewski G., Guil S., Semple C.A., Cáceres J.F. 2008. Posttranscriptional regulation of miRNAs har boring conserved terminal loops. Mol. Cell. 32, 383– 393. 54. Lee Y., Ahn C., Han J., Choi H., Kim J., Yim J., Kim V.N. 2003. The nuclear RNase III Drosha initiates microRNA processing. Nature. 425, 415–419. 55. Okada C., Yamashita E., Lee S.J., Shibata S., Katahira J., Nakagawa A., Tsukihara T. 2009. A highresolution structure of the premicroRNA nuclear export machinery. Science. 326, 1275–1279. 56. Ruby J.G., Jan C.H., Bartel D.P. 2007. Intronic microRNA precursors that bypass Drosha processing. Nature. 448, 83–86. 57. Newman M.A., Mani V., Hammond S.M. 2011. Deep sequencing of microRNA precursors reveals extensive 3' end modification. RNA. 17, 1795–1803. 58. Xia H., Li F., He T., Li Y. 2005. Distribution of mature microRNA on its precursor: A new character for microRNA prediction. Int. J. Inform. Tech. 11, 1–8. 59. Förstemann K., Horwich M.D., Wee L., Tomari Y., Zamore P.D. 2007. Drosophila microRNAs are sorted

MOLECULAR BIOLOGY

Vol. 49

No. 5

2015

60.

61.

62. 63.

64.

65.

66.

67.

68.

69.

761

into functionally distinct Argonaute complexes after production by Dicer1. Cell. 130, 287–297. Khvorova A., Reynolds A., Jayasena S.D. 2003. Func tional siRNAs and miRNAs exhibit strand bias. Cell. 115, 209–216. StaregaRoslan J., Krol J., Koscianska E., Kozlowski P., Szlachcic W.J., Sobczak K., Krzyzosiak W.J. 2011. Structural basis of microRNA length variety. Nucleic Acids Res. 39, 257–268. Czech B., Hannon G.J. 2011. Small RNA sorting: Matchmaking for Argonautes. Nat. Rev. Genet. 12, 19–31. Cifuentes D., Xue H., Taylor D.W., Patnode H., Mish ima Y., Cheloufi S., Giraldez A.J. 2010. A novel miRNA processing pathway independent of Dicer requires Argonaute 2 catalytic activity. Science. 328, 1694–1698. Yang J.S., Maurin T., Lai E.C. 2012. Functional parameters of Dicerindependent microRNA biogene sis. RNA. 18, 945–957. Wheeler B.M., Heimberg A.M., Moy V.N., Sperling E.A., Holstein T.W., Heber S., Peterson K.J. 2009. The deep evolution of metazoan microRNAs. Evol. Dev. 11, 50–68. Gong J., Tong Y., Zhang H.M., Wang K., Hu T., Shan G., Guo A.Y. 2012. Genomewide identification of SNPs in microRNA genes and the SNP effects on microRNA target binding and biogenesis. Hum. Mutat. 33, 254– 263. Jin Y., Lee C.G. 2013. Single nucleotide polymor phisms associated with microRNA regulation. Biomol ecules. 3, 287–302. Berezikov E. 2011. Evolution of microRNA diversity and regulation in animals. Nat. Rev. Genet. 12, 846– 860. Krol J., Sobczak K. 2004. Structural features of microRNA (miRNA) precursors and their relevance to miRNA biogenesis and small interfering RNA/short hairpin RNA design. J. Biol. Chem. 279, 42230–42239.

Translated by V. Gulevich