Guest Editors' Introduction to the Special Section: Computational ...

1 downloads 0 Views 83KB Size Report
biological data sets and, thus, computational intelligence methods can ... Y.-Q. Zhang is with the Department of Computer Science, Georgia State. University, PO ...
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,

VOL. 4,

NO. 2,

APRIL-JUNE 2007

161

Guest Editors’ Introduction to the Special Section: Computational Intelligence Approaches in Computational Biology and Bioinformatics Jagath C. Rajapakse, Yan-Qing Zhang, and Gary B. Fogel

Ç

I

recent years, the application of neural networks, fuzzy systems, and evolutionary algorithms to problems in the biosciences has increased. These methods, collectively considered “computational intelligence” paradigms, offer the advantage of being able to handle nonlinearity in data, large search spaces, and conditions where data classification is continuous rather than discrete. Such complex mappings and imprecise information are common in biological data sets and, thus, computational intelligence methods can assist in producing novel solutions to difficult problems without expert assistance. This special section highlights not only the application of these algorithms to biological problems, but also their possible combinations, such as optimization of neural networks with evolutionary computation. The papers included in this special section provide a broad perspective on possible applications and it is hoped that these papers will serve as the basis for future application and testing of these approaches. Fifty-one papers were submitted to the special section. The guest editors would like to thank all of the authors and reviewers for their great effort in this regard. After an extensive review and revision process, 10 papers were selected for inclusion in this special section due to space limitations. Below, we provide a summary of the papers contained in the special section. The paper “Poisson-Based Self-Organizing Feature Maps and Hierarchical Clustering for Serial Analysis of Gene Expression Data” by Haiying Wang, Huiru Zheng, and Francisco Azuaje provides the results of two new clustering approaches for serial analysis of gene expression (SAGE) methods that include the hybridization of selforganizing maps and hierarchical clustering, two approaches that are typically treated independently. This N

. J.C. Rajapakse is with the School of Computer Engineering, Nanyang Technological University, Blk N4-2a05, 50 Nanyang Avenue, Singapore 639798. E-mail: [email protected]. . Y.-Q. Zhang is with the Department of Computer Science, Georgia State University, PO Box 3994, Atlanta, GA 30302-3994. E-mail: [email protected]. . G.B. Fogel is with Natural Selection, Inc., 3333 North Torrey Pines Ct., Suite 200, La Jolla, CA 92037. E-mail: [email protected]. For information on obtaining reprints of this article, please send e-mail to: [email protected]. 1545-5963/07/$25.00 ß 2007 IEEE

new approach improves pattern discovery and visualization when working with SAGE data. “Relational Analysis of CpG Islands Methylation and Gene Expression in Human Lymphomas Using Possibilistic C-Means Clustering and Modified Cluster Fuzzy Density” by Ozy Sjahputera, James M. Keller, J. Wade Davis, Kristen H. Taylor, Farahnaz Rahmatpanah, Huidong Shi, Derek T. Anderson, Samual N. Blisard, Robert H. Luke III, Mihail Popescu, Gerald C. Arthur, and Charles W. Caldwell presents an important contribution in the application of fuzzy logic to gene expression analysis. In particular, possibilistic c-means and cluster fuzzy density are used to calculate measures of confidence regarding methylationexpression relationships in human non-Hodgkins lymphoma subclasses. Such data mining tools are found to be appropriate for exploration of large biological, data sets. In “Interactive Semisupervised Learning for Microarray Analysis” by Yijuan Lu, Qi Tian, Feng Liu, Maribel Sanchez, and Yufeng Wang, the concept of relevance feedback is proposed for an interactive machine learning approach to incorporate expert knowledge in microarray analysis. A kernel discriminant approach is also introduced to account for nonlinearity in the data. Analysis of yeast and Plasmodium gene expression patterns with these methods indicates promise for this combined approach. “On the Classification of a Small Imbalanced Cytogenetic Image Database” by Boaz Lerner, Josepha Yeshaya, and Lev Koushnir utilizes naive Bayesian classifiers and multilayer perceptron neural networks for the classification of genetic abnormalities using in situ hybridization fluorescence signals. The approach also makes use of a technique to “balance” the data by upsampling minority classes followed by additional dimensionality reduction. The paper “Gradient-Based Optimization of KernelTarget Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection” by Christian Igel, Tobias Glasmachers, Britta Mersch, Nico Pfeifer, and Peter Meinicke introduces a gradient-based approach for the optimization of kerneltarget alignment. The method is evaluated by adapting olignucleotide kernels for the detection of transcription start sites in bacteria. In “Subcellular Localization Prediction with New Protein ¨ . Mumcuo Encoding Schemes,” Hasan O gul and Erkan U glu introduce two methods for the encoding of protein Published by the IEEE CS, CI, and EMB Societies & the ACM

162

IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,

sequence information for use with support vector machines for subcellular localization prediction for cytoplasmic, extracellular, mitochondrial, and nuclear proteins. “Dynamical Systems for Discovering Protein Complexes and Functional Modules from Biological Networks” by Wenyuan Li, Ying Liu, Hung-Chung Huang, Yanxiong Peng, Yongjing Lin, Wee-Keong Ng, and Kok-Leong Ong utilizes both evolutionary algorithms and artificial neural networks for the heaviest k-subgraph problem in relation to the analysis of biomolecular networks and demonstrates utility on large-scale networks. The paper “Data Mining and Predictive Modeling of Biomolecular Network from Biomedical Literature Databases” by Xiaohua Hu and Daniel D. Wu introduces a new method for the data mining of biomolecular networks from literature databases that integrates text mining and predictive modeling. The scale-free networks that result from text mining are automatically clustered into meaning subnetworks for ease of user interpretation. “An Adaptive Multimeme Algorithm for Designing HIV Multidrug Therapies” by Ferrante Neri, Jari Toivanen, Giuseppe L. Cascella, and Yew-Soon Ong develops a period representation for viral therapy regimen modeling and couples this with a memetic algorithm for the design of optimal therapies. The method shows great promise relative to three popular metaheuristics. In “Multiobjective Optimization in Bioinformatics and Computational Biology,” Julia Handl, Douglas B. Kell, and Joshua Knowles provide a broad survey of applications of multiobjective optimization in the above areas. This primer outlines five contexts for the use of multiobjective optimization and highlights this useful computational intelligence approach. The editors would like to thank Suzanne Wagner, Mari Padilla, Susan Miller, and Dan Gusfield for their assistance with this special section. Jagath C. Rajapakse Yan-Qing Zhang Gary B. Fogel Guest Editors

VOL. 4,

NO. 2,

APRIL-JUNE 2007

Jagath C. Rajapakse received the BSc (engineering) degree with First Class Honors in electronic and telecommunication engineering from the University of Moratuwa, Sri Lanka, and the MSc and PhD degrees in electrical and computer engineering from the State University of New York at Buffalo. He is an associate professor in the School of Computer Engineering (SCE) and the Deputy Director of the BioInformatics Research Centre (BIRC) at the Nanyang Technological University (NTU), Singapore. He was a visiting professor at the Biological Engineering Division, Massachusetts Institute of Technology, from 2005-2006. Before joining NTU, he was a visiting fellow at the National Institutes of Mental Health, Bethesda, Maryland, and a visiting scientist at the Max-Planck-Institute of Cognitive and Brain Sciences, Leipzig, Germany. Dr. Rajapakse’s research investigates human brain function using imaging and bioinformatics techniques, leading to new drugs and behavioral or stem cell therapies for brain disease. His current research interests are in gene networks, protein interactions, neural systems, and pathways. He has authored more than 200 research publications in refereed journals, books, and conference proceedings in the fields of brain imaging, computational biology, and machine learning. He is an associate editor of the IEEE/ACM Transactions on Computational Biology and Bioinformatics and the chair of the IAPR Technical Committee in Bioinformatics. In 2007, he will serve as cochair of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB ’07) in Hawaii, program cochair of the European Conference on Evolutionary Computation, Machine Learning, and Data Mining in Bioinformatics (EvoBio ’07) in Valencia, and general chair of the Second IAPR Workshop on Pattern Recognition in Bioinformatics (PRIB ’07), Singapore. Yan-Qing Zhang received the BS and MS degrees in computer science and engineering from Tianjin University, China, in 1983 and 1986, respectively, and the PhD degree in computer science and engineering from the University of South Florida, Tampa, in 1997. He is currently an associate professor in the Computer Science Department at Georgia State University, Atlanta. His research interests include hybrid intelligent systems, data mining, bioinformatics, medical informatics, computational Web intelligence, computational intelligence, granular computing, and statistical learning. He has coauthored two books and coedited two other books. He has published 12 book chapters, more than 50 journal papers, and more than 100 conference papers. He is an associate editor of the Journal of Computational Intelligence in Bioinformatics and a member of the editorial board the International Journal of Data Mining and Bioinformatics. He was a program cochair of the 2006 IEEE International Conference on Granular Computing and the 2005 IEEE-ICDM Workshop on MultiAgent Data Warehousing and MultiAgent Data Mining. He is a member of the Bioinformatics and Bioengineering Technical Committee of the Computational Intelligence Society of the IEEE and the Technical Committee on Pattern Recognition for Bioinformatics of the International Association of Pattern Recognition. Gary B. Fogel received the BA degree in biology from the University of California, Santa Cruz, in 1991 and the PhD degree in biology from the University of California, Los Angeles, in 1998. He is currently vice president of Natural Selection, Inc., in La Jolla, California. His experience includes more than 13 years of applying computational intelligence methods to bioinformatics problems. He has more than 40 publications in the technical literature, the majority treating the science and application of evolutionary computation, and he is the coeditor of the book Evolutionary Computation in Bioinformatics (Morgan Kauffman). He also serves as an associate editor for the IEEE/ACM Transactions on Computational Biology and Bioinformatics, IEEE Computational Intelligence Magazine, and IEEE Transactions on Evolutionary Computation, and is a member of the editorial boards of four other technical journals. Dr. Fogel was chair of the IEEE Computation Intelligence Society Bioinformatics and Bioengineering Technical Committee (2004-2005). He is a senior member of the IEEE.