IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,
VOL. 9,
NO. 3,
MAY/JUNE 2012
641
High Performance Computational Systems Biology Tommaso Mazza
Ç
T
understanding of living systems is attracting the attention of researchers from different areas, a phenomenon that is witnessed by the impressive amount of data that new generation instruments have been relentlessly producing. Besides, the computational needs have grown by following the same trend. The motivations are to be ascribed to the significant costs for storage and computation, as well as to the pressing demand of targeted technical skills. For these reasons, IT has a fundamental responsibility in supporting the activity of life scientists, providing them with efficient computational means to advance their goals. Since the study of living systems is a very heterogeneous venture, research efforts in life sciences have resulted, so far, in the development of a multitude of software tools for biology which have been roughly assembled around the following elements: modeling languages, simulation, analysis and data management. Many of these tools were then designed to work on high performance architectures, namely, multicore workstations or computer clusters. The migration from classic standalone computational paradigms allowed researchers to meet important goals both in terms of computation scalability and algorithmic practicability. This special section includes a selection of papers presented at the Second International Workshop on High Performance Computational Systems Biology (HiBi 2010), which was colocated with the joint ICGT/SPIN conference, that was held at the University of Twente, Enschede, The Netherlands, on 27-29 September 2010. The HiBi workshop establishes a forum to link researchers in the areas of parallel computing and computational systems biology. It is a unique opportunity for experts from around the world to present their current work, discuss profound challenges, new ideas, results, applications, and their experience relating to key aspects of high performance computing in biology. In 2010, 42 papers were submitted in response to the call for papers, out of which nine papers appeared in the proceedings published by the IEEE Computer Society. The authors of the best workshop papers were invited to submit an extended version to this special section. Following a rigorous review process, five papers were selected for publication. HE
. T. Mazza is with the Bioinformatics Unit, IRCCS Casa Sollievo della Sofferenza, Viale Cappuccini, 1, 71013 San Giovanni Rotondo (FG) IT, Italy. E-mail:
[email protected]. For information on obtaining reprints of this article, please send e-mail to:
[email protected]. 1545-5963/12/$31.00 ß 2012 IEEE
The selected papers cover a broad range of bioinformatics topics, including medical imaging, Markov clustering, simulation, model checking, and gene networks. The first paper, “Constructing Complex 3D Biological Environments from Medical Imaging Using High Performance Computing” by Mark Burkitt, Dawn Walker, Daniela M. Romano, and Alireza Fazeli, makes use of multicore CPUs and GPUs to extract information about the shape, size, and path of the histology images of the human fallopian tube. They create, render, and simulate a unique but realistic 3D virtual organ, and check its dynamics against some real evidences. In the second paper, “Smoldyn on Graphics Processing Units: Massively Parallel Brownian Dynamics Simulation,” Lorenzo Dematte´ proposes an improved method to simulate the Brownian dynamics of particle systems. He capitalizes on the concept of space, which is an as important as neglected feature of the most interactive systems. Against the intrinsic complexity of this task, he enriches Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, with a new implementation that exploits the computing power of GPUs. The third paper, entitled “Reverse Engineering and Analysis of Genome-Wide Gene Regulatory Networks from Gene Expression Profiles Using High-Performance Computing” by Vincenzo Belcastro, Francesco Gregoretti, Velia Siciliano, Michele Santoro, Giovanni D’Angelo, Gennaro Oliva, and Diego di Bernardo, deals with the regulation of gene expression and, in particular, tackles the problem of reconstructing the regulatory interactions among genes from genome-scale measurements of microarrays by “reverse-engineering.” The authors design and develop two parallel computing algorithms, one to compute the pairwise Mutual Information between each gene-pair and the other to discover “communities” within the genes network. “Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format” by Alhadi Bustaman, Kevin Burrage, and Nicholas A. Hamilton presents an efficient manner to determine clusters in network through Markov clustering. To this aim, the authors identify the core bottlenecks in the matrix-matrix computations and in the Markov matrix normalizations, and then offer a CUDAbased software solution. By using the ELLPACK-R sparse format, they allow us to effectively cope with the sparse nature of interaction networks data sets common in bioinformatics applications. Published by the IEEE CS, CI, and EMB Societies & the ACM
642
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,
Finally, Jir´ı Barnat, Lubos Brim, Adam Krejc´ı, Adam ´ nek, Martin Vejna´r, and Toma´s VejpusStreck, David Safra tek present “On Parameter Synthesis by Parallel Model Checking.” They deal with the important problem of analyzing the dynamics of abstract models of biological systems under parameter uncertainty. They venture into this problem by a new algorithm for parameter synthesis based on parallel model checking. The algorithm is conceptually universal, in the sense that it is not constrained by any modeling language. Its efficiency is measured to its impressive scalability, while its soundness is proved on several biological models. I would like to thank the program committee members and external reviewers for volunteering their time to review the submissions to the workshop. I would also like to thank the Editor-in-Chief, Dr. Marie-France Sagot, for having given me the possibility to disseminate the exciting research presented at HiBi in this esteemed journal. I cannot forget to thank Paolo Ballarini and Davide Prandi, for having organized with me all the past editions of HiBi, and— especially—for their renewed friendship. Tommaso Mazza Guest Editor
ACKNOWLEDGMENTS The work of Tommaso Mazza is fully supported by the “RicercaCorrente 2012” funding granted by the Italian Ministry of Health and by the “5x1000” voluntary contributions.
VOL. 9,
NO. 3,
MAY/JUNE 2012
Tommaso Mazza studied computer science engineering at the University of Calabria and received the PhD degree in computer science and biomedical engineering from the “Magna Graecia” University of Catanzaro in November 2007. He was a visiting scientist at COSBI in March 2006 and at Microsoft Research Cambridge in the summer of 2006. In 2007, he joined the Bioinformatics Italian Society (B.IT.S.). He joined COSBI in January 2008 where he dealt with parallel stochastic simulation of biological systems, biological data storing and handling, and design of high performance software solutions for systems biology. From October 2010 until July 2011, he worked in the Laboratory of Translational Genomics at CIBIO, where he was involved in the job of structuring AURA: the Atlas of UTR Regulatory Activity. Since July 2011, he has held the position of vice-chair for Europe of the IEEE Technical Committee for Simulation (TC-SIM) and of researcher at Casa Sollievodella Sofferenza (FG) and CSS-Mendel (Rome), Italy, where he leads the Bioinformatics unit.
. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.