Federated Clouds for Biomedical Research: Integrating OpenStack for ...

3 downloads 452 Views 784KB Size Report
Keywords— cloud, openstack, biomedicine, federation, network. I. INTRODUCTION .... science gateways to support collaborative cloud computing platforms in ...
Federated Clouds for Biomedical Research: Integrating OpenStack for ICTBioMed

Cezary Mazurek, Juliusz Hemant Darbari, Amit Saxena, Paul Brenner, Sandra Gesing, Pukacki, Michal Kosiedowski, Rajendra Joshi Jarek Nabrzyski Szymon Trocha C-DAC, Pune, India CRC, Notre Dame, USA PSNC, Poznan, Poland [email protected]

[email protected]

Michael Sullivan

Devdatt Dubhashi, Subazini Thankaswamy

Internet2,Washington D.C., USA [email protected]

Chalmers, Gothenburg, Sweden [email protected]

Abstract—Increasingly complex biomedical data from diverse sources demands large storage, efficient software and high performance computing for the data’s computationally intensive analysis. Cloud technology provides flexible storage and data processing capacity to aggregate and analyze complex data; facilitating knowledge sharing and integration from different disciplines in a collaborative research environment. The ICTBioMed collaborative is a team of internationally renowned academic and medical research institutions committed to advancing discovery in biomedicine. In this work we describe the cloud framework design, development, and associated software platform and tools we are working to develop, federate and deploy in a coordinated and evolving manner to accelerate research developments in the biomedical field. Further, we highlight some of the essential considerations and challenges to deploying a complex open architecture cloud-based research infrastructure with numerous software components, internationally distributed infrastructure and a diverse user base. Keywords— cloud, openstack, biomedicine, federation, network

I. INTRODUCTION The Open Health Systems Laboratory (OHSL) established a consortium of institutions and practitioners around the world to participate in ICTBioMed, the International Consortium for Technology in Biomedicine. Initial institutional members include the Center For Development of Advanced Computing (C-DAC), Pune, India; Chalmers University Life Sciences Supercomputing Networking Center, Gothenburg, Sweden; Poznań Supercomputing and Networking Center, Poznań, Poland; the University of Notre Dame Center for Research Computing, Notre Dame, USA, and Internet2, USA. Additionally, subject area and field experts are comprised of staff from Arizona State University’s Computational Sciences and Complex Adaptive Systems Initiative; Duke Comprehensive Cancer Center, and TATA Memorial Center in India. The ICTBioMed collaborative is united in a broad set of objectives to produce an open, portable and largely system independent Platform as a Service (PaaS) for Biomedical Research. The goal of the collaboration is to specify, provide and evolve where appropriate, necessary tools in the areas of knowledge integration, network analytics, and systems interoperability for next generation biomedical inquiry. These tools will focus and provide utility across all areas of

[email protected]

Anil Srivastava Open Health Systems Laboratory Rockville, USA [email protected]

biomedical research including but not limited to genomics, bioinformatics, and domain specific medical communities such as disease prevention and cancer research. The platform will have the ability to interact with a large number of diverse databases from molecular experiments to tissue records coupled with video and audio datasets. Further, software will be developed for analysis and knowledge extraction by scientists and physicians, to accelerate lab research, facilitate clinical trials, and manage patients’ treatments. In the initial phases the developed infrastructure, platforms, software and tools are being provided to a subset of practicing scientists, physicians, and surgeons to get their feedback in navigating through the system, with recurring system evolution based on their feedback and requirements. In this project, the consortium aims to integrate, adapt, and deploy a wide spectrum of tested and successful informatics and high-performance computing technologies for biomedical researchers and clinicians. Within such a networking laboratory these technologies and tools will be rapidly (e.g., in an agile methodology) applied and verified in the biomedical domains through joined mini projects, which we call Proof-ofConcept demos. By also developing new communication strategies, we hope to reach the clinicians who will use these tools for diagnosis and treatment of their patients, and because the collaboration of consortium partners is global in nature, these applications will have a wider reach. We recognize that one of the challenges facing multiple data banks is the integration of data as well as software tools used for their analysis. Earlier attempts of making software tools and databases compatible did not provide the desired results as the flexibility in using these tools was lost and the users found it difficult to learn new and unfamiliar software. One of the objectives of our collaboration is to approach the issue of tool and data integration using a new end user-centric approach. II. ICTBIOMED PLATFORM The platform architecture presented in Figure 1 comprises five horizontal layers. They represent different levels of abstraction of the resources deployed on the platform. The lowest layer, Global Network Connectivity is responsible for providing reliable high performance network links between different, distributed components of the system. The next layer is made

Figure 1: Core Platform Components and Functions up of hardware resources and interfaces used for accessing them. On the Biomedical Data layer there are databases and repositories that are built and served on the platform. It is covered with security infrastructure responsible for controlling the access to the underlying data and services. The top layer of the architecture consists of applications, end users tools (such as workflow and scripting interfaces) and access to the different functionalities available at lower levels. Vertical ovals on the architecture diagrams represent different domain level scenarios supported by the system. It is possible to build such scenarios by gathering components from different layers. Some components can take part in many different scenarios (e.g., services on hardware or networking layer), but some of them are dedicated to the specific research purposes (e.g., applications on the User Access Layer). To create new scenarios it is possible to reuse components already deployed on the platform. A. Global network connectivity It is assumed that the platform will be used by an internationally distributed community. It also implies a very broad distribution of the resource components. Thus, the platform must be built on high-performance reliable network connections, which fulfill the distributed infrastructure and

user requirements. It should also provide mechanisms for dynamic network management (Software Defined Networks) and monitoring. In the ideal case network links are reserved automatically by a system request based on the user’s activity (e.g., increasing bandwidth for file transfer before computation). In the initial scope of the project we are provisioning the best available static network links and will investigate the possibilities of better dynamic link reservation for the purpose of planned big data transfers (computation, data replication, data addition/download). B. Hardware and Software Infrastructure This layer represents computational and data storage resources available via the platform. It means not only the machines themselves but also all software middleware providing access to computers and storage as well as tools available on the infrastructure. The role of this service is to support scenarios that require computation such as organ or disease simulation, genomics, data mining activities or 3D modeling. Storage interfaces are dedicated for data that is shared inside the platform and used by different scenarios for computation or visualization. Key components include: HPC hardware infrastructure, cloud/grid interfaces to computational resources and storage resources., Biomedical tools and generic workflow engines will also be integrated on this layer. In addition to the submission of single jobs, support for

biomedical workflows will enable computational and storage resources.

the

leveraging

of

C. Biomedical Data The main purpose of this layer is to provide access to biomedical databases and repositories. It is assumed that data available on this level is not personal, and it is not possible to identify patients or sources of information. It will be possible to replicate existing, open access resources, or provide wrappers for the applications and tools deployed on the platform as needed. The other important role of the layer is to provide mechanisms for extracting the data from hospital systems and (after anonymization procedures) to make them available on the platform. A more advanced feature will be semantic integration of the available information resources based on proposed domain ontology. Thus, users can efficiently search the available data. There are two main possibilities for integration mechanisms. The first one is based on mediation services that try to integrate information at the time of query. The second one is a warehouse model, which tries to build a common repository by fetching the data from external sources and translating it according to a proposed ontological model. The data available on this layer can be presented by end user applications and tools, but also can be used as an argument for the computations that are taking place on the Hardware and Software Infrastructure Layer. To make it possible, proper interfaces integrating data and computation must be provided. D. Security This layer represents all necessary infrastructure mechanisms that control access to the services of the underlying layers. It includes frameworks for user authentication and authorization. On the one hand, the chosen solution should provide a common access control layer for the user level applications and tools, and on the other hand it should also support security protocols implemented by infrastructure components in the Hardware and Software Infrastructure Layer. This layer also provides data protection mechanisms compliant with legal regulations concerning data sharing. Key components include: platform access control, authentication and authorization mechanisms, data protection framework and policies, and anonymization procedures. E. Science Gateways Science gateways offer an intuitive user interface and single point of entry to a set of applications and data sources across organizational boundaries. In most cases they are dedicated to solving domain level problems, or for presenting information available on the platform. Science gateways might be web-based, portal environments or they can be implemented as standalone applications. Key components include: portals, interactive applications, visualization environments, videoconferencing applications, and collaborative environments. Grid computing can provide researchers with a high performance platform that combines the power of geographically distributed supercomputers with science gateways. Applications running over grids like MosGrid [20] and TaxoGrid [18] offer easy access to Bioinformatics tools for the scientific community. With the advent of Cloud Computing, science gateways have begun to harness the

power of virtualization. There is a strong need for biomedical science gateways to support collaborative cloud computing platforms in order to solve complex biological problems. III. MAJOR SOFTWARE AND TOOLSET COMPONENTS ICTBioMed partners bring a number of key established software components and workflow competencies, which we are integrating on the unified biomedical research platform. The major initial areas of concentration are knowledge integration and semantic web, multi-dimensional big data analytics, next generation sequencing and analytics, and visualization of large multi-dimensional multi-format data. A. Knowledge Integration and Semantic Web The area of knowledge integration is concerned with establishing a set of repositories of biomedical knowledge available through specialized data exploration platforms. This relates to enabling access to such known and commonly used repositories as TCGA – The Cancer Genome Atlas managed by National Cancer Institute [1,2] or the Human Protein Atlas (HPA) created in Sweden within a project funded by the Knut and Alice Wallenberg Foundation [3]. TCGA has collected to date over 10 thousand cases representing more than 30 cancers. HPA enables the systematic exploration of the human proteome using Antibody-Based Proteomics. This is accomplished by combining high-throughput generation of affinity-purified antibodies with protein profiling in a multitude of tissues and cells assembled in tissue microarrays. The ICTBioMed consortium adds the Human Metabolic Atlas (HMA) enabled by Chalmers University (http://www.metabolicatlas.org). HMA provides genomescale metabolic models for different human cell types and allows using these models for identification of novel prognostic biomarkers for human diseases like type 2 diabetes, cardiovascular diseases and cancer. Integration of biomedical information is not possible without effective tools for the exchange of data. Gaggle [5] provides a framework for exchanging data between independently developed software tools and databases to enable interactive exploration of systems biology data. The next level of information integration is concerned with the semantic data integration. To this end PSNC has created a set of tools for building and exploring semantic knowledge bases. This set includes an agent based system for gathering information from distributed, heterogeneous data sources, a metadata management system for storing intermediate results, a semantic knowledge base builder framework, a query processing module, a portal middleware layer for semantic data presentation and the ontology mapper jMet2Ont [6]. The latter tool – jMet2Ont - transforms XML-based metadata to ontology-based formats. The source metadata format may be flat (e.g., Dublin Core) or hierarchical (e.g., MARC/XML). Knowledge processing in such an environment requires adequate capabilities for the analysis of huge amounts of data. Such capabilities are provided, among others, by the BioMet toolbox developed at Chalmers [7]. It is a web-based resource for the analysis of high-throughput data, together with methods for flux analysis (fluxomics) and integration of transcriptome data exploiting the capabilities of metabolic networks described in genome scale models. Access to underlying computational and storage infrastructure can be provided through a science gateway. For example, the Vine

toolkit created by PSNC [8] offers an API to implement an interface to bioinformatics tools and databases organized in workflows by such a system as C-DAC’s Anvaya [9]. Another example is the workflow-enabled MoSGrid (Molecular Simulation Grid) [10] science gateway further developed by the CRC at the University of Notre Dame. It provides users with workflow, data, and metadata capabilities for research in the areas of quantum chemistry, molecular dynamics, and docking. In addition to these science gateways, Galaxy [1113] is a widely used solution in the biomedical community. Galaxy CloudMan supports the integration and deployment of independently-developed tools and workflows in the Cloud. An ICTBioMed Galaxy instance in the cloud will provide research communities with workflows that exploit ICTBioMed computing and data resources. B. Complex (multi-dimensional) Big Data Analytics Besides providing the BioMet toolbox, mentioned in section 3.A above, the Chalmers University is very active overall in the analysis of complex, multi-dimensional Big Data. Chalmers has a Stochastic-Centre, a center of excellence supported by the Swedish National Science Foundation, the Swedish Foundation for Strategic Research (SSF) and the Wallenberg foundation. The Stochastic Centre specialists have created a number of mathematical models for the analysis of bioinformatics big data. Some examples of their work include high resolution, data-derived models of the glioblastoma subtypes. They aim to define important biological differences between subtypes, define candidate disease driving genes and identify drug targets [14]. Chalmers is also involved in a large “Big Data” analytics project supported by SSF and a part of a national Big Data analytics network involving leading companies like IBM, Ericsson, Volvo and AstraZeneca as partners. C. Next Generation Sequencing and Analytics Chalmers closely cooperates with the Sahlgrenska Medical Hospital where a Genomics Core Facility with new Illumina HiScanSQ and MiSeq instruments has been created. The Facility allows studying diversity of cancer cell types using metagenomic analysis from NGS data and performs GWAS studies. Gothenburg is a location with a number of groups with expertise in GWAS studies. Integration of such expert teams and next generation bio-facilities within a global network of biomedical research is essential to successful research collaborations. D. Visualization of Big and Complex Data An important element of bioinformatics analysis is data visualization. In Europe, a Marie Curie EU project on Data Intensive Visualization and Analysis methodologies in datadriven science and technology application domains has been established (http://diva-itn.ifi.uzh.ch/). PSNC is an ICTBIoMed partner active in the development of advanced multidimensional visualization tools. It has created the Vitrall system [15], which enables distributed web-based visualization using multiple GPUs for scalable rendering. It allows the creation of distributed collaborative environments with access enabled via a number of user interaction tools, including mobile devices equipped with sensors for detecting movement of the device. Experiments performed using Vitrall include the processing of medical imaging data such as CT

and MRI scans to create stereoscopic models of the human organs [16]. E. Large scale simulation frameworks Creation and visualization of models in biomedicine often requires simulation systems enabling large scale processing of data and information. These simulations are aimed at the development of virtual physiological human models. In the cancer research domain, a prominent VPH simulation application in Oncosimulator has been developed by a team of bio-scientists from University of Athens (Greece), initially within the ACGT project. This tool simulates cancer tumor growth in response to treatment. The model was designed for two kinds of cancer: Wilms tumor and breast cancer. A new model for leukemia is under way. The role of PSNC, also an ACGT participant, has been two-fold. On the one hand the PSNC team was responsible for optimizing the code and adopting it to the HPC architectures, including parallelization and GPU adaptation. On the other hand PSNC has been involved in the provision of an integrated environment for application execution on HPC resources with a user-friendly web interface and real-time visualization. IV. PLATFORM DEPLOYMENT AND INTEGRATION A. International Network Connectivity All of the partner institutions providing significant physical resources (hardware infrastructure) to the ICTBioMed collaboration are connected via world-wide networking resources. The global connectivity between sites is provided by high-speed National Research and Education Networks (NRENs), regional and local research networks. This includes GÉANT network in Europe, Internet2 network in North America and TEIN3 in Asia providing multi-Gigabit connectivity for researchers seeking collaboration between remote institutions. India has taken steps to connect with Europe and the USA with multiple 10Gbps circuits, and Notre Dame has multiple 10Gbps paths to Internet2 which coordinates peering to the ITCBioMed partners. Poznan is connected to international research networks through multiple 10Gbps channels and peerings. Chalmers is connected by broadband connectivity of 40 Gigs. OHSL is investigating 100Gbps connectivity through Internet2. The ICTBioMed partners have all deployed instances of perfSONAR [17] in order to establish a network baseline and monitor network performance. PerfSONAR reports on the paths the traffic traverses and helps identify bottlenecks to make sure the network connectivity is capable of supporting the desired target rates for the project applications. PerfSONAR contains a set of services delivering bandwidth, latency, and packet loss measurements in a federated environment. Scheduled throughput, one-way latency and ping tests were configured between participating hosts connected to the partners’ networking infrastructure as close as possible to the same subnet as the end systems. In order to easily recognize ICTBioMed test instances a new perfSONAR community called “ICTBioMed” was created to form affiliations of collaborating monitoring instances within the perfSONAR world. All partners joined this community, which is now available publicly.

The results of multiple rounds of network bandwidth and latency tests have been analyzed to identify transit bottlenecks such as firewalls, NAT limitations, and errant routing paths. The data will be collected throughout the project lifetime to assess and improve path performance. ICTBioMed’s network now functions as a reliable unified resource that the partners continuously monitor to assure its availability. B. OpenStack Deployment and Federation The emergence of cloud computing has changed every aspect of the collaboration and provided a ground changing opportunity for research. The ICTBioMed consortium is leveraging its capabilities and its experience with private clouds [19] to create a pilot cloud with the capabilities needed to meet the basic research requirements of biomedical researchers across the globe. While Cloud standards and technologies are still in a high state of flux, our consortium has selected the OpenStack platform as the basis for our federated infrastructure. The objectives of the Global ICTBioMed Cloud Pilot are:       

prioritized and dedicated resource access. Our goal is to share the resources in a way that external ICTBioMed users and local campus users at each location are able to employ the resources in a manner which reflects appropriate ownership and sharing intents. C. System Evolution in Collaboration Because ICTBioMed is a collaborative infrastructure we must maintain an agreed model for system deployment and evolution. Figure 2 below provides an organizational framework for the biomedical and ICT researchers, engineers, and technicians to simultaneously use and develop the systems. More importantly it specifies a communication framework between users and developers with a continuous roadmap generated in conjunction with public authorities such as the US National Institutes of Health and academic, medical hospital, and industry advisors. Additionally we will work or share best practices and identify areas of collaboration with related peer projects such as the ARES [21,22] and Bionimbus Projects [23].

Unified interface to manage Cloud resources distributed in different locations like PSNC, Chalmers, Notre Dame and C-DAC. Common Cloud architecture agreement among peers. Enable VM migration between clouds. Test multi-zone cloud infrastructures. Test resource sharing. Replication and sharing of data sets. Scientific use-cases for shared Cloud projects

The proposed reference architecture consists of common Cloud middleware software, which will be installed at all the locations. All the locations will run compatible OpenStack deployments with site-specific entry gateways, which implement site-specific additional security and authentication mechanisms for users accessing the cloud. The model employs a resource sharing mechanism and all the resources are transparent to the user. The user can access a resource at any site without knowing its actual location. Another mechanism can be adopted where the various partner locations can be represented as zones, and the user can access the desired zone as per the proximity to the location. Specific to this project we are working to exploit various components of OpenStack via their API and the granular control and resource partitioning available through regions, availability zones, host aggregation, and projects. Additionally we enable the GUI dashboard (“Horizon”) providing a web front end to the other OpenStack services such as Nova, Neutron, Cinder, Swift, Glance and Keystone. We allot Regions and each Region has its own full Openstack deployment, including its own API endpoints, networks and compute resources. Inside a Region, compute nodes can be logically grouped into Availability Zones when launching a new VM instance. Besides Availability Zone, compute nodes can also be logically grouped into Host Aggregates. Host Aggregates have meta-data to tag groups of compute nodes, e.g. we can group nodes with SSD disk to one Host Aggregate, and nodes with 10 Gb NICs to another Host Aggregate. Different Regions share one set of federated Keystone and Horizon components to provide access control. An additional and related challenge to federated access is

Figure 2: ICTBioMed collaborative model V. CONCLUSION For many years, scientific innovation has been a driving force for real progress in the medical sciences. At the same time, scientific research, which is at the forefront of products and technologies available on the market (and ensures proof of concept for new ideas), requires an advanced ICT infrastructure (e-Infrastructure). Examples include the US NSF investments in Cyberinfrastructure, European Research Infrastructure via the ICT Programme in eInfrastructure, eScience in Great Britain, the Polish PIONIER Programme, and the Nordic eScience Globalisation Initiative (NeGI). One of the most important aspects of advancing biomedical research is to leverage the results of R&D in computer science and make more effective use of existing informatics tools, databases and advanced infrastructures worldwide for testing and experimentation of novel methods and challenges in medical research. The informatics world has the potential to deliver the critical capacity by ensuring that high speed broadband connectivity (e.g. Internet2, GEANT, NRENs), high performance computing infrastructures,

dynamic and archival data storage, test beds and experimental facilities are always available and increasingly intuitive for the end user scientists. Advanced software tools and technologies to improve data acquisition, management, analysis and dissemination are required to make this possible. A collaborative effort is needed to provide support with an agile methodology for software and technology development both in biomedical and informatics domains. Progress in biomedical research requires integration using ICT methods and resources. The complexity of biomedicine, market forces, and social expectations make broad integration across diverse organizations difficult. Despite many advances in research, systems and solutions available on the market are neither comprehensive nor satisfactory. Grand challenges in biomedical research remain unsolved not simply because of a lack of intellectual talent but also due to many barriers: technical, social, market and legal. It is our collective hope to break down these barriers through collaboration within the ICTBioMed consortium and with its external partners to develop robust and open IT infrastructure to meet the needs of biomedical research.

[9]

[10]

[11]

[12]

[13]

[14]

[15]

ACKNOWLEDGEMENTS All of the authors would like to thank their parent institutions for support of this collaboration. The Notre Dame Center for Research Computing would like to thank engineers Steve Bogol and Rich Sudlow who were instrumental in configuring OpenStack and diagnosing PerfSonar network tests between ND and our partner institutions. We would like to thank Internet2 for consultation and collaboration on this and related network research projects.

[16]

[17] [18]

[19]

REFERENCES [1]

[2] [3]

[4]

[5]

[6]

[7]

[8]

M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, M. Zaharia. 2010. “A view of cloud computing.” Commun. ACM 53, 4 April 2010, 50-58 Chin, L., Hahn, W.C., Getz, G., Meyerson, M. (2011) Making sense of cancer genomic data. Genes and Development. 25(6): 534-555. Chin, L., Andersen, J.N., Futreal, P.A. (2011) Cancer genomics: from discovery science to personalized medicine. Nature Medicine. 17(3): 297-303. Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S, Wernerus H, Björling L, Ponten F., Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 2010 28(12):1248-50. Paul T Shannon, David J Reiss, Richard Bonneau and Nitin S Baliga (2006),The Gaggle: An open-source software system for integrating bioinformatics software and data sources. BMC Bioinformatics 2006, 7:176 Walkowska J. and Werla M. (2012), Advanced Automatic Mapping from Flat or Hierarchical Metadata Schemas to a Semantic Web Ontology, Lecture Notes in Computer Science Volume 7489, 2012, pp. 260-272. Garcia-Albornoz M, Thankaswamy-Kosalai S, Nilsson A, Vämo L, Nookaew I,Nielsen J (2014), BioMet Toolbox 2.0: genome-wide analysis of metabolism and omics data. Nucleic Acids Res, doi: 10.1093/nar/gku371 Michael Russell,Piotr Dziubecki,Piotr Grabowski,Michal Krysinśki,Tomasz Kuczyński,Dawid Szjenfeld,Dominik Tarnawczyk,Gosia Wolniewicz,Jaroslaw Nabrzyski (2008),The Vine Toolkit: A Java Framework for Developing Grid Applications, Lecture Notes in Computer Science Volume 4967, 2008, pp. 331-340

[20]

[21]

[22]

[23]

BHAKTI LIMAYE et al. (2012), Anvaya: A workflows environment for automated genome analysis, J. Bioinform. Comput. Biol. 10, 1250006 (2012) [16 pages] DOI: 10.1142/S0219720012500060 J. Krüger, R. Grunzke, S. Gesing, S. Breuers, A. Brinkmann, L. de la Garza, O. Kohlbacher, M. Kruse, W. Nagel, L. Packschies, R. MüllerPfefferkorn, P. Schäfer, C. Schärfe, T. Steinke, T. Schlemmer, K. Warzecha, A. Zink, and S. Herres-Pawlis “The MoSGrid Science Gateway - A Complete Solution for Molecular Simulations” Journal of Chemical Theory and Computation, 2014, 10(6): 2232–2245. Goecks, J, Nekrutenko, A, Taylor, J and The Galaxy Team. “Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences”. Genome Biol. 2010 Aug 25;11(8):R86. Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M, Nekrutenko A, Taylor J. "Galaxy: a web-based genome analysis tool for experimentalists". Current Protocols in Molecular Biology. 2010 Jan; Chapter 19:Unit 19.10.1-21. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, Zhang Y, Blankenberg D, Albert I, Taylor J, Miller W, Kent WJ, Nekrutenko A. "Galaxy: a platform for interactive large-scale genome analysis." Genome Research. 2005 Oct; 15(10):1451-5. Rebecka Jörnsten et al. (2011),Network modeling of the transcriptional effects of copy number aberrations in glioblastoma, Mol Syst Biol. 2011; 7: 468 Piotr Śniegowski,Marek Błażewicz, Grzegorz Grzelachowski, Tomasz Kuczyński,Krzysztof Kurowski,Bogdan Ludwiczak (2012),Vitrall: WebBased Distributed Visualization System for Creation of Collaborative Working Environments, Lecture Notes in Computer Science Volume 7203, 2012, pp 337-346 Miłosz Ciżnicki,Michał Kierzynka,Krzysztof Kurowski,Bogdan Ludwiczak,Krystyna Napierała,Jarosław Palczyński (2012).Efficient Isosurface Extraction Using Marching Tetrahedra and Histogram Pyramids on Multiple GPUs, Lecture Notes in Computer Science Volume 7204, 2012, pp 343-352 perfSONAR, available at http://psps.perfsonar.net/ Saxena, Amit, Sonal Dahale, Sankalp Jain, E. Ramakrishnan, Vivek Gavane, Renu Gadhari, Pankaj Vats, and K. Sunitha Manjari. "TaxoGrid: Molecular Phylogeny on Garuda Grid." International Journal of Computer Science Issues (IJCSI) 9, no. 6 (2012). Pathak, Bhagyashri, S. Rajesh, Sankalp Jain, Amit Saxena, Rashmi Mahajan, and Rajendra Joshi. "Private Cloud Initiatives Using Bioinformatics Resources and Applications Facility (BRAF)." International Journal on Cloud Computing: Services and Architecture (IJCCSA) 2, no. 6 (2012): 25-34. Gesing, S.*, Grunzke, R.*, Krüger, J., Birkenheuer, G., Wewior, M., Schäfer, P., Schuller, B., Schuster, J., Herres-Pawlis, S., Breuers, S., Balasko, A., Kozlovszky, M., Szikszay Fabri, A., Packschies, L., Kacsuk, P., Blunk, D., Steinke, T., Brinkmann, A., Fels, G., MüllerPfefferkorn, R., Jäkel, R., and Kohlbacher, O. (2012) A Single Sign-On Infrastructure for Science Gateways on a Use Case for StructuralBioinformatics Journal of Grid Computing, 10(4):769-790. Dario Valocchi, Gianluca Reali, Mauro Femminella, Emilia Nunzi, "The ARES Project: Network Architecture for Delivering and Processing Genomics Data," 2014 IEEE 3rd Symposium on Network Cloud Computing and Applications (ncca 2014), pp. 23-30 Mauro Femminella, Emilia Nunzi, Gianluca Reali & Dario Valocchi "Networking issues related to delivering and processing genomic big data", International Journal of Parallel, Emergent and Distributed Systems 23 Jun 2014 Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets; Allison P Heath, Matthew Greenway, Raymond Powell, Jonathan Spring, Rafael Suarez, David Hanley, Chai Bandlamudi, Megan E McNerney, Kevin P White, Robert L Grossman; J Am Med Inform Assoc amiajnl-2013-002155Published Online First: 24 January 2014doi:10.1136/amiajnl-2013-002155

Suggest Documents