Semantically-guided Workflow Construction in Taverna: The SADI and ...

1 downloads 0 Views 570KB Size Report
functionality of the BioMoby and SADI plug-ins to Taverna, with a particular .... 1 In principle, such sub-structures could be determined by exploring the data-type ...
Semantically-guided Workflow Construction in Taverna: The SADI and BioMoby Plug-ins David Withers1, Edward Kawas1, Luke McCarthy1, Benjamin Vandervalk 1, and Mark Wilkinson 1, 1 Heart + Lung Institute at St. Paul‟s Hospital, University of British Columbia, Vancouver, BC, Canada [email protected]

Abstract. In the Taverna workflow design and enactment tool, users often find it difficult to both manually discover a service or workflow fragment that executes a desired operation on a piece of data (both semantically and syntactically), and correctly connect that service into the workflow such that appropriate connections are made between input and output data elements. The BioMoby project, and its successor the SADI project, embed semantics into their data-structures in an attempt to make the purpose and functionality of a Web Service more computable, and thereby facilitate service discovery during workflow construction. In this article, we compare and contrast the functionality of the BioMoby and SADI plug-ins to Taverna, with a particular focus on how they attempt to simplify workflow synthesis by end-users. We then compare these functionalities with other workflow-like clients we (and others) have created for the BioMoby and SADI systems, discuss the limitations to manual workflow synthesis, and contrast these with the opportunities we have found for fully automated workflow synthesis using the semantics of SADI. Keywords: Semantic Web, Web Services, SADI, BioMoby, Taverna, workflow

1 Introduction Biology is increasingly becoming an in silico science, largely as a result of the rapid advance of high-throughput technologies for DNA sequencing, protein analysis, gene expression, metabolic profiling, and genotyping. Many studies currently being undertaken in bio/medicine require access to, integration of, and analysis of some or all of these data types. The scale, scope, and complexity of these analyses makes the traditional approach of copy/pasting data into Web pages untenable, and thus has led to the emergence of Workflows as a primary “object” in modern biology [1]. This shift from manual Web-based analysis to semi- or fully-automated analytical pipelines has been mirrored by the concomitant emergence of software and community support for this new in silico paradigm.

For non-coders, workflow design and enactment software makes it possible to conduct the kinds of high-throughput analyses that were formerly the exclusive domain of bioinformatics professionals. Moreover, emergent public workflow repositories like myExperiment [2] will increasingly play a role in supporting workflow construction by end-users - either through reuse, extension, or re-purposing of other researchers workflows [3]. Nevertheless, recent studies [4, 5, 6] have demonstrated that end-user biologists continue to have problems constructing useful or functional workflows, even when presented with existing scaffolds or templates. This is, at least in part, due to the difficulty of manually discovering a service or workflow fragment that does what is necessary, and correctly integrating that service into the workflow, both syntactically and semantically. Taverna is a general-purpose workflow design tool designed to manage most “flavours” of Web Service (CGI, SOAP, BioMoby, etc.), and handle data flow related to any domain of investigation [7]. This broad support is important given that many biomedical investigations require access to various types of biological and chemical data, pathway data, and statistical algorithms; however it comes at a cost of complexity – that is, the number of Web Services available in the default Taverna interface is already on the order of 3000 or more, and adding additional service endpoints is straightforward. This makes Taverna distinct from some other workflow tools in the bioinformatics space (for example, GenePattern [8]), where the available services are pre-screened and pre-selected by the software designers to solve specific problems, and therefore only number in the tens or hundreds of services. Consequently, however, Taverna suffers an embarrassment of riches, or more precisely, the end-users of Taverna “suffer” when they must identify and chose the service they require from a vast myriad of options at any given point in workflow construction. One of the earliest examples of semantically-supported Web Service discovery was exhibited by the TAMBIS (Transparent Access to Multiple Bioinformatics Information Sources) project [9]. Since then, a wide variety of initiatives have taken similar approaches to improving service discovery (three early high-profile projects are reviewed in Lord et al. [10]). The common thread between all of them is that their individual service registries contain semantic information beyond that provided by the Service‟s WSDL file. Here we are going to compare and contrast only two of these – BioMoby [11], and the SADI Framework [12] – since both of these Semantic Web Service frameworks have created plug-ins to Taverna that leverage the additional semantic search power of their respective registries. In the remainder of this article we will first briefly describe the BioMoby and SADI Semantic Web Service frameworks. We will then describe the functionality of their respective plug-ins to Taverna, and discuss how the nature of their support for Web Service discovery differs. Finally, we compare these functionalities with other workflow-like clients we (and others) have created for the BioMoby and SADI systems, discuss the limitations to manual workflow synthesis, and contrast these with the opportunities we have found for fully automated workflow synthesis using the semantics of SADI.

2 BioMoby Semantic Web Services BioMoby was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate interoperable information exchange and access to analytical resources by creating a community-built and communitycurated ontology of bioinformatics data-types and analytical operations. The key to BioMoby‟s interoperable behaviours was its invention of a „boutique‟ syntax in the form of an ontology-based XML schema – an instance of any ontological node had a specific and predictable XML representation that could be automatically determined by traversing the ontology. Thus machines could receive information of an unknown type, and determine not only its XML structure, but also the “meaning” of every sub-structure, by referring to the BioMoby data-type ontology. All BioMoby-compliant Web Services consumed and produced data in this boutique XML schema, thus services consuming ontologically-compatible data-types were, by definition, interoperable (at least syntactically, and to a large degree semantically). Service discovery in Moby is accomplished through querying a centralized registry (“Moby Central”) where Services are indexed by ontology-based input data-type, ontology-based output data-type, a controlled vocabulary of service functionality types, and by service provider identification. It is important to emphasize that Moby Central, like traditional Web Services registries, indexes the input and output datatypes as “globs” of data - effectively, a reference to an XML schema - and therefore does not explicitly expose the finer sub-structure of the data1. This observation is critical to understanding the contrast between BioMoby and the SADI framework described in section 3. Despite being key to interoperability, the BioMoby data-type ontology was its primary weakness. Not only was the XML representation of the ontology projectspecific, the ontology itself took the form of a large centralized resource that, while being openly community-editable, still required community agreement and consensus buy-in to ensure interoperability. In practice, this buy-in was only tentative, and many providers created duplicate data-types with different names and in some cases duplicated entire sub-branches of the ontology to be specific to their needs/terminologies/projects. Thus, as new standards for representing and publishing ontologies became available from the W3C, we undertook to invent a new semantic web service framework – SADI (Semantic Automated Discovery and Integration) – guided by the successes and failures of the BioMoby project.

3 SADI Semantic Web Services SADI is a set of guidelines for Semantic Web Service provision that aim to maximize interoperability between Web Services while minimizing the complexity of service provision by the resource providers. While SADI does not invent any new 1

In principle, such sub-structures could be determined by exploring the data-type ontology; however in practice no client application ever did this.

technologies or standards, project-relevant codebases in Java and Perl have been developed to support Web Services compliant with these guidelines and bestpractices. Like the BioMoby project, SADI embeds semantics into its data-structures in an attempt to make the purpose and functionality of a Web Service more computable. Where BioMoby used a boutique XML serialization of an ontology in order to represent semantically-grounded messages, SADI utilizes the W3C standards of Resource Description Framework (RDF) and Web Ontology Language (OWL) and their respective XML serializations. Service input and output are defined as OWLDL classes, and OWL Individuals of these classes (in RDF) are consumed and produced during service invocation. The key novelty of the SADI project lies in one very simple best-practice guideline – that is, that the identity of the input RDF node, and the identity of the output RDF node, must be identical. As a consequence of this, every service becomes an annotation service, where the input node is “decorated” with labeled RDF relationships to new data nodes derived from the execution of the service. As an extension to this, since input and output Classes are defined in OWL-DL (where property and value restrictions are used to describe the features of input and output class membership), it therefore becomes possible to determine what features are added to an input node simply by examining the difference between the input and output OWL Class definitions, since these are, by definition, the same “entity” (URI). These features are indexed in a registry and used for Service discovery. Note that this is subtly, but critically, different from the BioMoby registry index – where BioMoby indexes only the data-type “glob”, in SADI, the properties added to the input by the service are indexed in the registry, since these are part of the output data-type definition. Thus, in SADI, we can support much finer-grained searches of the data that is output from participating services. Service discovery in SADI is based on searches for services that consume a particular set of data properties, and produce one or more new properties of-interest based on those properties. Note again that searching is rarely, if ever, done for a particular output based on its Class, but rather for sets of specific data properties in relation to the input data. This approach was designed to mimic the linguistics of scientific query, where researchers frequently ask questions about the relationship between two pieces of data (e.g. “what is the coding sequence of the BRCA1 gene?”). While in BioMoby one could search for a service that consumed the data-type “gene name” and produced the data-type “nucleotide sequence”, there would be ambiguity about what the exact relationship between that gene and that sequence was (e.g. is it the gene sequence, the coding sequence, the promoter sequence, or a contig that contained that gene, etc.). Conversely, in SADI, one might search for services that provide the “hasCodingSequence” property on gene names, thus leaving little ambiguity about what the service does.

4 The Taverna BioMoby and SADI Plugins BioMoby and SADI are both products, in part, of the Genome Canada Bioinformatics Platform, where Web Service Workflows are the means by which we provide bioinformatics support to Platform end-users. The Platform has chosen the Taverna client application as one of its primary end-user tools due to its powerful, yet straightforward interface. In total, >1500 BioMoby Services, and an increasing number of SADI services (currently near 100) are available. However, as discussed earlier, the large number of Web Services available in the Taverna interface makes workflow construction a challenge for even expert end-users. As such, we have undertaken to create plug-ins to Taverna that make it easier for our biologist end-users to work with BioMoby and SADI Services, both at the level of Service discovery and the level of correct “wiring” of Services into workflows. 4.1 The BioMoby Plugin to Taverna The BioMoby plugin is described in detail elsewhere [13], but we will recap the salient features here. When searching for a Service there are two common scenarios: either the user has a particular data-type that they want to submit to a Service for analysis, or the user has already created a fragment of a workflow and now wish to pipeline the output from that workflow into a new Service. In BioMoby, both of these cases are identical in that both a standalone piece of data, and a Service output, are strictly typed by the BioMoby data-type ontology. Thus the registry query executed by the Taverna plugin is for services that consume data-type “X” as their input. This search can be enhanced by asking the registry to search for services that consume “X”, or an ontological parent-type of “X”, as their input (or conversely, which services produce “X” as their output when constructing workflows in the reverse direction) as shown in Figure 1. The resulting matches can be ordered by service name (a human-readable and often semi-informative string), by service type (a controlled vocabulary describing kinds of bioinformatics operations), or by service provider (by their unique provider URI string) as shown in Figure 2. In addition, a discovered service can be further examined to determine what BioMoby data-type it will output if invoked. When the desired Service is selected, the Taverna plugin automatically connects it to the workflow; this connection is guaranteed to be correct since the strict datatyping of BioMoby, and the ontological regulation of its data structures, ensures that data can be passed verbatim from one service to another so long as the data-types are ontologically related.

A

B

Fig. 1. The BioMoby Web Service search interface. As shown in A, a Web Service (getGenBankFASTA) can be examined to show the BioMoby datatypes consumed and produced by its input and output ports. In this case, the input port consumes the BioMoby datatype “Object” (which is used to pass database identifiers) and its output port produces the BioMoby datatype “FASTA”. The words in brackets following the data-type names are human readable annotations of the data-type added by the service provider to help explain the purpose of each input and output parameter. This data-type information can be used, by right-clicking on the data-type, to discover BioMoby services capable of consuming (as shown in B) or producing, that datatype. This allows workflow construction to be achieved either in the forward or reverse direction.

Fig. 2. The results of searching for services that consume FASTA (see figure 1B). In this view, the results are sorted by service provider; the provider “antirrhinum.net” is expanded to reveal the service – getDragonBlastText – that matches those criterion. In addition, the data-type output from that service, NCBI_Blast_Text, is provided in the expanded view to assist the user in determining if this service is likely to be appropriate for their needs.

A typical workflow resulting from this iterative discover/connect process is shown in Figure 3. Notice that the widgets representing each service show the transformative nature of that service – i.e., the data-type that goes in, and the datatype that comes out, are displayed on the widget. The annotations of those data-types - effectively, a human-readable single-word name given to the input or output – are also displayed, but may or may not be informative.

Fig. 3. A BioMoby workflow that extracts the gene names, protein names, and protein sequences that participate in a given biochemical pathway in the KEGG database. Orange nodes are BioMoby services; white nodes are parsers specific to BioMoby data types to enable extraction of data from BioMoby‟s “boutique” XML schema; blue pentagons are output data buckets. In each node, the top layer describes the input „ports‟ by their data-type name (e.g. “Object” – a BioMoby datatype for record identifiers), the middle layer is the service name, and the lower layer describes the output „ports‟ by their data type name (e.g. “SwissProt_Text”). For both input and output ports, the human readable parameter name of the port is in brackets (e.g. “keggId”)

4.2 The SADI Plugin to Taverna Superficially, the SADI plugin to Taverna has a near-identical functionality to that of the BioMoby plugin. Generally speaking, the starting point of service discovery is the same – either a specific data-type, or an existing workflow fragment. The search against the SADI registry is, therefore, for services that consume a particular type of data, as shown in Figure 4. What differs from BioMoby, however, is how that data-

type is described. In a SADI search, the information sent to the registry query is an OWL Class-name, rather than a (relatively) opaque BioMoby data-type name.

Fig. 4. The search interface of the SADI plugin. Search can now be conducted by rightclicking on the service widget itself in the workflow display, which is likely to be more intuitive for the biologist end-user who expects that workflow diagram to be interactive [4]. Search is simply a matter of clicking the “Find services” option of the pop-up menu.

Fig. 5. The results of searching for services that consume KEGG_Record (see figure 4). All valid services are displayed, together with some metadata about the service (service name, and human-readable description of service functionality), and the properties that the service will attach to its input data. Note the subtle distinction that, unlike the BioMoby plug-in, it is the name of the property, rather than its data-type, which is displayed to the end-user.

As such, the search API that accesses the SADI registry can use the property restrictions within that OWL Class definition to search for SADI services that consume sub-features of the output data. Thus, the SADI search is more semantically rich than a BioMoby search, since arbitrary sub-sets of data properties will discover services capable of consuming those subsets, rather than strictly matching based on a hierarchical data-type ontology. This subtle distinction is perhaps best described with an example. Some service Foo attaches the properties of “protein identifier”, “molecular weight”, and “Gene Ontology annotation” to its input. Foo can then be used to discover a downstream service Bar that consumes only “molecular weight” as its input, or combinations of “protein identifiers” and “Gene Ontology annotations” as its input. The matching is accomplished by a DL reasoner, which compares the properties in-hand to the input OWL property restrictions of each service in the registry. Once discovered, Foo and Bar can be connected accurately, both syntactically and semantically, without human intervention.

Fig. 6. A SADI workflow that extracts the gene names, protein names, and protein sequences that participate in a given biochemical pathway in the KEGG database. Square blue nodes are data inputs, green nodes are SADI Services; blue pentagons are data buckets. For each SADI service, the top layer is the input data-type (the OWL Class Name), the middle layer is the service name, and the bottom layer is the list of properties and value restrictions provided by that service. For example, the getUniProtByKeggGene Service consumes individuals of class KEGG_Record and attaches the “encodes” property which will have a value that is of type UniProt_Record.

A SADI workflow is shown in Figure 6 that is functionally identical to the BioMoby workflow shown in Figure 3. The salient features to note in this diagram are: (1) The output ports describe the properties being generated by the service, and the data-types of those properties. This makes the service functionality extremely

transparent, for example, the upper green box would read “A KEGG pathway has a pathway gene that is represented as a KEGG Record”; (2) There is visual confirmation to the end-user that their workflow is “correct” because the naming of record identifier data-types in OWL is typically more human-readable than the equivalent BioMoby data-type ontology term. For example, the KEGG_Record OWL class would have simply been “Object” in the BioMoby data-typing ontology (with an additional attribute “namespace=KEGG_Record”; however this attribute/value pair is not amenable to logical reasoning in any strict sense). Thus by visual inspection the user can see that the KEGG_Record from the first service is being fed as input to the KEGG_Record slot of the subsequent service. We have not yet tested the effectiveness of these small interface changes on end-user utility; however we suspect that we will see an improvement.

5 Semantic Service Discovery in Workflow Construction Most workflow composition systems support some form of assisted service discovery. Kepler allows the use of ontologies to describe the input and output of workflow components [14], and similarly Galaxy [15] supports both semantic service matching (using Lumina [16]), as well as ontologically-grounded lifting and lowering XML Schema, in a manner that mirrors the SAWSDL standard from the W3C [17]. The semantics of the service operation are also searchable in these systems through OWLS/WSDL-S-like standards, where the pre-state, post-state, and functional operations of a service are semantically described. BioMoby and SADI took more simplistic, but mutually distinct approaches to describing the semantics of a Web Service. In BioMoby, the semantics of the input and output messages are embedded in the message itself through its boutique XML serialization of the BioMoby data type ontology. The semantics of a BioMoby service operation are described in a simplistic hierarchy of bioinformatics service types akin to the ontological hierarchy created by the myGrid project [18] for their FETA Semantic Web Service annotation system [19]. In SADI the semantics of the input and output messages, and the semantics of the service operation itself, are both embedded in the message. This is because the SADI Semantic Web Service framework requires that service input and output RDF graphs have the same subject URI. Thus, the OWL classes that describe the input and output data-types also, implicitly, describe the difference between those data-types. This becomes an important descriptor of the functionality of the service during search, since it explicitly describes how the input and output data relate to one another. SADI does not rely on a centralized ontology to define these relationships, but rather allows the service provider to chose, or publish, any ontological predicate that semantically describes this relationship. The embedding of semantics directly in the message in both cases has significant consequences on how services are discovered. For the BioMoby and SADI Taverna plug-ins, semantically-guided discovery of services differs in two primary ways:

1.

As just described, SADI searches lack a distinct searchable feature regarding service functionality. While nothing about the SADI framework precludes the addition of detailed annotations of SADI service operations, we have not yet found a need for this; the semantic relationship between input and output is highly descriptive of what the functionality of the service must be, and has been sufficient to resolve our use-cases to date, including that of the Taverna plug-in. Moreover, while it is unlikely that this simplistic annotation will ever support fully automated workflow synthesis, there is evidence that such automation is not needed, or even wanted, by our target end-users [10]. Thus the simplification of use we gain is not, so far, detrimentally offset by a lack of support for full automation. Further, several studies [4; 6] have shown that, for our target end-users, the concept of “data-typing” is extremely foreign (described as “incomprehensible” by Gordon and Sensen) and we are hopeful that indexing services based on data-type relationships, rather than (or in addition to) data-types, will facilitate their use of the discovery tools.

2.

Second, service discovery with the SADI plug-in allows richer semantic matching because DL reasoners can match subsets of properties within complex data objects with services capable of consuming those subproperties. Planned user-studies which mirror those of Gordon and Sensen will determine if this additional semantic discovery-power simplifies, or complicates, the process of rational workflow construction by our target audience.

6 Other BioMoby/SADI Web Service Composition Systems Limiting the discussion only to BioMoby and SADI Semantic Web Services, a wide array of client applications support semi- or fully-automated workflow synthesis with one or both of these frameworks. In addition to Taverna, BioMoby workflow clients include other standalone clients such as Seahawk [20], jORCA [21], and Remora [22]; web-based clients such as Gbrowse Moby[23], MOWserv [24], and Mobyle [25]; “aggregators” such as Jabba [26] and DataBiNS [27] where pre-determined Service workflows are called to create Web-page content; and finally automated workflow construction systems such as Magellenes [28], where the user specifies their starting and ending data-type, and the system determines paths through the analytical service-space that will derive that output from that input. The newer SADI Semantic Web Service system has fewer clients so far, but both web based and standalone tools are already available. These include SHARE [29] and the Taverna plug-in described here. Of particular interest in the context of this manuscript is the SHARE client, because of its novel approach to automated workflow synthesis. SHARE automates workflow composition using the property constraints within OWL-DL Class definitions as a guide. While in Magellenes the user is presented with a selection of

reasonable pre-constructed workflows, SHARE can (often) precisely determine the path by which a complex data-type can be synthesized simply by examining the property and property-value constraints within the data-type‟s definition. This is possible specifically because the SADI framework, unlike BioMoby, utilizes both data-type and relationship information in its service annotation. While the required usability studies on the new Taverna plug-in have not yet been undertaken, we have observed notable added service-discovery power in other SADI client applications using a similar property-based searching paradigm. Moreover, there is mounting evidence that our target end-users find it difficult to comprehend and work with data-types. As such, we feel confident that these same end-users will find this new paradigm of working with the properties of data, rather than with explicit data-types to be a much more natural way of approaching workflow construction. Acknowledgments. The BioMoby project was funded in part by Genome Canada and Genome Prairie through the Genome Canada Bioinformatics Platform. The SADI project was funded by the Heart and Stroke Foundation of BC and Yukon, Microsoft Research, and the CIHR. Development of the BioMoby and SADI plugins to Taverna have been funded in part by Genome Canada and Genome Prairie, by expertise donated from the myGrid project, and by CANARIE through its funding of the CBRASS Project. Core laboratory funding is derived from an award from NSERC.

References 1. Goderis, A., Li, P., Goble, C.: Workflow Discovery: Requirements from E-science and a Graph-based Solution. International Journal of Web Services Research 5, vol 4. (2008) 2. Goble, C., DeRoure, D.C.: myExperiment: social networking for workflow-using escientists. In: Proceedings of the 2nd workshop on workflows in support of large-scale science. pp. 1-2. (2007) 3. Wroe, C., Goble, C., Goderis, A., Lord, P., Miles, S., Papay, J., Alper, P., Moreau, L.: Recycling workflows and services through discovery and reuse. Concurrency Computat: Pract. Exper. Vol 19(2), pp. 1-7. (2006) 4. Gordon, P.M.K., Sensen, C.: A Pilot Study into the Usability of a Scientific Workflow Construction Tool. Technical Report #2007-874-26. Department of Computer Science, The University of Calgary. (2007) 5. Goderis, A., Sattler, U., Lord, P., Goble, C.: Seven Bottlenecks to Workflow Reuse and Repurposing. In: Y. Gill et al. (Eds). ISWC 2005, LNCS 3729, pp. 323-337. (2005) 6. Gordon P.M.K., Barker, K., Sensen C.W. Helping Molecular Biologists Effectively Build Workflows, Without Programming. In P. Lambrix and G. Kemp (Eds.): Proceedings of 7th International Conference on Data Integration in the Life Sciences (DILS 2010) (Gothenburg, Sweden), August 25-27, 2010: pp. 74-89. (2010) 7. Oinn, T., Greenwood, M., Addis, M., Alpdemir, N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D., Li, P., Lord, P., Pocock, M., Senger, M., Stevens, R., Wipat, A., Wroe, C.: Taverna: lessons in creating a workflow environment for the life sciences. Concurrency Computat: Pract. Exper. Vol 18(10), pp. 1067-1100. (2006) 8. Reichm M., Liefeld, T., Gould, J., Lerner, J., Tamayo, P., Miserov, J.P.: GenePattern 2.0. Nat Genet Vol 38(5), pp. 500-501. (2006)

9. Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A., Brass, A.: Tambis: transparent access to multiple bioinformatics information sources. Bioinformatics Vol 16(2), pp. 184-5, 2000. 10. Lord, P., Bechhofer, S., Wilkinson, M.D., Schiltz, G., Gessler, D., Hull, D., Goble, C., Stein, L.: Applying semantic web services to Bioinformatics: Experiences gained, lessons learnt. In ISWC 2004. Springer-Verlag Berlin Heidelberg, pp350-364 (2004) 11. The BioMoby Consortium. Interoperability with Moby 1.0 - It's better than sharing your toothbrush! Briefings in Bioinformatics. Vol 9(3), pp. 220-231. (2008) 12. Wilkinson, M.D. Vandervalk, B. McCarthy, L.: SADI Semantic Web Services - ‚cause you can't always GET what you want! Services Computing Conference, 2009. APSCC 2009. IEEE Asia-Pacific, pp 13-18. (2009) 13. Kawas, E., Senger, M., Wilkinson, M.D.: BioMoby extensions to the Taverna workflow management and enactment software. BMC Bioinformatics, Vol. 7(523). (2006) 14. Pignotti, E., Edwards, P., Preece, A., Gotts, N., Polhill, G.: Enhancing Workflow with a Semantic Description of Scientific Intent. In: 5 th European Semantic Web Conference ESWC 2008, Springer,Vol 5021, pp. 644-658. (2008) 15. Taylor, J., Schenck, I., Blankenberg, D., Nekrutenko, A.: Using galaxy to perform largescale interactive data analyses. Curr Protoc Bioinformatics. Sep;Chapter 10:Unit 10.5. (2007) 16. http://lsdis.cs.uga.edu/projects/meteor-s/downloads/Lumina/files/thesis.pdf. Downloaded May 15, 2010. 17. http://www.w3.org/2002/ws/sawsdl/ Downloaded May 15, 2010. 18. Wolstencroft, K., Alper, P., Hull, D., Wroe, C., Lord, P.W., Stevens, R.D., Goble, C.A.: The myGrid ontology: bioinformatics service discovery. International Journal of Bioinformatics Research and Applications, Vol. 3(3), pp. 303-325. (2007) 19. Lord, P., Alper, P., Wroe, C., Goble, C.: Feta: A light-weight architecture for user oriented semantic service discovery. In: Proceedings of the European Semantic Web Conference 2005, Vol. 3532, pp. 17-31. (2005) 20. Gordon, P.M.K., Sensen, C.: Seahawk: moving beyond HTML in Web-based bioinformatics analysis. BMC Bioinformatics, Vol. 8(208). (2007) 21. Martín-Requena, V., Ríos, J., García, M., Ramírez, S., Trelles, O.: jORCA: easily integrating bioinformatics Web Services. Bioinformatics. Vol.26(4), pp. 553-559. (2010) 22. Carrere, S. Gouzy, J.: REMORA: a pilot in the ocean of BioMoby web-services. Bioinformatics. Vol.22(7), pp. 900-901. (2006) 23. Wilkinson, M.: Gbrowse Moby: A Web-based browser for BioMOBY Services. Source Code for Biology and Medicine, Vol.1(4). (2006) 24. Navas, I., Rojano, M., Ramirez, S., Pérez, A.J., Aldana, J.F., Trelles, O.: Intelligent client for integrating bioinformatics services. Bioinformatics, Vol.22, pp.106-111. (2006) 25. Néron, B., Ménager, H., Maufrais, C., Joly, N., Maupetit, J., Letort, S., Carrere, S., Tuffery, P., Letondal, C.: Mobyle: a new full web bioinformatics framework. Bioinformatics., Vol. 25(22), pp. 3005-3011. (2006) 26. http://bioinfo.mpiz-koeln.mpg.de/jabba/help.html Downloaded May 15, 2010. 27. Song, Y.C., Kawas, E., Good, B.M., Wilkinson, M.D., Tebbutt, S.: DataBiNS: a BioMoby-based data-mining workflow for biological pathways and non-synonymous SNPs. Bioinformatics, Vol. 23(6), pp. 780-782. (2007) 28. Ríos, J., Karlsson, J., Trelles, O.: Magallanes: a web services discovery and automatic workflow composition tool. BMC Bioinformatics, Vol. 10(334). (2009) 29. Vandervalk, B.P., McCarthy, E.L., Wilkinson, M.D.: SHARE: A Semantic Web Query Engine for Bioinformatics. In: The Semantic Web, LNCS Vol. 5926/2009, pp. 367-369. (2009)