Database-driven grid computing with GridBASE Hans De Sterck1 , Chen Zhang2 , Aleks Papo1 1 Department of Applied Mathematics, 2 David R. Cheriton School of Computer Science University of Waterloo 200 University Avenue West Waterloo, Ontario N2L3G1, Canada
[email protected],
[email protected],
[email protected]
Abstract The GridBASE framework for database-driven grid computing is presented. The design and a prototype implementation of the framework is discussed. Industry-strength database technology plays a key role in the design of the framework. The database is used as a scalable, reliable and remotely accessible component both for storing and organizing the configuration information of the grid, and for managing information related to the grid users and the jobs and tasks they submit for execution. Other system components are worker nodes, a simple resource broker, a grid operator console, and application clients. In analogy with electrical power grids, a clear distinction is made in our design between the role played by grid users on the one hand, who develop and submit application code but are otherwise mostly isolated from resource deployment and selection, and the role played by the grid operator on the other hand, who is responsible for providing computing resources and assuring system availability and maintenance. Application code can be written in any language, and simple workflow support is provided. In our prototype implementation we experiment with code delivery and input and output file delivery via the database component. Our approach is based on decentralization and implemented in Java, leading to a lightweight, portable and scalable grid computing solution that is especially suited for parallel bioinformatics. Deployment of GridBASE on Ontario’s SHARCNET and application to virtual experiments in RNA folding statistics are described.
1
Introduction
There is a clear need for transparent, user-friendly and efficient distributed computing systems for bioinformatics. Indeed, many bioinformatics problems require extensive computational resources. This may be due to either the large size of datasets to be analyzed, or to the high computational complexity of analysis algorithms, or both. Parallel and distributed computing approaches are therefore steadily gaining in interest in the bioinformatics field. Especially for loosely coupled problems, grid computing approaches [1] may be attractive. Computing power is thus treated as an exchangeable commodity, in analogy with the electrical power distributed on a power grid. While the grid computing idea is attractive for certain applications and substantial effort has been dedicated to working out this concept by various research groups around the world, it has also turned out that this idea is not easy to realize in practice. Among the stumbling blocks encountered we can mention security and privacy concerns, lack of hardware or software compatibility between heterogeneous computing equipment, complexity of new solution environments proposed, and the general inertia of legacy computing environments that the grid may attempt to use. Grid solutions are now being used for applications in many forms, both in research and industry environments, but due to difficulties like the ones mentioned above, many issues remain unresolved, and consequently no satisfactory grid computing solution has emerged as a standard yet. Driven by the ever-increasing need for computational power, research on grid computing concepts is therefore still thriving. This paper first presents the GridBASE framework for grid computing [2]. The purpose of GridBASE is to make it easy to grid-enable a certain class of (task-farmable) applications. GridBASE is based on a few simple ideas (Fig. 1):
broker
database
application
worker
operator
Figure 1. Conceptual diagram for databasedriven grid computing system. The thick lines represent information transfer through database access. The thin lines represent direct control interactions between system components.
All exchange of information in the grid occurs through an SQL database, which may be distributed. The only central component of the system is this passive database. All grid activity originates from the geographically distributed grid components, which include worker, broker and application components. The workers pull jobs from the database superqueue. The use of industry-strength databases allows to leverage the tremendous developments in database technology that have been made in recent years, especially with respect to scalability, performance and reliability. worker
worker
...
task bag result bag
task
task result
...
task task
application
result
...
result
worker
Figure 2. The bag-of-tasks paradigm for distributed computing. Just like its predecessor TaskSpaces [3, 4], GridBASE employs the ‘bag-of-tasks’ paradigm for distributed computing [5] (Fig. 2), in which applications deposit tasks in a ‘task bag’, which fulfills the role of a time- and spacedecoupled passive queueing system. Available workers pull
tasks from the task bag, process them, and place the results in a ‘result bag’, from which they are collected by the application. This mechanism offers decoupling in space, as components may be distributed over geographically distributed machines, and decoupling in time, as tasks may wait in the task bag until workers become available. Note that there is no centralized component that actively pushes jobs onto the compute nodes or clusters; all activity originates from the independent geographically distributed grid components. TaskSpaces is a lightweight, platform-independent realization of this concept [3, 4]. In the GridBASE framework to be presented in this paper, the task and result bags are replaced by a database component. There are two main advantages of this. First, the database provides a non-volatile, state-based storage mechanism, which makes the system much more stable and reliable. Second, use of industry-strength databases allows to leverage the tremendous developments in database technology that have been made in recent years, especially with respect to scalability, performance and fault-tolerance. In the simple prototype implementation of GridBASE that we present in this paper, we took the database idea to its limit, and decided to implement, in addition to job management, also all other aspects of the grid computing system via the database, including code delivery, and job input and output. While this may be an approach that is too radical for some applications (for instance, applications requiring very large input or output files), it turns out to work very well for other applications. When files are large, it would be easy to provide in alternative file serving mechanisms, for instance using http servers. In fact, the idea of having all data-related entities in the grid computing system pass through the database may not be that radical, given the fact that in the internet economy entire multi-billion dollar enterprises The structure of this paper is as follows. In the next section, we will give a brief overview of related work. In subsequent sections, the design, implementation and deployment of GridBASE will be described. The paper closes with sections about future work and conclusions.
2
Related work
The field of grid computing [1] is enjoying increasing attention, in research projects all around the world [6], and in recent years also more and more in commercial and corporate settings [7]. Some of the goals of grid computing systems are also related to high throughput computing systems like Condor [8], and volunteer computing systems like BOINC [9]. Worker-driven grid computing systems where workers pull jobs from passive ‘bag-of-tasks’ superqueues have been considered in experimental grid computing systems [10],
including our TaskSpaces system [3, 4]. Most grid computing projects that target large-scale scientific computing, however, employ the orthogonal approach of trying to integrate legacy queueing systems using sophisticated resource managers and cluster schedulers that push tasks to suitable clusters. See for example the open source TORQUE resource manager and Maui cluster scheduler projects [11]. Databases are being used as central components of grid-like computing systems in the particular context of several volunteer computing projects, for instance BOINC [12]. An overview of many workflow solutions for grid computing that have been proposed is given in [13], and [14] and [15] contain online discussions on grid workflows.
3 3.1
GridBASE design Target applications and design goals
GridBASE in its present form targets compute jobs that can be divided easily into many independent tasks (or sets of many independent jobs). A second assumption in the present implementation of GridBASE is that each task has small I/O requirements. GridBASE can thus also be described as a taskfarming system. It aims at improving throughput for these types of applications, by giving users transparent access to heterogeneous computing networks. Design goals are to produce a lightweight platformindependent grid computing framework for taskfarming applications that is easy to deploy and install.
3.2
System components and roles
The major GridBASE design concepts are reflected in conceptual diagram 1. The GridBASE system has five major types of components: a database, an operator, a broker, and one or more worker and application processes. Application processes submit jobs to the database. Each job consists of a number of tasks. Workers register with the database when they are available. The broker periodically queries the database and matches available tasks with available workers. After matching, the broker notifies the workers that have been assigned to tasks. Workers then download tasks from the database, execute them, and upon task completion place the results back into the database. The conceptual role of ‘operator’ is responsible for starting and maintaining the workers, the broker, and the database. Maintenance operations may include adding new users of the grid system, adding new workers or worker clusters, cleaning up residual information in the database when needed, etc. During normal grid operation, the operator is idle. An important observation is that the GridBASE design makes a clear distinction between the role played by grid
users on the one hand, who develop and submit application code but are otherwise mostly isolated from resource deployment and selection, and the conceptual role played by the grid operator on the other hand, who is responsible for providing computing resources and assuring system availability and maintenance. Some of the duties of the conceptual ‘operator’ have been implemented in an operator console program (see below), but other tasks remain to be executed manually by the physical operator in the present implementation of GridBASE.
4 4.1
GridBASE implementation System components
In our prototype implementation, the components of conceptual diagram 1 are implemented as follows. The operator, worker, broker and application components are four stand-alone Java programs. The use of Java allows for seamless platform-independence and easy installation. We use the standard Oracle database. We use the Java Database Connectivity (JDBC) API for database access from Java. JDBC is a standard SQL database access interface. This allows in principle to substitute the Oracle database with any other database of SQL type. Files are stored within the database as binary large objects. All data transfers between system components occur via the database. System components interact with the database in a classic client-server way. As will be detailed further below, the database stores all information that is related to the operation of the computational grid. This includes user information, grid configuration information, job and task information, task code files and task commands, task input files, and task output files. In our simple prototype implementation of the GridBASE design, process execution by tasks is command-line based. For example, C code can be uploaded to the database as task input, and task commands may then compile the code and execute it. This approach is, of course, rudimentary, and only as platform-independent as the heterogeneity of the worker systems allows. Alternatively, Java bytecode can be uploaded, which allows for more platformindependence. It is clear that this initial approach may be complemented by scripting approaches providing more platform-independence that are used in other grid computing systems, but for our purpose of demonstration and testing this initial approach has shown sufficient. The application component is a simple interactive program that allows users to define jobs by specifying tasks and associated input files and command lines. In addition to job specification, the application program allows the user to interactively submit jobs, monitor job progression, and retrieve job results from the database.
user userID userName userPassword
job jobID jobName
worker workerID workerCluster workerIP workerPort workerStatus task taskID taskRank taskStatus
command commandRank
worker
worker
broker 2 broker 1 worker
inputfile inputfileID
worker database
operator
outputfile outputfileID worker
application
Figure 3. Simplified Logical Data Structure for GridBASE.
The broker component periodically queries the database at short intervals of, for instance, five seconds, and matches available tasks with available workers. Workers are notified by the broker when they are assigned to a task. The broker uses simple socket connections to notify a listening server which runs in a separarte thread in each worker. Upon startup, the workers register their availability with the database. Upon notification by the broker, they download task code and input files for the task they are assigned to. Command line execution of the task commands is then invoked. After completion of the task commands, the output files are compressed and stored in the database as a single output file per task. The operator component is another interactive Java program that accesses the database for various tasks such as adding users, clearing residual information in the database, resetting tables, etc. In addition, the physical operator also needs direct ssh access to the machines on which the workers, the database and the broker reside. That allows the operator to start, restart or shut down these components.
4.2
Data model
Fig. 3 shows a simplified Logical Data Structure (LDS) for our prototype implementation of the GridBASE design. Our LDS diagram convention is based on [16]. All the data presented in the LDS is stored in the database. There are seven main data entities in our system. Each entity has one or more attributes. For example, every user has a userID, userName and userPassword, and is associated with one or more jobs. Every job is associated with one user, and is composed of one or more tasks. Similarly, workers can be associated with one or more tasks (all the tasks they execute over time), while every task may be associated with just one worker. WorkerStatus can be available
application
Figure 4. Gridbase deployment diagram. Rectangular boxes represent different machines. The thick solid lines represent connections to the database. The thin solid lines represent direct control interactions initiated by the operator component. The dashed lines represent notification of workers by their associated brokers.
or busy. WorkerIP and workerPort are used by the broker to notify workers when tasks are assigned to them. TaskStatus can be any of unassigned, assigned, in execution, or completed. Every task has one or more task commands. The order of commands within a given task is determined by the commandRank attribute. This enables rudimentary workflow control. The taskRank attribute gives the rank of the task within the job, and is available on the command line in order to allow for rank-determined action within a given task program that may be common to all tasks within a job. Every task can have multiple input files, and every input file can be associated with multiple tasks. This allows files to be used by several tasks while being stored in the database only once. Finally, for simplicity we assign to every task at most one outputfile (a compressed file that holds the actual task output files). Users only have access to the jobs they own, and do not have access to worker information. Some unessential details of the actual implementation have been omitted in this description for simplicity.
5
GridBASE deployment
Fig. 4 shows a typical deployment diagram for GridBASE, in which several workers on individual workstations are combined with workers on a cluster.
5.1
Access requirements
Deployment of the prototype implementation of GridBASE has several requirements on the accessibility of some of the system components, which may be limited in some environments due to firewalls or other security mechanisms that may exist. All workers need to be able to contact the database machine, and need to be accessible by their associated broker. This may be difficult to achieve on certain intranets or clusters. In many cases, clusters or intranets do allow node-initiated connections to the outside world, but nodes may not be visible and reachable from the outside world. The prototype GridBASE implementation may still be used in this kind of environment by simply adding a broker within the security perimeter of the machine or network. The attribute workerCluster in Fig. 3 holds the name of the logical cluster on which the worker resides. Each such logical cluster then has its own broker. Brokers take turns in accessing the database, and each broker only assigns tasks to workers it can access. If node-initiated outside connections are not allowed, it may be necessary to place the database within the security perimeter. Thus, while security restrictions may obviously hamper GridBASE deployment on large numbers of geographically distributed machines across organizations, GridBASE can often be deployed easily within an organization, combining the power of several desktop workstations and medium-sized clusters, keeping track of job progress, and, for instance, and removing the need for manual or script-based file transmissions.
5.2
Test runs
We have deployed GridBASE using machines of various type, including Apple PCs, Linux PCs, and clusters from Ontario’s SHARCNET [17]. Test problems included Java applications, Python scripts, and C applications with compilation on the worker machine. As expected, due to its simple and uniform design, GridBASE performed reliably and efficiently. The reader can experiment with GridBASE for his or her own applications: the prototype implementation described in this paper can be downloaded from the GridBASE project webpage [2].
6
Usage scenarios for bioinformatics
GridBASE can be tailored to many bioinformatics application scenarios for high throughput purposes. For example, GridBASE can readily be applied to: Divisible workload problems A typical example in this category is when we need to run an application with different input for many times and get our final result by looking at the outputs corresponding to each input. Consider the following RNA folding problem
as an example [18]. Given a pool of random RNA molecules of a specified length (typically 50-200 bases), what is the probability that the random pool contains molecules that have the right sequence and are folded into the right structure needed for a particular chemical function? The answer to this question can be obtained by running a certain folding program with varying input data, each of which contains a single random molecule sequence, and by calculating the overall probability based on all the outputs. If we set out to investigate whether specific kinds of chemical function arise more often in pools with overall composition biases in particular directions, this will require the computational folding of many samples in A,C,G,U composition space. If we use 5composition space, leading to 969 different compositions to be tested, then varying the length of the random molecules (for example, 50, 100, and 150 nucleotides), will further increase the number of foldings required, leading to simulations with a typical size of about a hundred million computational foldings. This constitutes a computational problem of moderately large size, which would require weeks to months on a single fast workstation. With GridBASE, however, all foldings can be parallelized easily on computational clusters, yielding much higher throughput. Multi-tier web server hosting bioinformatics programs There are many web-based bioinformatics programs provided by large institutions as web services, such as those doing multiple sequence alignment and secondary structure prediction. Each user will just input his own data set and wait for the corresponding web-based program to finish. At the backend of those web servers hosting the programs, there will be many instances of different programs running simultaneously, consuming huge resources especially when there are thousands of users running programs at the same time. In order to alleviate front end server workload so that on the one hand, higher throughput can be offered by the front end, and, on the other hand, higher performance can be achieved for each individual program instance, a large back end cluster is usually employed which actually runs all the requested program instances, leaving the front end only responsible for answering and dispatching users computation requests. GridBASE can fit in this multi-tier server computation model naturally by sitting in between the front end and back end, offering itself as a convenient and reliable resource management layer.
7
Future work
Integration of GridBASE with existing grid computing tools is needed with respect to Quality of Service (QoS) support, user accounting, fault tolerance, and general grid organization and security procedures. Platform indepen-
dence can be increased by using existing approaches for harmonizing scripting and compilation specifications across platforms. Advanced policy-based brokering can be developed based on task requirements and machine properties. The Database has to be controlled through an access layer (both for performance and security reasons). Application code has to be authenticated through a digital certificate mechanism. Fault tolerance has to be enhanced using database mechanisms, and fault recovery has to be automated. It also seems a good idea to have users submit jobs from their own user database, in which they would store job descriptions and job results. Each user!/s personal project database would act as a centre for further result processing and for long-term storage of results. In the present implementation of GridBASE we chose to put the burden of setting up and maintaining an SQL database only onto the shoulders of the grid operator, but it is clear that storing some types of results in a database may also have substantial advantages for end users. GridBASE is currently deployed on SHARCNET [17] and a departmental cluster of the David R. Cheriton School of Computer Science at the University of Waterloo. We will conduct more experiments using GridBASE, for example, the RNA folding statistics experiment, and will report our experience in follow-up papers. Acknowledgments This work was made possible in part by the facilities of the Shared Hierarchical Academic Research Computing Network [17].
References [1] Ian Foster, Carl Kesselman, Steven Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Lecture Notes in Computer Science, 2150, 2001. [2] The GridBASE prototype implementation discussed in this chapter can be downloaded from www.math.uwaterloo.ca/∼hdesterc/GridBASE. [3] H. D. Sterck, R. Markel, T. Pohl, and U. R¨ude. A lightweight Java TaskSpaces framework for scientific computing on computational grids. In Proceedings of the ACM Symposium on Applied Computing, Track on Parallel and Distributed Systems and Networking, pages 1024–1030, 2003. [4] Hans De Sterck, Rob Markel, and Rob Knight, ‘TaskSpaces: A Software Framework for Parallel Bioinformatics on Computational Grids’, in ‘Parallel Computing for Bioinformatics and Computational Biology’, A. Zomaya, editor, John Wiley and Sons, 651669, 2006.
[5] G. R. Andrews. Foundations of Multithreaded, Parallel, and Distributed Programming. Addison Wesley, Boston, 2000. [6] International scientific projects related to grid workflows as described on www.gridworkflow.org. http://www.gridworkflow.org/snips/gridworkflow/ space/Projects. [7] Grid Computing Resources. Xgrid. www.apple.com/server/macosx/ features/xgrid.html. Sun Grid Engine. www.gridengine.sunsource.net. Globus Alliance. www.globus.org. Cluster Resources Inc. www.clusterresources.com. [8] T. Tannenbaum, D. Wright, K. Miller, and M. Livny. Condor - A Distributed Job Scheduler. In Beowulf Cluster Computing with Linux. MIT Press, 2002. Condor project homepage. http://www.cs.wisc.edu/condor. [9] David P. Anderson. BOINC: A System for PublicResource Computing and Storage. 5th IEEE/ACM International Workshop on Grid Computing, 2004. [10] M. Noble and S. Slateva. Scientific computation with JavaSpaces. Technical report, Harvard-Smithsonian Center for Astrophysics, Boston University, Boston, 2001. [11] TORQUE Resource Manager and Maui Cluster Scheduler. http://www.clusterresources.com [12] David P. Anderson, Eric Korpela, and Rom Walton. High-Performance Task Distribution for Volunteer Computing. First IEEE International Conference on e-Science and Grid Technologies, 2005. [13] Yu, Jia; Buyya, Rajkumar A Taxonomy of Workflow Management Systems for Grid Computing. SIGMOD Record 34(3): 44–49, 2005. [14] Grid Workflow Forum. http://www.gridworkflow.org. [15] Scientific Workflows Survey. http://www.extreme.indiana.edu/swf-survey. [16] J. Carliss and J. Maguire. Mastering Data Modeling. Addison-Wesley, Boston, 2001. [17] Shared Hierarchical Academic Computing Network, Ontario, Canada. http://www.sharcnet.ca. [18] R. Knight, H. De Sterck, R. S. Markel, S. Smit, A. Oshmyansky, and M. Yarus. Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids. Nucleic Acids Research 33, 5924-5935, 2005.