5 Conclusions and Further Work

Chair: Gregor von Laszewski, [email protected] http://sc06.supercomputing.org/schedule/event_detail.php?evid=9233

2006 International Workshop on Grid Computing Environments 2006 in Conjunction with SC06

Table of Contents Grid Portal Development for Sensing Data Retrieval and Processing Diego Arias, Mariana Mendoza, Fernando Cintron, Kennie Cruz, Wilson Rivera Grid Portals for Bioinformatics Lavanya Ramakrishnan, Mark S.C. Reed, Jeffrey L. Tilson, Daniel A. Reed Science Gateways on the TeraGrid Charlie Catlett, Sebastien Goasguen, Jim Marsteller, Stuart Martin, Don Middleton, Kevin Price, Anurag Shankar, Von Welch, Nancy Wilkins-Diehr CoaxSim Grid: Buidling an Application Portal for a CFD Model Byoung-Do Kim, Nam-gyu Kim, Jung-hyun Cho, Eun-kyung Kim, Joshi Fullop Secure Federated Light-weight Web Portals for FusionGrid D. Aswath, M. Thompson, M. Goode, X. Lee, N. Y. Kim Portal-based Support for Mental Health Research David Paul, Frans Henskens, Patrick Johnston, Michael Hannaford WebGRelC: Towards Ubiquitous Grid Data Management Services Giovanni Aloisio, Massimo Cafaro, Sandro Fiore, Maria Mirto Workflow Management Through Cobalt Gregor von Laszewski, Christopher Grubbs, Matthew Bone, David Angulo Workflow-level Parameter Study Management in Multi-grid Environments by the P-GRADE Grid Portal Peter Kacsuk, Zoltan Farkas, Gergely Sipos, Adrian Toth, Gabor Hermann The Java CoG Kit Experiment Manager Gregor von Laszewski, Phillip Zinny, Tan Trieu, David Angulo A Deep Look at Web Services for Remote Portlets (WSRP) and WSRP4J Xiaobo Yang, Rob Allen

Connected in a Small World: Rapid Integration of Heterogenous Biology Resources Umut Topkara, Carol X. Song, Jungha Woo, Sang P. Park My WorkSphere: Integrative Work Environment for Grid-unaware Biomedical Researchers and Applications Zhaohui Ding, Yuan Luo, Xiaohui Wei, Chris Misleh, Wilfred W Li, Peter W. Arzberger, Osamu Tatabe A PERMIS-based Authorization Solution between Portlets and Back-end Web Services Hao Yin, Sofia Brenes Barahona, Donald F. McMullen, Marlon Pierce, Kianosh huffman, Geoffrey Fox Kickstarting Remote Applications Jens S. Vockler, Gaurang Mehta, Yong Zhao, Ewa Deelman, Mike Wilde Real-time Storm Surge Ensemble Modeling in a Grid Environment Lavanya Ramakrishnan, Brian O. Blanton, Howard M. Lander, Richard A. Luettich, Jr., Daniel A. Reed, Steven R. Thorpe TeraGrid User Portal v1.0: Architecture, Design,and Technologies Maytal Dahan, Eric Roberts, Jay Boisseau Extending Grid Protocols onto the Desktop using the Mozilla Framework Karan Bhatia, Brent Stearn, Michela Taufer, Richard Zamudio, Daniel Catarino Mehmet A. Nacar, Marlon Pierce, Gordon Erlebacher, Geoffrey C. Fox

Abstract: This workshop will focus on projects and technologies that are adopting scientific portals and gateways. These technologies are characterized by delivering well-established mechanisms for providing familiar interfaces to secure Grid resources, services, applications, tools, and collaboration services for communities of scientists. In most cases access is enabled through a web browser without the need to download or install any specialized software or worry about networks and ports. As a result, the science application user is isolated from the complex details and infrastructure needed to operate an application on the Grid. Additional information about this workshop is available at http://www.cogkit.org/GCE06

1

Grid Portal Development for Sensing Data Retrieval and Processing Diego Arias, Mariana Mendoza, Fernando Cintron, Kennie Cruz, and Wilson Rivera Parallel and Distributed Computing Laboratory University of Puerto Rico at Mayaguez P.O.Box 9042, Mayaguez, Puerto Rico 00681, USA

Abstract This paper presents our experiences developing grid portals for radar and sensor based applications. Underlying these gateways there are existing grid technologies such as Globus Toolkit 4.0.1 and Gridsphere. The grid portals provide secure and transparent access to applications dealing with data acquired from network of radar and sensors deployed in Puerto Rico, while implementing useful functionalities for data management and analysis. 1

Introduction

Grid computing [1] involves coordination, storage and networking of resources across dynamic and geographically dispersed organizations in a transparent way for users. The Open Grid Services Architecture (OGSA) [2], based upon standard Internet protocols such as SOAP (Simple Object Access Protocol) and WSDL (Web Services Description Language), is becoming a standard platform for grid services development. Operational grids based on these technologies are feasible now, and a large number of grid prototypes are already in place (e.g. Grid Physics Network (GridPhyN) and Teragrid among many others). Although applications can be built using basic grid services, this low-level activity requires detailed knowledge of protocols and component interactions. In contrast, grid portals hide this complexity via easy-to-use interfaces, creating gateways to computing resources. An effective grid portal provides tools for user authentication and authorization, application deployment, configuration and application execution, and management of distributed data sets.

The Open Grid Computing Environments (OGCE) portal software is the most widely used toolkit for building reusable portal components that can be integrated in a common portal container system. The OGCE portal toolkit includes X.509 Grid security services, remote file and job management, information and collaboration services and application interfaces. The OGCE portal toolkit is based on the notion of a “portlet,” a portal server component that controls a user-configurable panel. A portal server supports a set of web browser frames, each containing one or more portlets that provide a service. This portlet component model allows one to construct portals merely by instantiating a portal server with a domain specific set of portlets, complemented by domain-independent portlets for collaboration and discussion. Using the toolkit, one wraps each grid service with a portlet interface, creating a “mix and match” palette of portlets for portal creation and customization. Recently, there have been significant advances in grid portal technologies and development of scientific grid interfaces [3, 4]. This paper presents our experiences developing grid portals for radar and sensor based applications. The organization of the paper is as follows. Section 2 discusses briefly the grid testbed infrastructure deployed at the University of Puerto Rico to investigate issues related to grid computing. Section 3 and section 4 describe the applications and the grid portal developments. Section 5 discusses related work. 2

Grid Test-bed Infrastructure

The PDCLab Grid Testbed, deployed at the University of Puerto Rico-Mayaguez, is an experimental grid designed to address research issues, such as the effective integration of sensor

2 and radar networks into grid infrastructures. The PDClab grid test-bed components run CentOS 4.2 and the Globus Toolkit 4.0.1. The Globus Toolkit includes, among other components, services, such as a security infrastructure (GSI), data transport service (GridFTP), execution services (GRAM), and Information services (MDS). The Grid Security Infrastructure is used by the Globus Toolkit for authentication and secure communication. GSI is implemented using public key encryption, X.509 certificates, and the secure sockets layer (SSL) communication protocol and incorporates single sign-on and delegation.

xSeries Linux cluster with 64 nodes, dualprocessor at 1.2GHz, 53GB of memory and 1TB of storage; Eight (8) IA-64 Itanium servers, dual processor at 900 MHz, each with 8GB of memory and 140GB of SCSI Ultra 320 storage; Two (2) IA-32 Pentium IV servers, dual processor at 3 GHz, each with 1GB of memory and 120GB of ATA-100 storage; One (1) IA-32 Pentium III server, dual processor at 1.2 GHz with 2GB of memory and 140Gb of SCSI Ultra 160 storage; One (1) IA-32 Xeon server, dual processor at 2.8 GHz, L2 Cache 1MB with 1GB of memory and one 230 GB RAID of storage (STB Server); and two (2) PowerVault storage with 8TB.

The Monitoring and Discovery Service (MDS) is used to discover, publish and access both static and dynamic information from different resources in a computational grid. MDS uses the Lightweight Directory Access Protocol (LDAP) to access such information on the different grid components and provides a unified view of the disparate grid resources. The Globus Resource Allocation Manager (GRAM) is used for allocation and management of resources on the computational grid using a Resource Specification Language (RSL) to request resources. GRAM also updates the MDS with information as to the availability of grid resources. The GRAM API can be used to submit a job, query the status of a job, and cancel a job. A GRAM service runs on each resource that is part of the grid and that is responsible for interfacing with the local site resource management system (e.g. OpenPBS, Condor). GridFTP is a secure, high-performance and robust data transfer mechanism used to access remote data. In addition to GridFTP, Globus provides Globus Replica Catalog to maintain a catalog of dataset replicas so that, instead of duplicating large datasets, only necessary pieces of the datasets are stored on local hosts. The Globus Replica Management software provides the replica management capabilities for data grid by integrating the replica catalog and GridFTP. The computational resources available on the grid testbed (see Figure 1) include an IBM

Figure 1: Grid-service based Infrastructure 3

The Student Test-Bed (STB) Grid Portal

The CASA1 project is an NSF Engineering Research Center investigating the design and implementation of a dense network of lowpower meteorological radars whose goal is to collaboratively and adaptively sense the lowest few kilometers of the earth's atmosphere. We have deployed a grid-service based tool to access and manipulate radar data from a radar network. The access to this infrastructure is provided via a grid portal interface. The developed grid portlets provide a presentation layer for the manipulation of both, processed data and raw-data from radar, and for the services to end-users. Additionally, the visualization of weather information is implemented also via portlets.

1

http://casa.umass.edu/

3 The portal presentation layer and core portlets, included in the basic installation are made possible using Gridsphere. Figure 2 shows the customized STB portal. Gridsphere provides portlets for managing user accounts inside the portal framework. This set of portlets is integrated in the STB portal design to improve controlled access to certain resources and services which will be explained further on.

Figure 3: Data management portlet

Figure 2: STB Grid portal interface Users can access raw-data from radars through the grid portal. Files containing the rawdata are stored using the NetCDF2 format. The data management portlets allows end-users to download the data in such a way that they can obtain an exact copy of a file or a set of files. To avoid the server overload, raw data requests are restricted to registered users only. Raw-data does not provide comprehensible information; it requires additional tools for extraction and processing. As a result, this feature is designed for advanced users (students, teachers and researchers) who have the adequate software and previous knowledge of data from radars. These kinds of users are able to request an account from the portal administrator.

Figure 3 shows the data management portlets. Once the user has been logged into the portal, the raw data request portlets are made available. The initial portlet shows a single selection form that permits the selection of the date of interest and then all available data is listed. Then, the data set selected can be downloaded as a compressed file. The grid portal provides current rainfall estimates over the western area of Puerto Rico through reflectivity displays. This information is unrestricted and is available for anyone who accesses the portal. Figure 4 shows how the base reflectivity information corresponding to a sweep is plotted over the Puerto Rico west coast area.

Figure 4: Base Reflectivity portlet.

2

http://www.unidata.ucar.edu/software/netcdf/

4 NCtoJPG: Rainfall rate plots are available in the Grid portal; but older plots are not maintained in to safe storage. Using the grid portal, users can send out from date data sets to the grid, and then receive the corresponding reflectivity plots. NCtoASCII: Through use of the grid portal, users can convert tNetCDF files to text files. This tool eliminates the utilization of extra software for data manipulation.

Figure 5: Base Reflectivity animation portlet Figure 5 shows a portlet used to display a set of base reflectivity over the Puerto Rico west coast area. This portlet performs the animation of the data set and includes loop controls and zooming. The base reflectivity loop is useful in facilitating the tracking of meteorological phenomena. NetCDF data is written as binary files, thus, it can not be read by users as plain text, and specialized software is required for its interpretation. There are several libraries, plugging, programs and a variety of tools to manipulate NetCDF files. However installation, configuration and usage of these tools can be very complex for inexperienced users. Additionally, due to format flexibility, the structure of the files varies, depending on the implementation procedure. To perform a specific task, one or more software tools are needed. For example, there is not available software to generate the reflectivity plots from the radar raw-data. A Java class was developed using more basic classes and libraries for NetCDF manipulation. Additionally, a similar class was developed to convert NetCDF to ASCII. To facilitate the manipulation of raw-data from DCAS network nodes, two very useful services were implemented. These services allow end-users the execution of processes over the raw-data available in the storage system. Thus, users can upload its data sets from a local machine to the server, and process them. The available processes are:

Services for end-users involve executing a process over a single file or a set of files. For instance, a set of NetCDF files can be uploaded with a NCtoJPG request. Data is processed and the output files are made available for downloading, using the grid portal. This entire procedure is transparent to users. The server may process each file and then reply to the output files; nevertheless, the server could be receiving data from the radar network or replying to other user requests at the same time. In order to avoid a crash due to an overload of simultaneous tasks, remote job execution is introduced. The server can submit a simple job or a multi-job to the grid testbed, instead the routine performance of simple local jobs only. Job submission is supported by Globus through GRAM. Additionally, PBS (Portable Batch System) is used as a job scheduler. Figure 6 shows the job submission functionality.

Figure 6: Job submission architecture As shown in Figure 7, an important issue to point out when submitting multiple jobs is that CPU consumption is very quite high ( 97%) when a job is executed in a local server, and is close to 1%, when executing on the STB grid infrastructure.

5 Netcdf to JPG - CPU Usage 100

98

98

98

97

98

1

1

1

3

4

97

97

94

93

93

0

0

0

8

9

10

90

CPU Usage (%)

80 70 60 50 40

Gridsphere. The design of the methodology for composing distributed signal operators follows two major requirements. Firstly, it is desirable to optimize resource management according to the complexity of the operators to be processed. Secondly, the composition of distributed resources requires metadata distribution and management mechanisms.

30 20 10

2

2

0 1

2

5 Grid

1 6

1 7

Server

Figure 7: Percentage of CPU usage

4

The WALSAIP Grid Portal

The NSF WALSAIP3 project is developing a new conceptual framework for the automated processing of information arriving from physical sensors in a generalized wide-area, large-scale distributed network infrastructure. The project is focusing on water-related ecological and environmental applications, and it is addressing issues such as scalability, modularity, signal representation, data coherence, data integration, distributed query processing, scheduling, computer performance, network performance, and usability. A distributed sensor network testbed is being developed at the Puerto Rico’s Jobos Bay Natural Estuarine Research Reserve (JBNERR)4. The reserve has more than 2800 acres is located on the southern coast of Puerto Rico, between the municipalities of Guayama and Salinas. It is administered by the National Oceanic and Atmospheric Administration and it is managed locally by the Department of Natural and Environmental Resources. One of the components of this project is developing a grid-based tool to define workflow composition of signal processing operators as an application service. This tool allows the composition of operators that may be geographically distributed and provided by diverse administrative domains. Again underlying this tool there are existing grid technologies such as Globus Toolkit 4.0 and 3 4

http://walsaip.uprm.edu/ http://nerrs.noaa.gov/JobosBay/

The Grid Portal Interface provides transparent and secure access to end-users. This portal allows end users to define signal processing workflows by using drag and drop functionalities. GridFTP is used to improve data transport from the data server (WALSAIP Server) to the Grid Portal server (PDC Server). Signal processing operators are deployed as grid services. This grid services may be geographically distributed and provided by different administrative domains. Figure 8 depicts the components of the application. Figure 9 illustrates the grid portal interface.

Figure 8: Signal processing application services over a grid environment

Figure 9: WALSAIP grid portal

6 5

Related Work

The Linked Environments for Atmospheric Discovery (LEAD) project [5] proposes an information technology framework for assimilating, forecasting, managing, analyzing, mining and visualizing a broad array of meteorological data and model output independent of format and physical location. LEAD is currently led by nine institutions. The LEAD system is dynamically adaptable in terms of time, space, forecasting and processing. The LEAD infrastructure includes technologies and tools, such as, Globus toolkit, Unidata's Local Manager (LDM), Open-Source Project for a Network Data Access Protocol (OpenDap) and the OGSA Data Access and Integration (OGSADAI) service. The LEAD portal is based on OGCE. Majithia et. al. [6] proposed Triana, a framework that allow users graphically create complex service compositions based on BPEL4WS (Business Process Execution Language for Web Services). It also allows users to easily carry out “what-if” analysis by altering existing workflows. Using this framework it is possible to execute the composed graph service on a Grid network. Gao et. al. [7] developed a service composition architecture that optimizes the aggregate bandwidth utilization within operator networks. A general service composition is proposed to model the loosely coupled interaction among service components as well as the estimated traffic that flows among them. Glatard et. al. [8] discussed how build complex applications by reusing and assembling scientific codes on a production grid infrastructure. The authors stated two paradigms for executing application code on a grid. A task based approach, associated to global computing, characterized by its efficiency, and the service approach, developed in meta computing and the Internet communities, characterized by its flexibility. References 1. Foster and C. Kesselman (1998), “The grid: blueprint for a future computing

2.

3.

4.

5.

6.

7.

8.

infrastructure” Morgan Kaufmann Publishers Foster, C. Kesselman, J. Nick, and S. Tuecke (2002), “The physiology of the Grid: An open Grid services architecture for distributed systems integration, Technical report, Open Grid Service Infrastructure WG, Global Grid Forum. D. Gannon, G. Fox, M. Pierce, B. Plale, G. von Laszewski, C. Severance, J. Hardin, J. Alameda, M. Thomas, J. Boisseau, Grid Portals: A Scientist's Access Point for Grid Services, GGF Community Practice document, working draft 1, September 2003. Gregor von Laszewski, Jarek Gawor, Sriram Krishnan, and Keith Jackson. Grid Computing: Making the Global Infrastructure a Reality, chapter Commodity Grid Kits - Middleware for Building Grid Computing Environments, pages 639–656. Communications Networking and Distributed Systems, Wiley, 2003. K.K. Droegemeier, D.Gannon, D. Reed B. Plale, J. Alameda, T. Baltzer, K. Brewster, R. Clark, B. Domenico, S. Graves, E. Joseph, D. Murray, R. Ramachandran, M. Ramamurthy, L. Ramakrishnan, J. A. Rushing, D. Weber, R. Wilhelmson, A. Wilson, M. Xue, and S.Yalda, Service-oriented environments for dynamically interacting with mesoscale weather. Computing in Science &Engineering, 7(6):12{29, Nov.-Dec. 2005. S. Majithia, M. Shields, I. Taylor, I. Wang. “Triana: A Graphical Web Service Composition and Execution Toolkit.” IEEE International Conference on Web Services (ICWS’2004), San Diego, California, USA, 2004. X. Gao, R. Jain, Z. Ramzan, U. Kozat. “Resource optimization for Web Service Composition.” In Proceedings of IEEE SCC2005, 2005. T. Glatard, J. Montagnat, X. Pennec. Efficient services composition for Gridenabled Data-intensive Applications.” In Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC’06), Paris, France, 2006.

Real-time Storm Surge Ensemble Modeling in a Grid Environment

Lavanya Ramakrishnan1, Brian O. Blanton4, Howard M. Lander1, Richard A. Luettich, Jr.3, Daniel A. Reed1, Steven R. Thorpe2 1 Renaissance Computing Institute, 2MCNC, 3UNC Chapel Hill Institute of Marine Sciences, 4Science Applications International Corporation {lavanya, howard, dan_reed}@renci.org, [email protected], [email protected], [email protected] scale modeling and analysis has driven the use of Abstract high performance resources and Grid environments Natural disasters such as hurricanes heavily for such problems. impact the US East and Gulf coasts. This creates In this paper, we describe the distributed the need for large scale modeling in the areas of software infrastructure used to run a storm surge meteorology and ocean sciences, coupled with an model in a Grid environment. The sensitivity to integrated environment for analysis and timely model completion drives the need for information dissemination. In turn, this means there specific techniques for resource management and is an increased need for large-scale distributed increased fault tolerance when the models run in a high performance resources and data environments. distributed Grid environment. This framework was In this paper, we describe a framework that allows developed as a component of the Southeastern a storm surge model-ADCIRC to be run in a Universities Research Association’s (SURA) distributed Grid environment. This framework was Southeastern Coastal Ocean Observing and developed as a component of the Southeastern Prediction (SCOOP) program[20]. The SCOOP Universities Research Association’s (SURA) program is a distributed project that includes Gulf Southeastern Coastal Ocean Observing and of Maine Ocean Observing System, Bedford Prediction (SCOOP) program. SCOOP is creating Institute of Oceanography, Louisiana State an open-access grid environment for the University, Texas A&M, University of Miami, southeastern coastal zone to help integrate regional University of Alabama in Huntsville, University of coastal observing and modeling systems. North Carolina, University of Florida and Virginia Specifically this paper describes a set of techniques Institute of Marine Science. SCOOP is creating an used for resource selection and fault tolerance in a open-access grid environment for the southeastern highly variable ad-hoc Grid environment. The coastal zone to help integrate regional coastal framework integrates domain-specific tools and standard Grid and portal tools to provide an observing and modeling systems. Specifically, our integrated environment for forecasting and effort in this program is focused on two main areas: information dissemination. 1) storm surge modeling for the south east coast; and 2) experimenting with novel techniques to use 1. Introduction grid resources to meet real-time constraints of the Year after year, the US East and Gulf coasts are application. The storm surge component uses the heavily impacted by hurricane activity causing Advanced Circulation (ADCIRC)[12] model that large number of deaths and billions of dollars in computes tidal and storm surge water and currents, economic losses. For example in 2005, there were forced by tides and winds. While our framework 14 hurricanes, exceeding the record of 12 in 1969, was developed in the context of ADCIRC, the out of which 7 were considered major hurricanes solution is more general and is applicable for [9]. To help reduce the impact of hurricanes, there running other models and applications in grid is a need for an integrated response system that environments. In fact the framework is currently enables virtual communities [1] to evaluate, plan being applied to other models in the context of the and react to such natural phenomena. The integrated North Carolina Forecasting System[22]. system needs to handle real-time data feeds, Our solution builds on existing standard grid and schedule and execute a set of model runs, manage portal technologies including the Globus toolkit [2], the model input and output data, make results and Open Grid Computing Environment (OGCE)[4] status available to the larger audience. In addition, and lessons learned from grid computing efforts in to enhance the scientific validity of the models there other science domains, such as bioinformatics[21], is a need to be able to recreate scenarios and re-run astronomy[5] and other projects. A portal provides the models for retrospective analysis[19]. The large-

the front-end interface for users to interact with the ocean observing and modeling system. The users can conduct retrospective analysis, access historical data from previous model runs and observe the status of daily forecast runs from the portal. The real-time data for the ensemble forecast arrives through Unidata’s Local Data Manager (LDM)[15], an event-driven data distribution system that selects, captures, manages and distributes meteorological data products. Once all the data for a given ensemble member has been received, available and suitable grid resources are discovered using a simple resource selection algorithm. The model run is then executed and the output data is staged back to the originating site. The final ensemble result of the surge computations is inserted back into the SCOOP LDM stream for subsequent analysis and visualization by other SCOOP partners [18].

2.

Science Drivers

Before we detail our design and techniques, we present a brief description of the science elements that are the motivation for our decisions. As mentioned earlier for the storm-surge forecasts, we use the tidal and storm-surge model ADCIRC[12]. ADCIRC is a finite element model that solves the shallow-water generalized wave-continuity equations for a thin fluid layer on a rotating platform. The ADCIRC model is parallelized using Message Passing Interface (MPI). In the current implementation, we use a relatively coarse representation of the western North Atlantic Ocean. Figure 1. shows a 32-processing element decomposition of this ADCIRC grid. Storm surge modeling requires assembling input meteorological and other data sets, running models, processing the output and distributing the resulting

Figure 2. Timeline showing the computation of a hotstart file and a subsequent forecast. On Day K, the hotstart computed "yesterday" (Day K-1) is used to bring the hotstart sequence up to date, and an 84hour forecast is subsequently computed. This same hotstart file is used "tomorrow" (Day K+1) to start Figure 1. Domain decomposition of a highthe sequence over again. resolution ADCIRC grid used in the SCOOP computational system. information. In terms of modes of operation, most meteorological and ocean models can be run in ‘hindcast’ mode, as an after fact of a major storm or hurricane, for post-analysis or risk assessment, or in ‘forecast’ mode for prediction to guide evacuation or operational decisions[19]. The forecast mode is driven by real-time data streams while the hindcast mode is initiated by a user. Our framework is designed to support both these usage models for running ADCIRC and other models in a Grid environment. Further, often it is necessary to run the ADCIRC model with different forcing conditions to analyze

In this paper, we describe the interaction of the Grid components and specific techniques used for resource selection and fault tolerance during model execution. The rest of the paper is organized as follows. In §2 the science drivers are described in greater detail. We describes our design philosophy in greater detail in §3. The architecture and technology components are presented in §4 and §5, experiences from our system and related work in §6 and §7, and we present our conclusions and future work in §8. 2

forecast accuracy. This results in a large number of parallel model runs, creating an ensemble of forecasts. The meteorological modeling community has long recognized that a consensus forecast, based on an ensemble of forecasts, generally has better statistical forecast skill than any one of the ensemble members[14, 11]. Thus, we have taken an ensemble approach to storm-surge forecasting that requires access to a large number of computational clusters, coordinated access to data and computational resources, and the ability to leverage additional resources that may become available over time. Our operational cycle is tied to the typical 6-hour synoptic forecast cycle used by the National Weather Service and the National Centers for Environmental Prediction (NCEP). NCEP computes an atmospheric analysis and forecast four times per day, for which the forecast initialization times are 00Z, 06Z, 12Z, and 18Z. As ADCIRC solves discrete versions of partial differential equations, both initial and boundary conditions are required for each simulation. Boundary conditions include the wind stress on the ocean surface (an ensemble member, described below) and tidal elevations. The initial conditions for each simulation are taken from a previously computed “hindcast” that is designed to keep the dynamic model up-to-date with respect to the analyzed atmospheric model state. This is called hot-starting the model. For each synoptic cycle, a hot-start file is computed that brings the model state forward in time from the beginning of the previous cycle to the start of the current forecast cycle (Figure 2). The wind field boundary conditions for each simulation are taken from a variety of sources, each of which constitutes one member of the ensemble. In addition to the atmospheric model forecasts provided by NCEP, the SCOOP project also uses tropical storm forecast tracks from the National Hurricane Center to synthesize “analytic” wind fields. Each forecast track is statistically perturbed and an analytic vortex model[13] is used to compute the wind and pressure fields for each track. In the SCOOP project, this service is provided by the University of Florida and the wind files arrive through LDM. We are currently investigating the skill of this ensemble approach, and results will appear in a separate communication.

3.

Design Philosophy

The need for timely access to high performance resources for the large suite of ensemble runs makes it important to have a distributed, fault tolerant Grid environment for these model runs. Based on earlier experience in storm surge modeling and the lessons learned from other inter-disciplinary Grid efforts, we identified a set of higher level design principles that helped guide the architecture and implementation of the system. Scalable real-time system: As discussed earlier, using ensemble modeling the forecast accuracy can be increased. Running multiple high resolution, large-scale simulations necessitates the need for a scalable and distributed real-time system. Thus, our system is based on Grid technologies and standards allowing us to leverage access to ad-hoc resources that may become available. Extensible: While this effort has been largely focused in the context of the SCOOP ADCIRC model, our goal is to build a modular architecture to be able to support other applications and add additional resources as they become available. Adaptable: The criticality and the timeliness aspects of the science and the variability in grid environments require the infrastructure to be adaptable at various levels. The infrastructure needs to have active monitoring and adaptation components that can react to these changes and ensure successful completion of the models using fault tolerance and failure recovery techniques. Specifically, based on these underlying design principles, we are focused on building a framework that can be used for real-time storm surge ensemble modeling on the Grid that is triggered by arrival of wind data. The required timeliness of the model runs makes it important to address the following issues on the Grid: a) real-time discovery of available resources b) managing the model run on an ad-hoc set of resources c) continuous monitoring and adaptation to allow the system to be resilient to the variability in Grid environments.

4. Data and Control Flow of the NC SCOOP System The ADCIRC storm surge model can be run in two modes. The “forecast” mode is triggered by real-time data arrival of wind data from different sites through the Local Data Manager[15]. In the “hindcast” mode, the modeler can either use a portal or a shell interface to launch the jobs to investigate prior data sets (post-hurricane). Figure 3 shows the 3

architectural components and the control flow for the NC SCOOP system: 1. In the forecast run the wind data arrives at the local data manager (Step 1.F. in Figure 3). In our current setup, the system receives wind files from University of Florida and Texas A&M. Alternatively, a scientist might log into the portal and choose the date and the corresponding data to re-run a model (Step 1.H. in Figure 3). 2. In the hindcast run, the application coordinator locates relevant files using the SCOOP catalog at UAH[23] and retrieves them from the SCOOP archives located at TAMU and LSU[17]. In the forecast runs, once the wind data arrives, the application coordinator checks to see if the hotstart files are available locally or are available at the remote archive. If they are not available and not being generated currently (through a model run), a run is launched to generate the corresponding hotstart files to initialize the model for the current

resources. The application package is customized with specific properties for the application on a particular resource and includes the binary, the input files and other initialization files required for the model run. 6. The self-extracting application package is transferred to the remote resource and the job is launched using standard grid mechanisms. 7. Once the application coordinator receives the “job finished” status message, it retrieves the output files from the remote sites. 8. In case of the hindcast mode, the results are then available through the portal (Step 8.H in Figure 3). Additionally, in case of forecast mode, we push the data back through LDM (Step 8.F in Figure 3). Data is then archived and visualized by other SCOOP partners downstream. 9. The application coordinator publishes status messages at each of the above steps to a centralized messaging broker. Interested components such as

Figure 3. The control flow through the various components of the architecture forecast cycle. the portal can subscribe to relevant messages to 3. Once the model is ready to run (i.e. all the receive real-time status notification of the job run. data is available), the application coordinator will 10. In addition the resource status information use the resource selection component to select the is also collected across all the sites and can be best resource for this model run. observed through the portal as well as used for 4. The resource selection component queries more sophisticated resource selection algorithms. the status at each site and ranks the resources, 5. Technology Components accounting for queue delays and network We have described the flow through the control connectivity between the resources. system and identified the key components of the 5. The application coordinator then calls an architecture. In this section we will discuss in application specific component that prepares an greater detail the design issues, technology choices application package that can be shipped to remote and implementation of the architecture components. 4

As noted earlier, our architecture is based on existing open source grid middleware and web services tools such as Globus[2], Open Grid Computing Environment (OGCE)[4] and WSMessenger[10]. We describe each of the components in detail below. 5.1. Data Management The data transport system in SCOOP is based on Unidata’s Local Data Manager (LDM). LDM allows us to select, capture, manage, and distribute arbitrary data products over a networked set of computers. LDM is designed for event-driven data distribution where a client may ingest data. In addition an LDM server can communicate with other LDM servers to either receive or send data. LDM is flexible and allows for site-specific configuration and processing actions on the data. The ADCIRC model receives its upstream wind and meteorological data through LDM and the model results are sent downstream to other SCOOP partners through LDM for archiving and visualization. LDM allows us to associate triggers with arriving data that can be used for launching automated model runs. In the long term we anticipate that there might be multiple ways that the data might arrive. In this case, the model runs may need to be triggered by a higher level component. We also use GridFTP to manage data movement during model execution. In addition, we use the SCOOP catalog[23] to locate the data files that may have been generated previously. If available, the files are retrieved from the SCOOP archives[17]. The two types of files retrieved from the archive are the hotstart files to initialize the model run and the netCDF wind files. The wind files arrive through LDM for the forecast runs but may need to be retrieved from the archive for the hindcast runs. In addition, this gives us the ability to use the wind files from the archive to reduce data movement costs during forecast model execution. 5.2. Grid Middleware In the last few years, there has been increased deployment of Grid technologies on commodity clusters. These clusters are used to run scientific applications and are shared across different organizations forming large interdisciplinary virtual communities. For our system we assume a minimal software stack composed of existing grid technologies and protocols to manage jobs and files, namely, Grid Resource Allocation and Management (GRAM)[2] and GridFTP[6] based in the Globus

toolkit. Additionally, the Globus Monitoring Discovery System (MDS)[28] and Network Weather Service (NWS)[27] configured at a site is used to make a more informed resource selection. During the resource selection process, each of sites is queried for the queue status and the bandwidth to each site. The resource selection process is described in greater detail in §5.5. Once a resource is selected, a credential to be used at this site is obtained from a MyProxy[3] server. MyProxy server is a credential management service that stores Globus X.509 certificates. MyProxy allows users to store their certificates and private keys in the repository making it accessible from different distributed resources. MyProxy issues a short lifetime certificate to the system that can then be used to authenticate to the remote system. 5.3. Application Coordinator The Application Coordinator acts as a central component for each of the model runs whether initiated by the user through the portal or triggered by the arrival of data through LDM. It uses the resource selection component to select a grid site. After the user proxy is obtained, the Application Coordinator is able to perform Grid operations on behalf of the user (in case of the hindcast) or a preconfigured user (for the forecast). The application manager invokes a specified script to generate a self-extracting package of the application for the particular remote site. This self extracting package is transferred to the remote site using GridFTP. Once the file is transferred, the job is submitted to the Globus gatekeeper using the GRAM protocol. The GRAM protocol also allows users to poll for the status of the jobs or associate listeners that get invoked when the job status changes. Additionally, when the job completes, we use GridFTP to retrieve the compressed set of output files. The Application Coordinator has been designed to take configuration parameters about the application, its requirements and environment. This module supports running ADCIRC with different grids for different geographical regions and configurations. More recently, the module is being customized to be used with different meteorological models. 5.4. Application Preparation In this work, we assume that the need for urgent computing may necessitate situations that result in ad-hoc quick social arrangements to make resources available during a major storm or weather event. 5

This has implications on how and what we can expect a site to have installed and/or preconfigured. It is possible that the binaries may not

ADCIRC might vary slightly on different resources. ! Finally, create the compressed file

Figure 4. Job status and resource status from the Figure 5. Job History and result files from the portal portal

Figure 6. Hindcast mode from the portal be installed on the target resource. Once a resource for a particular ensemble member is selected, we need to create the application package that will be needed for the particular resource. We create a selfextracting archive file using an open source product called makeself. The self-extracting archive file contains everything that is needed for a model run and is the only file that is transferred to the selected grid resource. While in this particular work, the module contains the binary as well, it is possible to use this for applications which might be preinstalled at sites. The specific steps that are involved in creating this bundle include: ! Running a program that converts the netCDF version of the input wind file to a version compatible with the ADCIRC model. ! Select the correct set of ADCIRC executables for the given resource architecture and model run. ! Identifying specific arguments that are required at the remote end when the bundle is extracted, e.g. the actual MPI command for

containing the binary and all the input data. As described previously in §2, these model runs are usually hotstarted with previous day’s model results. The Application Preparation module checks a “correspondence” description file to identify the type of hotstart file required for a particular wind type and grid. It checks to see if this file has been generated previously and is available either locally or remotely in the archives. If the file does not already exist, it checks to see if another process is running right now that might generate it. If the file is being generated by another process it waits for the process to complete, otherwise a process is launched to generate the hotstart file. 5.5. Resource Selection The Grid sites vary greatly in performance and availability. Even with pre-established arrangements for exclusive access, resources and/or services maybe down or unavailable. Hence, given the criticality of the model run completion, we choose to use a dynamic resource selection algorithm to select an appropriate site for the job submission. 6

During the resource selection process, each of the sites is queried for the queue status and the bandwidth. Globus MDS[28] is an information service that aggregates information about resources and services that are available at a site. Network Weather Service (NWS)[27] is a sensor based distributed system that periodically monitors and dynamically forecasts performance measurements such as CPU and bandwidth. We have developed a simple plug-in based resource-ranking library. While currently we use only real-time information, the library is flexible in allowing us to collect historical information to make better and more accurate predictions. The question we try to answer in our resource selection is "Where should I run this job right now?" The library is built on top of Java CoG Kit[24] and uses the standard libraries for querying resources. The framework is completely extensible and can easily accommodate more sophisticated algorithms in the future. The resource selection first searches a list of remote resources to confirm availability in terms of appropriate authentication and authorization access to the resource, ascertain running of the basic Globus services such as GridFTP and GRAM. All remote resources meeting the above requirements are then ranked according to the real-time information including queue status and bandwidth. This allows us to balance the implications of data movement costs with computational running time. Based on queue and the bandwidth a total time estimate on each resource is calculated to rank the resources. The algorithm takes approximate running times for the model and the data sizes as input to perform this calculation. 5.6. Portal In addition to timely execution of the model, it is important to be able to share the data with the community at large while shielding the consumers of the information from the complexity of the underlying system. We use an Open Grid Computing Environment (OGCE) based portal interface to make available the status of the runs and output files from the daily forecast runs. Figure 4 shows the status of the model runs and Figure 5 shows the results available from the portal. A color marker shows the current state of the run (i.e. “data arrived”, “running”, etc.). In addition end users can use the portal to launch hindcast model execution

(Figure 6) in a grid computing environment using the files from the SCOOP archives. 5.7. Fault Tolerance and Recovery We apply a number of techniques to diagnose and repair errors that might occur during run-time, using a two-phase approach in the ADCIRC Application Manager. The first phase uses retries in the event of a failure or a timeout, and the step is retried a specified number of times. If the retries do not resolve the failure, a "persistent" error has occurred. The execution of the application coordinator has distinct phases (move files, run job, etc.). Persistent errors may occur in one of these labeled phases. A persistent error causes the decoder to retry beginning at an appropriate earlier phase. In addition, certain kinds of persistent errors, such as a failure to successfully transfer a file to a selected resource, cause that resource to be omitted from consideration during the resource selection phase of the retry. This error handling allows the complete execution of model runs under many different adverse circumstances, taking advantage of the inherent redundancy in a grid enabled environment. The application manager can easily detect errors and take appropriate rectification action. But sometimes errors might occur at the model level producing garbled data or a process might run longer than expected and not produce the output. In future implementations, we anticipate we will need additional error checking to detect these scenarios to decrease the probabilities of failures. 5.8. Monitoring and Notification A central component of our design is proactive monitoring of the status of the application and data. This monitoring system is based on standard tools and techniques such as Network Weather Service[27] and instrumentation points at various points of the data flow. The key to managing a distributed adaptation framework is a standard messaging interface. Our messaging interface is based on the workflow tracking tools and eventing system (WS-Messenger)[10] being built as part of another NSF ITR project - LEAD (Linked Environments for Atmospheric Discovery)[26]. Every component in our system publishes status information such as “input data arrived”, “task started”, “task finished”, etc. This status information is available through the portal interface (Figure 5). In addition, the resource monitoring portlet reads a web service we created that serves 7

CPU availability and network bandwidth data. The data itself is currently collected using MDS, NWS and the LEAD eventing system and then stored in a MySQL database.

6.

tolerance that was needed in the Application Coordinator to recover from various errors that might occur during execution. This was then built into the more recent version of the Coordinator. We are currently planning on wrapping these capabilities as web services allowing for more wide spread use in the Grid framework and workflow tools. Our resource selection algorithm is simplistic, but more generally the framework we have developed allows us to easily integrate other more sophisticated algorithms that are being researched in the Grid community.

Deployment Experiences

Various components of the framework have been tested and deployed in the context of hurricane storm surge over the past two years. In this section we briefly describe our evaluation and experiences. 6.1. Resource Pool Management The following SCOOP partner and SURAGrid sites have been tested and added to the resource pool for ADCIRC - local resources at Renaissance Computing Institute (RENCI), Texas A&M University (TAMU), University of Florida(UFL), University of Alabama at Huntsville(UAH), and University of Louisiana Lafayette (ULL). Each of the sites run basic Globus grid services such as the Gatekeeper for job submission, GridFTP for file transfer, and an information service and Network Weather Service. Our current infrastructure is based on the pre-web service protocol stack available in Globus versions 2.x through 4.x. It is important that the basic Globus services are configured correctly at all sites that might be used for the model runs. We have a test suite that is used to test all sites to verify the basic services are running and configured correctly. The test suite verifies the access rights, firewall, configuration of Globus services and the batch scheduler that might configured at the site. The sites are tested periodically to verify correct operation. The test suite helps detect, diagnose errors more proactively. To easily add resources to the pool, we use configuration properties. This allows us to add other resources to the pool, without any programmatic changes. The properties include the addresses for the Globus services, firewall port information and security credentials that can be used for a resource. 6.2. Application Coordinator The application coordinator is configured using a property file allowing easy addition of model configuration parameters etc. An application can use the framework by supplying application specific properties and scripts for creating the packaging, etc. As mentioned the framework is being applied to the North Carolina Forecasting System to run ADCIRC with different grids and other meteorological models. Our early experiences showed the need for higher resilience and fault

7.

Related Work

Grid computing has been increasingly used to run scientific applications from different domains including earthquake engineering, bioinformatics, astronomy, meteorology, etc. Our framework specifically addresses the problems of the need of increased reliability and fault tolerance and recovery that is needed in the context of time sensitive application such as storm surge prediction. Grid scheduling and adaptation techniques have been based on evaluating system and application performance are used to make scheduling and/or rescheduling decisions. Heuristic techniques are often used to qualitative select and map tasks to available resource pools[25]. Our resource selection algorithm is fairly simplistic and only considers queue status and bandwidth measurements to make a decision. While this is simplistic, it works effectively in our current resource environment. The API has been designed to be flexible to allow easy addition of other more sophisticated algorithms in the future.

8.

Conclusions and Future Work

This framework provides a solid foundation on which to build a highly reliable Grid environment for applications that might be time sensitive and/or critical. An enhancement to the computational system currently being developed is the selection of the ADCIRC model grid based on the predicted storm landfall location. We envision a suite of ADCIRC domains with the same basic open-ocean detail, but with different grids resolving different parts of the coastal region and supporting flooding and surge inundation. Grid and portal standards have been a moving target for a few years now. Our software stack for this work was guided by state of the art at the time of the project inception. More recently technology 8

implementations of the standard (i.e. JSR 168) and grid standards (WSRF) have stabilized and we will transition to support Globus 4.0 web services and OGCE-2 for our portlets. Our experiences with building and deploying the framework emphasize the need for increased fault tolerance and recovery techniques to be implemented in real Grid environments. We are investigating standardized web services interfaces that will allow applications to be easily run in a Grid environment with capabilities such as resource selection and fault tolerance. In addition, userfriendly modules that allow scientists to specify the properties needed by the Application Coordinator are being investigated. Data collected from the operation of the framework during the hurricane season will drive further evolution of the framework.

9.

10. References 1. I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” International Journal of Supercomputer Applications, 15(3), 2001. 2. I. Foster and C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” International Journal of Supercomputer Applications, 11(2):115-128, 1997. 3. J. Novotny, S. Tuecke and V. Welch, “An Online Credential Repository for the Grid: MyProxy,” Proceedings of the Tenth International Symposium on High Performance Distributed Computing (HPDC-10), August 2001. 4. Open Grid Computing Environment (http://www.collab-ogce.org/nmi/index.jsp) 5. M. Russell, G. Allen, I. Foster, E. Seidel, J. Novotny, J. Shalf, G. von Laszewski and G. Daues, “The Astrophysics Simulation Collaboratory: A Science Portal Enabling Community Software Development,” Proceedings of the Tenth International Symposium on High Performance Distributed Computing (HPDC-10), pp. 207-215, 2001. 6. W. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal and S. Tuecke, “Data Management and Transfer in High Performance Computational Grid Environments,” Parallel Computing, 28 (5), pp. 749-771, May 2002 7. I. Foster, C. Kesselman, G. Tsudik and S. Tuecke, “A Security Architecture for Computational Grids,” Fifth ACM Conference on Computer and Communications Security, pp. 83-92, 1998. 8. L. Pearlman, C. Kesselman, S. Gullapalli, B.F. Spencer Jr., J. Futrelle, K. Ricker, I. Foster, P. Hubbard and C. Severance, “Distributed Hybrid Earthquake Engineering Experiments: Experiences with a Ground Shaking Grid Application,” NEESGrid Technical Report-2004-42, 2004. 9. Climate of 2005: Atlantic Hurricane Season. http://www.ncdc.noaa.gov/oa/climate/research/2005 /hurricanes05.html, 2006. 10. Y. Huang, A. Slominski, C. Herath, and D. Gannon, "WS-Messenger: A Web Services based Messaging System for Service-Oriented Grid Computing ," 6th IEEE International Symposium on Cluster Computing and the Grid (CCGrid06).

Acknowledgements

This study was carried out as a component of the “SURA Coastal Ocean Observing and Prediction (SCOOP) Program”, an initiative of the Southeastern Universities Research Association (SURA). Funding support for SCOOP has been provided by the Office of Naval Research, Award N00014-04-1-0721 and by the National Oceanic and Atmospheric Administration’s NOAA Ocean Service, Award NA04NOS4730254. We would also like to thank the various SCOOP partners for discussion on the use cases - Philip Bogden (SURA and GoMOOS); Will Perrie, Bash Toulany (BIO); Charlton Purvis, Eric Bridger (GoMOOS); Greg Stone, Gabrielle Allen, Jon MacLaren, Bret Estrada, Chirag Dekate (LSU, Center for Computation and Technology); Gerald Creager, Larry Flournoy, Wei Zhao, Donna Cote and Matt Howard (TAMU); Sara Graves, Helen Conover, Ken Keiser, Matt Smith, and Marilyn Drewry (UAH); Peter Sheng, Justin Davis, Renato Figueiredo, and Vladimir Paramygin (UFL); Harry Wang, Jian Shen and David Forrest (VIMS); Hans Graber, Neil Williams and Geoff Samuels (UMiami); and Mary Fran Yafchak, Don Riley, Don Wright and Joanne Bintz (SURA). We would like to thank various SCOOP and SURAGrid partners for making resources available and special thanks to Steven Johnson (TAMU), Renato J. Figueiredo (UFL), Michael McEniry (UAH), Ian Chang-Yen (ULL), and Brad Viviano (RENCI), for providing valuable system administrator support.

9

11. E. Kalnay, “Atmospheric Modeling, Data Assimilation and Predictability,” Cambridge University Press, 2003. 12. R.A. Luettich, J. J. Westerink, and N. W. Scheffner, ADCIRC: An advanced threedimensional circulation model for shelves, coasts and estuaries; Report 1: theory and methodology of ADCIRC- 2DDI and ADCIRC-3DL, Technical Report DRP-92-6, Coastal Engineering Research Center, U.S. Army Engineer Waterways Experiment Station, Vicksburg, MS, 1992. 13. G. Holland. An Analytic Model of the Wind and Pressure Profiles in Hurricanes. Monthly Weather Review, Vol. 108, No. 8, pp. 1212–1218, 1980. 14. J. Sivillo, J. Ahlquist, and Z. Toth. An Ensemble Forecasting Primer, Weather and Forecasting, Vol. 12, pp. 809-818, 1997. 15. Unidata Local Data Manager. http://www.unidata.ucar.edu/software/ldm/, 2006 16. P. Bogden, et al, The Southeastern University Research Association Coastal Ocean Observing and Prediction Program: Integrating Marine Science and Information Technology," Proceedings of the OCEANS 2005 MTS/IEEE Conference. Sept. 18-23, 2005, Washington, D.C. 17. D. Huang, G. Allen, C. Dekate, H. Kaiser, Z. Lei and J. MacLaren "getdata: A Grid Enabled Data Client for Coastal Modeling," Published in HPC06. 18. P. Bogden et al., "The SURA Coastal Ocean Observing and Prediction Program (SCOOP) Service-Oriented Architecture," Proceedings of MTS/IEEE 06 Conference in Boston, September 1821, 2006 Boston, MA, Session 3.4 on Ocean Observing Systems. 19. J. Bintz et al., "SCOOP: Enabling a Network of Ocean Observations for Mitigating Coastal Hazards," Proceedings of the Coastal Society 20th International Conference, May 14-17, 2006; St. Pete Beach, FL. 20. SCOOP Website http://scoop.sura.org/, 2006. 21. D. A. Reed, et al., "Building the Bioscience Gateway," Global Grid Forum Technical Paper, June 2005. 22. North Carolina Forecasting System. http://www.renci.org/projects/indexdr.php 23. S. Graves, K. Keiser, H. Conover, M. Smith. “Enabling Coastal Research and Management with Advanced Information

Technology,” 17th Federation Assembly Virtual Poster Session, July 2006. 24. G. von Laszewski, I. Foster, J. Gawor, and P. Lane, "A Java Commodity Grid Kit," Concurrency and Computation: Practice and Experience, vol. 13, no. 8-9, pp. 643-662, 2001, http://www.cogkit.org/. 25. D. Angulo, R. Aydt, F. Berman, A. Chien, K. Cooper,H. Dail, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, M. Mazina, J. Mellor-Crummey, D. Reed, O. Sievert, L. Torczon, S. Vadhiyar, and R. Wolski. Toward a framework for preparing and executing adaptive grid programs. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS), 2002(41). 26. K. K. Droegemeier et al, “Service-Oriented Environments In Research And Education For Dynamically Interacting With Mesoscale Weather,” IEEE Computing in Science and Engineering, November-December 2005. 27. R. Wolski, N.T. Spring, J. Hayes, “The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing,” Future Generation Computer Systems, 1998. 28. K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, “Grid Information Services for Distributed Resource Sharing,” Proceedings of the Tenth IEEE International Symposium on HighPerformance Distributed Computing (HPDC-10), IEEE Press, August 2001.

10

Science Gateways on the TeraGrid A survey of issues for deployment of community gateway interfaces to shared high-end computing resources Charlie Catlett, Sebastien Goasguen, Jim Marsteller, Stuart Martin, Don Middleton, Kevin J. Price, Anurag Shankar, Von Welch, Nancy Wilkins-Diehr November 2006

Abstract Increasingly, the scientific community has been using web portals and desktop applications to organize their work. The TeraGrid team determined that it would be important to create a set of capabilities that would allow TeraGrid services and resources to be integrated, potentially in a transparent way, with these scientific computing environments. This paper outlines the “Science Gateways” program and provides an overview of key lessons learned in developing mechanisms to allow for such integration. Contents Background.......................................................................................................................... 2 Accounting........................................................................................................................... 3 Security ................................................................................................................................ 5 Risk Mitigation................................................................................................................ 5 Federated Identity Management ...................................................................................... 6 Metrics and Successful Peer Review ................................................................................... 6 Conclusions and Further Work: Science Gateway Primer................................................... 7 References ............................................................................................................................8

1 Background In 2004, the National Science Foundations’s TeraGrid facility [i] was made available to the national academic community after a 2-year period of construction and early access. The initial facility consisted of homogeneous environment of four Itanium-1 based clusters deployed at four highperformance computing sites linked by a dedicated 40GB/s network, with an aggregate of roughly 15 Teraflops of computational power. Today the facility includes over 20 computational resources, of a wide variety of architectures, at nine sites. In aggregate, TeraGrid provides over 140 Teraflops of computational power and will grow to over 560 Teraflops in 2007. Beyond computational resources, TeraGrid includes high-performance data archives, a growing number of public data collections, and a suite of remote visualization services. Initially, TeraGrid use consisted primarily of traditional client-server interactions, with grid functionality such as single sign-on, parallel data transfer, and remote job submission as well as a new allocation scheme that allowed users to obtain allocations “redeemable” on any TeraGrid platform rather than tied to a particular system as in the past. In 2006 the TeraGrid software architecture was augmented with web services, moving to a service oriented architecture to support new usage paradigms such as workflow and access from within applications or web portals.

Concurrent with the TeraGrid effort, many communities have developed customized interfaces to cyberinfrastructure – using emerging technologies such as web portal platforms and web services. Web access to data collections and compute capabilities are increasingly commonplace, though there are a variety of approaches and technologies used within the community, with a resulting variety of architectures and approaches to building such ‘science gateways.’ For example, the Protein Data Bank [ii] provides an electronic collection of structures of biological macromolecules as well as tools and resources for studying these structures. Similarly, the Network for Earthquake Engineering Simulation Cyberinfrastructure Center [iii] serves those engaged in earthquake engineering research by providing data sharing and simulation capabilities. The Network for Computational Nanotechnology [iv] provides course material, collaboration and simulation capabilities and more to those studying nanotechnology. The National Virtual Observatory [v]provides access to very large digital sky surveys and includes analysis capabilities. There are many such examples. In 2005 the TeraGrid team initiated a “Science Gateways program” to broaden the availability of TeraGrid resources and services by providing access through community-designed interfaces. Through this program we are effectively

adapting the TeraGrid to the work environment chosen by the user rather than requiring the user to learn and adapt to the TeraGrid environment. Working with developers of community infrastructure, we couple powerful TeraGrid resources to the “back end” of familiar front-end interfaces developed by research communities. By providing a rich set of services to gateway developers, they are able to provide greater capabilities to individual scientists without requiring them to invest in learning how to adapt their work to the TeraGrid environment. Traditionally, access to highperformance computing resources is granted to individuals, each of whom is provided with their own username and password. These individuals generally log on to a particular computing platform and submit compute jobs to a queuing system. Providing access to high end resources for a community through a shared interface involves a different, indirect, relationship between the resource provider and the end-user, and this impacts many aspects of the system ranging from authorization to usage accounting. Developing mechanisms to support this new model requires adaptation both for the resource providers and for the community interface developers. In this overview we outline some of these issues. Perhaps most importantly, the Science Gateway model provides an effective and efficient mechanism for providing specialized services (such as highperformance computing) to much larger user communities than was possible in

the past. Individual community proxies or organizations can aggregate resources for the larger group while identifying the common community services that will deliver the highest impact. This leads to a collaborative endeavor between those who provide general cyberinfrastructure services (such as TeraGrid), those who provide discipline-specific cyberinfrastructure (such as a science gateway team), and the scientists themselves. Such interactions are critical to developing and evolving a national cyberinfrastructure that is able to improve the productivity of individual scientists. We will discuss several areas that the TeraGrid has found critical to supporting gateway interfaces to shared resources.

2 Accounting Some gateways plan to support tens of thousands of users. Creating individual logins for all of these users on all TeraGrid resources presents significant scaling challenges. One simplification that can streamline access to resources is a shared community account. Shared community accounts are not accounts where many users access resources by entering the same username and password. Rather, the accounts are managed by the gateway developer and used to run codes on behalf ot the scientist through the gateway interface. Individual users may register with a domain-specific front end, and access common community computational services that are provided

via the TeraGrid using a single gateway account/username. Each individual using the gateway need not have an individual account on the TeraGrid; the gateway provides a collective account. While this level of anonymity can simplify access for end users, there are a number of additional tools required to provide the accountability and security necessary for NSF-funded resources. Developers need additional tools to trace community account resource usage back to individual gateway users and security staff need additional tools to restrict logins that provide web interfaces to supercomputing resources. We will describe these as well as additional capabilities necessary for developers in a complex multi-site environment. In a shared environment, gateway developers will need to track use of TeraGrid resources and attribute this use to individuals logged on to the gateway. In order to correctly attribute usage to an individual gateway user, a developer must be able to determine how many CPU hours were consumed by a job launched by that user on the TeraGrid. While seemingly straightforward in nature, we found that additional capabilities were necessary to facilitate this tracking. Many gateway developers use capabilities provided by the Globus Toolkit [vi]to access TeraGrid resources. For example, Grid Resource Allocation and Management (GRAM) [vii] is typically used to support remote job submission and monitoring., However, when a job finishes there is no

straightforward way for the gateway to determine how many CPU hours the job consumed. That information is critical to attributing usage to individual users using a Science Gateway account on the TeraGrid. To enable this functionality, the Globus team defined and created a Web Services (WS) interface and associated mechanisms to provide access to audit and accounting information associated with Grid services. This auditing system was designed to be scalable, secure, and open so that any grid service that TeraGrid deploys can follow this design. First, the Globus Toolkit’s GRAM2 (PreWS GRAM) and GRAM4 (WS-GRAM) services were enhanced to create audit records that are written to a database local to the GRAM services. So there will be many audit databases created everywhere TeraGrid’s grid services are deployed. These GRAM audit databases and records provide a persistent link between the grid service’s job id and the local resource manager’s (LRM) job id. Next, based on use cases and requirements, Open Grid Services Architecture-Data Access and Integration (OGSA-DAI) was selected to provide a service interface for TeraGrid’s audit and accounting information. OGSA-DAI is a Globus Toolkit Web Services Resource Framework (WSRF) service that can create a single virtual database from two or more remote databases. A new OGSA-DAI perform document was written which defines the WS operation for returning TeraGrid Service Units (SUs) given a grid job id.

Gateway developers can then use this new interface to manage and account for their allocation on a per job basis. There are some tricky details to accurately retrieving the correct usage for a job from TeraGrid’s central database that are best codified in a service/configuration and not exposed to gateway developers. OGSA-DAI can be deployed either centrally providing a TeraGrid-wide usage query service for all jobs, or deployed locally along with each GRAM4 service deployment, providing a local usage query service for local jobs. TeraGrid can decide (and change) which is most strategic. Gateways will now be able to remotely submit jobs to TeraGrid and account for usage on a per job basis without needing to understand the details of the various local resource managers chosen by TeraGrid resource providers. We feel this accounting capability will be very useful for other projects where per-job usage information is needed. These types of enhancements are essential toward reducing the complexity for gateways to interface with TeraGrid’s computational resources, as well as, allowing TeraGrid to simultaneously support an increasing number of gateways.

3 Security 3.1 Risk Mitigation Additional risks can also arise when providing community account and web interfaces to high performance resources. The TeraGrid security

working group has analyzed these risks and is developing approaches to mitigate them. Security officers at each site are alerted when a community accounts is requested and these accounts are uniquely identified in the TeraGrid central database. TeraGrid resource sites may take independent approaches to account restriction. The current approach being suggested is a command-based restriction approach where a community account may only run certain commands, e.g. only commands in a specific directory. At least one TeraGrid site is using this approach. We believe it provides the necessary security and gateway flexibility when deploying many applications on TeraGrid. One software package currently being developed at NCSA to suit this purpose is the Community Shell, or Commsh [ix]. Commsh allows for two methods of account restriction: The first method is an implementation of the commandbased restriction described in the previous paragraph. Under this method, a configuration file is created that defines which commands (or sets of commands) a given account can execute. These commands can be specified using wildcards and regular expressions to create for a flexible command restriction framework. The second method is change-root (or chroot) jailing. Change-root jailing effectively creates a filesystem-based "sandbox" for the account, only allowing commands to be executed from within this sandbox. An additional utility,

Chroot_jail, can be used to help construct and manage these change-root jails. An adapter exists that can allow Commsh to operate in command-based restriction mode for GRAM job submissions. Unfortunately, this adapter does not support change-root jailing at this time. Gateways may also want to implement shut-off mechanisms so that in the event of a problem, jobs sent to TeraGrid can be restricted from a given gateway user without shutting down the entire community account. As an intermediate step, TeraGrid will be providing a web interface for system administrators to contact developers with information regarding a problem local jobid. Long term, TeraGrid would like to provide a service to enable automatic gatewaylevel account shut-off.

3.2 Federated Identity Management In the traditional mode of operation, prior to Science Gateway model, each resource or resource-providing site was responsible for the management of their users identities (a term we use to encompass a user’s username, password, attributes and privileges). The science gateway model brought an out-sourcing of identity management from the resource to the gateway, shifting the responsibility for authentication and authorization from the resource to the gateway [xi].

To achieve the maximum scalability, the goal is to shift identity management all the way back to the user’s home institution (i.e. their campus or place of employment) and leverage the existing identity management infrastructure. Mechanisms to achieve this based on Shibboleth, GridShib, myVocs are other technologies are currently being evaluated by TeraGrid[x].

4 Metrics and Successful Peer Review Metrics of success are commonly requested for government funded programs. Successful gateway design will allow principal investigators to highlight gateway usage as well as science accomplishments due to the gateway. In the long term, gateways may set up a mechanism for researchers to cite the use of the gateway in publications. Success both in funding the gateway and in requesting TeraGrid resources can be traced to scientific accomplishments and a history of publications. For example, the DOE-sponsored Earth System Grid (ESG) project [viii] includes a Metrics Service that tracks logins, file and aggregation downloads, browse and search requests, and the total volume of activity conducted via its portal. Similar to many other funded projects, this information is very useful to principal investigators and sponsors in terms of determining the overall impact of the project. As ESG begins to utilize TeraGrid resources, it will need to track

computational and data services that are delivered to it as a Science Gateway.

necessary for making use of those resources.

At the same time, we need to understand the degree to which the TeraGrid contributes to the success of a given project. In the large, this is a fairly straightforward thing to quantify, but there will need to be an interplay between the Science Gateway projects and the TeraGrid where utilization of resources are associated with impact on the science community.

To facilitate accurate and timely information in such a dynamic area, the primer has been made available as a Wiki (http://www.teragridforum.org/ mediawiki) and will serve as the basis for Science Gateway documentation for TeraGrid resource integration. We expect contributions from a number of active gateway developers to ensure both accurate and timely information on TeraGrid resources and services, and also see the Wiki as an active repository for the many tools used in gateway development. We believe this type of community involvement will provide a rich collection of information for others and will facilitate use of high end resources in an increasing number of gateways.

5 Conclusions and Further Work: Science Gateway Primer Gateway and portal deployment is an extremely active area, and one objective of the TeraGrid Science Gateway program is to develop standard processes and approaches necessary so that gateways can be enabled in a routine fashion. We have pursued this approach in such a way as to minimize TeraGridspecific requirements for gateways, thus making it straightforward for a gateway project to use resources from TeraGrid as well as other grid facilities. To this end, an important aspect of the program has involved capturing and codifying lessons and approaches in the form of a “primer.” The initial version of the primer, based on initial experiences integrating TeraGrid resources with gateways, describes available resources and services for gateways as well as requirements

The primer describes TeraGrid resources and services available to Science Gateways, requirements for using TeraGrid resources, best practices when designing a gateway and a includes a software contribution area. TeraGrid provides a variety of services to gateway developers in addition to hardware and software resources. Developers also have additional responsibilities for securing community accounts and tracking usage. Accounting services include a variety of accounts made available to gateway developers including single user accounts, community accounts and, in the future, dynamic accounts. The primer includes links to all computing, data and visualization resources

available through TeraGrid. It describes the types of software available – the Common TeraGrid Software Stack, third party packages installed on the TeraGrid and maintained by TeraGrid staff and community software areas available to gateway developers for their own software deployment efforts. In addition, TeraGrid external relations staff are available to assist in publicizing gateway successes. Requirements outlined in the primer include additional information to be provided when requesting community accounts, recommended audit trails for usage tracking and mechanisms to restrict problem jobs. Best practices described cover gateway planning,

6 References

design, implementation, operation and metrics collection as well as desirable gateway characteristics. The goal of the TeraGrid Science Gateway program is to provide streamlined access to developers wishing to integrate high end resources into their portals and desktop applications. The Science Gateway team will continue to develop processes and software functionality necessary to make this possible. Near term future work will address generalized Web Service interfaces to TeraGrid resources and an attribute-based authentication testbed to investigate scaling and accounting issues faced by large communities.

[i] More information about TeraGrid can be found at http://www.teragrid.org [ii] Protein Data Bank – http://www.pdb.org [iii]Network for Earthquake Engineering Simulation – http://it.nees.org [iv] Nanohub – http://www.nanohub.org [v] National Virtual Observatory – http:// www.us-nvo.org [vi] Globus Toolkit – http://www.globus.org [vii]Grid Resource Allocation and Management - http://www.globus.org/alliance/ publications/papers.php#GRAM97 [viii]Earth System Grid – http://www.earthsystemgrid.org [ix] http://security.ncsa.uiuc.edu/research/commaccts/ [x] Von Welch, Ian Foster, Tom Scavo, Frank Siebenlist, Charlie Catlett. Scaling TeraGrid Access: A Roadmap for Attribute-based Authorization for a Large Cyberinfrastructure (draft) http://gridshib.globus.org/tg-paper.html [xi] Von Welch, Jim Barlow, James Basney, Doru Marcusiu, and Nancy Wilkins-Diehr. A AAAA model to support science gateways with community accounts. In Concurrency and Computation: Practice and Experience, October 2006.

CoaxSim Grid: Building an Application Portal for a CFD Model Byoung-Do Kim1, Nam-gyu Kim2, Jung-hyun Cho3, Eun-kyung Kim3, Joshi Fullop1 1

National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign 2 Korea Institute of Science and Technology Information, Tae-jun, Korea 3 Sookmyung Women’s University, Seoul, Korea

Abstract An application portal has been developed to support the execution of a computational fluid dynamics (CFD) numerical model that simulates hydrodynamic flow instability of the coaxial injector in the liquid rocket engine. The CFD application has been integrated into a web-based portal framework so that users can utilize resources available in grid computing environment. The portal provides users with a single sign-on access point that connects grid computing resources available in the users virtual organization. The portal developed in this project is built in the framework of GridSphere and Grid Portlet package. In addition to the default services offered by the GridSphere, extra service modules have been developed in order to provide customized features for the specific needs of the numerical model. The portal development was conducted as an international collaborative research between NCSA and Korea Institute of Science and Technology Information (KISTI). effective measure for developing a good problem solving environment (PSE) for the grid computing application. The web-based portal as a science gateway could be the most appropriate approach for any existing scientific application to take advantage of available grid technology. The Web browser-based portal user interfaces is a great candidate for a single sign-on access point that provides access to heterogeneous, geographically distributed resources, services, applications, and tools. By using the interface, the grid-enabled portal can offer various grid solutions to users as long as the user has access to the Internet. Integrating various grid toolkits into the web-based portal framework has been proven to be an effective mechanism for scientific research communities to use the grid resources in the form of science gateway [1, 2]. With a grid portal framework and proper toolkits, it is possible to develop a scientific portal in a shorter time period than before. The main effort is in developing customized services for the numerical application that is implemented in to the portal. Since each

1. Introduction Three important characteristics of a grid computing environment are scalability, accessibility and portability. Scalability represents the ability to offer a diverse set of resources including high-performance computers (HPC), large storage systems, visualization systems, and even special experimental instruments for certain research communities. In order to take advantage of these various resources, the computational application itself needs to be scalable up to a certain level that passes normal utilization of local resources. The accessibility of the grid environment could be realized by software layer. The main topic these days in cyberenvironment research would be how to provide better solution for scientists and engineers to access and utilize the grid resources for their research. Another important factor is portability, which guarantees easy of transformation of developed solution between heterogeneous computing environments. Achieving three features mentioned above could serve as an

1

the collaboration. The TeraGrid community was assumed to be a virtual organization for this portal project. This paper presents an overview of the application portal as well as details of the newly developed service modules for the CFD application.

application has different requirements, developing customized modular services and plug them into the basic portal frame work would be the best approach as it gives the portal high level of flexibility and portability. In this year, NCSA has initiated an international collaborative research project with KISTI in the area of cyberenvironment; developing problem solving environment for scientific applications. KISTI has been putting great effort into cyberenvironent reasearch with the national e-Sceince project for the last couple of years, and one of the outcomes is e-Science Aerospace Integrated Research Systems (eAIRS) [3]. The eAIRS is an application portal that offers specialized services for a CFD application developed by Seoul National University, Korea. The eAIRS became a target item for the collaboration project because of its similar characteristics to the requirements to be developed for the NCSA’s application. While the eAIRS successfully operates for its own CFD solver, its limited portability and other technical issues led the NCSA-KISTI development team to create a new portal framework. The most noticeable difference here is that eAIRS uses its own middleware and portal framework based on GT2 while the application portal developed in this project utilizes GridSphere and Grid Portlet to ensure portability. Though the mechanism underneath the portal interface is different from each other, both portals still share some of service modules, which is a great aspect of

2. The CFD Application The application implemented into the portal is a computational fluid dynamics (CFD) numerical model for the coaxial injector flow in the liquid rocket engine. This model has been developed through collaboration research work between NCSA and School of Aeronautics and Astronautics Engineering at Purdue University, West Lafayette, IN. The purpose of this numerical model is to investigate hydrodynamic instability of the two-phase jet flow (gas and liquid) in the coaxial type of the injector that is normally used in the Space Shuttle Main Engine (SSME) and other large scale spacelaunching rockets. In the fuel injection systems of the liquid rocket engine, gaseous fuel and liquid oxidizer are injected through the nozzle into the combustion chamber of the rocket engine. Due to many disturbance factors such as acoustic pressure variation from the chamber or intrinsic hydrodynamic instability in the flow itself, the atomization mechanism of the fuel injection promotes combustion instability during rocket engine operation.

Figure 1. Cross-sectional view of liquid jet density contour in a recessed region of the coaxial injector and frequency analysis of the jet pulsation.

2

and velocities in three-dimensional Cartesian coordinate with respect to time and space. The model is also capable of check-pointing, allowing users to check for numerical divergence or other numerical problems. Typical runs of the application takes 3 to 5 days with 16 to 32 processors depending on simulation requirements. The data size of the output also varies, but typical execution produces 10 to 20 GB for each run. The application portal that we have developed integrates the numerical model into the GridSphere portal framework and enables users to control all the features mentioned above. In addition to the basic functions that are provided by the GridSphere and Grid Portlets, extra service modules for the specific needs of the model have been developed by means of portlet programming. As the portal puts complex service layers and resources behind the webbased interface, an experienced user can run the code on various resources available in his/her virtual organization. The next section will discuss the architecture of the application portal.

The numerical model in this paper is to simulate the coaxial jet flow and to investigate the atomization mechanism of the coaxial jet in order to obtain a better understanding of combustion phenomena in liquid rockets [4, 5, 6]. Figure 1 shows flow instability inside the injector nozzle and jet pulsation at the exit. The numerical model consists of three major parts; pre-processing, main solver, and post-processing, which is a common structure for engineering numerical models. Mesh generation, input parameter setup, and code-run configuration compose the preprocessing step. The main solver is the core of the model, and solves flow physics in the computational domain. The solver produces a large amount of data in multiple formats. This data get visualized in the postprocessing step. The workflow described here is widely employed procedures especially in CFD numerical models. Figure 2 shows the workflow explained in this section. Due to the short time length for the collaboration project, the portal development focused on integrating the CFD solver and automating post-processes. The pre-process procedure will be added on to the current product later on. The code is parallelized using Message Passing Interface (MPI), and runs on a large number of processors. It is also possible to control the size and frequency of the output data files as well as the number of parameters to be printed. For production runs, it normally produces data for density, pressure,

3. Portal Architecture The application portal employs a standard three-tired architecture; resource layer, grid service layer, and portal interface. The architecture consists of several major elements such as a standardized portlet container based on the JSR-168 standard,

Figure 2. Workflow diagram of the coaxial injector flow modeling application

3

Figure 3. Three-tired architecture of the CoaxSim Grid Portal

grid security, file transfer and remote execution based on the Globus toolkit. The GridSphere framework with the Grid Portlet also employs the three-tired architecture and provides the basic functionality mentioned above. The application portal developed in this project is based on the GridSphere framework 2.01 and Grid Portlet 1.3.0. The GridSphere, a portlet container and a collection of core services and advanced user interface library, makes developing a portal easier by employing a portlet programming approach [7, 8]. Pre-WS Globus Toolkit 4 serves as base middleware in this architecture while it still maintains the GT2 compatibility. Within the GridSphere framework, a series of customized service portlets have been developed for the specific needs of the coaxial injector application. In addition to the new services, some of the basic features have been improved as well in order to increase the efficiency and ease of use. Figure 3 shows a diagram of the application portal architecture. The middle section illustrates relations between grid services and 3rd party service modules that are required to operate the services. Further explanation on service development and it mechanism will be given in the next sections.

Figure 4 shows front page of the CoaxSim Grid portal that has a brief explanation about the portal development project. The basic features such as user account management, grid credential retrieval, resource registration, file management, and job submission are provided by the GridSphere. Once a user logs into the portal, the first service tab is ‘Profile’ that includes account and service management. In the next tab ‘Grid Services’, the user has a choice of proxy server for the credential retrieval; myproxy.ncsa.uiuc.edu or myproxy.teragrid.org can be used as long as the resources are registered to the portal. The time length of the credential activation can be configured though the credential management portlet. If the user is an administrator of the portal, then he or she can register any resources available in the virtual organization. The resource management page under the grid service tab provides a feature that users can examine those registered resources. The MDS 2 (Monitoring and Discovery System) service offers information regarding available services and jobs on the machine, however, the MDS 2 is not at production level and the information offered is very limited at this point. Once MDS 4 is deployed onto the TeraGrid systems, it is expected to provide more useful information of the resources. Another basic feature in the grid service tab is file management. It allows users to upload

4. CoaxSim Grid Portal

4

injector model have been implemented. First, the ‘Real Time Code Check’ page provides users with a function that displays calculated results of critical parameters in the middle of job execution. Once the user selects a job from the list, the page displays plots of residual value from the matrix solver and total computation/communication time up to that point in the code execution. This feature is extremely useful to the user when the code runs over a long period of time since the residual value provides users with information on numerical behavior of the code. The time plot also gives the user an idea of how long the code will run, and also how far the code has progressed. The second function is data management. While the GridSphere provides a basic file management portlet, it still requires manual labor from the user if he/she wants to move the output files to a local workstation. The data management function developed for this portal offers an automatic file transfer capability between the computational resources and long-term storage unit (Mass Storage System at NCSA in this case).

and download files and move data around between local workstation and the grid resources. The user interface for the file management service is mediocre at best, but the effort of developing another file management module was not feasible at the given time frame. The next tab, named ‘Simulations’, handles job submission and status check-up for the jobs that have been submitted through the portal. Jobs are submitted by Globus Resource Allocation Manager (GRAM) service, and users are asked to input information required for running the job. Once the job is submitted by GRAM, users can review the job status in the job monitoring page. The GRAM service simply shows submitted, pending, active, and done status. In the CoaxSim Grid portal, an extra feature displays detailed job status information from the local scheduler. The implementation of the feature will be explained in the next section. The functions explained so far in this section are mainly the given features by GridSphere or modified version of them. In the ‘Data’ tab, newly developed features that are specifically customized for the coaxial

Figure 4. Front page of the CoaxSim Grid Portal

5

DataGrid project. At start of a session, users store their long-lived credentials in a dedicated MyProxy server and delegate short-lived credentials to their jobs. When job’s credential nears expiration, the Workload Management System retrieves a new short-lived credential from the MyProxy server on the user’s behalf and uses it to refresh the job’s credential [9]. The CoaxSim Grid portal’s credential module uses this feature in order to avoid issues due to credential expiration.

It also provides direct file download from the computational resources to a local workstation. Both single file and multiple files download are possible allowing users flexible file management. Copies of saved files are extracted from the mass storage system but the original data files remain available in the storage system after the download.

5. Service Modules Development 5.1 Grid credential retrieval module

5.2 Job submission and status check module

The credential and resource management function is provided by Grid Portlet in the Gridsphere framework. Once the MyProxy server is registered in the resources.xml by administrator, the user can create session credential using X.509 proxy certificate. Since typical run time of the application is usually longer than normal credential lifecycle, users had to reissue the credential once the original one is expired. The MyProxy team has recently developed a solution to the problem while they were working on EU

The features for job submission from GridSphere satisfy the basic needs for running this application, but the user interface was found to be not efficient enough. We have integrated the GRAM service selection page into the resource selection and job submission page, and redesigned the RSL scripting page for a better user interface.

Figure 5. Job Monitoring page with a link to CluMon and information from the PBS scheduler.

6

computational resources at a glance. It delivers information from both the scheduler and hosts to users through web interface. To convey this information to the portal user, CluMon was slightly modified to recognize a global ID request and use that key to crossreference the local job name and return the job info to the requester. The only technical bug is that the local scheduler (PBS based) currently has a bug where it is not returning all of the job information on a remote query, as opposed to a query from the local machine. Once this is fixed, the mechanism should work as designed. Figure 5 shows the job monitoring page that displays job-related information extracted from CluMon.

In the Job Monitoring page, we added a couple of new features in order to give more job-specific information to the users. When the job is submitted by the GRAM service, it is sent to the local scheduler, such as PBS, and the GRAM generates a job ID for its own use. Unfortunately, relating globally scheduled jobs to jobs scheduled to run on a computational resource with its own scheduler has been a problem since the inception of global grids. Since GRAM service only returns very simple job status message, users are not able to get any information regarding how the job is processed on the local machine. An indirect method of solving this problem that we have employed is to assign some key (a unique job ID in the RSL script in this case) to the global job at submission time, and pass that key along to the local scheduler as a job attribute. Then on the information query side of things, that key could be used to query and ascertain the local job name. CluMon, a cluster monitoring system, developed by NCSA [10] has been utilized in this mechanism. The CluMon was developed to give an overview of the current state of the

5.3 Real time code checkup module This module has been developed in order to allow users to check the behavior of the code during run time. At every time step, the code writes values of physical parameters in a single file that are critical for determining the convergence of the code. The portal can pull this file whenever users want, and display the plots that are shown in Figure 6.

Figure 6. Real Time Code Check-up page with residual value plot and computational time per time step plot.

7

out of the box. However, users of a specific numerical model usually want to have customized services for the output file management because data management for the post processing requires specific procedures that are suitable for an automated process. The coaxial injector model produces hundreds of output files from each run, and the size of the data easily goes beyond of normal size of disk quota of users. The CoaxSim portal utilizes 3rd party file transfer mechanism that was developed with KISTI’s eAIRS project. Once the eAIRS file transfer module using gridFTP protocol is activated, then the eAIRS server keeps remotely checking the data production from the CFD solver on the computational machine. The files are transferred to long-term storage units that are already registered to the portal. At the same time, meta data for the transferred files are saved at separate DB server. When portal server inquires about the information on the output file on behalf of the user, the DB server parses corresponding meta data to the portal server for display.

The right plot is matrix residual value with respect to time and the left plot is showing computational or communication time with respect to accumulated time or each time step. This graph generation is done by utilizing freely available java graphic API, JFreeChart. This module was previously developed for the e-AIRS project at KISTI, and was implemented here with a slight modification. The displaying parameters can always be replaced with other choices depending on the users requirements. This service is particularly useful to the most of numerical application developers especially where the application has a check-pointing capability because it shows numerical behavior of the code in real time during excution. 5.4 Post-processing data management module The GridSphere provides a default file management portlet. Once the user download and install the GridSphere and Portlet package, the user can use the feature

Figure 7. Data Management page for output file browsing and single/multiple file download

8

technology and lots of programming work for services development. Keeping up with the rapidly changing the standards in the grid computing technology is another barrier for the engineering and science research groups. The CoaxSim Grid portal in this project is an example of overcoming those obstacles through collaboration between computer science group and engineering application group. Though there are still many areas to work on in order to create more sophisticate grid computing solution for generic scientific applications, the CoaxSim Grid project has shown a possible way of reusable grid services development using currently available grid technologies.

The actual file transfer from the storage system to the local workstation occurs only when the user decides to download the file to the local workstation. This mechanism saves time and avoids network performance bottlenecks since only the meta data are needed before the actual file transfer. The eAIRS server enables the 3rd party file transfer activity between the computational resource and storage system and also between the storage systems to local workstation by using gridFTP protocol. 6. Summary and Discussion CoaxSim Grid, an application portal for a CFD numerical solver has been developed through the collaborative project between NCSA and KISTI. The CoaxSim Grid portal developed in this project utilizes basic portal features by the GridSphere framework and Grid Portlet package from GridLab. The main focus of the development of the CoaxSim Grid was to provide customized services for the coaxial injector modeling application. Rather than asking users to get accustomed to what is given with current grid technology, proactively developing users need and the requirements of the application in the context of the grid portal could bridge the gap between the engineering researchers and computer science technology. The architecture presented in this paper is based on the concept that the portal server is a container of service clients that are designed according to the portlet component model. The use of standardized portlet component model enables fast development of customized portal for a given application. It also allows plug-in type of service module development, which guarantees flexibility and portability of the portal contents. By having unified user interface for using all types of grid computing resources in the TeraGrid community, the CoaxSim portal could accelerate the pace of research production of user groups when it is positioned as a research community oriented problem solving environment. Challenges are still there, however, because it still requires high level of understanding of grid

Acknowledgement The authors would like to acknowledge supercomputer time provided by the National Center for Supercomputing Applications and financial support from Korea Institute of Science and Technology Information.

References [1] M.P. Thomas, J. Burruss, L. Cinquini, G. Fox, D. Gannon, L. Gilbert, G. von Laszewski, K. Jackon, D. Middleton, R. Moore, M. Pierce, B. Plale, A. Rajasekar, R. Regno, E. Roberts, D. Schissel, A. Seth and W. Schroeder, ‘Grid Protal Architecture for Scientific Applications’, Journal of Physics, Conference Series 5. Accepted for publication [2] M.P. Thomas, M. Dahan, K. Mueller, S. Mock, C. Mills and R. Regno, ‘Application Portals: Practice and Experience’, Grid Computing Environments: Special Issue of Concurrency and Computation: Practice and Experience 2002 V.14:1427-1444. [3] e-AIRS project at http://obiwan.kisti.re.kr/escience/eairs [4] B. Kim, S. D. Heister “Numerical Modeling of Hydrodynamic Instability of Swirl Coaxial Injectors in a Recessed

9

Region”, 42th AIAA/ASME/SAE/ASEE Joint Propulsion Conference, Sacramento, CA, 2006 [5] B. Kim, S. D. Heister, S. H. Collicott “Three-Dimensional Flow Simulations in the Recessed Region of a Coaxial Injector”, Journal of Propulsion and Power, Vol. 21, No. 4, pp. 728-742, 2005 [6] B. Kim, S. D. Heister “Effect of Chamber Pressure Variation on High-Frequency Hydrodynamic Instability of Shear Coaxial Injector”, 40th AIAA/ASME/SAE/ASEE Joint Propulsion Conference, Fort Lauderdale, FL, 2004 [7] J. Novotny, M. Russell and O. Wehrens, ‘GridSphere: An Advanced Portal Framework’ at http://www.gridsphere.org/gridsphere/wp4/Documents/France/gridsphere.pdf [8] J. Novotny, M. Russell and O. Wehrens, ‘GridSphere: A Portal Framework For Building Collaborations’ at http://www.gridsphere.org/gridsphere/wp4/D ocuments/RioBabyRio/gridsphere.pdf [9] D. Kouril and J. Basney ‘A Credential Renewal Service for Long-Running Jobs’ [10] CluMon at http://clumon.ncsa.uiuc.edu/

10

Secure Federated Light-weight Web Portals for FusionGrid D. Aswath,1 M. Thompson,2 M. Goode,2 X. Lee,1 N.Y. Kim1 1 2

General Atomics, San Diego, California Lawrence Berkeley National Laboratory, San Francisco, California Utah. This project created a national Fusion Energy Sciences Grid (FusionGrid) [3] to provide new capabilities to fusion scientists to advance fusion research. FusionGrid is a system for secure sharing of computation, visualization, and data resources over the Internet. The FusionGrid goal is to allow scientists at remote sites to fully participate in experimental and computational activities as if they were working at a common site thereby creating a virtual organization (VO) of the US Fusion community. The Grid’s resources are protected by a shared security infrastructure including strong authentication to identify users and fine-grain authorization to allow stakeholders to control their resources. FusionGrid uses the X.509 certificate standard and the FusionGrid Certificate Authority (CA) to implement Public Key Infrastructure (PKI) for secure communication.

Abstract The FusionGrid infrastructure provides a collaborative virtual environment for secure sharing of computation, visualization and data resources over the Internet to support the scientific needs of the US magnetic fusion community. Invoking FusionGrid computational services is typically done through client software written in, for historical reasons, the commercial language IDL. Scientists use these clients to prepare input data and launch FusionGrid computational services. There are also numerous web sites throughout the US dedicated to fusion research, functioning as light-weight single purpose portals. Within the FusionGrid alone, there are web sites associated with authentication, authorization, and monitoring of services. Pubcookie and MyProxy technology were used to federate these disparate web sites by enabling them to authenticate a user by their FusionGrid ID and then to securely invoke FusionGrid computational services. As a result of this drop-in authentication mechanism, portals were created that allow easier usage of FusionGrid services by the US fusion community. The shared authentication mechanism was accomplished by the integration of Pubcookie’s single sign-on mechanism with the MyProxy credential repository that was already in use by the FusionGrid. This paper will outline the implementation of the FusionGrid portal technology, discuss specific use cases for both invoking secure services and unifying disparate web sites, present lessons learned from this activity, and discuss future work.

I.

Fundamental to the deployment of FusionGrid into the everyday working environment of US scientists is the usage of the web browser client to deliver some of FusionGrid’s capabilities. Such web browser functions include a Fusion Grid Monitor (FGM) [4], hosted at General Atomics, for monitoring the execution of FusionGrid jobs and a preliminary site hosted at LBNL [5] for user registration and management. Combining these new capabilities with the numerous existing US Fusion web sites that contain documentation and other information relevant to perform sciences on FusionGrid has resulted in a large number of web servers spread across the US that serve some aspect of FusionGrid functionality. A separate project investigated the usage of a Java portal, but having a single general-purpose portal did not correspond to the realities of the highly distributed VO with a significant number of legacy web sites.

Introduction

The National Fusion Collaboratory Project [1] [2] teams the three major U.S. fusion physics research centers: the Princeton Plasma Physics Lab, General Atomics (GA), MIT Plasma and Fusion Science Center, with collaborators from the computer science groups at Princeton, Argonne National Labs (ANL), Lawrence Berkeley National Labs (LBNL) and University of

Access to FusionGrid’s computational service is done through client programs that depend on Globus Secure Infrastructure (GSI) [6] to do secure data access and secure job submission. These client programs have been written in the Interactive Data Language (IDL) [7], a

1

was accomplished, the handoff of the maintenance and further development by fusion scientists, who were neither Java programmers nor portal experts, was not successful. Addition of a new code or service could only be done by users who understood the code to be executed, its portal environment and with administrative access to the site at which the portal was run. In practice, the fusion scientists who understood how to execute a code lacked the portal expertise to integrate their interface into the portal.

commercial software analysis and programming language that is very commonly used with the experimental US fusion community. There are two problems posed by this solution: Globus Toolkit [8] is not available on Windows and requires a fairly complicated installation procedure on UNIX, and IDL, is not available on every potential client machine, since it requires buying a license for each host. Thus a simple web interface that would allow data marshalling and job submission is desirable as it would allow easy client usage from any web browser capable computer. This web interface also has to be enabled to leverage off a single sign-on authentication scheme to get a proxy certificate for the scientist, that is required by the grid middleware for remote job submissions and data access.

Another common approach to hiding the complexities of running a scientific code is to first wrap the code with a simple command line or a GUI interface, prompting for the required parameters and then run the command as a Common Gateway Interface (CGI) script behind a web page; thereby creating a single purpose portal. To launch the code as a grid service, this approach must authenticate the user as a member of the VO permitted to run the service and subsequently retrieve a grid proxy credential on behalf of the user, for use in the Globus job submissions. Pubcookie [16] as an open source software package uses cookies and a central secure login server to enable a set of trusted web sites to effect a single, authenticated sign-on to all the web sites. The login server is the only site that needs to see the user’s password for authentication. The authenticated user’s ID is conveyed to the other web sites in encrypted cookies that can only be decrypted by the login server and the targeted sites. Notably, the user or anyone who gains possession of the cookie can not alter its contents without invalidating it. Pubcookie is implemented on each of the trusted web sites as an Apache module. The login server allows the authentication mechanism to be provided as a plug-in module enabling the deployer to decide if user names and passwords are to be kept in a simple database, an ldap server, Kerberos or some other means. Additionally, MyProxy [17] as a standard open-source grid server provides X.509 proxy credentials, suitable for use in GSI transactions, when provided with a user name and either the user’s password or a trusted credential. As the FusionGrid was already running a MyProxy server as part of the centralized certificate management service, combining it with Pubcookie was an obvious approach to provide an authentication and delegation service that allows existing single-purpose web sites to

II. Related Work The majority of the community’s work on creating scientific portals has been done in Java, leveraging off some popular containers for Java servlet code such as Apache Tomcat [9], Jetty [10], Jboss [11], WebSphere [12], and the Java portlet specification released in October 2003 [13]. The goal of such a portal is to provide a single point of entry to all the functions of a VO, and some of the commonly provided functions include: shared spaces such as chat, calendar and newsgroups, whiteboards, shared applications and group authorizations. Grid Portal containers such as OCGE [14] and GridSphere [15] also provide the X.509 authentications, grid style job submissions and grid data transfers. To provide a GUI interface for marshalling data and setting parameters for running a particular code and to further encapsulate the code within such a portal, the developers must be knowledgeable about the scientific code being called and the tools and libraries that are provided by the portal. Typically portals are designed to run centrally at one site providing access to all of the VO services. Installing and maintaining an integrated portal is a non-trivial undertaking, complicated by the fact that the state-of-the-art portals are large rapidly evolving software projects, based on frequently changing third party software, e.g. portlet containers, portlet standards and authentication approaches. The National Fusion Collaboratory Project worked with the developers of an OGCE portal to deploy a fullservice grid portal for FusionGrid. While deployment of the provided portal framework

2

single sign-on across all portals a user might require; to get the necessary grid credentials that enable the client-side software to make a GSIenabled call to a FusionGrid service, and to provide access to the Globus software from within the portal.

authenticate FusionGrid members and securely access remote data and submit jobs, through the GSI infrastructure. Other work has also been done to integrate these two packages. The MyProxy server has an authentication plug-in module, which allows it to authenticate a user via a Pubcookie cookie. The National Virtual Observatory has done a similar integration of MyProxy, Pubcookie and PURSe (Portal-based User Registration System) [18] to provide proxy certificates to its portals.

B.

We combined several existing software modules to provide a Federated Portal Framework, namely: a Pubcookie module providing single-signon for the set of trusted web servers; a MyProxy server handling the storage of long term credentials and delegation and storage of short-term proxies needed for GSI[4]; the Credential Manager handling the registration of users and the management of long-term credentials; The combination of the four components: Credential Manager, Pubcookie login server and two instances of MyProxy one for long-term and another for short-term credentials is referred to as the Authentication and Delegation Service (ADS). Each of these servers is co-located on the same host, so that the connections between them are automatically secure. The FusionGrid authorization server, ROAM [19] is used by a FusionGrid service to check a user's access to the specific grid resource based on the Common Name (CN) in the user's X.509 certificate. Figure 1 presents an overview of the architecture and the details on how it is used.

III. Federated Web Portals A.

Approach

Overview

As described in the previous two sections, the National Fusion Collaboratory Project aimed to provide browser accessible GUIs for job submission on the FusionGrid. These single-purpose portals would help lead the user through the data preparation stages, explain and set parameters, record input for future reference or reuse, invoke the service, monitor the process and make results available to the user. In order to succeed within the FusionGrid environment, it is necessary that such interfaces be written by the service provider in a language of their choice, requiring minimum additions to the standard Apache web server installations. The major challenges to providing such single-purpose portals is the ability to provide

Fig. 1. Federated Portal Architecture.

3

The infrastructure for the portal architecture consists of:

E.

FusionGrid runs two MyProxy servers, a MyProxy Credential Store and a MyProxy Proxy store. The first MyProxy server is deployed to store long-term credentials in the CredentialStore. For FusionGrid jobs that are submitted directly by a user, a short-term proxy is delegated from the CredentialStore using the gridId and password provided by the user.

• A set of servers running on a secure and trusted host (ADS) • A set of trusted web-interfaces that support https, cookies and an Apache Pubcookie module, • A FusionGrid authorization server ROAM.

A second MyProxy server is deployed to store short-term proxies in the ProxyStore to support proxy renewals by long-running FusionGrid services. These proxies can be used for delegation by a trusted server presenting its own X.509 credential. This style of delegation can be used by portals to get the proxy certificate for an authenticated user to submit Globus jobs on their behalf. Since Pubcookie is using MyProxy to authenticate the gridId and password via a myproxy-login interface, it is natural to store the resulting proxy in the short-term ProxyStore. These proxies are set to allow delegation only by the trusted web portals. Thus when a web portal needs a proxy certificate to do a Globus request, it can contact the short-term MyProxy server to get one. This requires that each web portal have its own X.509 service certificate which is registered with the ADS. In order to co-ordinate the Pubcookie password authentication with that of MyProxy, an authentication plug-in was added to Pubcookie that calls MyProxy to check the password. A side-effect of this call is the issuing of a proxy credential.

C.

Credential Manager A user on registering with the FusionGrid chooses a login ID, called the gridId used for their subsequent logins. The first and last name provided as part of the user’s registration process is used to define the Common Name (CN) that forms a part of the X.509 credential issued to the user. The Credential Manager enters these long-term credentials into a MyProxy server, indexed by gridId and encrypted by the user’s pass-phrase. D.

MyProxy server with Pubcookie

Pubcookie

The Pubcookie framework consists of an Apache module deployed on all of the trusted web portals and a central login server providing the basic multi-site/single sign-on ability for web sites in the same domain, e.g. fusiongrid.org. When a user connects, the Pubcookie module looks for a session cookie and if such a signed-cookie containing the user's gridId is provided, it knows that the user has been authenticated. However, in the absence of such a cookie, it redirects the request to the Pubcookie login server. The first time a user is redirected to the login server, they are presented with a login form prompting them for a gridId and password. The login server authenticates the user with the gridId and password provided thereby returning two cookies: a granting cookie scoped to reach the web portal that was originally contacted and a login-cookie scoped to be returned to it on access to any other web portal.

The process of securely launching FusionGrid computational codes with the Federated Portal Architecture has the following steps as shown in Fig. 2: 1. The user connects to the web portal to launch a specific FusionGrid service. 2. If the user has not previously authenticated, the Pubcookie module redirects the request to the Pubcookie login server.

When a different portal is first contacted, the redirection to the login server contains the logincookie. The login server uses this cookie to authenticate the user without prompting them for a password again. It then generates a new granting-cookie which is subsequently returned to the second web portal.

3. The Pubcookie login server requires the user to sign-on with Fusion-Grid credentials.

4

Fig. 2. Details of Authentication and Delegation.

4. User is authenticated with the ADS server and a short-term proxy is delegated from stored long-term credentials 5. The short-term proxy is placed in the secondary MyProxy server for subsequent portal retrieval.

7. The ‘redirect’ page causes the user’s browser to re-connect to the original web portal, this time with the granting cookie.

IV. Web Portal Use Case ONETWO [20] is a time dependent magnetic fusion analysis and simulation code that is available as a computational service on the FusionGrid. This grid-enabled code running on a cluster of Linux machines can be invoked directly from General Atomics (GA) locally or remotely via FusionGrid as shown in Fig. 3.

6. Upon a successful authentication, the user is sent a ‘redirect’ page and is granted a login cookie. The login cookie is used on any subsequent visits by the user to the login server (single sign-on capability)

Fig. 3. ONETWO as a FusionGrid service.

5

page. The portal uses its host credentials to retrieve the proxy certificate for the authenticated user to start the ONETWO run. The portal queries the ROAM authorization server to check if the authenticated user has permission to access the ONETWO resource. Authorized users are presented with the option to gather and process inputs as shown in Fig. 5. The ONETWO code has hundreds of different input settings but this initial version of the portal interface has only the most commonly changed available for user adjustment. As use of the portal grows more inputs will be added. The general inputs include which fusion plasma shot, what time range within that shot to analyze, and an optional text comment string. The advanced input section includes: where to get the plasma shape (EFIT ID), the plasma temperature and density profiles (ZIPFITS and Profile directory), the input template file that specifics all possible inputs (INONE Template), and specifics about the auxiliary heating of the plasma (NBI and ECH). The inputs thus prepared on a desired shot are inserted into a database and subsequently retrieved by the ONETWO computational code during its run. The run can be monitored through a FusionGrid Monitoring system (FGM) as shown in Fig. 6. The user can then access the results of the run stored in an MDSPlus [21] data repository identified by the FGM logs.

AUTOONETWO and PREONETWO are IDL-based client-side GUI tools hosted at GA to help scientists in preparing ONETWO runs on the FusionGrid. Though differing in the way the inputs are gathered and processed and the specific scientific problem they address, these GUI tools manages code runs with a code run database, and upload the inputs prepared to the ONETWO computational service requesting a new code run; thereby reducing the work required by the user to launch FusionGrid code runs. However these client programs depend on both a Globus infrastructure [8] and a commercial IDL license to be made available on every host machine. With the ongoing effort to have a simple web interface to allow easy client usage from any machine supporting a web browser, we have developed a web portal (Fig. 4) with the Federated Portal Architecture described above to enable authenticated FusionGrid users to securely invoke the ONETWO computational service on the FusionGrid. When clients attempt to access a Pubcookie protected web page hosted by the web portal, through their browsers, they are prompted for their gridId and password by the login server at cert.fusiongrid.org. Upon a successful authentication, a proxy is delegated from the MyProxy server for later retrieval by the portal and the user is redirected to the originally requested

Fig. 4. Web Portal hosting the web page to launch the ONETWO service on the FusionGrid.

6

Fig. 5. ONETWO Input Preparation for FusionGrid users authorized to access ONETWO.

Fig. 6. FusionGrid Monitor (FGM) logs monitoring a ONETWO run with a run id of 1190. Results of this run are stored in the MDSPlus tree AOT06.

7

ID’s and passwords to be able to access and edit such pages, and make requests via Bugzilla.

V. Discussion and Concluding Remarks With a straightforward implementation, the Federated Web Portal worked as expected to authenticate, authorize and provide a proxy credential for the user and we were successful in launching the ONETWO computational code as a secure grid service on the FusionGrid via the portal. With the MDSPlus repository storing the outputs of the code runs, output visualizations are currently presented to the fusion scientists with existing GA software tools, such as ReviewPlus run locally rather than through the portal. As the scientists have commenced their usage of the portals, we are yet to determine the types of physics codes that can be ported to our portal framework. Our preliminary phase shows that codes with intensive visualizations during the process of input preparation are not suited to be accessed via our web portals. However, those codes that do not require extended graphics such as our ONETWO codes are well suited to be used with our proposed portal architecture. A brief duration of six months should enable us to better make such a decision on the best usage of this technology. It is expected that with a straightforward procedure for a portal site to authenticate, authorize and obtain and use a proxy, the significant work in creating a code portal will be in presenting a convenient and intuitive interface for input and output to FusionGrid services. For our future work, to further enable visualizations of the code run outputs, we plan on integrating Elvis [22], as the scientific visualization package that would allow users to view graphs in a browser window. As the Federated Portal approach requires each of the web portals to have a fusiongrid.org alias in addition to their primary name, we would like to eliminate this additional requirement. Having migrated to a Wiki-based web site for the DIII-D fusion facility and a Bugzilla system to track requests from users on software updates and possible bugs, we will examine the usage of FusionGrid credentials for the login scheme. As the DIII-D wiki site requires the user’s login ID to be tracked to monitor edits on the web pages themselves, this Pubcookie model of authentication with the X.509 FusionGrid credentials would not only secure the DIII-D wiki pages, but would also eliminate the need for users to remember a separate set of login

Acknowledgment This work funded by the SciDAC project, U.S. Department of Energy under contract DEFG02-01ER25455 and by the Director, Office of Science, Office of Advanced Science, Mathematical, Information and Computation Sciences of the U.S. Department of Energy under contract number DE-AC02-05CH11231. The authors wish to thank David Schissel for his valuable suggestions on this paper. References [1] D.P. Schissel, et al., “Building the US National Fusion Grid: Results from the National Fusion Collaboratory project,” Fusion Eng. And Design 71, 245 (2004). [2] D.P. Schissel, et al., “The National Fusion Collaboratory Project: Applying Grid Technology for Magnetic Fusion Research,” Proceedings of the Workshop on Case Studies on Grid Applications at GGF10 (2004). [3] The National Fusion Collaboratory, http://www.fusiongrid.org. [4] S.M. Flanagan, J.R. Burruss, C. Ludescher, D.C. McCune, Q. Peng, L. Randerson, D.P. Schissel, “A General Purpose Data Analysis System with Case studies from the National FusionGrid and the DIII-D MDSPlus between pulse analysis system.” [5] J.R. Burruss, T.W. Fredian, M.R. Thompson, “Simplifying FusionGrid Security, Challenges of Large Applications in Distributed Environments (CLADE) workshop at HPDC”14, July 2005, Research Triangle Park, NC. [6] V. Welch, F. Siebenlist, I. Foster, J. Bresnahan, K. Czajkowski, J. Gawor, C. Kesselman, S. Meder, L. Pearlman, S. Tuecke, “Security for Grid Services,” Twelfth International Symposium on High Performance Distributed Computing (HPDC-12), IEEE Press June 2003. [7] The Data Visualization and Analysis Platform (IDL), http://www.ittvis.com/idl/. [8] Globus, docs/2.4/.

8

http://www.globus.org/toolkit/

[9] Tomcat, http://tomcat.apache.org/. [10] Jetty, http://www.mortbay.org/. [11] JBoss, http://labs.jboss.com/portal/jbossportal

[18] M. Freemon, http://grid.ncsa.uiuc.edu/ myproxy/talks.html [19] J.R. Burruss, T.W. Fredian, M.R. Thompson, “ROAM: An Authorization Manager for Grids,” to appear in the fall 2006 in Journal of Grid Computing. [20] W. Pfeifer, R.H. Davidson, R.L. Miller, and R.E. Waltz, General Atomics Report GAA16178 (1980).

[12] WebSphere, http://www-306.ibm.com/ software/websphere/ [13] Java Portlets, Final release Oct 2003; http://jcp.org/aboutJava/communityprocess/ final/jsr168/ [14] OGGE, http://www.collab-ogce.org/ogce2/

[21] J.A. Stillerman et al., “MDSPlus,” Rev. Sci. Instrum. 68, 939 (1997). [22] Elvis, http://w3.pppl.gov/elvis/.

[15] GridSphere, http://www.gridsphere.org/ gridsphere/gridsphere [16] Pubcookie, http://www.pubcookie.org/ [17] J. Basney, M. Humphrey, and V. Welch, The MyProxy Online Credential Repository, Software: Practice and Experience, Volume 35, Issue 9, July 2005, pages 801816, also http://grid.ncsa.uiuc.edu/ myproxy/

9

Portal-based Support for Mental Health Research David Paul1 , Frans Henskens1 Patrick Johnston2 and Michael Hannaford 1 1

School of Electrical Engineering & Computer Science, The University of Newcastle, N.S.W. 2308, Australia 2 Centre for Mental Health Studies, The University of Newcastle, N.S.W. 2308, Australia

Abstract. This paper describes experiences with the use of the Globus toolkit and related technologies for development of a secure portal that allows nationally-distributed Australian researchers to share data and application programs. The portal allows researchers to access infrastructure that will be used to enhance understanding of the causes of schizophrenia and advance its treatment, and aims to provide access to a resource that can expand into the world’s largest on-line collaborative mental health research facility. Since access to patient data is controlled by local ethics approvals, the portal must transparently both provide and deny access to patient data in accordance with the fine-grained access permissions afforded individual researchers. Interestingly, the access protocols are able to provide researchers with hints about currently inaccessible data that may be of interest to them, providing them the impetus to gain further access permissions.

1 Introduction Schizophrenia is a brain disease that affects approximately 0.6-1.5% of the population, with an incidence of 18 - 20 cases per 100,000 per year [9]. Although prevalence is low, the burden of the illness upon society and upon sufferers and their families is extremely high. The World Health Organisation, for example, rates schizophrenia amongst the ten leading causes of disease burden. The disorder involves severe cognitive, affective and perceptual dysfunctions, which, at an overt behavioural level, manifest themselves in terms of delusional beliefs and disorganised behaviours; perceptual disturbances including, particularly, auditory hallucinations; and lack of motivation, and general decline in personal and social functioning. Consequently, it is a disease associated with very high costs to government (AUD35,000 per patient per year) [1] and extremes of social impoverishment and economic disadvantage [10]. Recent scientific advances have le d to a model of schizophrenia that recognises the role of abnormal neuro-developmental and/or

neurodegenerative processes in altering the structure and function of the brain. Until relatively recently detailed images of cerebral morphology could only be obtained from postmortem tissue. The limitations of the traditional tissue -based approach to neuropathology can potentially be overcome through the use of neuroimaging technologies. Neuroimaging techniques offer the potential for in vivo studies of brain structure as well as func tion, thus overcoming problems relating to tissue degeneration postmortem, invariably small samples of post-mortem brains and, of course, the obvious fact that the tissue is derived from deceased persons. Moreover, techniques such as magnetic resonance imaging (MRI) allow for repeated testing of the same individuals, and thus longitudinal studies may be undertaken. A further advantage of MRI is that it may be employed to produce high-resolution threedimensional digital representations of brain structure. This approach lends itself more easily to sharing and distribution of the primary source data (i.e. digital images) among research teams than does traditional approaches in neuropathology (i.e. where the brain tissue itself is the primary source data). It also supports the

This research is supported by The Australian Research Council (ARC) grant SR0566756 (2005-2006). On-going work is supported by the National Health & Medical Research Council (NHMRC) grant AIP/ERP #1679 (2006-2010), and by a grant from the Pratt Foundation (2007-2011). 1

application of computational image processing techniques for the precise definition, localisation and measurement of brain structures.

brain activity such as functional MRI (fMRI) or event-related potentials (ERPs). A further example of the significant impact of data-access-enabling infrastructure on research was the National Institute for Schizophrenia and Allied Disorders (NISAD) [15] Schizophrenia Research Register. It was intended that the Virtual Brain Bank would act as the foundation to which could later be added putative endophenotype measurements derived from the Schizophrenia Register participants and other neurocognitive studies of schizophrenia as well as genetic information derived from the DNA Bank and the Laboratory of Neuro Imaging (LONI) [13]. Such integrative strategies that combine various methodological approaches have been shown to considerably further the understanding of the pathology of schizophrenia. The recently established Australian Schizophrenia Research Bank (ASRB) builds on and extends the ideas of such previous facilities to create a nationally accessible resource for schizophrenia researchers in Australia and beyond.

The heritability of schizophrenia is of the order of 70-80%. However, the inheritance pattern is not the cla ssical Mendelian type. As with other complex diseases (eg, diabetes, cardiovascular disease), it is believed to involve a number of contributing genes, each of small effect, interacting with each other and with environmental factors. With this in mind, traditional genetic research approaches based on the diagnostic category of schizophrenia need to be modified if we are to further our understanding of the genetic basis of this disease. A more recent approach in schizophrenia research has been to investigate discrete neurobiological or neurocognitive characteristics that may be more closely linked to a particular gene [8, 12] rather than the clinical syndrome diagnosed as schizophrenia. These characteristics, known as endophenotypes, can assist researchers in unravelling the complex genetic causality of schizophrenia and help to identify individuals who carry the genetic trait for these discrete deficits [20].

In this paper we describe and discuss issues in the use of primarily Globus -based [4] technology to build a grid [3] that allows geographically distributed researchers to contribute to initially the NISAD/LONI Virtual Brain Bank, and now the encompassing ASRB’s collection of schizophrenia -related data and software resources in the quest for knowledge on the reasons for and treatment of schizophrenia.

The NISAD/LONI Virtual Brain Bank [14] primarily consists of a large distributed database of high resolution 3D computer representations of the brains of approximately 250 schizophrenia patients and age/gender-matched healthy control subjects, derived from structural MRI images and transformed into a standardised spatial coordinate system. The purpose of this bank is to provide a resource for the analysis of subtle structural variations between the brains of schizophrenia patients and healthy controls, and to map brain changes that occur as a result of variables such as age, gender, duration of illness and duration of untreated psychosis. The brain bank also provides the opportunity to explore associations between brain structure and clinical or neurocognitive measures, gene expression or genetic linkage data, and functional measures of

2 The ASRB Grid A major issue for schizophrenia research is the expense of the collection of patient data (e.g. MRI brain scans, tissue samples) needed for analysis. The ASRB will have a major impact on schizophrenia research in Australia because it will amortise the high cost and the significant time involved in obtaining data across the 2

national body of researchers. As schizophrenia is likely to involve multiple genes of small effect, access to large sample sizes is a key to undertaking studies of sufficient statistical power. With its cross-referenced data in clinical, cognitive, neuroanatomical and genetic domains, the ASRB will make a huge contribution to schizophrenia research on a national scale, enabling multiple research questions to be addressed relatively easily in a large sample that would otherwise be inaccessible or prohibitively expensive for independent investigators to acquire. This large data set will be formed by merging existing data held by groups around the country, and supplementing it with data obtained by a concerted recruitment and collection process.

potentially patients.

beneficial

treatments

for

those

As the ASRB Grid contains personal patient information, security is of vital importance. Typical Grids require strong security to determine whether a user should have access to a given system, or set of systems, without the need for any fine-grained security; a user is either allowed to access the system, or they are not. The ASRB Grid is different because users have different access rights to the resources provided by the Grid, even those on an individual component system. Further, a researcher should be able to perform a preliminary query on data for which they are not currently authorised, allowing them to identify data of interest as a pre-cursor to a request for access to it. For example, it should be possible for the researcher to search for scans exhibiting particular features to determine if there are sufficient samples to justify their requesting access to them. If there were insufficient data items that match their query, it would be a waste of time and resources to request access to the data. If, on the other hand, it was found that there was a sufficiently large extant data set (albeit currently unavailable to the individual researcher), it is likely that a request for access to that existing data would be significantly easier (and less expensive) to achieve than collection of new data. Notwithstanding, it is essential that certain aspects of the data, especially information that can identify patients, be inaccessible to any user who has not been given specific rights to access it.

Ethics approvals are necessarily associated with the collection of data and samples from live patients. Such ethics approvals typically specify the project for which data is to be used, and limit the group of researchers who can access the data to, for example, those at a particular institution, or in a particular research group. It is also common that most researchers permitted to use and analyse patient data are prevented from being able to identify patients from their data (i.e. the data is de-identified). The extant data collections currently held at the disparate Australian member sites are all subject to existing ethics approvals. Access to the new patient data for which collection has been funded by the NHMRC, is similarly controlle d. Thus, a major and important aim of the ASRB Grid is to provide controlled access to the data available to each particular user of the Grid. The most obvious need is to allow all authorized users to access the newly collected data, but it is also important to allow access to any other data collections for which the user has approval, either through their institution, research group, or personally. A further consideration is that it should be possible for selected personnel to identify patients from the ir data in the circumstance that analysis has discovered

Once the researcher has the data needed for their experiment, they typically would execute computer programs to analyse this data. At present this can involve manually collecting the data into a compressed archive, sending it to, for example, Los Angeles via FTP, and waiting for the results to be returned. At the remote processing site, a user must extract the data, 3

schedule it for analysis, collect the results and then return them to the initiating researcher. Other less compute-intensive tasks can be controlled by a single user, though these still require manual scheduling on computers in Australia, which can be time consuming, increasing the time needed by the researcher to do their job. It is intended that by utilizing compute servers in the ASRB Grid, this handson approach to computer-based analysis can be reduced, with researchers simply submitting the job to the Grid, after which the Grid automatically schedules and runs the job, collects the results, and returns them to the researcher, with no further human interaction required.

after a request has been made the service can later be queried to obtain updated information about the task.

As data access is a very important part of the ASRB Grid, two important components of the Globus Toolkit for this project are GridFTP [5] and OGSA-DAI (Open Grid Services Architecture Data Access and Integration) [17]. GridFTP is an extension to regular FTP that supports using Globus credentials for authorization and authentication. It has been extended in Globus Toolkit 4 with the Reliable File Transfer service, which is a Web service for managing secure third-party GridFTP transfers. OGSA-DAI is middleware designed to give secure access to data stores such as relational databases, as well as to integrate data from different sources via the Grid. It allows the access of relational databases using the WSRF, giving the ability to securely access them via Web services.

A final and important requirement of the ASRB Grid is that it should be easy to use, and provide reasonable performance and feedback. If the user interface to the new infrastructure is too complex, or if the performance is pedestrian, users will prefer to continue using the familiar old methods, with all their problems. Thus, use of the new system must be as intuitive as possible, and should hide or abstract over all unnecessary complexity. This means that sensible defaults should be chosen for all options, and a consistent interface should be provided to enable the researchers to concentrate on their research rather than being caught up dealing with the vagaries of the computer support.

It was decided that a Web portal should be used to access the Grid systems, as this will eliminate the need for researchers to install special software on their machines, providing flexibility with respect to client location and host computer. The portal framework chosen is Gridsphere [16], with GridPortlets [19] used to access the Grid. Gridsphere is an open-source portal framework completely compliant with the JSR 168 specifications, so that any standardscompliant portlet can be used by Gridsphere. GridPortlets are a set of portlets for Gridsphere that allow access to Grid resource and user credential management, as well as GridFTP operations, and many other useful Grid activities. The GT4Portlets extension to this allows the execution of jobs on remote Globus Toolkit 4 systems, and further enhances GridPortlet’s compatibility with the newest version of Globus.

3 Support for Fine-grained Security To make the ASRB Grid as accessible as possible, it was decided at an early stage that Web services should be used wherever possible . It was also a preference of the Australian Research Council that the Globus Toolkit 4 [4] be used. Thus Globus was chosen as the software to provide the grid framework. Version 4 of the Globus toolkit is mainly built on the Web Service Resource Framework (WSRF), which allows Web services to have state , so that 4

In order to supply users with credentials to access ASRB Grid resources, a SimpleCA certificate authority is being established. To further facilitate the researcher’s use of the system, PURSe portlets [2] are used to eliminate the user’s need to knowingly interact with this system. Using these portlets, a user fills in a Web-based form to request an account. The user is then sent an email to verify their request and an administrator is informed of the request. The administrator can accept or reject the user, and has the capability to provide the user with access to an account on the Grid; ultimately the user is informed by email of the result. Provided the user is accepted, appropriate Grid credentials are automatically created for the user and a proxy certificate stored for them in a MyProxy server. The user can then log in to the Web portal, using a password supplied by them in their initial request, and a proxy certificate is automatically retrieved from the MyProxy server. This proxy certificate will then be available for access by the portlets in the Web portal. The portlets use these credentials to authenticate with any Grid resources in a manner that is completely transparent to the user.

is needed, and the complexities of these relationships can best be handled by the users themselves. GridFTP is the only component of the Globus Toolkit that supports CAS out of the box, though OGSA-DAI can be extended to support CAS with very little impact on performance [18]. Much of the Globus Toolkit is currently accessible only through the use of command-line statements. Technologies such as the CoG Kits [22] and GridPortlets make access to Globus Grids much easier, but the CAS technologies that we have chosen to use have really only been usable from the command line. Thus, one of the first things needed by this project is portlets for accessing CAS. A portlet that allows authorized users to manage CAS entities has been created. With this facility users with the correct CAS permissions are able to view, create, and delete CAS entities, such as groups or service actions. In addition the portlet provides the ability to grant and revoke rights to groups and services. CAS will thus also be used by administrators to grant access to various database tables, through OGSA-DAI.

Since identified patient data will be stored on the ASRB Grid, it is vitally important that researchers are restricted to access only that data for which they are approved (resultant from ethics approval, or otherwise). As a result it is required that users be given different levels of access to resources based on both their own identity, and the groups to which they belong. The Globus Toolkit includes a component that can be used for this purpose: the Community Authorization Service (CAS) [6] (which is not to be confused with JA-SIG’s Central Authentication Service [11]). CAS allows resource providers to give course-grained access to various systems, handing finer-grained access control management to the community of users. This is important for the ASRB Grid because there are very complex levels of access for different data resources, so fine-grained control

4 Future Work Development of the ASRB Grid is very much an on-going project, and there are a number of parallel development tasks in progress, as described in the following sub-sections.

4.1 Description of Patient Data

The above security framework is designed to provide tightly controlled access to resources such as data and computation. To date much of the extant patient data has not been available online; rather the data are stored on CDs or DVDs in researcher’s offices, and these must be moved to on-line storage subsystems so they can be 5

accessed using the Grid. A further issue is the existence of aggressive firewalls that have been used to protect confidentiality of patient data at some of the host sites. The recently-funded collection of substantial quantities of new patient data has not yet begun but is imminent, so provision of infrastructure for storage and processing of that new data is a priority. In parallel it is necessary to finalise meta-data description of the heterogeneous extant (and the homogeneous future) data that will be accessible through the ASRB Grid. Until this significant task is completed no specific tools development can take place.

4.3 Abstraction over Distributed File Storage

A service that allows users to create logical folders, providing a window onto data on all the different GridFTP servers to which they have access, will also be integrated into the system. Thus, users will simply see a familiar folder-like structure containing sub-folders and files. This is achieved using the Globus Replica Location Service (RLS) [7] and a system to map a set of logical files to a set of logical folders; the actual files in the folders may be stored on any of the GridFTP servers available to the user of the Grid. The various locations of the data available to the user will be abstracted away by this service, allowing users to simply see their data without regard for the location at which it is stored.

4.2 Extension of Portlet Support

To date, there has be en a paucity of reported development of portlets to access OGSA-DAI resources, especially for OGSA-DAI secured by CAS. While some OGSA-DAI portlets have been developed, they currently do not provide the level of support for security required by this implementation, and so must be extended to provide the necessary security. It will also be necessary to create or modify some GridFTP portlets to include CAS functionality so that researchers are able to easily share their data with groups to which they wish to provide such access. It is also planned to create a new PURSe Portlets registration module to automatically enrol users in various CAS groups when their account is created. This will include placing them in a group over which they have complete control, as well as giving them exclusive access to space on a GridFTP server. Users will then be able to create their own selfcontrolled groups, allowing them to share their data with authorised users while asserting as much fine-grained control as is necessary. There would be no requirement for administrator intervention in the establishment and control of such groups.

4.4 Access to Data Processing and Analysis Facilities

The ultimate aim of the ASRB Grid infrastructure is to provide researchers with the ability to analyse (subsets of) the data collection, leading to advances in the understanding and treatment of schizophrenia. While it will be possible (subject to access rights) for researchers to download data to their own machines to perform analysis, there will be tasks which will benefit from access to the parallel resources of the Grid. For example, the data associated with a single MRI scan can exceed one gigabyte, and transfer of such quantities of data across the Internet is expensive with respect to time (noting that some of the member sites are up to 4,000 kilometres apart). Analysis of such data is more efficiently performed by positioning the computation close to the data source, with high bandwidth data path(s) joining them. Unfortunately, automatically executing a task on a set of remote machines is difficult. Projects such as GT4Portlets allow the execution of jobs on a single remote machine, and projects such as the Gridbus Broker [21] automatically 6

allocate tasks to servers, but the interfaces to these are very general. Thus a further task for this project is to create a portlet wizard that allows the easy creation of a portlet to execute a particular application. It is envisaged that these portlets will be based on the Gridbus Broker, but will enable researchers to choose input files and set parameters using a simple, easily understandable Web form. The provision of a wizard will make it easy for developers to create portlets for many different programs. If a specific program has special needs, however, developers will still have access to the full source code so that the portlet can be modified as needed. This will enable researchers and developers to use the processing capabilities of the distributed compute servers much more easily than is currently possible.

and remotely stored data; intuitive wizard-based access to distributed compute servers and application programs; the ability for users to provide individuals and/or groups with controlled access to their personal data store.

6 References 1. Carr, V., Lewin, T., Neil, A., Halpin, S., and Holmes, S., Premorbid, psychosocial and clinical predictors of the costs of schizophrenia and other psychoses. British Journal of Psychiatry, 2004. 184: p. 517-525. 2. Christie, M., PURSe Portlets Website, http://www.extreme.indiana.edu/portals/purse-portlets. 3. Foster, I. and Kesselman, C., The Grid: Blueprint for a New Computing Infrastructure. 1999: Morgan Kaufmann. 4. Foster, I. Globus Toolkit Version 4: Software for Service-Oriented Systems. in IFIP International Conference on Network and Parallel Computing. 2005: SpringerVerlag. 5. Globus, GT 4.0 GridFTP, http://www.globus.org/toolkit/docs/4.0/data/gridftp/. 6. Globus, GT 4.0: Security, http://www.globus.org/toolkit/docs/4.0/security/. 7. Globus. RLS: Replica Location Service, http://www.globus.org/rls/. 8. Gottesman, I.I., McGuffin, P., and Farmer, A.E., Clinical genetics as clues to the real genetics of schizophrenia (a decade of modes t gains whilst playing for time). Schizophrenia Bulletin, 1987. 13(1): p. 23-47. 9. Gureje, O. and Bamidele, R.W., Gender and schizophrenia: association of age at onset with antecedent, clinical and outcome features. Australia and New Zealand Journal of Psychiatry, 1998. 32(3): p. 415-423. 10. Jablensky, A., Epidemiology of schizophrenia: the global burden of disease and disability. European Archives of Psy chiatry and Clinical Neuroscience, 2000. 250(6): p. 274-285. 11. JA-SIG, JA-SIG Central Authentication Service, http://www.ja-sig.org/products/cas. 12. Kremen, W.S., Faraone, S.V., and Seidman, L.J., Neuropsychological risk indicators for schizophrenia: a preliminary study of female relatives of schizophrenic and bipolar probands. Psychiatric Research, 1998. 79(3): p. 227-240. 13. LONI, Laboratory of Neuro Imaging, http://www.loni.ucla.edu/.

5 Conclusion This paper introduces a project that uses the Globus toolkit and related technologies that allows Australian Mental Health researchers to share data and application programs in their quest for understanding of schizophrenia and ultimately improvements in its treatment. A web services portal that provides fine-grained control over user access to resources is described. This portal simultaneously provides simple authentication-based access for users and certificate-based access to sub-sets of the entire resource collection. Users are unaware of host network boundaries and the need for separate authentication for the disparate sites and servers; these requirements are abstracted away by the portal. The ASRB Grid is very much a work in progress. On-going development of abstractions over distributed data storage, remote compute services and portal development are also presented. These facilities will result in: nested folders that provide consistent access to locally 7

14. NISAD, The NISAD/LONI Virtual Brain Bank , http://www.nisad.org.au/newsEvents/resNews/wwwscz res.asp. 15. NISAD, http://www.nisad.org.au/. 16. Novotny, J., Russell, M., and Wehrens, O., Gridsphere: A Portal Framework for Building Collaborations, Gridsphere Project Website. 17. OGSA -DAI, OGSA-DAI Software, http://www.ogsadai.org.uk/index.php. 18. Pereira, A., Muppavarapu, V., and Chung, C., RoleBased Access Control for Grid Database Services Using the Community Authorization Service. IEEE Trans. on Dependable and Secure Computing, 2006. 3(2): p. 156-166.

19. Russell, M., Novotny, J., and Wehrens, O., The Grid Portlets Web Application: A Grid Portal Framework, Gridsphere Project Website. 20. Trillenberg, P., Lencer, R., and Heide, W., Eye Movements and psychiatric disease. Current Opinion in Neurology, 2004. 17(1): p. 43-47. 21. Venugopal, S., Buyya, R., and Winton, L., A Grid Service Broker for Scheduling e-Science Applications on Global Data Grids. Concurrency and Computation: Practice and Experience, (accepted Jan 2005). 22. von Laszewski, G., Gawor, J., Lane, P., Rehn, N., Russell, M., and Jackson, K., Features of the Java Commodity Grid Kit. Concurrency and Computation: Practice and Experience, 2002. 14: p. 1045-1055.

8

WebGRelC: Towards Ubiquitous Grid Data Management Services Giovanni Aloisio, Massimo Cafaro, Sandro Fiore, Maria Mirto Center for Advanced Computational Technologies, University of Lecce, Italy Center for Euro Mediterranean Climate Changes, Italy {giovanni.aloisio, massimo.cafaro, sandro.fiore, maria.mirto}@unile.it

Abstract Nowadays, data grid management systems are becoming increasingly important in the context of the recently adopted service oriented science paradigm. The Grid Relational Catalog (GRelC) project is working towards an integrated, comprehensive data grid management solution. This paper describes WebGRelC, which is a dedicated grid portal allowing data handling, publishing, discovery, sharing and organization, and its underlying data grid services.

1. INTRODUCTION As pointed out by Ian Foster, we have in the last few years moved towards a new, service oriented science [1] in which software is envisioned as services, and services as platforms. Increasingly, services besides computation rich are also data rich, producing a huge amount of data distributed across multiple data servers. There is a growing need for a grid infrastructure allowing scientific communities sharing data securely, efficiently and transparently. Datasets once created need to be visualized, published, downloaded, annotated etc. Discovery mechanisms, such as searchable metadata directories must be provided to find relevant data collections. Integration and federation services need to cope with independently managed legacy datasets, to infer new knowledge from existing distributed data. Although fundamental building blocks such as distributed file systems and semantic storage already exist, data grid management systems are still in the pioneering phase. The main challenge is to design and implement reliable storage, search and transfer capabilities of numerous and/or large files over geographically dispersed heterogeneous platforms. The Grid Relational Catalog (GRelC) project [2], a data grid research project developed at the Center for Advanced Computational Technologies (CACT) at the University of Lecce, is working towards an integrated, comprehensive data grid management solution. It provides, besides traditional command line and graphical interfaces, a dedicated grid portal allowing data handling,

publishing, discovery, sharing and organization. Grid portals are web gateways to grid resources, tools and data. They hide the underlying grid technologies and provide advanced problem solving capabilities to solve modern, large scale scientific and engineering problems. This paper describes WebGRelC, which is the GRelC grid portal, and its underlying data grid services. The outline of the paper is as follows. In Section 2, we present the GRelC data grid services underlying the portal, whereas in Section 3 we describe the portal architecture. In Section 4, we discuss the current implementation, technologies and related issues, whereas in Section 5 we present a use case related to the use of the portal for a bioinformatics experiment, whilst Section 6 recalls related work. We draw our conclusions in Section 7. 2. GRELC DATA GRID SERVICES The main goal of the GRelC project is to provide a set of data grid services to access, manage and integrate data (i.e. databases and files) in grid environments. GRelC data grid services already implemented are: data access (DAS), data storage (DSS) and data gather (DGS) service. The Data Access Service (DAS) has been designed to provide a uniform, standard interface to relational and not-relational (i.e. textual) data sources. It is an intermediate layer, which lies between grid applications and Database Management Systems. The Data Storage Service (DSS) provides a comfortable, lightweight solution to disk storage management. It manages efficiently and transparently huge collections of data spread in grid environments, promoting flexible, secure and coordinated storage resource sharing and publication across virtual organizations. DSS besides data handling and remote processing operations, also provide publication and information discovery capabilities, as needed due to the large number of stored objects. The DSS represents a high performance implementation of the grid workspace concept, which is a virtualized

and grid enabled storage space that a community of users can use to share and manage their files/folders taking into account fine-grained data access policies. Grid workspaces represent grid storage spaces accessible by authorized users sharing common interests. Within a DSS, data is fully organized into workspaces and for each one of them, the DSS admin must define a set of authorized users, groups and VOs, the workspace administrators and the physical mounting point. Finally, the Data Gather Service (DGS) offers data federation capabilities providing a second level of virtualization (data integration). This service, which lies on top of DASs or DSSs, allows the user looking at a set of distributed data sources as a single logical entity, thus implementing a data grid federated approach. The proposed WebGRelC architecture provides ubiquitous web access to widespread grid enabled storage resources and metadata, so it is currently concerned with both Data Storage and Data Access Services. Clients are also available both as command line and graphical interface to manage collections of files, but the Grid Portal interface better addresses and increases both transparency and pervasiveness. 3. WEBGRELC The WebGRelC architecture represents an integrated solution to easily, transparently and securely manage data (collections of files) stored within grid storage repositories (GRelC DSSs) and metadata stored within datasets (accessed by means of GRelC DAS). In the following subsections, we will describe the WebGRelC portal, discussing main goal and challenges, grid architecture, subcomponents, security issues and metadata management. 3.1 Goal and Challenges The main goal of WebGRelC is to supply users with Grid data management services through transparent, user friendly, secure and pervasive access to web pages. Moreover, through the WebGRelC Grid Portal scientists can explore data stored within storage resources, discover the data they need and retrieve the relevant data collections through web interfaces (which provides location independence). More in detail, the main challenges are connected with: a) security: it represents a fundamental and crucial requirement in a data grid environment. We need to address security at different levels

providing mutual authentication among grid services, users and machines, data encryption and delegation support. Moreover, security concerning HTTP Internet connections needs to be properly addressed. Within our system, we basically chose to adopt the Globus Grid Security Infrastructure [3]; b) user-friendliness: the portal must provide user-friendly web pages to simplify the interaction between the users and the data grid environment. Our choice leverages current web technologies, including XHTML, CSS etc. but we also plan to switch to recent developments, including the use of portlets, Java Server Faces etc.; c) pervasiveness: the proposed solution leverages the pervasiveness of Web technology to provide users with ubiquitous grid data management facilities. It is worth noting here that client requirements consist just of a standard web browser; d) transparency: the proposed grid portal must conceal a lot of details about grid storage components, data transfer protocol details, heterogeneous storage resources, technological issues, command line parameters/options, and so on. This requirement is multifaceted and in a data grid environment it concerns, among the others, access, location, namespace, concurrency and failure transparency. Several design and technical choices highlighted within this work address all of these transparency issues. 3.2 Portal Architecture The grid portal architecture (Fig. 1) follows a standard three-tier model. The first tier is a client browser that can securely communicate to a web server over an HTTPS connection (no other specific requirements are imposed). On the second tier, the Web Server implements the WebGRelC portal (WGP) which consists of several components (see Section 3.2.1) leveraging (i) GRB [4], and (ii) GRelC DSS, DAS and SDAI client libraries. WebGRelC interacts with a MyProxy server [5] for secure user’s credentials (proxy) storage and retrieval and with the Portal Metadata Catalogue (PMC) to manage user’s profiles. Finally, GRelC DSSs are deployed on the third tier, the data grid infrastructure, providing a lightweight and grid enabled solution for disk storage management. 3.2.1

WebGRelC Portal

WebGRelC represents the core of the proposed architecture, implementing a grid portal able to

retrieve data and present them (via HTTP protocol), within HTML pages. As can be seen in Fig. 2 the WebGRelC portal consists of the following: - Profile Manager: it handles the user’s profile managing metadata stored within the Portal Metadata Catalog (PMC relational database). It allows (i) inserting, updating and deleting personal information as well as (ii) managing a list of available grid enabled storage resources, workspaces, etc. Currently, the PMC runs on different DBMSs by means of the GRelC Standard Database Access Interface (SDAI, see Section 4.3); - Credential Manager: it allows configuring the credentials to be used for a given set of resources, retrieving them from a MyProxy server. After this initial configuration step, the WebGRelC grid portal transparently retrieves the credentials needed to access specific data sources; - Remote Administration: it provides basic functionalities to access and manage metadata information stored within GRelC DSS Metadata Catalog. Through the portal, the user can remotely manage administration information about (i) users, groups and VOs, (ii) internal workspaces configurations, (iii) data access control policies, etc. Moreover, the proposed grid portal provides admin sections for logging (to display information related to all of the operations carried out at the DSS side), and check-coherence (to report system coherence problems between data (content) and metadata (context));

(iv) data access, creation and deletion capabilities (Posix-like oriented). File Manager also supports parallel and partial file transfer as well as file copy between two GRelC DSSs (both push and pull mode are currently available). Along with synchronous functionalities, the proposed grid portal supports the GRelC Reliable File Transfer (G-RFT) mechanism, an asynchronous service primarily intended to be used to reliably copy files from one DSS to another one. Users can submit G-RFT requests simply by filling out a web form containing information about (i) DSS source and destinations, (ii) myproxy access parameters, (iii) file transfer options related to data transfer protocol (HTTPG, GridFTP) and data transfer mode (push or pull), (iv) request options connected with priority level of the G-RFT request, the maximum number of retries to be used in case of failed data transfer, related delay and backoff (linear, exponential). Moreover, within the WGP, users can display status and options about submitted G-RFT requests, abort a data transfer, as well as resubmit again G-RFT requests; - Metadata Manager: it provides basic functionalities to (i) annotate files, that is to publish and manage metadata at the GRelC DAS side, (ii) display metadata information about files, folders, workspaces and schema, (iii) manage metadata schema modifying the list of elements associated to the stored objects and (iv) query metadata information in order to retrieve a list of objects satisfying conjunctive search conditions (basic digital libraries capabilities).

Figure 2. Detailed view of WebGRelC architecture 3.3 Security Issues

Figure 1. WebGRelC architecture - File Manager: it provides support for (i) basic files transfer operations such as download and upload of files, (ii) workspace browsing, (iii) displaying of file and folder metadata as well as

To login into the WGP the user must supply the correct username and password. The portal security model includes the use of HTTPS protocol for secure communication with the client browser and secure cookies to establish and maintain user sessions. Moreover, we decided to adopt the Globus Toolkit Grid Security Infrastructure (GSI), (a solution widely accepted

and used in several grid projects) in order to perform security tasks within the data grid environment: - mutual authentication between WGP and MyProxy servers DAS or DSS; - communication protection (by means of data cryptography) for the data exchanged between WGP and MyProxy server, DAS or DSS; - delegation mechanisms to perform data management tasks on the grid. Finally SSL support is provided when the WGP Profile Manager interacts with the PMC. 3.4 Metadata Metadata management is crucial within such a data grid system. To aid scientists in discovering interesting files within collections of thousands or millions of files the proposed architecture besides basic data management facilities must provide metadata publication and semantic search capabilities. Currently, metadata management concerns two different types of metadata: internal (or low level) and application oriented (or high level). In the former case, metadata is related to the physical stored object (for example creation date, file owner, size, etc.) and is system defined, so users cannot modify neither metadata attributes (schema) nor their values (instances). This kind of metadata is managed by the GRelC DSS. In the latter case, metadata is application specific and related schema can vary depending on the particular context (in this case, through the portal, authorized users can annotate both files and workspaces adding, deleting and updating application level metadata and related metadata schema). Application oriented metadata are stored within the GRelC DAS. Finally, to provide a basic semantic search capability, within the WGP a simple search form has been provided. 4. WEBGRELC IMPLEMENTATION In the following subsections, we will describe the WebGRelC implementation, discussing involved grid portal technologies, grid middleware, GRelC and GRB libraries. 4.1 Grid Portal Technologies The WebGRelC grid portal has been developed using a Model View Controller design pattern. To efficiently address performance and modularity we adopted the Fast CGI technology leveraging the Apache Web Server.

Taking into account the main subcomponents of the WGP, within the portal we provided XHTML based web pages to: - support credential delegation and single sign-on to the Grid; - configure grid storage resources and services as well as manage grid portal user profile; - manage DSS users, groups and VOs authorizations; - upload and download files as well as transfer data from a storage resource to another one; - perform activities related to digital libraries (metadata based search engine) accessing the GRelC DAS; - manage metadata and metadata schema related to the objects stored within the storage resources; - submit and manage G-RFT requests. 4.2 Middleware The current version of WebGRelC is based on the Globus Toolkit 4.0.3 (latest stable release as of November 2006) as grid middleware; basically we exploited the Globus GSI libraries. Web service components such as GRelC DSS and DAS strongly exploit the gSOAP Web Services development Toolkit [6]. It offers an XML to C/C++ language binding to ease the development of SOAP/XML Web services in C and C/C++. gSOAP provides a transparent SOAP API using proven compiler technology. These technologies leverage strong typing to map XML schemas to C/C++ definitions. Strong typing provides a greater assurance on content validation of both WSDL schemas and SOAP/XML messages. As a result, SOAP/XML interoperability is achieved with a simple API relieving the user from the burden of WSDL and SOAP details, thus enabling her to concentrate on the application-essential logic. The compiler enables the integration of (legacy) C/C++ software in SOAP applications that share computational resources and information with other Web services, possibly across different platforms, language environments, and disparate organizations located behind firewalls. Finally, to guarantee a secure data communication channel between WGP and DSS or DAS, we utilized the GSI support, available as a gSOAP plug-in [7]. We did not use the Globus Toolkit 4.0.3 C WS Core, which is a C implementation of WSRF (Web Services Resource Framework), because it lacks a usable authorization framework. Even though it is possible to develop WRSF grid services in C (we are already migrating our software to the Globus Toolkit implementation of

WSRF), it is not possible to deploy production level services. Indeed, grid services deployed in the C WS Core container can only use the default SELF authorization scheme (a client will be allowed to use a grid service if the client’s identity is the same as the service’s identity), which is useless for a production service. Unfortunately, the globus-wsc-container program does not have options to handle different authorization schemes. A possible solution could be the development of a customized service that uses the globus_service_engine API functions to run an embedded container, setting the GLOBUS_SOAP_MESSAGE_AUTHZ_METHO D_KEY attribute on the engine to GLOBUS_SOAP_MESSAGE_AUTHZ_NONE to omit the authorization step and then using the client’s distinguished name to perform authorization. However, with the Globus Toolkit 4.0.3 it is not possible for a C grid service to retrieve the distinguished name of a client contacting it, so this is not a viable option and we have to wait for the next stable release of the Globus Toolkit (4.2) that should provide major enhancements to the C WS Core, including a usable authorization framework. 4.3 GRelC Libraries The GRelC libraries mainly address data management activities. Basically the WebGRelC grid portal exploits the following libraries: GRelC SDAI, SDTI, DAS and DSS libraries. The GRelC SDAI library (used within the WGP Profile Manager component) provides a transparent and uniform access to the PMC relational database. It exploits a plug-in based architecture leveraging dynamic libraries. Currently SDAI wrappers are available for PostgreSQL, MySQL, SQLite, Unix ODBC and Oracle DBMSs. An SQLite PMC implementation has a twofold benefit: it provides extreme performance (due to the embedded database management) and it increases service robustness and reliability because it does not depend on an external DBMS server. The GRelC SDTI provides a data management library to transfer files between WGP and DSS (get/put) or couples of DSSs (copy). Basically, it is a C library (leveraging a plug-in based approach), which virtualizes the data, transfer operations providing high level interfaces connected with get/put/copy (parallel and partial). Two basic modules related to GridFTP and HTTPG (HTTP over GSI) are currently available. Further drivers covering additional protocols

(such as FTP, SFTP or SCP) are actively being developed and will be easily added to the system due to the modular design and implementation of this library. This library is used within the WGP File Transfer component. The GRelC DAS client library provides many functionalities related to the DAS component. Among the others, it allows (i) managing metadata, (ii) submitting semantic queries, (iii) browsing metadata, etc. This library is used within the Metadata Manager. Finally, the GRelC DSS client library provides many functionalities related to the interactions with the DSS components. Among the others, it allows (i) managing file transfer and workspaces, (ii) submitting, monitoring and deleting G-RFT requests, (iii) managing users, groups and VOs membership, etc. This library is extensively used within the following WGP components: File Manager and Remote Administration. 4.4 GRB Libraries GRB software is developed within the Grid Resource Broker project at the CACT of the University of Lecce. Currently, GRB Team supplies users with several production libraries mainly connected with (i) job submission, (ii) resource discovery, (iii) credential management and (iv) grid file transfer. For WebGRelC development, we exploited the grb_gridftp [8] and grb_myproxy libraries to respectively transfer data among grid nodes and manage/retrieve user’s credentials. 5. USE CASE: WEBGRELC FOR BIOINFORMATICS DATA The WebGRelC Grid Portal is part of the SPACI [9] middleware (with regard to data management operations) and it is also actively used within the SEPAC [10] production grid. Several GRelC DSSs are now deployed in Europe in the cities of Lecce, Naples, Cosenza, Milan and Zurich managing different storage resources and several tens of thousands of files primarily related to biology experiments. Through the proposed grid portal, bioinformatics can manage and share their workspace areas, annotate files connected with their experiments. Application level metadata can be published and stored within the system, just filling out web forms provided by the grid portal. Search and retrieval operations on metadata allow users finding desired objects within the system

displaying query results on several formats (HTML tables, plain text or XML). More in detail, we defined several workspace areas to store experimental results, structure protein, etc. Bioinformatics workspace contains files produced by the experiments and files that contain the protein structure, retrieved by the UniProt KnowledgeBase (UniProtKB) data bank [11]. The experiment carries out multiple sequence alignment (MSA) among each of the human proteins available in the UniProtKB database (about 70.845 sequences, retrieved by the uniprot_sprot_human.dat and uniprot_trembl_human.dat flat files) and those stored in the UniProt NREF data bank. Homologue sequences are hence matched to identify functional domains. Indeed, multiple alignment is important for studying regions that during the evolution are conserved and that characterize, with a good probability, biological functionality of the sequence. PSI-BLAST (Position Specific Iterative - Blast) [12], available in the NCBI toolkit, has been used for MSA. After running one experiment we produced about 70 thousands alignments files (in XML format) storing them within bioinformatics DSS workspace. The XML Schema Definition of the PSI-Blast output is shown in Fig. 3. Each resulting XML file contains all of the sequences producing significant alignments for a

specified number of iterations (in such an experiment we have considered 2 iterations). Metadata set description of the experiment includes among the others, protein identifier, evalue, score, accession number, etc. Figure 4 illustrates an example query related to the bioinformatics domain.

Figure 3. XSD of the PSI-BLAST output The user can choose a data workspace, and submit a query choosing the output format. A number of files are returned matching the search criteria. The user can then select a file of interest in order to display all of the relevant file metadata. She can also copy, annotate or download these files, as needed.

Figure 4. Semantic Search Engine: example query and results

6. RELATED WORK MySRB [13] is a web-based resource sharing system that allows users to share their scientific data collections with their colleagues. It provides a system where users can organize their files according to logical cataloguing schemes independent of the physical location of the files and associate metadata with these files. MySRB uses the Storage Resource Broker (SRB) [14] and the Metadata Catalog (MCAT) developed at SDSC as its underlying infrastructure. MySRB and WGP share the same goals. The aim of the DataPortal project [15] is to develop the means for a scientist to explore data resources, discover the data they need and retrieve the relevant datasets through one interface independently of the data location. Separate instances of the DataPortal are currently being installed as part of the grid environments of the eMinerals and eMaterials projects. The new web services architecture of the portal has allowed an easy integration with other services such as the CCLRC HPCPortal. In the current system, these are integrated using standard Web protocols, such as web services, http and SOAP, since support for emerging grid technologies, such as OGSA and grid services is not available. 7. CONCLUSIONS AND FUTURE WORK The paper presented an overview of the WebGRelC Grid Portal. We presented the portal architecture, and discussed its implementation. WebGRelC bridges the gap between scientists and their data located in grid environments, providing an effective, production-level, data grid management system. The portal is currently in production in the European SPACI and SEPAC grids. The recently released WSRF based Globus Toolkit will be supported in the near future, as soon as the tools provided will be stable and mature enough for production usage. Future work related to data grid management concerns a complex semantic search engine (based on a P2P federated approach leveraging GRelC DGSs) developed and tested within the SEPAC Production Grid. References

[1]

I. Foster, “Service-Oriented Science, Science”, 308, pp 814-817 [2] G. Aloisio, M. Cafaro, S. Fiore, M. Mirto, “The Grid Relational Catalog Project”, Advances in Parallel Computing, “Grid Computing: The

New Frontiers of High Performance Computing”, L. Grandinetti (Ed), pp.129-155, Elsevier, 2005. [3] Foster, I., Kesselmann, C., Tsudik G., Tuecke, S., “A security Architecture for Computational Grids”, Proceedings of 5th ACM Conference on Computer and Communications Security Conference, pp. 83-92, 1998. [4] G. Aloisio, M. Cafaro, G. Carteni, I. Epicoco, S. Fiore, D. Lezzi, M. Mirto , S. Mocavero, “The Grid Resource Broker Portal”, to appear in Concurrency and Computation: Practice and Experience, Special Issue on Grid Computing Environments [5] Basney J, Humphrey M, Welch V, “The MyProxy Online Credential Repository”, Software: Practice and Experience: 2005 (9):801-816 [6] Van Engelen, R.A., Gallivan, K.A, “The gSOAP Toolkit for Web Services and Peer-ToPeer Computing Networks”, Proceedings of IEEE CCGrid Conference, May 2002, Berlin, pp- 128-135 [7] Aloisio G, Cafaro M, Epicoco I, Lezzi D, “The GSI plug-in for gSOAP: Enhanced Security, Performance, and Reliability”, Proceedings of Information Technology Coding and Computing (ITCC 2005), IEEE Press, Volume I, pp. 304-309 [8] G. Aloisio, M. Cafaro, I. Epicoco, “Early experiences with the GrifFTP protocol using the GRB-GSIFTP library”, Future Generation Computer Systems, Volume 18, Number 8 (2002), pp. 1053-1059, Special Issue on Grid Computing: Towards a New Computing Infrastructure, NorthHolland [9] The Italian Southern Partnership for Advanced Computational Infrastructures. http://www.spaci.it [10] The Southern European Partnership for Advanced Computing. http://www.sepac-grid.org [11] A. Bairoch, R. Apweiler, C.H. Wu, W.C. Barker, B. Boeckmann, S. Ferro, and et al, “The Universal Protein Resource (UniProt)”, Nucleic Acids Res., (33):154–159, 2005. http://www.uniprot.org. [12] S.F. Altschul, T.L. Madden, A.A. Schffer, J.Zhang, Z. Zhang, W. Miller, and D.J. Lipman, “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”, Nucleic Acids Res., 17(25):33893402, 1997 http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST/nphpsi. [13] Michael Wan and Reagan Moore, Arcot Rajasekar, “MySRB & SRB Components of a Data Grid”, The 11th International Symposium on High Performance Distributed Computing (HPDC-11) Edinburgh, Scotland, July 24-26, 2002

[14] Chaitanya Baru, Reagan Moore, Arcot Rajasekar, Michael Wan, “The SDSC Storage Resource Broker”, Proc. CASCON'98

Conference, Nov.30-Dec.3, 1998, Toronto, Canada [15] DataPortal http://www.e-science.clrc.ac.uk/ web/projects/dataportal

Workflow Management Through Cobalt Gregor von Laszewski1,2 , Christopher Grubbs3 , Matthew Bone3 , and David Angulo4 1

University of Chicago, Computation Institute, Research Institutes Building #402, 5640 S. Ellis Ave., Chicago, IL 60637 2

Argonne National Laboratory, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439

3

Loyola University Chicago, Department of Computer Science,

Lewis Towers, Suite 416, Water Tower Campus, Chicago, Illinois 60611, USA, 4

4 DePaul University, School of Computer Science, Telecommunications and Information Systems, 243 South Wabash Ave, Chicago, Illinois 60604, USA

Contents 1 Introduction: Queueing systems and workflows

1

2 Cobalt: System software for parallel machines

2

3 Karajan: A workflow scripting language 3.1 Java2Karajan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 4

4 Qstat Monitor: observing the workflow

5

5 Implications and possibilities 5.1 Extensibility: adding PBS system support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Further research: graphical workflows in the Qstat Monitor . . . . . . . . . . . . . . . . . . .

5 6 6

6 Conclusion

6

7 Availability ‘

6 Abstract

Workflow management is an important part of scientific experiments. A common pattern that scientists are using is based on repetitive job execution on a variety of di↵erent systems, and managing such job execution is necessary for large-scale scientific workflows. The workflow system should also be client-based and able to handle multiple security contexts to allow researchers to take advantage of a diverse array of systems. We have developed, based on the Java Commodity Grid Kit (CoG Kit), a sophisticated and extensible workflow system that seamlessly integrates with the queueing system Cobalt through the advanced features provided by the CoG Kit.

1

Introduction: Queueing systems and workflows

Parallel computing systems o↵er scientists incredibly powerful tools for analyzing and processing information. These supercomputers benefit researchers in many diverse fields, from medicine to physics to social sciences. 1

In order to take advantage of parallel system capabilities, scientists must be able to schedule their jobs with a high degree of confidence, knowing that the jobs will run, the results will be saved, and any errors will be handled. Use of these systems is not, of course, limited to one user; therefore the systems must be able to reliably handle large amounts of tasks from many di↵erent users. This is where queueing system software comes in. Queueing software handles job execution on parallel systems, scheduling the jobs and allocating resources accordingly. While the simplest queueing systems would simply execute jobs on a first-come-first-serve basis, other factors, like the projected running time or number of nodes required, may come into play in more complex systems. By maintaining a user job queue, the queueing software automates the scheduling process such that the system’s resources are utilized to their fullest extent. It also provides an interface for monitoring the current system queue and retrieving information about a job, such as its location in the cluster or grid, how many nodes it is using, how long it has been running, and so on. Some queueing systems also allow server-side specification of dependencies, or workflows, so that users can string together tasks which depend on prior tasks. There are many queueing systems available, in both commercial and open-source implementations. Commercial queueing software packages include Platform’s Load Sharing Facility (LSF) [9], IBM’s Tivoli Workload Scheduler [17], and Altair Engineering’s PBS Pro [11]. An open-source implementation of PBS, called OpenPBS, also exists [10]. Other open-source packages providing job scheduling functionality include the Globus Toolkit [6], Condor [5], Sun’s GridEngine [12], and Cobalt [2], used on Argonne National Laboratory’s BGL cluster system. Furthermore, many portals exist which streamline access to queueing software. TeraGrid’s portal is one example [16]. The TeraGrid Portal is a web-based system, allowing the grid’s users to manage their projects, receive important system-related information, and access documentation, all from within their browsers. As was mentioned before, some queueing systems such as LSF provide workflow functionality; however, such workflows are necessarily tied to one resource. A workflow submitted to one grid system will only be able to utilize the resources provided by that grid system. The addition of a client-side workflow implementation, however, allows access to numerous resources. Through a client-side workflow, a user may access both grid and non-grid systems, working across several di↵erent security contexts. A particularly flexible client-based workflow engine called Karajan is provided in the Java CoG Kit [7], an open-source set of grid tools. This paper will focus on the integration of Karajan with the aforementioned queueing system Cobalt. We chose Cobalt primarily because it is used on BGL at Argonne National Laboratory, where our research is based. Furthermore, because BGL is not a grid system, we wanted to demonstrate that tools normally applied to grid systems could also be applied to non-grid parallel systems. This will demonstrate Karajan’s flexibility in creating robust workflow solutions; the techniques used to integrate Cobalt with Karajan can be employed for other queueing systems as well, as we will show through our additional integration with PBS systems. Before we go into details about Karajan, a brief introduction to Cobalt will be useful.

2

Cobalt: System software for parallel machines

Cobalt is used for handling jobs on BGL, the 1024 node, 2048 processor IBM BlueGene/L system at Argonne National Laboratory. BGL is used primarily for scientific computing, application porting, system software development and scaling studies [1]. Following a “smaller and simpler is better” philosophy, the Cobalt software package trades on feature-richness for agility [3]. While its core implementation is comprised of less than 4000 lines of code, mostly in Python, its component-based architecture makes it a highly adaptable and useful research platform [3]. Researchers can readily add new components or rewrite existing ones. Users may login to BGL via SSH using public key authentication. The cqsub command submits a job to the queue. Like the qsub command found on other queueing systems, cqsub takes a variety of command line arguments, including the desired execution time, the amount of nodes requested, the path to the executable, and so on. Cqsub also takes some specialized arguments, including one for dynamic kernel selection. This feature, which is only found in Cobalt, allows users to test experimental kernels and conduct system software research [3]. It is special features like this that motivated our integration of a specialized Cobalt submission 2

Figure 1: sample caption library with the Karajan workflow engine.

3

Karajan: A workflow scripting language

Karajan is a parallel scripting language which uses a declarative concurrency approach to parallel programming [4]. It is a dual-syntax language, supporting both a native syntax and an XML-based format. For the purposes of this paper, we will use the XML syntax, although there is no underlying di↵erence between the two forms. Basic sequential workflows in Karajan are more or less lists of tasks; the user simply creates the task elements in the proper execution order. To execute tasks in parallel, the user places the tasks in question within a element. Karajan also provides a element, in case the user requires certain tasks within a parallel element to execute sequentially. In addition to its parallel and sequential workflow components, Karajan provides elements for remote task execution, file transfers, a variety of data structures, logical operators, variables, access to GUI forms, Java object and method bindings, and more. Karajan contains a general task execution framework which employs the CoG Kit’s task/provider model. The task/provider model provides a consistent programmatic interface while the underlying provider implemenation (which may be GT2, GT4, SSH, etc.) changes dynamically. This approach is useful in many ways, a↵ording developers a high-level syntax for specifying tasks while leaving actual submission and execution details to the CoG Kit. It also allows developers a simple way to add new protocols without having to change the task implementation; they need only write a new provider. One downside, however, is that it does not allow for special features specific to a certain queueing system, such as Cobalt’s kernel profile selection options. In order to utilize Cobalt’s features within the Karajan framework, we employ Karajan’s Java binding functionality, which allows limited interfacing with Java classes, objects, and methods. Through the Java bindings, we are able to invoke Java code which logs into a Cobalt system, submits a job, and monitors its status. This approach allows us to bypass the CoG Kit abstraction layer while still a↵ording the same workflow capabilities of Karajan’s task/provider model. *Java Binding in Karajan Objects can be instantiated in a Karajan script with the element, provided that the class is present in the CoG Kit’s classpath. 3

The instantiated object’s methods can be accessed with the element; static methods may also be accessed with this element by either setting its “static” attribute to true or by specifying a class name. Creating these bindings can be an tedious task, especially if many di↵erent methods are involved. To automate this process we created the Java2Karajan tool.

3.1

Java2Karajan

The Java2Karajan program takes a compiled Java class file and, using the Java reflection API, constructs a Karajan Java binding. The approach is similar to that taken by SWIG [14], where libraries implemented in a lower level language, in this case Java, are bound and translated to a higher level language, in this case Karajan. Here is an example of an object and method binding generated by Java2Karajan: Here we have a binding for instantiating an Example object, as well as a binding for that object’s method myMethod() which takes an int and a String as parameters. If a script imports this Java binding, it can, for example, instantiate an Example object with the element , then call myMethod(4, “hello”) with the element . If a method returns a value, that value can be placed within a Karajan variable and used elsewhere in the script. * Integrating Cobalt job submission with Karajan workflows In order to take advantage of Java2Karajan for our purposes, we had to first create a set of Java classes for submitting and monitoring the status of jobs on a Cobalt queueing system. The basic sequence of events in a Cobalt submission is as follows: SSH Initialization ! SSH Authentication ! Open SSH session channel ! Execution of cqsub, returning the job’s ID ! Close channel We created a CobaltSubmitter object which utilizes the SSH connection, authentication, and command execution functionality found in the open-source J2SSH SSHTools libraries [13]. First, we instantiate a CobaltSubmitter object, which takes as parameters the hostname, the username, the port, and the path to the user’s private key. In order to avoid a situation where the user would need to enter their passphrase in cleartext, we used a function in the CoG Kit which, upon execution of the Karajan script, masks the passphrase console input with random characters. Assuming all of the authentication information is correct, and that connection and authentication is successful, we are left with an initialized CobaltSubmitter object, through which we can submit jobs using its submit() method. Any command line argument which can be passed to Cobalt’s cqsub command can be given as a parameter in the CobaltSubmitter’s submit() method. This method formulates a cqsub command based on the given parameters and executes it on the server. Cobalt, after adding the job to its queue, responds with a job ID number, which the CobaltSubmitter’s submit() method returns as a string. These two processes, initialization and submission, are essentially all we need to submit jobs to a Cobalt system. However, the CobaltSubmitter object would be useless for workflow purposes without a third function: job status monitoring. Simply submitting jobs in parallel does not preserve dependencies; what we need to do is not only submit the job, but continually monitor its state, moving on to subsequent jobs in the workflow only when the job has finished executing. 4

Cobalt provides the command cqstat for queue monitoring purposes. The CobaltSubmitter object has a method status() which takes the id of a job as a parameter, and executes the cqstat command for that particular job, returning a string representating the job’s state. If the cqstat command returns a blank table, that means that the job in question has completed execution and left the queue, so status() returns the string “finished.” Having written this code in Java, we can now use our Java2Karajan tool to generate our bindings for us. Once we do that, one further step remains: defining a Karajan element, which combines the job submission and job monitoring methods provided by the CobaltSubmitter bindings into a single task element. This element will invoke the submit() method, placing the returned job ID string in a variable. We then enter a loop, utilizing Karajan’s element, in which we repeatedly invoke status() using that returned job ID string as a parameter, exiting the loop when the method returns “finished.” Of course, we wait an arbitrary amount of time between invocations of status() to prevent undue stress on the server. It should be noted that although polling is usually regarded as a dirty word, it is necessary for our purposes, as we can provide no callback function to inform us of the job’s status. While polling is not an ideal solution, the flexibility a↵orded by our approach makes it worth the relatively crude implementation. Having created the element, we now have a simple, convenient way to submit Cobalt jobs from a Karajan workflow. Here is a simple abbreviated example, in which two jobs are executed in parallel, and a third is executed afterward:

4

Qstat Monitor: observing the workflow

As we developed the Cobalt submission classes, we concurrently developed the Qstat Monitor, which is a graphical component displaying the current state of a parallel system. It provides the user with a clean, intuitive interface to a cluster or grid system. Essentially, it parses the output of a qstat command into a Swing JTable. The table is color-coded (running jobs are green, queued ones are yellow) and allows users to customize which fields they would like to view. Jobs which have left the queue remain in the Qstat Monitor’s table, colored blue, providing the user with a record of completed jobs. The monitor, in addition to being of general use for users of BGL and other parallel systems, was specifically relevant for the Cobalt workflow research, because it allowed us to observe the queue’s status throughout the various stages of the workflow execution. We can check if a given job has begun execution or if it is still waiting, and we can monitor other system activity; for example, perhaps the system is particularly backlogged with other users’ jobs, which will slow down our workflow execution. The Qstat Monitor also o↵ers job submission functionality, using the same Cobalt submission classes used in the Karajan workflow integration. Through a dialog box, the user can fill in the relevant information for the job (wall time, amount of nodes, etc.) and submit the job directly through the Qstat Monitor program.

5

Implications and possibilities

Although our integration of a custom queueing system library with Karajan is fairly basic, it has significant implications for researchers interested in maximizing their usage of diverse parallel systems. While Karajan previously supported a heterogeneous workflow, allowing multiple systems and security contexts, we now 5

have the ability to fine-tune our workflows to take advantage of the unique features provided by specific queueing systems. This approach is by no means limited to Cobalt.

5.1

Extensibility: adding PBS system support

As a demonstration of the extensibility of our approach, we developed a set of PBS submission classes which can be integrated with a Karajan workflow in the same way as the Cobalt classes. PBS is a widely used batch queueing system; here at Argonne, it is used on the UC/ANL TeraGrid [15]. We use the same SSH methods as the Cobalt system for these classes. What changes is the structure of the qsub command and the arguments available to the user. Having implemented the methods to generate the qsub command, and with the features provided by our Java2Karajan tool, creating and elements is a fairly straightforward task. This makes possible a Karajan workflow combining Cobalt and PBS elements, as well as Karajan’s existing task submission elements, and potentially any other system for which a library is available. Additionally, we implemented PBS functionality in the Qstat Monitor to demonstrate that component’s extensibility. The Qstat Monitor can switch between the two queueing system modes, providing features specific to each, while the underlying SSH interaction mechanism remains the same.

5.2

Further research: graphical workflows in the Qstat Monitor

One possibility for further research in this area might be the integration of the Karajan workflow with the graphical Qstat Monitor component. Instead of defining workflows in Karajan XML, the Qstat Monitor component could be expanded to allow the graphical creation of workflows, which could then be passed to the Karajan engine. This would combine the customizability o↵ered by the custom queueing system submission libraries and the convenient graphical interface provided by the Qstat Monitor. It may also be possible to integrate the queueing system-specific features described in this paper with the existing Karajan GUI application [8].

6

Conclusion

We have demonstrated an e↵ective way to extend Karajan to take advantage of the unique features provided the Cobalt queueing system, further improving Karajan’s client-based heterogeneous workflow capabilities. In doing so, we have also demonstrated a general methodology of augmenting Karajan through its Java binding functionality. Furthermore, our Karajan implementation works without installing any grid software, and authenticates through standard SSH public key authentication. As part of the CoG Kit, this research expands the commodity vision, in which we integrate grid resources with non-grid resources.

7

Availability

The Java CoG Kit is available through its homepage at http://wiki.cogkit.org. A Java Web Start release of the Qstat Monitor is available at http://wiki.cogkit.org/index.php/Java_CoG_Kit_Qstat. Code referred to in this paper can be found in the qstat section of the CoG Kit’s Subversion repository, viewable at http://svn.sourceforge.net/viewvc/cogkit/trunk/five/qstat/. Instructions for downloading the repository to Eclipse, using the Maven build system, can be found at http://wiki.cogkit. org/index.php/MavenRepository.

Acknowledgments The REU was supported by NSF REU site grant 0353989. The submitted manuscript has been created by the University of Chicago as Operator of Argonne National Laboratory (”Argonne”) under Contract No. 6

W-31-109-ENG-38 with the U.S. Department of Energy. The U.S. Government retains for itself, and others acting on its behalf, a paid-up, nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

References [1] Argonne National Laboratory BG/L System. Web Page. Available from: http://www.bgl.mcs.anl. gov/. [2] Cobalt: System Software for Parallel Machines. Web Page. Available from: http://www-unix.mcs. anl.gov/cobalt/. [3] Cobalt: An Open Source Platform for HPC System Software Research. Web Page. Available from: http://www-unix.mcs.anl.gov/cobalt/Cobalt-epcc-10-05.pdf. [4] Java CoG Kit Karajan Workflow Reference Manual. Web Page. Available from: http://wiki.cogkit. org/index.php/Java_CoG_Kit_Karajan_Workflow_Reference_Manual. [5] Condor: High Throughput Computing. condor/.

Web Page.

Available from: http://www.cs.wisc.edu/

[6] The Globus Toolkit. Web Page. Available from: http://www.globus.org. [7] Java Commodity Grid (CoG) Kit. Web Page. Available from: http://www.cogkit.org. [8] Java CoG Kit - Karajan GUI. Web Page. Available from: http://www.cogkit.org/release/4_1_2/ /webstart/#karajan-cog-workflow-gui. [9] Load Sharing Facility. Web Page, Platform Computing, Inc. Available from: http://www.platform. com/. [10] Portable Batch System. Web Page, Veridian Systems. Available from: http://www.openpbs.org/. [11] Altair PBS Professional. Web Page, Altair Engineering. Available from: http://www.altair.com/ software/pbspro.htm. [12] gridengine: Home. Available from: http://gridengine.sunsource.net. [13] J2SSH SSHTools. Web Page. Available from: http://sourceforge.net/projects/sshtools. [14] Simplified Wrapper and Interface Generator (SWIG). Web Page. Available from: http://www.swig. org/. [15] TeraGrid. Web Page, 2001. Available from: http://www.teragrid.org/. [16] TeraGrid Portal. Web Page. Available from: http://www.teragrid.org/userinfo/portal.php. [17] IBM Tivoli Workload Scheduler. Web Page. Available from: http://www-306.ibm.com/software/ sysmgmt/products/support/IBMTivoliWorkloadScheduler.html.

7

Workflow-level Parameter Study Management in multi-Grid environments by the PGRADE Grid portal1 Peter Kacsuk, Zoltan Farkas, Gergely Sipos, Adrian Toth, Gabor Hermann MTA SZTAKI, 1111 Kende utca 13 Budapest, Hungary e-mail: {kacsuk, farkas, sipos, toth.adrian, gabor.hermann}@sztaki.hu

Abstract Workflow applications are frequently used in many production Grids. There is a natural need to run the same workflow with many different parameter sets. Unfortunately current Grid portals either do not support this kind of applications or give only specialized support and hence users are obliged to do all the tedious work needed to manage such parameter study applications. P-GRADE portal has been providing a high-level, graphical workflow development and execution environment for various Grids (EGEE, UK NGS, GIN VO, OSG, TeraGrid, etc.) built on second and third generation Grid technologies (GT2, LCG-2, GT4, gLite). Feedback from the user communities of the portal showed that parameter study support is highly needed and hence the next release of the portal will support the workflow-level parameter study applications. The current paper describes the semantics and implementation principles of managing and executing workflows as parameter studies. Two algorithms are described in detail. The black box algorithm optimizes the usage of storage resources while the PS-labeling algorithm minimizes the load of Grid processing resources. Special emphasis is on the concurrent management of large number of files and jobs in the portal and in the Grids as well as providing a user-friendly, easy-to-use graphical environment to define the workflows and monitor their parametric study execution.

1. Introduction One of the most promising utilizations of Grid resources comes to life with parameter study (or sometimes written as “parametric study” or “parameter sweep”) applications where the same application should be executed with a large set of input parameters. Such parameter study applications are easy to implement in the Grid since the different executions started with different parameters are completely independent. Indeed, there are several projects [1], [2], [3] that demonstrated that parameter study applications are easily manageable in the Grid. However, most of these projects tackled only single job based applications. The real challenge comes when complex applications consisting of large number of jobs/services connected into a workflow should be executed with many different parameter sets. There have been only two projects that tried to combine parameter studies with workflow-level support in the Grid. ILab [4], [5], [6] enables the user to create a special parameter study oriented workflow. With the help of a sophisticated GUI, the user can explicitly define how to distribute and replicate the parameter files in the Grid and how many independent jobs are to be launched for each segment of the data files. This approach is very static restricting the exploitation of the dynamic nature of Grids that enables the dynamic collection of resources. The SEGL [7] approach puts much more emphasis on exploring the dynamic nature of the Grid. They also provide a GUI to define the workflows and to hide the low level details of the underlying Grid. The SEGL workflow provides tools for several levels of parameterization, repeated processing, data archiving, handling conclusions and branches during the processing as well as synchronization of parallel branches and processes. The problem with this GUI is that it might be too sophisticated, requiring very large skill from the application developer. Furthermore both ILab and SEGL are connected to a particular Grid although in case of a parameter study execution there is a large need to exploit as many resources as possible even if they should be collected from different Grids. Finally, neither of them can be used as a service through a Grid portal.

1

This research work is carried out under the FP6 Network of Excellence CoreGRID funded by the European Commission (Contract IST-2002-004265) and under the SEE-GRID-2 project funded by the European Commission (Contract number: 031775).

Although our approach to support workflow-level parameter study applications in the Grid has many similarities with these two projects, there are significant differences, too. Our main goals are as follows: 1. 2. 3. 4. 5.

Keep both the workflow GUI and the parameter study support concept as simple as possible. This enables the fast and easy learning of the tool as well as its easy usage. Enable run any existing workflow with different parameter sets without modifying the structure of the workflow. Manage the execution of the workflows on as many Grid resources as possible. Enable the collection of Grid resources from several Grids even if they are based on different Grid technology. Enable the access of the workflow-oriented GUI and the available Grids via a single Grid entry point, i.e., via a Grid portal without installing any software on the user’s machine. Provide a dynamic balance between the usage of processing resources, storage resources and network resources.

Clearly, these goals differ from the main concept of the above mentioned two projects. The last goal can be found in several other parameter study projects that aim at the support of parameter study applications at the individual job level. In case of Nimrod/G [3] and Apples/APST [8] the main emphasis is on scheduling and hence in these projects the fifth goal of our project is dominant. In Nimrod even an economic model is considered during resource scheduling. The starting point for our project was the P-GRADE Grid portal [9] that provides a workflow-oriented GUI as well as workflow-level interoperability between various Grids even if they are built on different Grid technologies. This means that the same portal can be connected to several different Grids and the portal manages the workflow execution among these Grids according to the users’ requirement [10]. The portal even enables the parallel exploitation of the connected Grids, i.e., different jobs of the same workflow can simultaneously be executed on Grid resources taken from different Grids. Such a multi-Grid workflow execution mechanism is a unique feature of the P-GRADE portal that is now widely used for many different Grids (SEE-GRID, VOCE, EGrid, HunGrid, CroGrid, GILDA, etc.). Besides LCG-2 and gLite based production Grids the portal is successfully used as service for the GT2 based UK National Grid Service (NGS) and it was also successfully connected to the GT4 based Westfocus Grid (UK). Recently the GIN (Grid Interoperation/Interoperability Now) VO of OGF is supported by the portal enabling the simultaneous access to all of its resources coming from different Grids. This portal is also connected to US OSG and TeraGrid, UK NGS, UK WestFocus Grid, EGEE Grids and hence via this portal all the major Grids of the US and Europe can be accessed even from the same workflow. Experiences of the portal revealed that many applications require not only the single execution of a workflow rather they seek for parameter study support to execute an existing workflow application with many different parameter sets. Therefore, our motivation was to extend the existing single workflow support of the portal towards a generic workflow-level parameter study execution support. Such support should enable the automatic starting, execution, monitoring and visualization of all the workflows belonging to the same parameter study. Of course the same way as in the case of the single workflow management environment the users should neither know any details of the underlying Grids nor insist on any particular programming language. Even legacy codes can be used as services in the workflows if the portal is integrated with the GEMLCA legacy code architecture service [11]. In order to reach the five main goals mentioned above we have developed two algorithms to manage workflow-level parameter studies across multiple Grids. The first algorithm is based on the “black box” concept that minimizes the required storage resources of the portal but results in superfluous job executions in many cases. The second algorithm, called as the PS-labeling algorithm minimizes the usage of processing resources of the Grids but increases the storage needs of the portal. They represent two extreme points in the scale of possible execution management algorithms. The algorithms should be mixed according to a dynamic and adaptive scheduling algorithm that is the subject of our future research. The paper first introduces the workflow concept of P-GRADE portal in Section 2. The next section explains the “black box” execution semantics and its portal support both at the user interface and the portal

workflow manager level. Section 4 introduces the PS-labeling algorithm and compares it with the “black box” execution mechanism. Section 5 summarizes the various problems that should be tackled in a parameter study environment (resilience, multi-Grid execution, large number of processes and workflows, dynamic changing of input parameter set, etc.). Finally, Section 6 compares our research with related ones. 2. The workflow concept of P-GRADE portal When designing and implementing P-GRADE portal, we focused on the following basic principles. Simplicity and expressiveness of the user interface P-GRADE portal provides a graphical editor to develop workflow applications. The workflow graph is a simple DAG (Directed Acyclic Graph) where nodes of the graph are either jobs (sequential, MPI or PVM) or legacy code services (if GEMLCA is integrated with the portal). For each node input and output ports can be defined. Input ports represent the input files to be used by the node and output ports represent output files to be generated by the node. Input and output ports can be connected by directed arcs (from the input to the output port). The arcs represent the necessary file transfer between the two connected nodes. Indeed, the portal run-time system automatically takes care of these predefined file transfers, so the user is completely released of transferring files among jobs during the workflow execution. A node can be executed if all the input arcs have received the necessary input files. This very simple dataflow semantics enable the parallel execution of those nodes that are simultaneously supplied with the required input files. Notice that based on this semantics the user can easily achieve two-level parallelism. The first level is among nodes of parallel branches of the workflow (inter-node parallelism), the second level is within a node if the job executable defined for the node is a parallel code (MPI or PVM). This second level is called intra-node parallelism. To keep the concept as simple as possible neither if-then-else nor loop constructs are provided in the graph. (However, based on the existing graph facilities, the user can easily create ifthen-else graph templates [12]). In order to illustrate the graphical workflow concept of P-GRADE portal Figure 1 shows a simple example workflow that is used for solving the Ax = B equation where A is a matrix, B and X are vectors. The application consists of 5 jobs (all of them having sequential executable code). The first job called as “Separator” accepts the A and B matrices as input parameters on its input port, separates them and then copies A to jobs “Invert_A” and “A_mul_X”, and copies B to “Multip_B” and “Subtr-B”. The job “Invert_A” creates the invert matrix of A that is multiplied by B in job “Multip_B”. The output of “Multip_B” is the searched X vector. The next two jobs are used to check the quality of the result.

Figure 1 Workflow for Computing the Ax=B equation

Exploiting the available services of existing Grids Instead of developing a new kind of middleware as it has been done in several other parameter study related Grid projects [3], [8] our aim was to exploit the available services of existing production Grids. For this reason we have not developed our own enactment service rather we use Condor DAGMan [13]. If there are alternative services, we use them in order to ensure Grid interoperability through the portal. For example, the portal can access, process and visualize the information in both MDS and BDII type information systems. However, the portal should work even if a specific service is not available. For example, the portal can exploit the EGEE broker services to assign jobs to Grid resources but it also can manage GT2 Grids where such broker does not exist (e.g. the UK NGS). In the former case the portal automatically generates a default JDL file based on the user created workflow definition and enables the user to tune this JDL file with additional visual editing. In the latter case, the portal enables the user explicitly assigning jobs to Grid resources. In both cases, first the user should select the Grid where the job should be executed. For different jobs of the workflow the user can select different Grids either with broker or without broker. Tailoring the Grids to user needs The portal administrator can connect the portal to several Grids. The portal maintains a list of connected Grids common for every users. This list cannot be modified by the users only by the portal administrator. For each connected Grid a list of available resources is provided for every single user. The content of these Grid resource lists can be modified by the user. Grids usually have many unreliable sites and the user can remove from his Grid resource list those sites that he identifies as unreliable ones. On the way round, if he knows some extra sites not shown by the Grid resource list he can easily add those sites to his own Grid resource list. If later the user would like to assign a job to a Grid resource, the portal offers the available Grid resources according to these Grid resource lists. Flexibility, expandability, standardization In order to provide a flexible, easy to expand and standard portal we selected the Gridsphere portal framework technology [14] and used it to build our portal. Gridsphere is used by a large community enabling the exchange of available portlets that are written according to the Gridsphere standard. It also enables the easy tailoring of the portal to specific user needs. Our aim with the P-GRADE portal was to create a workflow-level core portal that can easily be extended and adapted for different user communities’ needs. 3. The “black box” execution semantics of workflow-level parameter study execution The simplest approach of supporting parameter studies at the workflow level is based on the “black box” execution semantics. It means that we consider a workflow as a black box that should be executed with many different parameter sets. These parameter sets are placed on the so-called PS input ports of the workflow. An input port is called PS input port if a set of parameter files can be received on that port. If a workflow has one such PS input port, it should be executed as many times as many elements are in the parameter file set of that port. If there are several PS input ports, the workflow should be executed according to the cross-product of these input sets. From now on we say that a workflow (WF) which has got at least one PS input port is called a PS workflow (PS-WF). The concept of the workflow-level PS support based on the “black box” execution semantics is illustrated in Figure 2. The original WF consisting of 4 jobs is considered as a black box that have two input ports capable of accepting inputs from the outside world. On both inputs the user can provide several input sets and the workflow should be executed with the cross-product of these input sets. Figure 2 shows a case when Input port 0 accepts an input set with two elements (value 1 and 2) and Input port 1 accepts an input set with three elements (value 3, 4 and 5). Therefore, the workflow should be executed six times as shown in Figure 2. In order to manage the execution of workflows according to the “black box” execution semantics the workflow manager of the portal was extended in the following way. Let M = N1 x N2 x…x Nm

where m is the number of PS input ports and Ni denotes the number of input files on the i-th PS input port. At run-time the portal PS-WF manager generates M executable workflows (e-WFs) from the original PSWF. Every e-WF is labeled by m labels: 1 - n1, 2 - n2 , … m - nm Input values: 1, 2 IDs: 0, 1

PS-port 0 Job_3 SEQ 1

Input values: 3, 4, 5 IDs: 0, 1, 2

0

0

Job_4

Job_5

SEQ

MPI

2

1

1 PS-port

0

1

Job_6 MPI 2 Workflow is a black box 2, 1

3, 4, 5 0

1

WF

Generate cross product 1

3

2

3

1

4

2

4

1

5

2

5

0

1

0

1

0

1

0

1

0

1

0

1

WF

WF

WF

WF

WF

WF

Apply cross product for the original workflow

1

2 0

1

2 0

J ob_ 3

0

2 0

0

J ob_ 3 J ob_ 3

P a ra m e tric

J ob_ 3

J ob_ 3

J ob_ 3

P a ra m e tric

SEQ

SEQ

P a ra m e tric

P a ra m e tric

SEQ 1

1

0

0

P a ra m e tric SEQ

SEQ

1

J ob_ 4

J ob_ 5

1 P a ra m e tric

P a ra m e tric

SEQ

MP I

2

1

1

3

J ob_ 4

0

J ob_ 6

0

0

0

J ob_ 4

P a ra m e tric

4

P a ra m e tric

SEQ

MP I

2

1

0

J ob_ 5

P a ra m e tric

P a ra m e tric

SEQ

MP I

2

J ob_ 4

4

1

1

J ob_ 5

1 P a ra m e tric

P a ra m e tric

SEQ

5

MP I

2

1

J ob_ 4

J ob_ 5

1 P a ra m e tric

P a ra m e tric

SEQ

MP I

2

1

1

J ob_ 6

P a ra m e tric

0

1

J ob_ 6

P a ra m e tric

MP I

P a ra m e tric

P a ra m e tric

SEQ

MP I

0

1

2

1

0

1

J ob_ 6

J ob_ 6

P a ra m e tric

P a ra m e tric

MP I

MP I

MP I

2 2

0-0_1-0

J ob_ 5

1

J ob_ 6

MP I

2

0

J ob_ 4

5

1

P a ra m e tric

MP I

0

0

J ob_ 5

1

0

P a ra m e tric

1

1

0 0

0

P a ra m e tric

SEQ

1

0

3

1

0

0-1_1-0

0-0_1-1

2

0-1_1-1

Figure 2. Concept of “black box” execution semantics

2

0-0_1-2

2

0-1_1-2

Where the internal structure of labeli is: i - ni where i identifies the i-th input PS port and ni represents the ordering number of the input file taken from the i-th PS port in the identified execution instance: 0 ≤ ni < Ni Figure 2 also shows these labels for each of the 6 execution instances. This labeling scheme identifies for the PS-WF manager which input file to take from the different PS input ports in the case of different execution instances (e-WFs). It also helps in identifying the output files generated at the output ports. Every output file is labeled with the label of the e-WF that generated it. Notice that output files can be local and remote. Remote output files are always permanent and once they are produced by an e-WF they can be immediately read by the user. This enables the user to study the partial results even if an e-WF is not completed. Local output files can be permanent or volatile. Permanent means that the user would like to get access to this output file only when the whole e-WF execution has been completed. These partial results are collected and stored by the portal meanwhile the e-WF runs. When the e-WF is completed these files are zipped (together with the standard output and error logs) and placed by the portal to a Grid storage resource that was defined by the user. These local permanent files should be typically small files and collecting and storing them by the portal the number of access to Grid storage resources can be significantly reduced resulting in the reduction of the overall execution time. Finally, local volatile files represent temporary partial results. As they are consumed by the connected job(s) they can be removed from the portal. This is important from the point of view of reducing the load of the portal storage resources. Developing and running PS-WF applications according to the “black box” execution semantics require three main steps. The first step is the development of the WF application that the user would like to run as a PS-WF application. The development process of a WF application has been described in previous publications [9] as well as in the User’s Manual [15] of the current service portals and hence we do not describe it in this paper. The basics concepts are summarized in Section 2. The second step is the transformation of the WF into a PS-WF. Finally, the third step includes the submission of the PS-WF application to the Grid and monitoring the execution of the PS-WF application. Consequently, the PS-WF user interface has two major parts: 1. 2.

Definition of PS-WF graphs Monitoring the PS-WF graph execution

3.1 PS-WF graph definition In order to turn a WF application into a PS-WF application the graphical Workflow Editor (WE) of the portal was slightly extended. The user can open the existing WF by the WE and can turn any of the existing input port into a PS input port. In order to illustrate the process we use the same Ax_EQUAL_B workflow application that was shown in Section 2. The task is to modify this WF in order to solve the equation for a set of A and B parameters. Figure 3 shows how to turn the input port of the Separator job into a PS port. Notice the difference between the input port and PS input port definition. In case of a normal input port a file is associated with the port. This file can be either local (originating from the user’s machine and part of the input sandbox) or remote (placed in a storage resource of the Grid). In case of a PS port (Figure 3) a directory is associated with the port. This directory always should be placed in a storage resource of the Grid. The user should place the series of input files into this directory that must not contain any other file. Currently, the portal does not provide any portlet to support the placing of the input files in to the selected Grid storage resource. The user should use the command line interface of the actual Grid. E.g., in EGEE Grids, there are file catalog related commands by which the user can place the files in such a directory. After defining the PS input ports the user should identify the Grid and its storage resource where the local permanent files should be stored at the end of each e-WF execution. As a summary we can say that turning an existing WF into a PS-WF is an extremely easy task. Simply turn some of the input ports to PS input ports and define the target Grid storage resource for the local permanent files. This was exactly our aim: to simplify for the user the process of utilizing existing workflows and run them as parameter studies.

Figure 3 Definition of a PS input port 3.2 Monitoring the PS-WF graph execution Even monitoring a single job is important for the user not mentioning when he runs thousands of jobs as part of a PS-WF execution. The challenge here is how to visualize the execution status of thousands of eWFs and jobs in an easily understandable and manageable way. Again the monitoring of a single WF as it was managed by the portal was a good starting point. It was only slightly extended and yet it can help to monitor all the e-WFs and all their jobs in an efficient way. The ordinary WFs of a user are listed by the portal under the Workflow Manager window. Here the user can submit the WF, attach the WF to the Workflow Editor to see the graphical view of the WF, and delete the WF. Moreover, the “Details” button enables the user to see the details of the WF, i.e., to see the component jobs and their assignment to Grid resources. The PS-WFs are listed in the Workflow Manager window in the same way as the WFs. The only difference is that the PS-WFs have a “PS Details” button to show their internal details. Figure 4 shows a snapshot of the Workflow Manager window where the original “Ax_EQUAL_B_seegrid_broker” WF and its PS-WF version (“Ax_EQUAL_B_PS”) are listed before submitting the PS-WF.

Figure 4 Workflow Manager before submit the PS-WF

Once the user submits the PS-WF (simply clicking on the corresponding “Submit” button), the portal workflow manager (WM) creates all the e-WFs that are defined by the cross-product of the PS input ports’ file sets. Then WM submits simultaneously as many e-WFs as many are permitted by the portal administrator. In principle all the e-WFs could be submitted into the Grid. However, every e-WF submission requires a significant resource set both from the portal server and from the underlying Grid. In order to prevent the flooding of the portal and the Grid by the e-WFs and jobs of a single user, the portal administrator can restrict the number of e-WFs that can simultaneously be submitted to the Grid. After the PS-WF submission using the PS Details button the user can see the statistics of the e-WFs: how many were initiated, submitted, finished and how many went on Rescue or Error. Figure 5 shows the situation where the “Ax_EQUAL_B_PS” PS-WF was submitted with 6 input parameter sets. As a result 6 e-WFs were generated by the portal. 2 of them already finished, 3 submitted and one still in init state, i.e. waiting for submission. The figure also shows that any submitted e-WF can be viewed in detail by using the “Details” button. Clicking there the detailed view of the e-WF shows the component jobs of the e-WF, their Grid resource assignment and their status. Notice that any e-WF can be aborted. It kills the selected e-WF but the other e-WFs can continue their activity. Figure 5 also reveals that those e-WFs that are finished cannot be viewed in the PS Workflow Details window. Their results (including every stdout and stderr files) are already stored in the defined Grid storage resource so the user can check those files there.

Figure 5 Detailed monitoring view of a PS-WF’s execution 4. PS-labeling algorithm Although the “black box” execution semantics is easy to understand and apply unfortunately, it is not optimal execution semantics. In the case of the example shown in Figure 2 the whole PS-WF as a black box should be executed 2x3=6 times, i.e., all the jobs of the PS-WF will be executed 6 times. However, analyzing the PS-WF it quickly turns out that it is enough to execute job3 and job5 with the two input values of PS input port 0. Job4 and Job6 indeed should be executed with the cross-product of the two PS input ports, i.e. 6 times. This example shows that the “black box” execution semantics results in a redundant execution that is not tolerable if the number of files on the PS input ports are in the range of hundreds and thousands as it frequently happens during large-scale scientific simulations. In order to be able to solve the problem illustrated by Figure 2 we have to modify the “black box” execution mechanism. The new method is based on the “black box” approach in the sense that the user should not have to modify anything related to an existing WF, except for simply turning the required input ports into PS input ports. Once this is done the portal will handle the WF as a PS-WF and provides a unique natural number as PS port identifier for the PS input ports. Of course, the user should place the necessary number of input files for every PS input port. Based on the PS port identifier and the number of input files placed on the PS ports, the portal can identify for each node of the PS-WF graph the optimal execution

number. This is done by the PS-labeling algorithm run by the workflow manager of the portal. The PSlabeling algorithm has two phases: -

Preparation phase Execution phase

Preparation phase The preparation phase is executed by the workflow manager before submitting a PS-WF. Starting from each node having a PS input port represented by the label i-Ni (where i denotes a unique natural number index of the PS port and Ni denotes the number of input files belonging to this port) the algorithms extends the current label of all dependent nodes by i-Ni. (A node j is dependent from node i, if there is a directed path of arcs starting from node i and ending at node j.) At the end of the algorithm every node is either without any label or labeled with a series of labels: label1, label2, … labeln The former case means that the node does not depend on any PS input ports and hence it should be executed only once. The latter case means that the node was on at least n paths that are rooted from n different input PS ports’ node. Overall such a node should be executed N1 x N2 x…x Nn times as the cross-product of the initial input file sets. This label set is extended to be a full label set for every node. A full label set is an ordered set of the Ni values, i.e., on position i there is the value of Ni. If there was no label for position j (PS port j) in the generated label set, then an empty position is placed here. The full label set for a given node shows those PS input ports that have impact on this node and for such PS ports it gives the number of input files demonstrating the strength of this impact. Example: let us suppose that M=4, N0= 4, N1=3, N2=5 and N3=2 and the generated label set for node A dependent on PS port 1 and PS port 3 is: 1-3, 3-2. Then the full label set for A will be: {_, 3, _, 2}. After completing the labeling, for each node a power set of the full label set is generated. For node A in the example above the power set will look like: {_,1,_,1; _,1,_,2; _,2,_,1; _,2,_2; _,3,_,1; _,3,_,2 } Execution phase After completing the preparation phase, the workflow manager begins to generate and submit the e-WFs labeled as n1, n2 , … nm where i identifies the i-th input PS port and ni represent the ordering number of the input file taken from the i-th PS port in the identified execution instance: 0 ≤ ni < Ni During the e-WF execution the workflow manager tries to substitute the execution of the node by the file(s) resulting from a previous run of the node. First the e-WF label is matched with the power set of the nodes of the PS-WF. If there is no power set of a node it means that it has to be executed only once and marked as executed for later e-WFs. If a node has a power set and the matching element is marked for the node, then this node was already executed and therefore the workflow manager does not submit the job (node) again, rather immediately provides those output files that were generated by this node during the marked run. If the matching element is not marked for a node, then this node should be executed. After executing the node

its matching element is marked and the result output files are stored either in a Grid storage resource if they were defined as remote files or inside the portal if they were defined as local files. To illustrate the algorithm let us take the example shown at the preparation phase (M=4, N0= 4, N1=3, N2=5 and N3=2) and consider the e-WF with label “3,2,4,1”. If node A has the power set {_,1,_,1; _,1,_,2; _,2,_,1; _,2,_2; _,3,_,1; _,3,_,2 } and the matching element _,2,_,1 is not marked yet, the job or service belonging to node A should be executed. Once it is executed the matching element _,2,_,1 should be marked. If later another e-WF for example with label “1,2,3,1” is to be executed, then the workflow manager will recognize that the matching element _,2,_,1 is already marked and hence node A must not be executed again. Notice that the price for this optimized usage of processing resources is to increase the storage capacity requirements of the portal server. There are three types of files handled by the portal in the PS-WF: 1.

2.

3.

Local volatile files: During normal WF execution these files are immediately removed from the portal when they are consumed by their connected input ports. The same is true in case of the black box PS-WF algorithm. However, in the PS-labeling PS-WF execution algorithm they should be stored in the portal in order to substitute by them the re-execution of their producer node. They should be stored as long as there is an e-WF that have not completed yet and must use these files. Local permanent files: During normal WF execution these files are stored in the portal and delivered to the user after completing the WF execution. In case of the black box PS-WF algorithm they must be stored until the corresponding e-WF is not completed yet. After that they can be removed without waiting the completion of other e-WFs. In case of the PS-labeling PS-WF algorithm they must be stored as long as there is an e-WF that have not been completed yet and must use these files. So their average storing time is much longer than in case of the black box PSWF algorithm. Remote output files: They should be stored in Grid storage resources and hence these files do not increase the storage requirements of the portal either in the case of the black box or the PSlabeling algorithm.

As a conclusion we can say that the more internal output files are defined as remote files the less storage price we have to pay for the optimized usage of processing resources. Since remote files also help in online checking and monitoring the execution of PS-WFs, it is recommended to use remote files instead of local files in case of PS-WFs. On the other hand remote files restrict the distribution of the execution of PSWF nodes in different Grids. Usually, a node should be executed in the same Grid where the associated input/output remote file is stored. However, since local files are stored on the portal their processing PSWF nodes can be assigned to any Grid connected to the portal (provided that the user has certificate and VO membership for the given Grid and VO). Therefore if a user would like to exploit many Grids for the execution of a PS-WF it is recommended to use local files instead of remote files. In this case the choice of Grids and Grid resources can be done on-the-fly by the portal provided that it is supported by a metabroker. To develop the optimal Grid selection algorithm and developing a meta-broker is the subject of future research [16]. Notice that from the user interface point of view the PS-labeling PS-WF execution does not differ in any way from the black box PS-WF execution. Therefore it is the task of the portal to decide which algorithm is to be used for a given PS-WF. Such decision will be the subject of future research. The decision algorithm can be even more fine-grain, i.e., the portal could dynamically change the applied PS-WF execution algorithm node by node according to the size of the generated output files and the execution time of the different nodes. For example, if a node runs very quickly but generates a large local file, it would be better to execute it according to the black box algorithm. 5. Further aspects of PS-WF execution Fault-tolerance in case of PS-WF execution has an outstanding importance since a PS-WF typically uses many Grid resources for a long period of time in order to execute all the e-WFs derived from the PS-WF. The fault tolerant execution of PS-WF can be supported in many ways by the portal. First of all, any job of

the PS-WF that is assigned to a Grid broker will be resubmitted if the broker rejects its execution on a selected Grid site. If a WF node is assigned by the user to a dedicated Grid resource and that resource is failed or after three attempts the Grid broker still fails to run a node, the particular node and e-WF will have a RESCUE status and the portal sends an e-mail message to the user (if the user requested it). The e-WF will go into rescue mode according to the Condor DAGMan control. The user can re-assign the failed node and resume the execution of the e-WF from the rescue state. Sending e-mail to the user has a great importance since a PS-WF application can run for days or even for weeks and the user does not want to pay continuous attention to the execution status. However, there are situations where the user’s intervention is needed for the portal or the user is interested in accessing the partial results of the execution. The user can define such situations for the portal and whenever such a predefined situation occurs the portal sends an e-mail to the user. One such situation is when a job failed and its e-WF turned into rescue state. Another important situation is when the user’s certificate is getting close to the expiration. The user can ask for e-mails after completing every e-WFs or completing the whole PSWF. A portal can be used by many users and many of them can submit PS-WFs. This can result in an overload of either the portal or the connected Grid or both. In order to avoid such harmful situation, the portal administrator can set up certain limits for the number of workflows that can be submitted by a single user. If this number N is less than the number of generated e-WFs of a submitted PS-WF (M), then the portal simultaneously submits N e-WFs out of the potential M generated e-WFs. Whenever an e-WF is finished the portal submit another e-WF until all the M e-WFs are submitted. However, this restriction does not protect other portal users. In order to evenly distribute the portal and Grid resources among the different users, the portal administrator can also set up a certain limit for the number of jobs that can simultaneously be submitted by the portal into the Grid. This limit is applied for all the users of the portal and hence the portal tries to equally distribute the available job submissions among the actual users of the portal. P-GRADE portal is a multi-Grid portal as it was explained in the Introduction. As such it can distribute eWFs for all the connected Grids. The question is how to select the right Grid and the best performing Grid resource for a given e-WF or for a given node of an e-WF. The current solution is largely controlled by the user. He can directly assign a node to a certain Grid and Grid resource in the original WF. In this case all the e-WFs generated from the PS-WF created from this WF will assign the job to this Grid site. This solution must be used in Grids where broker is not available. Allocating different Grids and different resources in the selected Grids the user can achieve a static distribution of e-WF nodes among the connected Grids. The situation can be improved by assigning nodes to brokers in Grids where brokers are available. In this case the user statically defines in advance which Grid to be used for the execution of a node but gives the freedom to the Grid broker to assign dynamically the node within that particular Grid. Assigning nodes to different Grids and particularly to different Grid brokers will result in a static or semidynamic distributed processing of the e-WFs among the connected Grids and among the Grid sites of those Grids. A fully dynamic solution could be reached if a Grid meta-broker was available that can dynamically select Grids. Currently we are engaged in research to define and create such a Grid meta-broker [16]. Users of long running PS-WFs would like to dynamically access the partial results generated by the e-WFs and modify the input data sets according to the partial results. P-GRADE portal enables this dynamic change of input data sets. First of all, the user can check any remote output file as soon as the status of the job generating it became finished. If the user decides to modify the input data sets he can suspend the execution of the PS-WF. In this case all the already submitted e-WFs go into rescue state and the user can modify any input data. After modifying the input data set the user can make the suspended e-WFs resume their work utilizing the rescue mechanism of Condor DAGMan. Using the same mechanism even the graph structure could be modified by the user. This facility is not implemented yet and it requires further research to solve the consistency of all the e-WFs even if the PS-WF has been changed. 6. Related research There are two main streams of research directions that have strong relationship with our work. Research on scientific workflows in Grid environments is an exciting and richly investigated subject. A good survey in

this field has been written by Yu and Buyya [17] and a whole special issue was recently devoted to this subject in Journal of Grid Computing [18]. These papers clearly show that meanwhile significant efforts are made to increase the usability of scientific workflows in Grid environment from many aspects (workflow composition, scheduling, performance estimation, fault-tolerance, etc.) there was little work on how to define and manage parameter studies at the workflow level. The second main direction relevant to our investigation is the research on parameter studies. Some of the existing tools like Condor [1], UNICORE [19] or AppLeS (Application-Level scheduler) [2] can be used to launch pre-existing parameter studies within distributed resources. There are several important projects specifically aiming at the realization of parameter studies in Grid environments but most of them consider only individual jobs and not workflows [3], [8], [20], [21]. Usually, their main concern is the scheduling aspects of jobs in Grids although, [21] concentrates on data management. In many cases they define specialized middleware in order to optimize the scheduling [8] or introduce new scheduling concepts [3]. A good comparison of several of these tools and projects can be found in [22]. There are very few projects that deal with the integration of parameter study and workflow research. VisPortal [23] supports workflow-level parameter study applications but it is not a general purpose portal rather a specialized one to support only rendering applications. It was based on the Grid Portal Development Toolkit that is the predecessor of GridSphere used for building P-GRADE portal. ILab [4] and SEGL [7] are two main projects that provide a graphical workflow concept particularly tailored to support large parameter study applications. Notice that their goal is different from our goal. They want to support the parameter study applications by a workflow whose components are used to specify the next stage of parameter study processing activity. As a result their workflow is much more sophisticated than the P-GRADE workflow and its creation requires much more skill. Our goal was that existing DAG workflows should be enabled to be executed as parameter studies simultaneously exploiting as many Grids and Grid resources as possible. Conclusions The workflow concept of P-GRADE portal was very successful and popular among Grid users because its simplicity and expressiveness. Developing and monitoring Grid applications based on the workflow concept of the portal is extremely easy. Due to these advantages it was asked to set up for many different Grids (OGF GIN VO, EGRID, SwissGrid, Turkish Grid, BalticGrid, BioInfoGrid, CroGrid, Bulgarian Grid, Macedon Grid, etc.) meanwhile it runs as official portal of several Grids (SEE-GRID, HunGrid, VOCE) and serves other Grids as volunteer service (UK NGS, GILDA, etc.). The feedback from the users made it clear that they want a parameter study support at the workflow level but in a way that keeps the simplicity and expressiveness of the original workflow concept. Based on their request we have extended the portal with the workflow-level parameter study support. The new version of the portal has been prototyped and was publicly demonstrated at the EGEE conference in September 2006. The new version of the portal (version 2.5) that gives service quality full support for the workflow-level parameter study will be released in November 2006. We have described two algorithms for executing PS-WFs in a multi-Grid environment. The black box algorithm gives an optimal solution concerning the utilization of storage resources while the PS-labeling algorithm optimizes the utilization of processing resources. Further research is required to create a dynamic and adaptive integration of the two methods where the portal can dynamically decide which method to be used for a particular node of the PS-WF graph. The current execution method of PS-WFs enables the static distribution of nodes between different Grids and different Grid resources if brokers are not available in the connected Grids. If brokers are available Grid resources can be assigned dynamically but the Grid assignment is still static. To provide a fully dynamic allocation of Grids and Grid resources a meta-broker should be developed and connected to the portal and to the Grids. The development of such a broker is subject of further research in the framework of the EU CoreGrid project.

References [1] Thain, D., Tannenbaum, T., and Livny, M., Condor and the Grid, in Fran Berman, Anthony J.G. Hey, Geoffrey Fox, editors, Grid Computing: Making The Global Infrastructure a Reality, John Wiley, 2003. ISBN: 0-470-85319-0., pp. 299-336, 2003. [2] Casanova, H., Obertelli, G., Berman, F. and Wolski, R., The AppLeS Parameter Sweep Template: UserLevel Middleware for the Grid, Proceedings of the Super Computing (SC 2002) Conference, Dallas / USA, 2002. [3] Abramson, D., Giddy, J., and Kotler, L., High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?, IPDPS'2000, Mexico, IEEE CS Press, USA, 2000. [4] Yarrow, M., McCann, K., Biswas, R. and van der Wijngaart, R., An Advanced User Interface Approach for Complex Parameter Study Process Specification on the Information Power Grid, Proceedings of the 1st Workshop on Grid Computing (GRID 2002), Bangalore / India, December 2000. [5] Yarrow, M., McCann, K. M., Tejnil, E., and deVivo, A., Production-Level Distributed Parametric Study Capabilities for the Grid, Grid Computing - GRID 2001 Workshop Proceedings, Denver, CO, November 2001. [6] McCann, K. M., Yarrow, M., deVivo, A. and Mehrotra P., ScyFlow: An Environment for the Visual Specification and Execution of Scientific Workflows, GGF10 Workshop on Workflow in Grid Systems, Berlin, 2004. [7] N. Currle-Linde, F. Boes, P. Lindner, J. Pleiss and M.M. Resch, A Management System for Complex Parameter Studies and Experiments in Grid Computing, in: Proc. of the 16th IASTED Intl. Conf. on PDCS (ed.: T. Gonzales), Acta Press, 2004. [8] Casanova, H. and Berman, F., Parameter Sweeps on the Grid with APST, in Fran Berman, Anthony J.G. Hey, Geoffrey Fox, editors, Grid Computing: Making The Global Infrastructure a Reality, John Wiley, 2003. ISBN: 0-470-85319-0., pp. 773-788, 2003. [9] P. Kacsuk and G. Sipos, Multi-Grid, Multi-User Workflows in the P-GRADE Grid Portal, Journal of Grid Computing, Vol. 3, No. 3-4, pp. 221-238, 2005. [10] P. Kacsuk, T. Kiss and G. Sipos, Solving the Grid Interoperability Problem by P-GRADE Portal at Workflow Level, Proc. of the Grid-Enabling Legacy Applications and Supporting End User Workshop, in conjunction with HPDC’06, Paris, pp. 3-7, 2005. [11] T. Delaitre, et al., GEMLCA: Running Legacy Code Applications as Grid Services, Journal of Grid Computing, Vol. 3, No. 1-2, pp. 75-90, 2005 [12] R. Lovas et al, Dynamic workflows in the service-oriented P-GRADE portal using Grid superscalar, Austrian Grid Symposium, Innsbruck, 2006. [13] J. Frey, Condor DAGMan: Handling Inter-Job Dependencies, http://www.cs.wisc.edu/condor/dagman/, 2002. [14] http://www.gridsphere.org/ [15] http://www.lpds.sztaki.hu/pgportal/v23/manual/users_manual/UsersManualReleaseV2.html [16] A. Kertesz and P. Kacsuk, Grid Meta-Broker Architecture: Towards an Interoperable Grid Resource Brokering Service, CoreGRID Workshop on Grid Middleware in Conjunction with EuroPar’06, Dresden, 2006. [17] J. Yu and R. Buyya, Taxanomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Vol. 3, No. 3-4, pp. 171-200, 2005. [18] E. Deelman and I. Taylor (guest editors), Special Issue on Scientific Workflows in Grid Environments, Journal of Grid Computing, Vol. 3, No. 3-4, pp. 151-304, 2005. [19] Erwin, D. (Ed.), Joint Project Report for the BMBF Project UNICORE Plus Grant Number: 01 IR 001 A-D, Duration: January 2000 to December 2002; ISBN 3-00-011592-7. [20] Abramson, D, Lewis, A. and Peachy, T.: Nimrod/O: A Tool for Automatic Design Optimization, The 4th International Conference on Algorithms & Architectures for Parallel Processing (ICA3PP 2000), Hong Kong, 2000. [21] H.A. James and K.A. Hawick, Scientific Data Management in a Grid Environment, Journal of Grid Computing, Vol. 3, No. 1-2, pp. 39-51, 2005. [22] DeVivo, M. Yarrow, and K. McCann, A comparison of parameter study creation and job submission tools, Technical Report NAS-01-002, NASA Ames Research Center, Mo#ett Field, CA, 2001. [23] http://www-vis.lbl.gov/Publications/2004/LBNL-PUB-893Visportal.pdf#search=%22%22parameter%20study%22%20AND%20%22Grid%20portal%22%22

The Java CoG Kit Experiment Manager Gregor von Laszewski,1,2 , Phillip Zimny,3, 1, Tan Trieu4,1 , David Angulo5,1 1 2

Argonne National Laboratory, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60440

University of Chicago, Computation Institute, Research Institutes Building #402, 5640 South Ellis Ave., Chicago, IL 60637-1433 3 4

PHILS INSTITUITION

Santa Clara University, Department of Computer Engineering, 500 El Camino Real, Santa Clara, CA 95053 5

DAVES ADDRESS DePaul University

Abstract

laboration among the scientific communities. The construction of the Grid requires the establishment of standards for a secure and robust infrastructure. One such undertaking is the definition of the Open Grid Services Architecture (OGSA), which provides a specification for a standard service-oriented Grid framework [1]. The implementation of the services form the Grid middleware, and the Globus Toolkit is today’s de facto standard Grid middleware [2]. The toolkit provides an elementary set of facilities to handle security, communication, information, resource management, and data management services [1]. However, the set of services may not be compatible with the commodity technologies that Grid application developers use. The Commodity Grid project addresses the incompatibility by creating Commodity Grid (CoG) Kits that define mappings and interfaces between Grid services and particular commodity frameworks such as Java, Perl, and Python [1]. The Java CoG Kit provides more than just a mapping between Java and the Globus Toolkit. The Java CoG Kit bridges the Java commodity framework and Grid technology. This means it not only defines a set of convenient classes that provide the Java programmer with access to basic Grid services [3], but also integrates a number of sophisticated abstractions, one of which is a workflow system [4]. Hence, it provides a significant feature enhancement to existing Grid middleware [1]. A popular use of the Grid is motivated by the field of bioinformatics, where applications such as Grid-enabled Blast [?] are used to compare base or amino acid sequences registered in a database with sequences provided by the user [?]. Blast runs can generate numerous queries that require hours or even days to complete. Managing such studies requires that scientists maintain the status and outputs of the

In this paper, we introduced a framework for experiment management that simplifies the users’ interaction with grid environments by managing a large number of tasks to be conducted as part of the experiment by the individual scientist. Our framework is an extension to the Java CoG Kit. We have developed a client-server approach that allows us to utilize the grid task abstraction of the Java CoG Kit and expose it easily to the experimentalist. Similar to the defintion of standard output and standard error we have defined standard status that allows us to conduct application status notifications. We have tested our tool with a large number of long running experiments and show its usability in practical applications such as bioinformatics.

1

Introduction

Grid computing addresses the challenge of coordinating resource sharing and problem solving in dynamic, multi-institutional virtual organizations [?]. The analogy between the computational grid and the power grid highlights the emphasis on virtualization. When a user plugs an appliance into the power outlet, he/she expects the delivery of power without concern for the whereabouts of the power source. Just as the electric power grid allows pervasive access to electric power, computational grids provide pervasive access to compute-related resources and services [1]. The Grid’s focus on integrating heterogeneous, distributed resources for the purpose of high performance computing differentiates it from other technologies such as cluster computing and the Web. The Grid’s ability to virtualize a collection of disparate resources to solve problems promisses effortless col1

individual queries, distancing them from the experiment at hand and burdening them with the tedious task of checking for job status and output. In an effort to relieve the scientist from the drudgery of managing output data and provide the scientist with a tool to monitor the progress of his/her jobs, we introduce the concept of an experiment. An experiment can be defined as tasks that are executed on the Grid with their associated output stored in a user-defined location. In this paper we show that the Java CoG Kit is ideally suited to support such a high level service. Using the facilities provided by the Java CoG Kit, we create a user driven experiment management system to simplify the administration and execution of repetitive tasks that use similar parameters.

port. This includes the presentation of a use case for our framework. Next we describe the architecture that fulfills our requirements. We describe the implementation and present preliminary performance results. We conclude the paper with our thoughts on future work to be conducted.

2

Requirements

The experiment management system has several major requirements including automated experiment checkpointing, transparent output management, automated version control, metadata management, detailed status reporting, persistent experiment sessions, scalable experiment updating. Next we will The user driven experiment management tool comdiscuss each of the requirements in more detail. bines features of several tools to empower the novice Grid user. It includes features typically found in queuing systems, shells with history, and process Automated Checkpointing. A basic assumpmonitoring programs such as the well known UNIX ps tion that the experiment management system makes command. Naturally it is targeted to include specific about experiments is that they are non-interactive, enhancements for the Grid environment. To empha- long running jobs. With long running experiments, size on the similarities let us revisit a typical use case the expectation that the host requesting the remote of a user using a UNIX command shell. A user work- resource maintains an uninterrupted connection with ing in the UNIX command shell queries which jobs the remote resource is impractical. From this stems have been submitted by the history function. The the requirement that checkpointing, or saving the status of process, a running instance of a program, state, of an experiment must be a transparent process can be obtained by issuing the ps command. The so that users do not have to associate experiments output provides information such as the process ID, with checkpoint files. Instead, once a user submits current status, the cumulated CPU time, and exe- an experiment he/she must only associate the expercutable name. Our experiment management system iment with its name in order to track its status. To provides a similar interface, displaying the added ex- address the overhead of maintaining a persistent conperiments in a format that includes the experiment nection, the Java CoG Kit abstractions module proID, current status, cumulative time the experiment vides a checkpointing mechanism that enables users has been queued, and experiment name. However, in to reconnect to a submitted job at a later time. extension to the normal command and history management tool by shells we must integrate user acces- Transparent Output Management. To shield sible outputs and error files on a command by com- the user from details about the Grid, the standard mand basis. error (stderr) and standard output (stdout) are autoA preliminary version of the execution manager matically saved in a predetermined experiment path is already available for years as part of the Java location to prevent the impression that the stdout CoG Kit under the name Grid Command Manager and stderr have vanished because they reside on the (GCM). However, we enhanced its functionality sig- remote execution host or because the experiment has nificantly.The enhancements include experiment sta- been duplicated (see also Version Control). Such tus checkpointing, management support for a large functionality provides the illusion of localized comnumber of experiment submissions, and the inte- puting while using the Grid. gratin of fault tolerant queues for managing experiment submissions. Version Control. Storage of output files leads to The rest of the paper is structured as follows. First, the next requirement of output version control. When we revisit our requirements that lead us to a redesign an experiment is submitted more than once, the outof the Grid Command Manager for experiment sup- put from its previous runs needs to be stored and 2

accessible for comparison to its future runs. An automated version control system sequentially names versions of output. This is to take the responsibilities of re-naming, moving, and organizing different versions of output away from the scientist.

Scalable Experiment Status Updating. With persistent sessions, the number of experiments within a session can grow quite large. The task of updating the status of such a large number of experiments can consume a disproportionate amount of computing resources on the client machine. The experiment management system thus needs to simulate thread exeMetadata Management. A scientist often has cution when updating the status of each experiment, additional information about an experiment that rather than creating a thread dedicated to status upneeds to be managed. Such information includes the dating for each experiment. authors of the experiment, the date the experiment, and other information pertinent for organizing and documenting of an experiment. Hence, an additional 3 Use Case requirement is to provide a system to automatically maintain the metadata of each experiment. This sys- [focus on biology case; derive requirements] In this tem must allow for easy entry of an experiment’s section we describe a scenario for the use of the cogmetadata as well as allow for changes to be made. experiment tool. It will allow the scientist to reference more than just The Basic Local Alignment Search Tool (BLAST) the output to uniquely identify each experiment. is one of the most popular tools for searching nucleotide and protein databases. It tests a nucleotide sequence against a database of known sequences and Application Status Reporting. [ why we have returns similarities. BLAST offers many different done this] Besides retrieving stdout and stderr, we types of queries to the data base including ones such believe that users will benefit from application status as nucleotide to nucleotide, protein to nucleotide, proreporting. Similar to stdout, we introduce a stantein to protein, nucleotide to protein, as well as many dard status (stdstatus) that can be used to report more. Biologists use this tool to help them discover more detailed experiment states as well as applicathe identity of the sequence they are studying or to tion specific information.1 When checking the staidentify the function of the sequence they are studytus of an experiment that reached the failed state the ing by comparing it to similar sequences BLAST user may wonder what triggered or caused the failure. finds. The standard status provides this service by trapping A biologist may find themselves in the scenario signals that interrupted the job. The user can then where they wish to research a specific genetic sequery the standard status to review the events that quence by making 500 slight modifications to the seoccurred before the failure. quence, run the it through BLAST and see if the The use of the standard status goes beyond error modification produces any similar characteristics of reporting; it provides a simple technique for runtime other sequences. The biologist would have start by application status notification. For experiments that running the original sequence and then run each of take days to complete, knowing that the experiment the 500 modifications. At the beginning of each subis running is often inadequate. The standard sta- mission of the BLAST run, the biologist would have tus provides a mechanism for application developers to name the task himself and have to then keep track to expose a more detailed record of the application’s of the 500 different names. Upon the completion of progress during execution. the BLAST run the biologist would then have to move his output files into separate directories which they Persistent Experiment Sessions. The ability to would have to name and then remember. Once the biologist has completed his 500 BLAST load information about previous experiments when runs, he now has 500 different outputs to manage. restarting the experiment manager is an important The biologist most likely does not wish to devise a maintenance tool. In case the experiment manager method of organizing all of the different output they abruptly shuts down or if the user has multiple inhave created. Even once organized, there is very little stances of the experiment manager running, persisaddition data associated with the outputs to allow tence enables the user to maintain sessions. the biologist to search through the output files. This 1 this is an implementation detail research method is ine⌅cient and timely and not the 3

optimum way a biologist would like to conduct their research. The cog-experiment tool offers a way to make this process not only much simpler but also much more e⌅cient. This tool allows the biologist to set up the submission process to repeat itself however times is need, in this case 500, while making a slight modification to the submission parameters each time. For this example, the modification to the parameters would be the file name each modified sequence was stored in. The submission process no longer requires the biologist’s presence. If these submissions took long periods of time to complete, such as several days, the cogexperiment tool would also checkpoint progress on the completion of a submitted task so the researcher would not have to start the task over from the beginning. In addition to offer a better submission procedure to the biologist, the cog-experiment tool simplifies the file management for the biologist. The biologist may pick one name and the cog-experiment tool will automatically assignment a sequential version number to each submission. Starting with one, the biologist now has an easy to understand version schema. This auto-naming feature takes away any worries about over writing data or forgetting the names used for submission. Once each submission has been automatically named, a folder of the same name is created to store that individual submission’s files in the user’s experiment path location. Now the biologist has all of their submission files properly named and has the output file neatly stored in individual folders. The biologist can use the option to enter metadata about the submissions on an individual level. Information about the specific submission such as author, time, or other notes can be saved to persistent storage. When the biologist does this, it allows them to search through all the outputs. For example, if the biologist wishes to view all the submissions from a certain day, they can simply search for that date and all of the submissions from that date are displayed on the screen.

4

provides high level abstractions that include Grid tasks, transfers, jobs, and queues that make developing Grid programs easier [4]. The experiment management system consists of two primary components, an experiment manager and a command line component. These components communicate via a socket, with the experiment manager running as a background job that services requests from the command line component to add, remove, submit, list, and retrieve the status of experiments. Figure 1 depicts the architecture of the experiment management system. The heart of the system is the experiment manager component, which maintains experiment status with a set of four queues: pending, submitted, completed, and failed. An experiment’s transition through the queueing system is illustrated through the state diagram in Figure 2. The user has control over two state transitions: adding an experiment to the pending queue, and performing a local submit to move the experiment to the submitted queue. The rest of the state transitions are handled by a background thread, which periodically updates the status of the queued experiments.

cog-experiment command line interface

Queue

R/W

Checkpointing Experiment Manager

R/W

Pending Queue Submitted Queue Completed Queue

W

Experiments Repository

Failed Queue Experiment Status the Grid infrastructure

Figure 1: The architecture of the Java CoG Kit experiment management framework. The experiment manager uses persistent storage to provide automated experiment checkpointing, transparent output management, and persistent experiment sessions. The automated experiment checkpointing and transparent output management functions rely on an experiment repository to store the checkpoint files for each submitted experiment and to save the stdout, stderr, and stdstatus resulting from

Architecture

The architecture of the experiment management system integrates with the Java CoG Kit’s layered approach. The experiment management system is a module that reuses the abstractions layer while exposing a command line tool. The abstractions layer 4

of experiment data manager to help organize an experiment’s metadata. [clarity on client/server] Experiment data manager is designed to hold all of the metadata of an experiment in one object. This class stores the separate pieces of metadata in individual strings. The only methods the experiment data manager class contains are simple get and set methods to set and retrieve information within the class. Experiment metadata implementation class is implemented using a variety of technologies. It uses JPanel from the swing package to construct the graphical user interface that is presented to user to enter the metadata to their experiment. This interface retrieves the metadata and sends it to be saved to persistent storage. The metadata is organized by being placed into an instance of the experiment metadata manager class. This instance of the experiment metadata class is then written to file in xml by using the XStream package [?]. The location of the metadata storage is the same as the experiments location for storing the stderr and stdout. This location is in a directory that shares the same name as the experiment it is associated with. The name of the experiment is ultimately determined not by the user but by the auto-naming method inside of the experiment metadata implementation class. This method automatically increments an integer that is attached to the end of the name of the experiment the user enters. To keep track of all of the experiments that have been created, experiment metadata implementation uses a vector that contains the string name of all of the experiments that have been added. This vector is the stored to persistent storage in xml format using XStream in the file entitled versions.xml. This file is referenced when the auto-naming method is determining the correct number to attach to the end of the name entered by the user. Experiment output manager is designed to handle the user’s query based commands. It uses XStream load the vector stored in versions.xml as well as other saved instances of the experiment data manager class. Once the necessary information has been loaded experiment output manager conducts the specified query through the data and returns the results.

Begin

Experiment State Diagram

Add Pending

Local Submission Failure

Grid Submission

Local Submit Submitted

Failure Grid Submit Runtime Failure

Failed

Running

Completion with Failure Completed

End

Figure 2: State diagram of an experiment as it transitions through the four queues. an experiment. To provide persistent experiment sessions function, the experiment manager periodically checkpoints the status of the four queues to persistent storage, where the status of the four queues can be reloaded when the system is restarted. The cog-experiment command line interface component provides access to the experiment manager functions to add, remove, submit, and retrieve the status of experiments. The command line interface also provides other functions such as metadata maintenance and version control, and thus requires access read and write access to the experiments repository. [figures changes: class relationships - fill out page redraw arch w/ lines

5

Implementation

The implementation of the experiment management system is split into the client and server components.

5.1

client

make sure that client and server clearly separated in different subsections Experiment manager client is implemented using an argument parser to retrieve the user’s desired function. Once the user has entered their command experiment manager client then parses the command into its separate components to determine which method to call. Experiment manager client’s methods use instances of experiment metadata implementation and experiment output manager. These two classes both use instances

5.2

Server

The server consists of four threads of execution; the main thread of execution listens for and responds to requests from the client, one for intermittent checkpointing of the queues, and another two threads to 5

update the status of submitted experiments. The main thread instantiates the experiment manager class that is responsible for providing all the necessary methods that exposes the interface for the client to communicate with the server. Once the checkpointed queues have been loaded from the experiment path, two new threads are spawned, and they are responsible for updating the status of experiments in the submitted queue. The number of threads and their polling intervals can be adjusted for performance fine-tuning. The choice of using a couple threads to monitor experiment status allows the experiment management system provide reasonable response time while restricting resource consumption. These threads also automate the retrieval of any output associated with the experiment. Because they can detect an experiment’s change in state, these threads update the standard output, error, and status on a minimal basis and conserving computational consumption. The final thread initiated during the experiment manager startup process saves the states of the four status queues at a configurable interval. The functionality addresses the possibility of an abrupt interruption preventing the experiment manager from gracefully halting. Another thread is initiated when the experiment manager server is started. It uses a configuration polling duration to checkpoint the queues, so that an abrupt interruption will not cripple the system, and be able to launch the experiment management system with minimal loss. The server uses four levels of class containment, and Figure 3 illustrates the containment relationship. At the root of the containment relationship is the ExperimentManager class that provides the serverside functions to respond to the commands that the client issues. The commands supported by the ExperimentManager class include add, submit, list, status, and stop. The ExperimentManager class implements the functions based on the four queues that it maintains: pending, submitted, completed, and failed. The queues consist of objects that hold an experiment data structure along with the experiment’s id and a dependencies list. The Experiment class is the core data structure in mediating communication between the experiment management system and the grid. The class exposes an interface to simplify the experiment submission process that incorporates enhanced status reporting through the standard status. The standard status is simply a file with entries describing the current sta-

!ontainment *e+ations-ips

ExperimentMana:er2er>er

ExperimentMana:er Experiment2tatus4ueue Experiment4ueue;b=ect 5d

Experiment

7ependencies

Figure 3: Containment relationship among the primary classes used to implement the experiment manager server tus of an experiment that is written to the working directory where the executable program is invoked. The standardized format of each entry, as shown in Figure 4, permits customized status reporting. However, those applications used must be rebuilt to append to the standard status additional information about its execution state. The standard status, as currently implemented, reports the latter three states of the experiment state diagram: running, completed, and failed. It also supports more detailed error detection by reporting trapped signals that otherwise would have vanished on the remote host. #CoG: :

where status ::= pending | submitted | running | completed | fai time ::= HH:MM:SS date ::= MM/DD/YYYY time_zone ::= GMT Figure 4: HSpecification of an entry for the standard status The experiment class uses the Java CoG Kit’s Task and FileOperation abstractions to interact with the grid. Preparation of the experiment for submission requires wrapping the executable and its the associated list of arguments into a shell script. This implementation of the standard status requires both the transfer of and setting execute permissions for the script on the remote machine. The experiment is 6

submitted through a task handler that provides simple status reporting via through the Status interface defined in the Java CoG Kit abstractions-common module. Automated checkpointing of the experiment occurs when the Experiment object detects that the experiment has been successfully submitted to the remote host for execution through a status change event notification.

6

context of absolute allocated memory ranging from 2MB to 60 MB. The 60 MB requirement suggest that managing a thousand experiments can be problematic within a computing environment where memory is a scarce resource. On the other hand, the nearly constant 75-80 percent heap space consumption is a testament to the system’s spatial scalability.

Security Issues Memory Consumption

7

0 30 0 50 0 10 00

80

10

60

40

9

20

7

5

3

1

Percentage of memory consumed

As we have a simple client-server implementation, we 90 80 require that the experiment client and server be run 70 as part of a secure Intranet. However, as we have 60 50 already implemented the logic of dealing with exper40 30 iments, it will be straightforward to use secure grid 20 10 sockets that are provided by the Java CoG Kit or Se0 cure Grid Services provided by the Globus Toolkit 4. In both cases, the communication between the client Number of queued experiments and server can be securely achieved. At this time we provide a secure solution as both client and server can Figure 5: Percentage of allocated heap space conbe run on the user’s computer. Naturally, the com- sumed under increasing load of added experiments munication between client and server is done through a port that is not externally accessible. In the time domain, the logs indicate a constant response time to requests to add, submit, and list the status of queue experiments. Adding experiment uses on average a range of 10 ms to 100 ms to satisfy the request. Checking the status of experiments requires on average 0.5 to 1.5 milliseconds to complete. Experiment submission requests on average approximately 0.5 to 1.5 seconds to locate the experiment, prepare the task for submission, and submit it. However, the amount of time needed to start the experiment manager requires orders of magnitude more time than the other operations. Beyond the 50 experiments threshold, we clocked an average of approximately one second per experiment, displaying linear growth performance. Below that threshold, the loading time grows linearly but at a rate that is less than 0.5 seconds per experiment. Figure 6 shows the difference in loading time below and above the threshold number of experiments. The results show that the system scales well when running; however, because of sizeable I/O involved in deserializing the checkpointed queues, reloading the experiment management system is an expensive operation. As such, it is advisable to categorize sets of experiments into separate projects, and manage the different categories of experiments as separate sessions.

Performance Results

We logged the amount of memory and clock time required for the four primary server operations of adding, submitting, listing, and displaying the status of experiments under increasing loads. The number of experiments managed by the experiment manager provides the basis of measuring the load on the system. Memory consumption is an important indicator in determining how well the experiment management system will perform with other programs running concurrently. On a Pentium 4, 1.8 GHz machine running the Linux-2.4 operating system with 512 MB of memory, we obtained the following results pertaining to the amount of allocated heap space used by the experiment management system. Because of the fluctuating heap size allocated by the JVM, using the absolute byte count of used memory is not useful. Instead we take the percentage of the amount of heap used versus the total heap size. The results of the heap usage performance test, as summarized in Figure 5, show that the experiment manager consumes within the range of 75-80 percent of the available heap space when a reasonably large number of experiments have been submitted. The percentage is within the 7

by DOE MICS, and NSF Alliance. This work was also supported by the National Science Foundation under Grant No. 0353989.

140 120 100 80 60 40 20 0

[1] G. von Laszewski and K. Amin, Grid Middleware. Wiley, 2004, ch. Middleware for Commnications, pp. 109–130. [Online]. Available: http://www.mcs.anl.gov/ gregor/papers/ vonLaszewski--grid-middleware.pdf

80 10 0

60

40

9

20

7

5

References

3

1

Time (seconds)

Load from XML

Number of Experiments

[2] I. Foster, “What is the Grid? A Three Point Checklist,” 22 July 2002. [Online]. Available: http://www.gridtoday.com/02/0722/ 100136.html Figure 6: Amount of time to load checkpointed queues with an increasing number of experiments. [3] G. von Laszewski, I. Foster, J. Gawor, W. Smith, and S. Tuecke, “CoG Kits: A Bridge between Commodity Distributed Com8 Conclusion puting and High-Performance Grids,” in ACM Java Grande 2000 Conference, San Francisco, While looking at bioinformatics applications, we have CA, 3-5 June 2000, pp. 97–106. [Online]. Availidentified that typical experiments need to be manable: http://www.mcs.anl.gov/ gregor/papers/ aged by the novice grid user. In order to support vonLaszewski--cog-final.pdf this requirement we have developed a tool called cogexperiment. This tool is architected around a clientserver model that allows the user to manage a large [4] G. von Laszewski and M. Hategan, “Grid Workflow - An Integrated Approach,” in To number of tasks as part of his daily research quest. be published, Argonne National Laboratory, The experiment framekwork is bassed on a layered Argonne National Laboratory, 9700 S. Cass architecture that integrated fully with the Java CoG Ave., Argonne, IL 60440, 2005. [Online]. AvailKit. Through this combination we have enabled a able: http://www.mcs.anl.gov/ gregor/papers/ system that allows automated checkpointing, autovonLaszewski-workflow-draft.pdf matic version control, and output file management. These features are making the researchers’ experience to use the Grid simpler, faster, and more e⌅cient. The cog-experiment is implemented using Java, and Java Cog Kit to provide these features. Future research will focus on integrating our clientserver model into a WS-RF based grid environment. Additionally, it will be simple to integrate our system into the Java CoG Kit’s workflow framework.

9

Acknowledgements

This work was supported by the Mathematical, Information, and Computational Science Division subprogram of the O⌅ce of Advanced Scientific Computing Research, O⌅ce of Science, U.S. Department of Energy, under Contract W-31-109-Eng-38. DARPA, DOE, and NSF support Globus Project research and development. The Java CoG Kit Project is supported 8

A

Command cog-experiment

NAME cog-experiment SYNOPSIS cog-experiment [-info] [-list] [-tree] [-add [-file ]] [-submit ] [-status ] [-delete ] [-combine ] [-metaset ] [-copy ] [-rename ] [-search ] [-diff ] [-view ] [-edit ] [-ls] A short format is also available: cog-experiment [-i] [-l] [-t] [-add [-file ]] [-submit ] [-stat ] [-del ] [-com ] [-ms ] [-cp ] [-rn ] [-srch ] [-diff ] [-view ] [-ls]

DESCRIPTION cog-experiment manages experiments. Experiments can be added, deleted, submitted, and monitored. Other management tasks include querying for the status of experiments and searching and filtering experiments. The stdout and stderr of each of experiment is managed using a directory structure that is customizable through the COG_EXPERIMENT_PATH environment variable. An example for a directory structure for an experiment could look something like this: $HOME/.globus/experiment//stdout /stderr OPTIONS -info Returns useful information about the location of the experiment -list Lists all experiment names -tree Lists tree view of all experiment names -edit Edits file using emacs -add

9

adds a new experiment without submitting it user prompted to enter experiment information -add add options: -file [filename] Uses the specified file as the task for the experiment -status lists the status of all experiments -status lists all experiments with the specified state -submit creates new experiment in default directory -submit creates a new experiment named experiment or continues experiment -delete deletes the names experiment -combine adds the data of experiment2 to the end of experiment1 -metaset set the metadata to the named experiment -settask set the task on the named experiment -copy copies experiment1 to destination folder -rename renames experiment1 version nameto n -search search option1: -dir Executes function in specified directory -file Executes task in specified file search option2: -metadata Executes function returning only metadata -data Executes function returning only the data -stderr Executes function returning only stderr -diff ,

-diff ,

10

diff option: -metadata Executes function returning only metadata -data Executes function returning only the data -stderr Executes function returning only stderr -versions lists the name of all files in the specified experiment folder -experiments lists all experiments -view prints all information in named file to screen -view view option: -metadata Executes function returning only metadata -data Executes function returning only data -stderr Executes function returning only stderr -ls view advanced information on all experiments EXAMPLES In this eaxample we show the creation of a new experiment. cog-experiment -edit blastrun.xml -format schema.xsl Through a swing interface the user is asked to fill out the following form. Metadata: # Identity # Identifier: Experiment Name: blastrun # Contact # Author: Pan Department: MCS Project: Gene Sequencing Phone: 630-252-1682 E-mail: [email protected] Date: July 11, 2005 Time: 2:00PM # Execution # Program:

/home/user/simulation

11

Arguments: "-s 2 -i 20" # Host Directory: Stdout: cog-stdout Stderr: cog-stderr Projectaccountnumber: Parameters: -p gt2 -s tg-grid1.uc.teragrid.org/jobmanager-pbs Now that the necessary information has been gathered, the experiment object is now created: cog-experiment -add blastrun.xml Next we send the experiment off to be completed on the grid: cog-experiment -submit blastrun If the experiment was run more than once the versions would look as follows: cog-experiment -view -versions blastrun blastrun-1 blastrun-2 ... To view the specific information of a experiment that has been completed: cog-experiment -view -file blastrun-1 Metadata: # Identity Identifier: Experiment Name: blastrun # Contact: Author: Pan Department: MCS Project: Gene Sequencing Phone: 630-252-1682 E-mail: [email protected] Date: July 11, 2005 Time: 2:00PM # Execution Program: /home/user/simulation Arguments: "-s 2 -i 20" # Host Directory: ˜ Stdout: cog-stdout Stderr: cog-stderr Projectaccountnumber: Parameters: -p gt2 -s tg-grid1.uc.teragrid.org/jobmanager-pbs

Data: 12 23 43 56 76

12

Standard Error: none To view only the data created during the experiments run: cog-experiment -view -file -data blastrun-1 Data: 12 23 43 56 76 cog-experiment view file data blastrun-2 Data: 12 23 34 56 76 cog-experiment -diff -data genes-1 genes-2 blastrun-1 line 3: 43

blastrun-2 34

If more than one experiment has been created, multiple experiments can by viewed by: cog-experiment -view Experiments: default genes dna If you would like to create an experiment from an existing XML file: cog-experiment -add -file /home/experiments/lysosome.xml Experiment lysosome has been created. To view all experiments: cog-experiment -view -experiments default blastrun dna lysosyme To view more information on all experiments: cog-experiment -ls

blastrun: dna: lysosome:

UserName Pan Tan Gregor

Time 2:00PM 11:01AM 5:45AM

Date July 11, 2005 July 22, 2005 July 23, 2005

cog-experiment -cureCancer

13

Cancer is now cured! SEE ALSO cog-job-submit, cog-checkpoint-submit, cog-checkpoint-status

B

Design

-------------------------------------------------------------------------------Client Experiment Manager Grid infrastructure Client (UI) - command-line utility - GUI (monitor and visualize output) Experiment Manager - 4 Queues (pending, submitted, completed, failed) - Persistent storage * load/save queues from/to persistent storage * experiments repository - Scheduler * selects the next task to submit to the Grid, based on a scheduling policy possible policies: FIFO, random, priority-based - task submission: * submits task(s) to Grid through a workflow * create as a node in the workflow a wrapper script to detect completion or failure of the task * update the appropriate queues * update the experiments repository with stdout, stderr and any other output How different components of our system interact: Component1 Component2 UI Queues Scheduler Grid h UI ExperimentRepository ExperimentRepository ExperimentManager -------------------------------------------------------------------------------Interfaces -------------------------------------------------------------------------------interface StatusQueue * StatusQueue is used as the underlying data structure in ExperimentManager -------------------------------------------------------------------------------/* * NOTE: StatusQueue implementations MUST be THREAD-SAFE * queue * object: [id, experiment name, task, dependencies] */ /* * inserts an object into the queue returns a String that is the * id of the queued task */ String enqueue(Object queueObject)

/* * inserts an object into the queue with a list of dependencies * (dependencies is a list of ids that the added task depends on)

14

* returns a String that is the id of the queued task */ String enqueue(Object queueObject, List dependencies); /* * removes the next queue object from the queue, using the * scheduling policy returns the removed queue object */ Object dequeue(); /* * removes the queue object with the specified id * returns the removed queue object */ Object dequeue(String id); /* * sets the dependencies for the queued task with the specified id */ void setDependencies(String id, List dependencies); /* * sets the scheduling policy for the queue */ void setSchedulingPolicy(String schedulingPolicy); /* * retrieves the scheduling policy for the queue */ String getSchedulingPolicy(); /* * enumerateQueueObjects returns an enumeration of queued objects */ Enumeration enumerateQueueObjects(); -------------------------------------------------------------------------------interface ExperimentManager * The client (user interface) interacts with the Experiment Manager through this interface * This interface is also used to interface with the Grid -------------------------------------------------------------------------------/* * addTask adds tasks to an experiment (creates experiment * directory if necessary), and returns an array of Strings * representing task IDs that the pending queue has assigned to * each task. */ String[] addTask(String experimentName, Vector tasks); /* * submitTask moves the task with the associated id to the submitted queue * The task gets executed according to the queue scheduling policy */ void submitTask(String id); /* * enumerateTasks returns an enumeration of queue objects * queue object: [id, experiment name, task, dependencies] * this function could be used to discover a task’s id (as seen by the queues) */ Enumeration enumerateTasks() /* * enumerateTasks with the status argument enumerates only queue

15

* objects from a specified status queue. queue object: [id, * experiment name, task, dependencies] */ Enumeration enumerateTasks(String status); /* * getTaskStatus returns the status of the task with the associated * id Status is determined by which queue the task is located */ String getTaskStatus(String id); /* * setTaskStatus moves the task with the specified id to a different * queue; the new queue is specified with the newStatus argument * intent of this method: to update a task’s status after it has * been submitted; that is, when we get an update as to whether the * task has actually been completed or failed. Determining * completion or failure is a sticky issue that we are currently * dealing with. */ void setTaskStatus(String id, String newStatus); /* * only allow get methods to the queues * The user setting the queues causes inconsistency in status */ StatusQueue getPendingQueue(); StatusQueue getSubmittedQueue(); StatusQueue getCompletedQueue(); StatusQueue getFailedQueue(); /* * writes the contents of all four queues out to persistent storage so that we * can recover from interruptions (such as the ExperimentManager being * stopped). */ void checkpointStatusQueues(); /* * We need some way to update the experiments repository upon receiving * information (such as stdout, stderr, or any program output files) from the Grid */ /* * Writes stdout, stderr, or output files that need to be updated to the * experiment repository. updateExperimentRepository updates the contents * based on a task’s experiment name and id number? */ void updateExperimentRepository() void updateExperimentRepository(String id) -------------------------------------------------------------------------------interface ExperimentScheduler * Interface for experiment manager to schedule and submit tasks to the Grid * this component schedules tasks for execution on the Grid based on a scheduling policy -------------------------------------------------------------------------------/* * Given a list of tasks, determine which to go next * Possible scheduling schemes: FIFO, random, priority * nextTask takes an enumeration and a scheduling policy and determines what * object in the enumeration is next */ Object nextTask(Enumeration objectEnumeration, String schedulingPolicy);

16

-------------------------------------------------------------------------------interface ExperimentRepository * Interface between the client and the experiment repository * Interface between the experiment manager and the experiment repository -------------------------------------------------------------------------------/* * The experiment repository should maintain some form of checkpointing, so * that queries on it will simply be a matter of reading from the checkpoint * file instead of traversing the entire structure. * checkpointEperimentRepository saves the directory tree of the experiment * repository */ void checkpointExperimentRepository(); void setCheckpointFileName(String checkpointFileName); String getCheckpointFileName(); /* * writes outputFile to the location where the experimentName is located * writeFlag argument determines whether to append or overwrite if the file * already exists. */ void writeToRepository(String experimentName, File outputFile, int writeFlag); -------------------------------------------------------------------------------* The interface between the experiment manager as a whole and the Grid is probably through workflows? --------------------------------------------------------------------------------

17

A Deep Look at Web Services for Remote Portlets (WSRP) and WSRP4J Xiaobo Yang, Rob Allan CCLRC e-Science Centre, Daresbury Laboratory, Warrington WA4 4AD, UK {x.yang, r.j.allan}@dl.ac.uk Abstract Web Services for Remote Portlets (WSRP) extends today’s data-centric web service paradigm by providing markup fragments that can be used directly for rendering. WSRP alleviates some complex tasks of building web portals by enabling remotely-hosted portlets to be accessed directly. This illustrates a promising approach for portal federation. However, according to our early investigations on selected open-source portal frameworks, WSRP implementations are still immature. In this paper, a deep look at WSRP, in particular, WSRP4J the Apache implementation of the OASIS WSRP 1.0 specification will be reported with further discussions aiming at providing a full solution for WSRP.

1. Introduction Web portals are gateways to many kinds of information systems including e-Learning, e-Research and eBusiness. Today the second-generation web portals based on portlet technique are widely adopted and deployed. For example, in the grid community many grid portals are now portlet based such as the OGCE Portal Toolkit [2] and the UK National Grid Service (NGS) Portal [17]. Acting as web components, portlets encapsulate relatively independent functionalities with markup fragments provided to portals for composition of integrated web pages. While the Java Portlet Specification 1.0, JSR 168 [1], standardises portlet development using Java, there is still need to re-deploy the same portlet in different portal frameworks so that it can be re-used. The Web Services for Remote Portlets (WSRP) specification 1.0 [8] proposed by OASIS defines presentation-oriented web services aiming to remove the re-deployment of portlets by solving the interoperability issues among different portlet containers. Now portlet providers can develop and maintain their own portlets individually without requiring upgrades on their consumer side (normally web portals). The difference between traditional data-oriented and WSRP services is that the latter provide markup fragments generated by remote portlets that can be used directly by their clients for rendering, while data-oriented services provide only raw data. These data normally require post-processing, e.g. using XSLT [11] transformation before being presented to end users. WSRP makes it possible for service providers to develop and maintain independent services which provide all the rendering information needed by the client side UI. This greatly alleviates the hard work needed for constructing end user focused applications like web portals. Now web portals, including grid portals, can be federated as required. As shown in Fig. 1, Portal A has two local portlets and two remote portlets, one from Portal B, the other one from Portal C. Altogether, four portlets are combined together and rendered on one portal page. Although WSRP depicts a beautiful blueprint for future portal development, according to our investigation on open-source portal frameworks there are many interoperability issues [18]. We looked at eXo platform, Liferay, StringBeans and uPortal plus WSRP4J. Even with producer and consumer from the same vendor almost none of

!"#$'%(*

!"#$'%()

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$'%(+

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

Figure 1. Portal federation: building up portals by plugging in local and remote portlets.

the WSRP implementations are fully functional. Therefore in this paper, a deep look at WSRP4J, the Apache WSRP 1.0 implementation will be reported. This will form the basis of our WSRP consumer developed for the UK JISC funded Sakai VRE Portal Demonstration project [12]. Initiated by IBM, WSRP4J from Apache is an open-source project which is now still in incubation stage. Currently WSRP4J is functional although some WSRP features are missing [10]. However, during our WSRP support test [18], the WSRP4J producer gave the best performance, especially for handling interactions between end-user and remote portlet. WSRP4J makes use of Pluto [3], a JSR 168 reference implementation by Apache, as its portlet container. Producer within WSRP4J has been developed based on a very modular architecture. In theory, Pluto should be able to be replaced by other portlet containers, although in practice, this will involve a considerable amount of work. In this paper, we will first give a brief introduction to the mechanism of WSRP. Different aspects of several missing bits in WSRP4J will be discussed followed by further discussions on providing a fully functional WSRP solution to portal developers. Finally we provide a summary and concluding remarks.

2. Web Services for Remote Portlets (WSRP) Besides End-user, the WSRP 1.0 specification defines Producer and Consumer. A description of the sequence flow is given below for better understanding on how producer and consumer work together to complete portlet’s publication and consumption. Fig. 2 illustrates a simplified sequence flow of communications in WSRP. In this figure, communications such as clonePortlet and deregister are omitted to make it more readable. A detailed description of WSRP communications between consumer and producer can be found in the WSRP 1.0 Primer [9] and other introductory materials [13]. As depicted in Fig. 2, a consumer first talks to a producer’s description interface to retrieve metadata. This includes a description of the producer itself and the portlets it holds. During this stage, the consumer may be asked to register itself with the producer. According to the consumer’s capabilities, the producer may then provide markup fragments in different formats. The consumer can now select a portlet it wants to access. A unique portlet handle is used for communications between producer and consumer so that the portlet can be uniquely identified. The

producer will first provide the default view of the portlet. Now the consumer is able to make use of the markup fragments provided by the producer so that a full (HTML) web page can be constructed and rendered to end-users. The consumer also needs to collect requests, such as form input, and re-direct them to the producer. This is the most important task of the consumer. Only when a user request is transferred correctly, can a correct response be created. As reported in [18], many WSRP consumers could not handle such a re-direction correctly which makes them currently unusable.

Figure 2. Simplified WSRP sequence flow.

As mentioned above, WSRP4J is still under development. The TODO list [10] highlights a good starting point for investigation of WSRP4J. In this paper, we will not touch on all of the items in the TODO list but with several points we noticed during our investigation. These include templates, mode/ windowState change, and file upload support in WSRP4J.

3. Experiences on WSRP4J 3.1. Templates support plus templatesStoredInSession In WSRP the consumer acts as the intermediary between the producer and the end-user. Among the communications involved, one key task the consumer has to handle is to generate correct URLs as many of them are relative URLs on the producer side. Furthermore, the consumer may want to add its own fragments to those generated by the producer. There are basically two approaches for doing this: a) consumer rewrites URLs in markup fragments generated by the producer; and b) consumer provides templates to the producer so that URLs are generated according to the consumer requirements. The second one is a more flexible approach for consumers to communicate with producers. This is because if a consumer provides templates to a producer, the latter will

be able to process markup fragments using these templates. Hence such a producer will not ask the consumer to re-write URLs within the markup fragments. The consumer can now handle everything such as links and form actions according to requirements on its own side. If templatesStoredInSession support is used, the consumer will only need to provide templates once, which reduces network traffic. For example, the BEA WSRP producer (http://wsrp.bea.com/portal/producer?wsdl) requires its consumers to support both templates and templatesStoredInSession while the one from NetUnity Software (http://wsrp.netunitysoftware.com/WSRPTestService/ WSRPTestService.asmx?Operation=WSDL) only requires templates support. The only missing bit of templates support in WSRP4J producer is resourceTemplate. We have implemented this together with templatesStroedInSession support in WSRP4J producer and our ProxyPortlet based WSRP consumer developed for Sakai, an open-source collaboration and learning environment for education [4]. This consumer has been observed to successfully consume portlets held by our modified WSRP4J producer with templates and templatesStoredInSession support, the aforementioned BEA and NetUnity WSRP producers. Fig. 3 gives a screenshot of a remote portlet from BEA running inside Sakai through WSRP. In [19], this WSRP consumer for Sakai has been described with examples given for consuming grid portlets developed for the UK NGS Portal [17].

Figure 3. Consuming BEA remote portlet in Sakai through WSRP.

Namespace support is also currently not available in WSRP4J but can be realised in a similar approach as templates support.

3.2. Mode/ WindowState change support It was observed that WSRP4J does not support Mode/ WindowState change in portlets. For example, a portlet may have a button in its view page to go to help page and vice verse. This involves a mode change. There was no response observed during our WSRP support test of mode change requests. The same thing happened to WindowState which implies WSRP4J does not support window state change, for example from normal to maximized. As one of the key features portlets provide, Mode/ WindowState change support has been added to WSRP4J producer during our investigation. In Fig. 3, a line of URL links between the ”Change Portlet” button and the ”BEA Racing” title includes different modes and window states we support.

3.3. File upload support To support file upload, it is practical for consumers to extract data first then set up correct WSRP request parameters to transfer the data to producers. A producer can then retrieve data transferred and set particular HTTP request parameters that the portlet is aware of, so that the file upload can be finished. It was found not practical for the producer to construct a HTTP request object with ”enctype” attribute set as ”multipart/form-data”. Therefore file the upload portlet had to be modified to be aware of some particular parameters the producer sets as mentioned above. If uploaded data is encoded then included in the SOAP message from consumer to producer, we observe severe performance problems when transferring large amounts of data. This issue is not mentioned in WSRP 1.0 specification. Therefore extensions are needed to increase the performance of transferring large datasets between consumer and producer. Howes [14] reported that BEA has extended the WSRP 1.0 specification (Custom Data Transfer Extension) to allow ”WSRP consumers and producers to create, view, modify, and control concurrent access to shared, scoped data in a scalable, reliable, and highly performant manner” with the help of Tangosol Coherence. On the other hand, file download support is also missing in WSRP4J. Normally file download is redirected to a servlet which does the real file download task. This is currently not available in WSRP4J as redirect is not supported.

4. Further discussions 4.1. Redirect and file download Support of redirect is missing in WSRP4J. As mentioned above, file download could be realised by redirecting to a servlet. Current version of WSRP4J producer simply sets redirectURL to null.

4.2. Customisation One of the key benefits portlets bring is the ability to set preferences for each user, however in a remote portlet environment, this becomes an issue. Because producer-offered portlets are accessible to all consumers, they are not allowed to be customised. WSRP uses the concept of ”cloned portlet” to let consumers customise portlets for end-users. Two approaches are listed in the WSRP 1.0 specification: 1) using edit mode of a portlet; and 2) using property definitions within the portlet’s metadata. In WSRP4J, all methods related to properties are empty thus it is not implemented. According to our test with WSRP4J, once a portlet handle is changed (e.g., edit a cloned version of portlet A, then switch to portlet B, come back to portlet A) or connection to a WSRP4J producer is closed, the preference settings are lost. This is because such a portlet will be cloned during those actions. Preferences of a standard JSR 168 portlet are managed by the portlet container. Inside a WSRP4J producer, Pluto uses an XML file (portletentityregistry.xml) to store portlet preferences. These preferences are stored with a link to the portlet’s ID which is in fact the portlet handle. Since such a handle for cloned portlets is always changing and the WSRP4J producer does not keep it, portlet preference support is not persistent. To solve the issue, a WSRP4J producer may record additional information such as consumer’s name and user’s name together with the portlet handle when a user edits preferences of a cloned portlet. Then next time the same user from the same consumer accesses the same portlet and tries to edit preferences, these values can be retrieved and set by the producer.

4.3. Rich user interface The tide now in web development is to develop Rich Internet Applications (RIA) that have features and functionalities similar to desktop applications. AJAX (Asynchronous JavaScript And XML) is today’s hottest topic in web development. Although AJAX can be applied to standard JSR 168 portlets, there will be problems to publish these as remote portlets without modification. Early investigation shows that applying AJAX in a remote portlet is not straightforward. AJAX makes heavy use of JavaScript especially XMLHttpRequest to contact a remote server while in a remote portlet scenario there exists a consumer acting as a broker. Normally all requests from end-users should be sent to this consumer first and then redirected to the producer by the consumer. After that, the producer handles requests from the consumer and passes them to the portlet container. It is the portlet container which does the real job of talking to a particular portlet. Responses from portlets are collected by the container, and then passed to the producer and the consumer accordingly. AJAX breaks this design pattern by contacting remote services directly behind the WSRP consumer portal server. For example, an HTTP request may be sent from the user agent (browser) to services deployed on a WSRP producer server. Due to security concerns, requests sent by AJAX are limited to the server where the web application resides. Internet Explorer will pop up a warning message once such a call is made while Firefox simply blocks it without warning. A proxy servlet can be utilised to re-direct HTTP requests to remote services [15] so that this issue can be worked around. Fig. 4 gives a screenshot of such a portlet, an AJAX Invoice Viewer portlet [20] running under our servlet-based WSRP producer. When a new invoice number is selected (in Fig. 4 the invoice number is ”439091”), AJAX will contact a proxy servlet sitting on the consumer web server. The proxy servlet will then re-direct the request to a WSRP producer (WSRP4J) server where the target resides. The response will then be returned from the remote service. All this is done behind the web browser so that performance is greatly improved.

Figure 4. AJAX-enabled remote portlet.

4.4. Producer/ portlet publish and discovery End users today have to remember too many URLs, a central registry for web services can alleviate the pain. UDDI [6] which is widely adopted for web services registry, discovery and integration can be adopted since WSRP

producers are web services. Although a consumer can then obtain a producers’ interface URLs and get a list of their portlets, end users are only interested in searching for particular portlets rather than producers. Portlets should therefore also be registered, for instance, in a UDDI registry. We have set up a test registry for this purpose [16] which makes it possible for end users to discover portlets and then transparently access them (a user simply selects th required portlet; the consumer will then connect to the producer to access it).

4.5. Security Another area which is not detailed in the WSRP 1.0 specification is security. Because WSRP is based on web services, the specification does not address the security issue in detail but simply guides developers to follow existing security mechanisms, such as WS-Security [7] and SAML [5]. WSRP 1.0 emphasises the usage of transport-level security standards, e.g. SSL/ TLS between consumer and producer, to achieve message integrity and confidentiality.

5. Conclusions WSRP is a promising specification for building up presentation-centric web services that can be plugged into web portals without requiring portlets to be deployed locally. Unfortunately as revealed by our investigations of WSRP support in several open-source portals frameworks, WSRP is still not fully functional. Therefore in this paper, WSRP4J, the best performer in our tests, has been selected to dig out the missing bits including templates, mode/ windowState plus file upload support. Further discussions include redirect support, customisation, AJAX which is now prevailing for building up Rich Internet Applications, producer/ portlet publication and discovery and security issues. With these issues tackled in the future, WSRP could play a key role in portal federation and shared portal services.

Acknowledgements The authors would like to thank the anonymous reviewers for their insightful comments helped to improve this paper. This work was undertaken at the CCLRC e-Science Centre, Daresbury Laboratory supported by UK JISC (The Joint Information System Committee) as part of the Sakai VRE Demonstrator project.

References JSR 168. http://www.jcp.org/aboutJava/communityprocess/ final/jsr168/. OGCE. http://www.collab-ogce.org/ogce2/. Pluto. http://portals.apache.org/pluto/. Sakai: Collaboration and learning environment for education. http://sakaiproject.org/. SAML. http://www.oasis-open.org/committees/tc home.php?wg abbrev=security. UDDI. http://www.uddi.org/. Web Services Security. http://www.oasis-open.org/committees/tc home.php?wg abbrev=wss. WSRP 1.0. http://www.oasis-open.org/committees/ download.php/3343/oasis-200304-wsrp-specification-1.0.pdf. WSRP 1.0 Primer. http://www.oasis-open.org/committees/download.php/10539/wsrp-primer-1.0.html. WSRP4J TODO List. http://wiki.apache.org/portals/WSRP4J/TodoList. XSLT. http://www.w3.org/TR/xslt. R. Allan, D. Chohan, X. Wang, X. Yang, R. Crouchley, A. Fish, M. Gonzalez, M. Baker, H. Ong, M. Dovey, and F. Pinto. Virtual research environments: Sakai demonstrator. In Proc. UK e-Science All Hands Meeting 2005, available on CDROM, pages 208–215, Nottingham, UK, September 2005. [13] R. Gupta. WSRP: Dynamic and real-time integration - an introduction to wsrp, its usage, and implementation. WebServices Journal, 5(8):10–19, 2005. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

[14] J. Howes. Sharing data among federated portals: Using WebLogic Portal, Tangosol Coherence and WSRP. http://dev2dev.bea.com/pub/a/2005/11/federated-portal-cache.html, November:2005. [15] J. Margaglione. AJAX programming in BEA WebLogic Portal 8.1, part 2. http://dev2dev.bea.com/pub/a/2006/03/ajaxportal-2.html, April 2006. [16] X. D. Wang, X. Yang, and R. Allan. Plug-and-play remote portlet publishing. In GCE05: Portals Workshop, Seattle, USA, November 2005. [17] X. Yang, D. Chohan, X. D. Wang, and R. Allan. A web portal for the National Grid Service. In Proc. UK e-Science All Hands Meeting 2005, available on CDROM, pages 1156–1162, Nottingham, UK, September 2005. [18] X. Yang, X. D. Wang, and R. Allan. Investigation of WSRP support in selected open-source portal frameworks. Concurrency Computat.: Pract. Exper., 2006, in press. [19] X. Yang, X. D. Wang, R. Allan, M. Dovey, M. Baker, R. Crouchley, A. Fish, M. Gonzalez, and T. van Ark. Integration of existing grid tools in Sakai VRE. In International Workshop on Collaborative Virtual Research Environments (CVRE06), to be held in conjunction with GCC 2006, Changsha, China, October 2006; accepted. [20] G. Ziebold and J. Suri. Asynchronous rendering of portlet content with AJAX technology. http://developers.sun.com/prodtech/portalserver/reference/ techart/asynch rendering.html, June 2005.

Connected in a Small World: Rapid Integration of Heterogenous Biology Resources Umut Topkara

Carol X. Song

Jungha Woo

Sang P. Park

Rosen Center for Advanced Computing Purdue University West Lafayette, IN, 47907, USA utopkara,cxsong,wooj,[email protected]

Abstract— Timely access to the most up to date versions of resources, such as data and software, is of paramount importance for researchers in an active field like Biology. We introduce a grid enabled biological data and software collection portal architecture, SALSA (a Scalable Simple Architecture), that is tailored towards fast integration of new computational resources made available by ever faster advancing and diversifying research in this area. We identify two models that guide the design of SALSA: heterogeneous database model and network growth model with preferential attachment. SALSA recognizes the challenges that are noted by the previous research on heterogeneous database model inherent in biological database resources; these resources are autonomously managed and lack a common database schema. SALSA is also guided by a model for the growth of the portal’s collection (of data and associated software to process this data) from previous research on related collections (e.g. citation networks and software package dependencies). This model suggests that in the presence of components that have a higher likelihood of gaining new connections (e.g., popular resources such as BLAST or FASTA sequences), the relationships between components tend to organize in a small-world scale-free network. The growth model helps the portal developers identify important hub components that emerge by taking part in increasing number of tasks as the portal grows. In order to effectively improve the overall user experience, developers can direct expensive development efforts (e.g., query optimization, user interface, documentation, etc.) to hub components, rather than to specialized components that have a lesser likelihood of developing to become hubs. In this paper we discuss a grid enabled web portal implementation that is built to contain a growing collection of biological data and software to process this data. The implementation that we present is a realization of Scalable Simple Architecture (SALSA) that strives to rapidly integrate newly published components into the existing collection in a sustainable fashion. Notably, this implementation uses flexibility of XML for component management, XSL for web user interface, SRB and MCAT for large data storage.

I. I NTRODUCTION Timely access to the newest resources is of paramount importance for researchers in an active field like Biology. Biological databases and software have grown to number and complexity that overwhelm the researchers in the area. The diversification of research areas as well as the availability of new technology to create large amounts of experimental data

have contributed to this trend. Despite the fact that most of these data and software are publicly available through the internet, the heterogeneous nature of these resources may hinder their ease of use and immediate accessibility by researchers, hence may inhibit the pace of scientific discoveries. In this paper we discuss a grid enabled web portal implementation that is built to contain a growing collection of biological data and software to process this data. The implementation that we present is a realization of Scalable Simple Architecture (SALSA) that strives to rapidly integrate newly published components into the existing collection in a sustainable fashion. SALSA recognizes the challenges that are noted by the previous research on heterogeneous database model inherent in computational resources in Biology; these resources are autonomously developed and they lack a coordination of design (e.g., common database schema). The heterogeneity of the components in the collection denies presumptions about data format, data schemas, input parameters, and user interfaces. For this reason, SALSA does not use one canonical component model to capture all possible components. Even though this design decision brings flexibility to precisely define the components of the collection, it also comes with an added cost of complexity due to the multiplicity of incompatible component definitions. We show that the growth of the collection can be sustained despite this potential for enormous complexity (more on this later). One of the important issues with biological resources is the intermittent availability of resources. SALSA mandates the implementation of a cache for query results when they are obtained from volatile remote resources, so that they can be re-used by processes that are fired by later jobs. SALSA also incorporates a powerful documentation facility which associates all data and software components with their sources and versions; this mitigates conflicts due to uncoordinated data and schema updates to data sources. SALSA is also guided by a model for the growth of the portal’s collection based on previous research about similar collections. This model suggests that, in a growing network, if the new components have a slight preference to establish relationships with some of the existing components, the relationships between components tend to organize in a small-

a) Model Driven Integration

b) Task Specific Integration

Fig. 1. Approaches to integration. Edges indicate a compatibility among incident components. a) The darker node is the canonical component model that captures all instance components, thereby making all components compatible with each other. b) The darker nodes are popular application or data components -hubs- which take part in many task specific workflows; other components take part in only a limited number of workflows.

world scale-free network. In a scale-free network, most of the nodes have a small number of connections with other nodes, whereas a small number of nodes are connected to a large number of components. The latter type of nodes are called hubs. This growth model helps the portal developers identify important hub components (e.g., BLAST, FASTA data, etc.) that emerge by taking part in an increasing number of tasks while the portal grows. In order to effectively improve the overall user experience, developers can direct expensive development efforts (e.g., query optimization, user interface, documentation, etc.) to hub components, rather than to specialized components that have a lesser likelihood of developing to become hubs. Workflows that span many components are limited in SALSA by design, whereas tasks oriented workflows that involve a small number of hub components are abundant (more on this later). The small world model enables SALSA to achieve increased robustness against the high fault rate of heterogeneous components. SALSA achieves this by deploying multiple components that achieve the same tasks with emphasis on providing a redundancy of hub components (which also include service middleware components such as Storage, and Execute). Since the robustness of these components have a disproportional effect on the overall robustness of the system – due to their central role among components– we seek to fortify these components with their alternatives, such as mirror sites, local copies, version repositories and competing applications. In the next section we discuss the main concepts of SALSA along with its comparison to model-driven architectures. Then, in Section III, we will give a brief overview of a biology resource collection implementation using SALSA concepts. We conclude our paper with review of related literature and a discussion of future work.

II. T HINKING R APID I NTEGRATION SALSA’s foremost design criterion is the ability to easily integrate new research-grade resources into a collection. This criterion is coherent with the imperative for the access of most up to date data and software in Biology research. Before moving on to discussion of SALSA, we would like to give a brief overview of the popular alternative to integrating research grade resources. To the best of our knowledge, modeldriven middleware architectures [1], [2], [3] struggle to seamlessly integrate data services and workflow enactment services by wrapping their components in uniform component models which interact through interfaces defined by these models (Figure 1.a). These architectures are suitable for Model Driven Development processes, which start with a blueprint of the final product and then proceed to realize the final product using formal methods and Computer Aided Software Engineering tools [4]. The effectiveness of the overall Model Driven Development process depends heavily on whether the requirements can be precisely obtained during the design phase, otherwise revisions in requirements may stifle the development [5], [6], [7]. Hence, in grid enabled portals Model Driven Development can be expected to be more effective when the implementation of portal components are coordinated, and the canonical component models are powerful enough to capture all possible components without sacrificing from precision. These architectures have a significant advantage if all components have the potential to be used by every other component: if there are N different data or software components, the integration of the overall system requires the development of only N component model instances from which wrappers or mediators can be automatically generated. This number is vastly smaller than O(N 2 ) mediators that would need to be developed if each components pairing would need to be supported (forming a complete graph) without a uniform interface and automation. 2

Unfortunately the heterogeneous nature of research resources introduces some important problems for this approach. A major drawback of the model-driven approach, which arises at design time, is that they have to capture all requirements as well as anticipated behavior of all the components in their canonical component models or schemas. Therefore, the development of the schemas for such systems requires in-depth knowledge of the underlying research fields by the developers. Another more important drawback emerges following the deployment, while the model-driven systems grow. The initial schemas which capture the canonical components that have been solidified at the design time may become insufficient to capture a precise description of the new components as the collection acquires components from new fields or as the components in the existing research areas diversify with the natural development within research fields. As a result, the schemas need to be revised so that description of the canonical components capture the specific requirements of all these new components. Unfortunately, these revised schemas are usually more lengthy and more complex, thereby make the addition of new components to the collection a more tedious process [5], [8], [9], [10]. Eventually, it requires more time for each new component to be integrated into the system, which may add up to cripple the growth and expansion of the collection portal. SALSA’s development process anticipates the design requirements may frequently change in a grid portal that serves access to a growing number of workflows which consist of third party software and data. Hence, in contrast to Model Driven Development, SALSA employs an Agile Development [7], [10], [9], [5], [11], [12] process which provide the design flexibility to enable continuous maintenance and improvement of the portal. In the next two sections we discuss two models that support SALSA’s approach to scalable integration of biology resources into a software and database web portal: the heterogeneous database model for the nature of data, and the complex network model for the growth of the collection. Note that these models are different from generative models in Model Driven Architecture; since they do not describe what the final product is, but how the system will evolve. Then, we summarize SALSA design concepts in Section II-C.

Locally controlled updates: The resources are managed by autonomous researchers, and they make syntactic and semantic changes to the resources they provide without prior notice. Following such remote updates, applications that depend on these resources may fail to operate correctly. • Sporadic down times: Since these portals are independently maintained, there can be computer failures or scheduled maintenances that make the resources temporarily unavailable. • Heterogeneity in Size: Resources that are provided in remote repositories can vary in size and quality. In order to effectively use several data sources, researchers will need to combine large data with small data, while paying attention to contamination of high-quality data with low quality data. • Structural Heterogeneity: The data resources vary in their structure from highly structured relational databases to unstructured text. An effective system can process data that is represented in several types of structures. • Heterogeneity in querying abilities: Some remote resources, or the applications in the portal (which can be viewed as queries on the data) might allow only a limited set of queries. The aforementioned properties of heterogeneous resources account as drawbacks for model-driven approaches to integrate them. This is because model-driven approach assumes that all programs and files in the portal’s collection can be represented with a canonical description that is written using their arsenal of descriptors. This assumption does not hold for biology resources, which are created by independent researchers from diverse fields to fit their individual research needs and practices - hence are inherently heterogeneous. A canonical description would either be too general to let individual programs and files to be precisely documented, or it would be too complex to the point that it will require the understanding of all the programs and data in the collection to be able to understand the descriptors and add a new component to the collection. •

B. Portal Growth Model In this paper we conjecture that, the architecture of a growing collection of biological resources does not manifest O(N 2 ) interactions (complete graph) among its components, but only a small fraction of these interactions among components will be realized. We base our argument on the recently recognized small-world scale-free network characteristic in software components [16], [17], [18], [19], [20], [21] and citations [22], [23]. In scale-free networks [24], [25], the connectivity of nodes does not follow an even distribution, which would be expected from a random graph. Some nodes have significantly higher number of connections (hub nodes) and most of the nodes are connected to a few hub nodes with a small number of edges. The probability that a random node will connect to k other nodes, P (k), is roughly k −α , where α is a constant that characterizes the network. This peculiar behavior of

A. Heterogeneous Data Model Resources for computational biology research is accumulated in a rather uncoordinated process that involves a large body of researchers who are distributed to all over the World. Even though the internet is making the communication among these researchers easier, group collaborations are far from creating a coordinated research effort. Individual laboratories or researchers take design decisions while developing their data and software without any outside control. The resulting multiplicity of data resources, formats, applications, implementations, etc. are modeled in heterogeneous data model [13], [14], [15]. Heterogeneous data has the following properties: 3

A

A

H

Interaction with a hub component

A

H

A

I New application component

Interaction among application components

Interaction with a hub component

A new hub component

C

B

D

Hub component

Application component

D

B

H

C

D

B

C

D

text B

C

G

E

F

G

E

F

G

New application components

E

F

E

A collection with one hub component.

New components F,G, and H are added. (H,C) and (G,C) are interactions among application components.

Fig. 2.

Since C became highly connected with addition of G and H, it is re-factored, hence becomes a hub component.

New components that are added to the collection can interact with C more seamlessly.

The maintenance of SALSA during its growth.

networks have been explained using a generative mechanism, called “preferential attachment”. According to this generative model, if some of the nodes have a higher likelihood of gaining connections from newly introduced nodes, the network becomes scale-free as it grows. In the context of a grid portal, this model means that, most of the components will need to be compatible with only a few hub components, such as the service components (e.g., the generic storage, computation, visualization components) or common data formats (e.g., FASTA sequences, molecular interaction network graphs, etc.) and common software (e.g., BLAST, graph clustering, etc.). The interactions among nongeneric components can be achieved by using only a small number of interaction-specific mediators (Figure 1.b). The Debian package dependency network is an example of scale free networks that manifest themselves in software collections [18]: more than half of the packages are not referenced by other packages (skewed distribution of connectivity), and three quarter of the packages need other packages. The dependency graph is sparse, but in the graph of about 20 thousand components the average distance among components is just over 3 dependency links. The distribution of the number of incoming dependencies is P (k) = k −0.90 .

systems [7], [10], [9], [5], [11], [12]. These methods recognize the fact that adding features under a previously planned framework is costlier than adding without any previous irreversible design decisions. For this reason, they emphasize a continuous process of specification, design and development. The specifications are usually based on the use scenarios, and the design are least specific. The existing code is first improved to accommodate the new design using refactoring [11] and then new features are added. During refactoring components that perform specific tasks are converted into instantiations of more generic components; however such changes to the code are not made until they are required by the latest specification changes. In a grid enabled portal, management of such continuous effort can be a daunting task unless it is aided by an architecture that successfully captures the evolution of the portal and helps predict the future changes to the system. SALSA architecture redirects focus from the many components that are specific to a single application (e.g., configuration input file of a program) to the few generic components (e.g., storage, execute, history and user interface components) as well as commonly used application components, which we will refer as hub applications (e.g., FASTA databases, molecular interaction databases, BLAST, etc.). The generic components and the hub applications provide common services to a large number of components in the system. The stability and maintainability of these few generic components and hub applications are critical for the evolution and growth of the collection, since the new components that are added to the system tend to depend on them.

C. SALSA Framework In a grid enabled software and data portal it is very important that the portal development process is sustainable to embrace changes to requirements of existing components and additions of components with new requirements. Large software systems usually need to undergo small design changes during their product lifecycles, however such changes are more pronounced in the life of a portal. This is mostly because the development of individual components that are part of the portal are results of an uncoordinated effort of thirdparty developers, and there is a continuing pressure to add more of these components to the portal. It is imperative to use development methods which can support software that evolve through continuous improvement in the design and code structure. In recent years, Agile Development Methods have been proposed and successfully used in development of large software

In SALSA, new hubs that emerge in the interaction graph are converted into generic component models through refactoring (Figure 2). Refactoring process of a portal component involves optimization for improved performance, as well as the creation of a canonical model specific to the hub node that also describes the components similar to it (e.g., BLAST is refactored to be integrated into a more general pattern matching component). Since the new hubs will attract new interactions, this rare maintenance process will reduce the number of mediators required for new interactions added to the system as well as the complexity of adding these 4

new mediators. This approach differs from the model-driven approach, in several ways. Firstly, the canonical descriptions for hub resources evolve over time, instead of being predicted at design time. Also, the canonical descriptions pertain to a very specific type of applications characterized by a hub application, rather than striving to capture the characteristics of all applications. This latter difference translates into a simpler integration process for new applications in SALSA; because every application can be defined independently from the constraints of conforming to a canonical descriptor set, which could otherwise grow to sizes that can impede this process. After enough mediators are added to the existing network, the interaction among the components will be stable, i.e., new workflows that involve existing components will not require new mediators and new mediators will only be needed for newly added components. Focusing of the development effort (including documentation, maintenance, improvement of interface and performance) to a component translates to the flexibility of this component so that it is easier for new task workflows to use this component. Moreover, the improvement of these components directly increase the quality of the overall user experience, hence it is an effective practice for the use of the expensive programming effort. SALSA enables the system to achieve more stability in a way that is hard to achieve without recognizing the growth model that underlies it: The generic components can provide redundancy to increase the overall error-tolerance of the system. Redundancy is achieved by providing a specific service through different implementations. For example, the storage service component provides operations of data store and retrieval using a local file system as well as SRB (Storage Service Broker) for distributed storage. If one component fails, an alternative component is still available. Since the failure of the overall system in a scale free network topology mostly depends on the failure of hubs, the error-tolerance at hubs has a disproportional contribution to the overall robustness of the system. The identification of hub applications, which are also central to the collection, also makes it possible to extend redundancy to these hub applications. As added benefits, the redundant alternatives for hub applications provide a better user experience (by providing alternative tools which can be used to control results obtained from one tool) and help possible problems that have been discussed in the context of heterogeneous data model (e.g., unavailable remote data, incompatibility of new formats, etc.). We propose a web service-oriented architecture that recognizes these results to provide rapid prototyping by requiring minimal overhead to integrate new data and processing components to the system.

overall architecture of this portal, and details that involve the component descriptions and data management. CRB Input File

CRB Output File

CRB Tool

APPLICATION SPECIFIC COMPONENTS (EXAMPLE APPLICATION)

SwissPROT

NCBI

Pattern Matching

Genome

Exact Match

EBI HUB APPLICATIONS AND DATA

User Interface

Visualize

Storage

Web Services

MIDDLEWARE COMPONENTS

BLAST

Compute

SERVICE COMPONENTS

Globus Gnuplot

MySQL

Local Filesystem NCBI

XML+XSL Rendering

History Browser

Invigo-lite

Campus Grids Local Computation

SRB MCAT Remote Distributed Storage TeraGrid System Components

Fig. 3.

The component hierarchy of SALSA.

SALSA portal is a web based collection, and its aim is to grow into a large collection of up to date biology data and software. SALSA has a small number of service middleware components, which are used by the rest of the system (Figure 3): User Interface, Storage, Compute, Web Services Interface, Visualize. In addition to these service components, there are application based middleware components, these components are services that have been derived from individual applications or data that have proved to be central to the collection. Examples of such application based middleware are pattern matching component derived from BLAST, and sequence data component derived from FASTA and SwissProt. As we have discussed earlier, redundancy has tremendous benefits for fault tolerance of the system. Since our components are from 3rd party sources, we need to be prepared for failure of these sources, redundancy is one of the ways we use (replication of data by caching, alternative data sources from other mirrors, also for software and central components). Redundancy at Execute component is achieved by using local computing resources as well as campus grids and TeraGrid through Condor. Storage is a significant concern when the remote data resources are not persistent, as they can be unavailable, have limited accessibility or can be frequently updated. We used several types of storage targets to meet these needs that range from local flat file storage to mySQL and to SRB. In our implementation, web applications that interact with the user are based on the GridSphere portlet framework. GridSphere provides ease for common web development tasks, such as user profiles, login management and customized lay-

III. I MPLEMENTATION We have used the SALSA concepts to implement a gridenabled web portal for collection of research grade computational biology resources. In this section we discuss the 5

outs. In order to meet the flexibility that is required to provide user interface for a variety of applications we augmented GridSphere portlets with an XML and XSL based interface. In the following subsections we discuss issues relating to user interface, storage, and compute services in a grid enabled portal. We first discuss how XML can be used to manage component descriptors, submitted jobs and user interaction at the same time. Note that these are service middleware components, which will be used by almost all components in the portal.

tions. The XML for these components are tailored to precisely describe their respective component, without subscribing to a common vocabulary of XML elements. We used XSL to implement the behavior of the web based user interface for a component. Besides the separating the user interface development from the rest of the portal development effort, this choice has several other advantages. Firstly, the user interfaces are generated in the client side and there is no need for the use of any computational power at the server side to generate HTML-based user interfaces. Secondly, the user interface development is relieved from the burden of supporting variety of user platforms (e.g. mobile, text based, etc.), since the browser will take care of parsing XML and generating the HTML result; the only compatibility requirement is the use of a browser that can render XML files. XSL had been our choice as a user interface markup language because of its cross-platform compatibility advantage over alternatives such as XUL, MXML, and XAML [27]. Even though these alternative technologies offer advanced user interface libraries that support menu bars, tree views and tab boxes [28], XSL still has a good standing since such functionalities are attainable with the help of Javascript. The use of XSLT enables the deployment of Javascript for a more useable and interactive user interface that goes beyond simple HTML forms. Currently, we use Javascript to verify the inputs at the entry time so that we can keep semantic integrity of the data and avoid execution failure due to invalid inputs. When a user tries to launch an application through job submission portlet, the portlet shows an HTML page containing all required attributes to be filled in. At this stage, Javascript checks whether the entered values are valid for the expected data type. Javascript is also useful for interactive browsing of user initiated jobs and their output files through interactive graphs. After a user runs an application, the XML description that is used to describe the application type is updated with the input data that has been entered by the user. This updated XML becomes an instance of a job of the corresponding application type (see middle column of Figure 4). The same XML is then updated by the Execute service component to record its trail until it finishes its execution (see right column in Figure 4). The history data for each execution can be easily rendered to be shown in the browser with another XSL which is very similar to the XSL which was used for user input interface, (see Figure 5). XML and XSL help rapid integration of a new application into the portal and evolution of existing components that are already in the portal. We would like to emphasize that one of the biggest advantages of using XML is its ability to provide powerful and customized web based user interfaces, while requiring little development and maintenance effort.

A. Multi-purpose XML In a grid enabled portal, it is necessary to have a specification of each component for the dual purpose of humanreadable documentation and machine-readable book-keeping. Human-readable documentation include annotations by developers during the design and development of these components, as well as annotations of user generated data or submitted jobs by end users. Machine-readable specifications include all information that are automatically processed to direct the execution of the portal, such as input-output types, CPU requirements, and remote data locations. Even though developers may directly alter this information, end users usually need more user-friendly interfaces while they dispatch new jobs or add annotations. Implementations of user interfaces for web applications need to replicate and be compatible with the descriptions of their respective components. The alias relationship between the component specifications and their user interfaces may hinder the maintenance of the portal, which is anticipated to evolve as it grows. In order to facilitate a more streamlined maintenance and development process we have coupled the user interface specification with the component specification, such that a change in a component’s specification are automatically reflected in the user interface. In this section we discuss the details of the component specification and user interface aspects of our portal implementation. XML is a simple but powerful markup language for text data. It also enables the use of XSL to easily generate user interfaces to present and edit the underlying data [26]. The widely used web browsers such as Mozilla and Internet Explorer have built-in support for rendering XML to HTML through XSLT. In our implementation, we used XML to document data and software components, as well as to keep track of the execution trail of workflows. We used XSL to build user interface for these components and workflows. This feature of XML allows us to couple component data and their user interface while efficiently separating the user interface development effort from that of portal development. We describe each data and software component in our portal with a specific XML. The XML description for computational components include information such as the input data requirements, the specifications of output data, the hardware requirements as well as annotations by the user. Likewise the description of data components include identification information such as source, format, and date, as well as user annota-

B. SRB, MCAT and SQL SRB is a middleware that uses a hierarchical logical namespace to manage distributed and heterogeneous data collections. Among data grid middleware, SRB is commonly accepted 6

ncat

ncat

ncat

71

71

71

1000

1000

1000

0

0

0

Value greater than maximum limit (1000). 2000

1000

0

1

Number of words in catalogue file. Number of words in catalogue ……

Number of words in catalogue file. Number of words in catalogue

……

……

0 01/01/200612:00 TestJob

TestJob

1

1

Not Submitted

1 01/01/200612:00

TestJob 1000

Number of words in catalogue file. Number of words in catalogue

Waiting

Fig. 4.

1000 01/01/200612:00

The lifetime of XML file in SALSA.

to be the most popular and mature of its kind. It supports file management through logical data space, scalable capacity, security support, replication, federation, and APIs for development. SRB supports bulk-operations and data grid federations, both of which are essential for the scalability. SRB can be used with Globus Toolkit and Condor, the commonly used grid tools today. SRB has its weaknesses, nonetheless. For example, it is heavily dependent on Metadata Catalog (MCAT). If the MCAT server becomes unavailable, it is not possible to retrieve data even though the data server is operational. We have found that SRB can readily be integrated into our implementation and serve as our main file storage method. Its Java API (JARGON) is easy to use, making up for the shortcoming of not being based on standards. SRB has proved to be a good choice for our data grid component since the biology research applications we support are all data-intensive computations and need to access large amount of data. The choice of SRB and MCAT simplified the development process as our system takes advantage of SRB’s capability of handling unlimited storage space and automated data grid management. We use SRB to store not only the output data produced by the computations, but also the execution history and user input. Execution history data is also stored as metadata in the MCAT. This enables the user to examine completed jobs, modify the input parameters if he/she wishes, and re-execute the application with new inputs. Storing job history metadata also allows for flexibly searches

and speedy retrieval of job information. In our data experiments, we use MCAT to store metadata that describe large data collections. SRB retrieval latency remains the same regardless of the size of the data collections since unique identifiers are used to access the files stored in SRB. However, the MCAT access latency increases proportionally when size of the data collection (number of files) expands. This is because query operations are used to access MCAT data and they are sensitive to the size of data, which in our experiment was in the order of hundreds of thousands. In order to alleviate the slow response time from the MCAT, we use a method of information replication to achieve faster response for some important types of the queries. In addition to MCAT, we use MySQL to manage a local database for storing metadata that pertain to job management, which is typically a small amount of data. With the replication, a portlet page containing the job history of a user can be rendered faster without being affected by the latency SRB data fetching or MCAT metadata query processing. This is particularly useful when MCAT is not available, because the job information database serves as a local copy of the most of the information that has been stored in MCAT. On the other hand, the XML descriptions of the jobs stored in SRB can be used to reconstruct the local MySQL server in the case of its failure. The XML descriptions stored in SRB contain a complete trace of each job launched from the portal, starting from their instantiation at the User Interface component until they are processed by 7

CRB Toolkit Number of words 71 in catalogue

independent, self-contained tasks, such as many computational biology tools, are suitable for running on a Condor pool. For example, the CRB tool [29] is a great candidate utilizing the Condor resources. The CRB tool implements a Bayesian model for locating regulatory regions in a DNA sequence. To process a long DNA sequence, we have improved the tool to process segments of the sequence simultaneously, creating multiple Condor jobs to run in parallel [30]. The Compute component allows job submission to the TeraGrid by using the Condor-G interface through TeraGrid gatekeeper computers. This enables SALSA portal users with highly parallel computations transparent access to the nation’s powerful supercomputing power through a web portal. The graph in Figure 6 shows the Condor utilization of idle CPU cycles in the Purdue Condor pool. The green area represents the Condor cycles used during a typical week.

Sequence length 35937 Catalogue file

Browse...

Data file

Browse...

Additional parameters

string input Submit Job

a)Job submission interface CRB Toolkit Job Description

TestJob

Date/Time

01/01/2006-12:00

Status

REJECTED

User Input Validity

There was an error

Tried Input Data Number of words 2000 in catalogue

Value greater Invalid than maximum limit (1000).

Sequence length

4000a

Invalid

Value not integer.

Catalogue file

mycat.txt Invalid

File is not uploaded.

Data file

mydat.txt Invalid

File is not uploaded.

Additional parameters

NA

Unknown parameters

Invalid

Fig. 6. Condor utilization of idle CPU cycles in the Purdue Condor pool. The green area represents the Condor cycles used during a typical week.

IV. R ELATED W ORK Integration of research grade data and software resources for Biology has been an active research area, since it will improve the research productivity and speed in this area. It has been repeatedly emphasized that the heterogeneous nature of resources is one of the biggest obstacles before such an ambitious aim. In [13], Davidson et.al. identified the challenges in integration of biological databases: i) heterogeneous data (e.g., some relational databases, some hyper-linked documents, some text) and application programs which can be viewed as queries on data. ii) users wish to perform “bulk” queries. iii) updates to data are locally and autonomously controlled. They also studied the integration steps: 1) Data model transformation 2) Semantic schema matching 3) Schema integration 4) Data transformation 5) Semantic data matching. This early paper, demonstrates that integration of data from two independent sources is an involved process that is even harder to generalize for many sources. Sujansky gives an introduction to database heterogeneity and its specific examples for biomedicine in [14]. Notably, the author proposes that single data model for biomedicine is not a possible goal. Sujansky also gives a list of properties that are specific to heterogeneous data integration: i) complex declarative queries should be handled; ii) heterogeneity of underlying data should be transparent to the user; iii) write

b)Job status interface. Fig. 5. Rendering of two XML files from Figure 4 generated by Firefox 1.5. This job has been rejected by the secondary validity check at job execution component; the user sent illegal parameters bypassing the primary validity check that uses javascript at the browser – probably by disabling scripting. Note that input validity check is required at server side for security reasons too, as the user input can be used to launch a code injection attack.

the Execute component and completed with results. Note that SRB, and MySQL are alternatives that provide redundancy for the Storage component; this redundancy alleviates reliability concerns by improving the Storage components fault tolerance. In summary, a little data redundancy provides faster query time and increase reliability of our systems. C. Compute Service Component In the Compute service component, we create interfaces to allow jobs to run at various computation resources, including on the local computer, campus grid, and the TeraGrid. This implementation provides the flexibility for users to compute on the resource that is most efficient for the task at hand. On the Purdue campus grid, we utilize the Purdue’s high throughput computing Condor pool. It is one of the largest academic Condor sites in the U.S., with over 4200 computers distributed on the campus. Jobs that can generate a large number of 8

access is not required from remote databases; iv)autonomous updates of databases and schemas is to be expected; v) timely access to newest data is paramount. Finally, the author provides a categorization for the types of heterogeneity: structural; naming; semantic; content. In our system, we maintain the understanding of heterogeneity as an inherent characteristic of computational biology resources, and prepare our framework for possible drawbacks this will cause. North Carolina Bioportal (NCBioportal) [3] use PISE as a user interface middleware component for a model-driven architecture. PISE supports a comprehensive set of pre-defined data types and generates an HTML page after parsing the user specification document that has been generated using these pre-defined data types. However, it is still possible that some of the applications will require data types different from the ones provided by PISE. Our XML and XSL based user interfaces can provide customized services to such applications. This approach is good for achieving interoperability and getting fast component integration but provides limited freedom for the customization for non-standard data which is very common in heterogeneous settings. Although Virtual Data System (VDS) was first introduced in 2002, difficulty of installing VDS has been one of the major hindrances for getting popularity, until it was made available over the web [31]. Basic idea of Virtual Data System is to re-use function definitions and real function calls. If a query procedure has been written already and saved in a database, users who want the same or similar function can take advantage of existing ones. If the users can find a function definition or real call, they only need to adjust input parameters. The re-use of functions can decrease query time of repeated or similar request. VDS also supports object history for ensuring the object semantic security. This helps users verify the safety of objects they want to re-use. The importance of continous development in software with evidence from the complex network structure of large software systems has been reported by Myers [21]. Myers measured various characteristics of Class Collaboration Diagrams in large software and suggested a generative model of software evolution that tries to capture these characteristics. The systems that were studied by Myers’ are results of a coordinated effort and design, hence some of the results do not apply to grid enabled portals in which third-party components are plentiful. Notably, Myers observes that the complexity in software systems is not responsible for fault tolerance as opposed to many complex engineering systems. However, in this respect grid enabled portals that are the focus of our paper behave closer to complex engineering systems and gain fault tolerance throgh redundancy as well as degeneracy. Grid enabled portals are similar to Commercial Off-TheShelf (COTS) based systems, since they rely heavily on thirdparty components. In this respect, COTS-based development experience can be used to improve portal development processes. On the other hand COTS-based system development is very different from grid enabled portal development, because i) the third-party components are usually of research-

quality rather than commercial quality, ii) aggregation of all competing components into the system is required as opposed to selection of one. It has been repeatedly reported that the true cost of COTS-based systems lies at the maintenance [32], [33] of the system, since such systems need to update their code in order to improve their overall quality as well as to comply with new versions of the COTS components. The cost of maintenance of the COTS-based system are determined [32] by i) number of COTS packages that need to be synchronized, ii) technology refresh cycle times, iii) maintenance workload for wrapper updates, iv) reconfiguration of packages, v) product evaluation, vi) update databases, v) migrate to new standards vi) license costs It is for this reason that the managers of COTS-based systems value the functionality and reliability of COTS components more than their financial affordability at the initial deployment [33]. V. C ONCLUSION We have presented an architecture for a grid-enabled webbased collection of research-grade resources. We based our design decisions on two models that capture the nature of the individual resources (i.e. heterogeneous database model) and the nature of the collection (i.e. small-world scale-free network model). We diverged from the model-driven approach to adopt an application driven approach, in which individual applications are treated as separate applications from the collection, until -in rare cases- they become hub components. The flexibility in handling of individual components translate into their faster integration into the collection. We conjecture that the lazy approach to building the intermediate links between applications and data does not incur a large development complexity penalty due to the small-world scale-free model of the collection. This model states that only a small fraction of the overall components will be used by other components, hence would need to be written according to a canonical component description. The main contributions of SALSA can be listed as follows: •

•

•

•

•

9

Emphasis on rapid integration of new components to the portal through a judicious management of system maintenance. More focus on highly connected components, without being constrained by a canonical component model. After refactorization, hub components become more compatible and can be easily added to a large number of workflows. Ability to grasp the trends in the requirements of a research field by observing the emergence of workflow patterns in the collection. Less connected components are still available in standalone workflows. This enables an early start to collect experience for future support of potentially important components. A simple component description method that is specific to the relevant tasks, which enables easier integration of heterogeneous resources and at the same time maintains the simplicity of this process.

Using added redundancy for hub components to alleviate reliability issues that are inherited from heterogeneous nature of the sources of components. • Better user experience with effective improvement of more important hub components. There are strong indications for the small-world scale-free nature of independently developed resources. As we have demonstrated in this paper, such models have great impact on our understanding of these collections and how we design our systems. As future work we would like to quantify the exact parameters of the scale-free nature of such networks for computational biology. Implementation of SALSA is an inherently continuing process. One of the main goals of SALSA is to be able to interchange hub components with alternatives to increase fault tolerance. In this paper we have shown an example of this for Execute and Storage services. In the future, User Interface services can be made richer using alternative approaches that could be suitable for a larger user base when combined, such as Mobile Java Applets or Ajax. Caching is another way to achieve fault tolerance, besides increasing overall performance. However, the sheer size of a unit of computational biology data record makes traditional caching techniques insufficient. This calls for some interesting research that explore the trade-off between disk-space and reliability, rather than speed.

[13] S. Davidson, G. Overton, and P. Buneman, “Challenges in Integrating Biological Data Sources,” Journal of Computational Biology, vol. 2, no. 4, pp. 557–572, 1995. [14] W. Sujansky, “Methodological Review,” Journal of Biomedical Informatics, vol. 34, pp. 285–298, 2001. [15] T. Hernandez and S. Kambhampati, “Integration of biological sources: current systems and challenges ahead,” ACM SIGMOD Record, vol. 33, no. 3, pp. 51–60, 2004. [16] S. Valverde and R. Sole, “Hierarchical Small Worlds in Software Architecture,” Arxiv preprint cond-mat/0307278, 2003. [17] A. Mockus, R. T. Fielding, and J. D. Herbsleb, “Two Case Studies of Open Source Software Development: Apache and Mozilla,” ACM Transactions on Software Engineering and Methodology, vol. 11, no. 3, pp. 309–346, 2002. [18] N. LaBelle and E. Wallingford, “Inter-Package Dependency Networks in Open-Source Software,” Arxiv preprint cs.SE/0411096, 2004. [19] S. Valverde and R. Sole, “Logarithmic growth dynamics in software networks,” Arxiv preprint physics/0511064, 2005. [20] R. Wheeldon and S. Counsell, “Power law distributions in class relationships,” in Third IEEE International Workshop on Source Code Analysis and Manipulation, 2003, pp. 45–54. [21] C. Myers, “Software Systems as Complex Networks: Structure, Function, and Evolvability of Software Collaboration Graphs,” Physical Review E, vol. 68, no. 4, p. 46116, 2003. [22] S. Bilke and C. Peterson, “Topological properties of citation and metabolic networks,” Physical Review E, vol. 64, no. 3, p. 36106, 2001. [23] A. Vazquez, “Statistics of citation networks,” Arxiv preprint condmat/0105031, 2001. [24] A. Barabasi and E. Bonabeau, “Scale-free networks.” Sci Am, vol. 288, no. 5, pp. 60–9, 2003. [25] A. Barabasi, “The physics of the Web,” Physics World, vol. 14, no. 7, pp. 33–38, 2001. [26] S. McGrath, XML by example: building e-commerce applications. Prentice Hall PTR Upper Saddle River, NJ, USA, 1999. [27] J. Nichols and A. Faulring, “Automatic Interface Generation and Future User Interface Tools,” ACM CHI 2005 Workshop on The Future of User Interface Design Tools. [28] D. Hyatt, B. Goodger, I. Hickson, and C. Waterson, “XML User Interface Language (XUL) Specification 1.0,” http://www. mozilla.org/projects/xul, Last visited November 2006. [29] E. Crowley, K. Roeder, and M. Bina, “A Statistical Model for Locating Regulatory Regions in Genomic DNA,” Journal of Molecular Biology, vol. 268, no. 1, pp. 8–14, 1997. [30] C. Song, U. Topkara, J. Woo, S. P. Park, and M. Bina, “Enabling Advanced Bioinformatics Research through SALSA: A Scalable, Simple Architecture,” in TeraGrid Conference, Indianapolis, IN, USA, June 12– 15 2006. [31] Y. Zhao, M. Wilde, I. Foster, J. Voeckler, T. Jordan, E. Quigg, and J. Dobson, “Grid middleware services for virtual data discovery, composition, and integration,” Proceedings of the 2nd workshop on Middleware for grid computing, pp. 57–62, 2004. [32] D. J. Reifer, V. R. Basili, B. W. Boehm, B. Clark, and R. Consultants, “Eight lessons learned during COTS-based systems maintenance,” Software, IEEE, vol. 20, no. 5, pp. 94–96, 2003. [33] M. Keil and A. Tiwana, “Beyond Cost: The Drivers of COTS Application Value,” Software, IEEE, vol. 22, no. 3, pp. 64–69, 2005.

•

VI. ACKNOWLEDGMENTS Authors would like to the anonymous referees for their helpful comments and suggestions. This work was supported by the National Science Foundation under TeraGrid Resource Partners grant OCI-0503992. R EFERENCES [1] X. Zhang and G. Agrawal, “Enabling Information Integration and Workflows in a Grid Environment with Automatic Wrapper Generation,” in The 6th IEEE/ACM International Workshop on Grid Computing, 2005, pp. 156–163. [2] V. Fontes, B. Schulze, M. Dutra, F. Porto, and A. Barbosa, “CoDIMS-G: a data and program integration service for the grid,” Proceedings of the 2nd workshop on Middleware for grid computing, pp. 29–34, 2004. [3] A. Blatecky, K. Gamiel, L. Ramakrishnan, D. Reed, and M. Reed, “Building the Bioscience Gateway,” Workshop on Science Gateways: Common Community Interfaces to Grid Resources, GGF, vol. 14, 2005. [4] D. Schmidt, “Model-Driven Engineering,” IEEE Computer, vol. 39, no. 2, pp. 25–31, 2006. [5] J. Shore, “Continuous design,” Software, IEEE, vol. 21, no. 1, pp. 20–22, 2004. [6] S. Mellor, A. Clark, and T. Futagami, “Model-driven development-Guest editor’s introduction,” Software, IEEE, vol. 20, no. 5, pp. 14–18, 2003. [7] S. Ambler, “Agile model driven development is good enough,” Software, IEEE, vol. 20, no. 5, pp. 71–73, 2003. [8] C. Ebert, “Understanding the Product Life Cycle: Four Key Requirements Engineering Techniques,” IEEE Software, vol. 23, no. 3, pp. 19– 25, 2006. [9] R. Nord and J. Tomayko, “Software Architecture-Centric Methods and Agile Development,” Software, IEEE, vol. 23, no. 2, pp. 47–53, 2006. [10] D. J. Reifer, “How good are agile methods?” IEEE Software, vol. 19, no. 4, pp. 16–18, 2002. [11] M. Fowler, K. Beck, et al., Refactoring: improving the design of existing code. Addison-Wesley, 1999. [12] D. Thomas, “Agile programming: design to accommodate change,” Software, IEEE, vol. 22, no. 3, pp. 14–16, 2005.

10

A Deep Look at Web Services for Remote Portlets (WSRP) and WSRP4J Xiaobo Yang, Rob Allan CCLRC e-Science Centre, Daresbury Laboratory, Warrington WA4 4AD, UK {x.yang, r.j.allan}@dl.ac.uk Abstract Web Services for Remote Portlets (WSRP) extends today’s data-centric web service paradigm by providing markup fragments that can be used directly for rendering. WSRP alleviates some complex tasks of building web portals by enabling remotely-hosted portlets to be accessed directly. This illustrates a promising approach for portal federation. However, according to our early investigations on selected open-source portal frameworks, WSRP implementations are still immature. In this paper, a deep look at WSRP, in particular, WSRP4J the Apache implementation of the OASIS WSRP 1.0 specification will be reported with further discussions aiming at providing a full solution for WSRP.

1. Introduction Web portals are gateways to many kinds of information systems including e-Learning, e-Research and eBusiness. Today the second-generation web portals based on portlet technique are widely adopted and deployed. For example, in the grid community many grid portals are now portlet based such as the OGCE Portal Toolkit [2] and the UK National Grid Service (NGS) Portal [17]. Acting as web components, portlets encapsulate relatively independent functionalities with markup fragments provided to portals for composition of integrated web pages. While the Java Portlet Specification 1.0, JSR 168 [1], standardises portlet development using Java, there is still need to re-deploy the same portlet in different portal frameworks so that it can be re-used. The Web Services for Remote Portlets (WSRP) specification 1.0 [8] proposed by OASIS defines presentation-oriented web services aiming to remove the re-deployment of portlets by solving the interoperability issues among different portlet containers. Now portlet providers can develop and maintain their own portlets individually without requiring upgrades on their consumer side (normally web portals). The difference between traditional data-oriented and WSRP services is that the latter provide markup fragments generated by remote portlets that can be used directly by their clients for rendering, while data-oriented services provide only raw data. These data normally require post-processing, e.g. using XSLT [11] transformation before being presented to end users. WSRP makes it possible for service providers to develop and maintain independent services which provide all the rendering information needed by the client side UI. This greatly alleviates the hard work needed for constructing end user focused applications like web portals. Now web portals, including grid portals, can be federated as required. As shown in Fig. 1, Portal A has two local portlets and two remote portlets, one from Portal B, the other one from Portal C. Altogether, four portlets are combined together and rendered on one portal page. Although WSRP depicts a beautiful blueprint for future portal development, according to our investigation on open-source portal frameworks there are many interoperability issues [18]. We looked at eXo platform, Liferay, StringBeans and uPortal plus WSRP4J. Even with producer and consumer from the same vendor almost none of

!"#$'%(*

!"#$'%()

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

!"#$'%(+

!"#$%&$

!"#$%&$

!"#$%&$

!"#$%&$

Figure 1. Portal federation: building up portals by plugging in local and remote portlets.

the WSRP implementations are fully functional. Therefore in this paper, a deep look at WSRP4J, the Apache WSRP 1.0 implementation will be reported. This will form the basis of our WSRP consumer developed for the UK JISC funded Sakai VRE Portal Demonstration project [12]. Initiated by IBM, WSRP4J from Apache is an open-source project which is now still in incubation stage. Currently WSRP4J is functional although some WSRP features are missing [10]. However, during our WSRP support test [18], the WSRP4J producer gave the best performance, especially for handling interactions between end-user and remote portlet. WSRP4J makes use of Pluto [3], a JSR 168 reference implementation by Apache, as its portlet container. Producer within WSRP4J has been developed based on a very modular architecture. In theory, Pluto should be able to be replaced by other portlet containers, although in practice, this will involve a considerable amount of work. In this paper, we will first give a brief introduction to the mechanism of WSRP. Different aspects of several missing bits in WSRP4J will be discussed followed by further discussions on providing a fully functional WSRP solution to portal developers. Finally we provide a summary and concluding remarks.

2. Web Services for Remote Portlets (WSRP) Besides End-user, the WSRP 1.0 specification defines Producer and Consumer. A description of the sequence flow is given below for better understanding on how producer and consumer work together to complete portlet’s publication and consumption. Fig. 2 illustrates a simplified sequence flow of communications in WSRP. In this figure, communications such as clonePortlet and deregister are omitted to make it more readable. A detailed description of WSRP communications between consumer and producer can be found in the WSRP 1.0 Primer [9] and other introductory materials [13]. As depicted in Fig. 2, a consumer first talks to a producer’s description interface to retrieve metadata. This includes a description of the producer itself and the portlets it holds. During this stage, the consumer may be asked to register itself with the producer. According to the consumer’s capabilities, the producer may then provide markup fragments in different formats. The consumer can now select a portlet it wants to access. A unique portlet handle is used for communications between producer and consumer so that the portlet can be uniquely identified. The

producer will first provide the default view of the portlet. Now the consumer is able to make use of the markup fragments provided by the producer so that a full (HTML) web page can be constructed and rendered to end-users. The consumer also needs to collect requests, such as form input, and re-direct them to the producer. This is the most important task of the consumer. Only when a user request is transferred correctly, can a correct response be created. As reported in [18], many WSRP consumers could not handle such a re-direction correctly which makes them currently unusable.

Figure 2. Simplified WSRP sequence flow.

As mentioned above, WSRP4J is still under development. The TODO list [10] highlights a good starting point for investigation of WSRP4J. In this paper, we will not touch on all of the items in the TODO list but with several points we noticed during our investigation. These include templates, mode/ windowState change, and file upload support in WSRP4J.

3. Experiences on WSRP4J 3.1. Templates support plus templatesStoredInSession In WSRP the consumer acts as the intermediary between the producer and the end-user. Among the communications involved, one key task the consumer has to handle is to generate correct URLs as many of them are relative URLs on the producer side. Furthermore, the consumer may want to add its own fragments to those generated by the producer. There are basically two approaches for doing this: a) consumer rewrites URLs in markup fragments generated by the producer; and b) consumer provides templates to the producer so that URLs are generated according to the consumer requirements. The second one is a more flexible approach for consumers to communicate with producers. This is because if a consumer provides templates to a producer, the latter will

be able to process markup fragments using these templates. Hence such a producer will not ask the consumer to re-write URLs within the markup fragments. The consumer can now handle everything such as links and form actions according to requirements on its own side. If templatesStoredInSession support is used, the consumer will only need to provide templates once, which reduces network traffic. For example, the BEA WSRP producer (http://wsrp.bea.com/portal/producer?wsdl) requires its consumers to support both templates and templatesStoredInSession while the one from NetUnity Software (http://wsrp.netunitysoftware.com/WSRPTestService/ WSRPTestService.asmx?Operation=WSDL) only requires templates support. The only missing bit of templates support in WSRP4J producer is resourceTemplate. We have implemented this together with templatesStroedInSession support in WSRP4J producer and our ProxyPortlet based WSRP consumer developed for Sakai, an open-source collaboration and learning environment for education [4]. This consumer has been observed to successfully consume portlets held by our modified WSRP4J producer with templates and templatesStoredInSession support, the aforementioned BEA and NetUnity WSRP producers. Fig. 3 gives a screenshot of a remote portlet from BEA running inside Sakai through WSRP. In [19], this WSRP consumer for Sakai has been described with examples given for consuming grid portlets developed for the UK NGS Portal [17].

Figure 3. Consuming BEA remote portlet in Sakai through WSRP.

Namespace support is also currently not available in WSRP4J but can be realised in a similar approach as templates support.

3.2. Mode/ WindowState change support It was observed that WSRP4J does not support Mode/ WindowState change in portlets. For example, a portlet may have a button in its view page to go to help page and vice verse. This involves a mode change. There was no response observed during our WSRP support test of mode change requests. The same thing happened to WindowState which implies WSRP4J does not support window state change, for example from normal to maximized. As one of the key features portlets provide, Mode/ WindowState change support has been added to WSRP4J producer during our investigation. In Fig. 3, a line of URL links between the ”Change Portlet” button and the ”BEA Racing” title includes different modes and window states we support.

3.3. File upload support To support file upload, it is practical for consumers to extract data first then set up correct WSRP request parameters to transfer the data to producers. A producer can then retrieve data transferred and set particular HTTP request parameters that the portlet is aware of, so that the file upload can be finished. It was found not practical for the producer to construct a HTTP request object with ”enctype” attribute set as ”multipart/form-data”. Therefore file the upload portlet had to be modified to be aware of some particular parameters the producer sets as mentioned above. If uploaded data is encoded then included in the SOAP message from consumer to producer, we observe severe performance problems when transferring large amounts of data. This issue is not mentioned in WSRP 1.0 specification. Therefore extensions are needed to increase the performance of transferring large datasets between consumer and producer. Howes [14] reported that BEA has extended the WSRP 1.0 specification (Custom Data Transfer Extension) to allow ”WSRP consumers and producers to create, view, modify, and control concurrent access to shared, scoped data in a scalable, reliable, and highly performant manner” with the help of Tangosol Coherence. On the other hand, file download support is also missing in WSRP4J. Normally file download is redirected to a servlet which does the real file download task. This is currently not available in WSRP4J as redirect is not supported.

4. Further discussions 4.1. Redirect and file download Support of redirect is missing in WSRP4J. As mentioned above, file download could be realised by redirecting to a servlet. Current version of WSRP4J producer simply sets redirectURL to null.

4.2. Customisation One of the key benefits portlets bring is the ability to set preferences for each user, however in a remote portlet environment, this becomes an issue. Because producer-offered portlets are accessible to all consumers, they are not allowed to be customised. WSRP uses the concept of ”cloned portlet” to let consumers customise portlets for end-users. Two approaches are listed in the WSRP 1.0 specification: 1) using edit mode of a portlet; and 2) using property definitions within the portlet’s metadata. In WSRP4J, all methods related to properties are empty thus it is not implemented. According to our test with WSRP4J, once a portlet handle is changed (e.g., edit a cloned version of portlet A, then switch to portlet B, come back to portlet A) or connection to a WSRP4J producer is closed, the preference settings are lost. This is because such a portlet will be cloned during those actions. Preferences of a standard JSR 168 portlet are managed by the portlet container. Inside a WSRP4J producer, Pluto uses an XML file (portletentityregistry.xml) to store portlet preferences. These preferences are stored with a link to the portlet’s ID which is in fact the portlet handle. Since such a handle for cloned portlets is always changing and the WSRP4J producer does not keep it, portlet preference support is not persistent. To solve the issue, a WSRP4J producer may record additional information such as consumer’s name and user’s name together with the portlet handle when a user edits preferences of a cloned portlet. Then next time the same user from the same consumer accesses the same portlet and tries to edit preferences, these values can be retrieved and set by the producer.

4.3. Rich user interface The tide now in web development is to develop Rich Internet Applications (RIA) that have features and functionalities similar to desktop applications. AJAX (Asynchronous JavaScript And XML) is today’s hottest topic in web development. Although AJAX can be applied to standard JSR 168 portlets, there will be problems to publish these as remote portlets without modification. Early investigation shows that applying AJAX in a remote portlet is not straightforward. AJAX makes heavy use of JavaScript especially XMLHttpRequest to contact a remote server while in a remote portlet scenario there exists a consumer acting as a broker. Normally all requests from end-users should be sent to this consumer first and then redirected to the producer by the consumer. After that, the producer handles requests from the consumer and passes them to the portlet container. It is the portlet container which does the real job of talking to a particular portlet. Responses from portlets are collected by the container, and then passed to the producer and the consumer accordingly. AJAX breaks this design pattern by contacting remote services directly behind the WSRP consumer portal server. For example, an HTTP request may be sent from the user agent (browser) to services deployed on a WSRP producer server. Due to security concerns, requests sent by AJAX are limited to the server where the web application resides. Internet Explorer will pop up a warning message once such a call is made while Firefox simply blocks it without warning. A proxy servlet can be utilised to re-direct HTTP requests to remote services [15] so that this issue can be worked around. Fig. 4 gives a screenshot of such a portlet, an AJAX Invoice Viewer portlet [20] running under our servlet-based WSRP producer. When a new invoice number is selected (in Fig. 4 the invoice number is ”439091”), AJAX will contact a proxy servlet sitting on the consumer web server. The proxy servlet will then re-direct the request to a WSRP producer (WSRP4J) server where the target resides. The response will then be returned from the remote service. All this is done behind the web browser so that performance is greatly improved.

Figure 4. AJAX-enabled remote portlet.

4.4. Producer/ portlet publish and discovery End users today have to remember too many URLs, a central registry for web services can alleviate the pain. UDDI [6] which is widely adopted for web services registry, discovery and integration can be adopted since WSRP

producers are web services. Although a consumer can then obtain a producers’ interface URLs and get a list of their portlets, end users are only interested in searching for particular portlets rather than producers. Portlets should therefore also be registered, for instance, in a UDDI registry. We have set up a test registry for this purpose [16] which makes it possible for end users to discover portlets and then transparently access them (a user simply selects th required portlet; the consumer will then connect to the producer to access it).

4.5. Security Another area which is not detailed in the WSRP 1.0 specification is security. Because WSRP is based on web services, the specification does not address the security issue in detail but simply guides developers to follow existing security mechanisms, such as WS-Security [7] and SAML [5]. WSRP 1.0 emphasises the usage of transport-level security standards, e.g. SSL/ TLS between consumer and producer, to achieve message integrity and confidentiality.

5. Conclusions WSRP is a promising specification for building up presentation-centric web services that can be plugged into web portals without requiring portlets to be deployed locally. Unfortunately as revealed by our investigations of WSRP support in several open-source portals frameworks, WSRP is still not fully functional. Therefore in this paper, WSRP4J, the best performer in our tests, has been selected to dig out the missing bits including templates, mode/ windowState plus file upload support. Further discussions include redirect support, customisation, AJAX which is now prevailing for building up Rich Internet Applications, producer/ portlet publication and discovery and security issues. With these issues tackled in the future, WSRP could play a key role in portal federation and shared portal services.

Acknowledgements The authors would like to thank the anonymous reviewers for their insightful comments helped to improve this paper. This work was undertaken at the CCLRC e-Science Centre, Daresbury Laboratory supported by UK JISC (The Joint Information System Committee) as part of the Sakai VRE Demonstrator project.

References JSR 168. http://www.jcp.org/aboutJava/communityprocess/ final/jsr168/. OGCE. http://www.collab-ogce.org/ogce2/. Pluto. http://portals.apache.org/pluto/. Sakai: Collaboration and learning environment for education. http://sakaiproject.org/. SAML. http://www.oasis-open.org/committees/tc home.php?wg abbrev=security. UDDI. http://www.uddi.org/. Web Services Security. http://www.oasis-open.org/committees/tc home.php?wg abbrev=wss. WSRP 1.0. http://www.oasis-open.org/committees/ download.php/3343/oasis-200304-wsrp-specification-1.0.pdf. WSRP 1.0 Primer. http://www.oasis-open.org/committees/download.php/10539/wsrp-primer-1.0.html. WSRP4J TODO List. http://wiki.apache.org/portals/WSRP4J/TodoList. XSLT. http://www.w3.org/TR/xslt. R. Allan, D. Chohan, X. Wang, X. Yang, R. Crouchley, A. Fish, M. Gonzalez, M. Baker, H. Ong, M. Dovey, and F. Pinto. Virtual research environments: Sakai demonstrator. In Proc. UK e-Science All Hands Meeting 2005, available on CDROM, pages 208–215, Nottingham, UK, September 2005. [13] R. Gupta. WSRP: Dynamic and real-time integration - an introduction to wsrp, its usage, and implementation. WebServices Journal, 5(8):10–19, 2005. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

[14] J. Howes. Sharing data among federated portals: Using WebLogic Portal, Tangosol Coherence and WSRP. http://dev2dev.bea.com/pub/a/2005/11/federated-portal-cache.html, November:2005. [15] J. Margaglione. AJAX programming in BEA WebLogic Portal 8.1, part 2. http://dev2dev.bea.com/pub/a/2006/03/ajaxportal-2.html, April 2006. [16] X. D. Wang, X. Yang, and R. Allan. Plug-and-play remote portlet publishing. In GCE05: Portals Workshop, Seattle, USA, November 2005. [17] X. Yang, D. Chohan, X. D. Wang, and R. Allan. A web portal for the National Grid Service. In Proc. UK e-Science All Hands Meeting 2005, available on CDROM, pages 1156–1162, Nottingham, UK, September 2005. [18] X. Yang, X. D. Wang, and R. Allan. Investigation of WSRP support in selected open-source portal frameworks. Concurrency Computat.: Pract. Exper., 2006, in press. [19] X. Yang, X. D. Wang, R. Allan, M. Dovey, M. Baker, R. Crouchley, A. Fish, M. Gonzalez, and T. van Ark. Integration of existing grid tools in Sakai VRE. In International Workshop on Collaborative Virtual Research Environments (CVRE06), to be held in conjunction with GCC 2006, Changsha, China, October 2006; accepted. [20] G. Ziebold and J. Suri. Asynchronous rendering of portlet content with AJAX technology. http://developers.sun.com/prodtech/portalserver/reference/ techart/asynch rendering.html, June 2005.

My WorkSphere: Integrative Work Environment for Grid-unaware Biomedical Researchers and Applications Zhaohui Ding1, Yuan Luo1, Xiaohui Wei1

Chris Misleh2, Wilfred W. Li2, Peter W. Arzberger2

Osamu Tatabe3

[email protected]; [email protected];

[email protected], [email protected],

[email protected]

[email protected]

Department of Computer Science, University of Tsukuba, Tsukuba, Japan3

National Biomedical Computation College of Computer Science & Resource, University of California, Technology, Jilin University, San Diego, 1 Changchun, Jilin, China 130012 La Jolla, CA, USA, 920932 [email protected]

Abstract In order to deliver cyberinfrastructure to the general scientific and biomedical research community, transparent access and ease of use are of critical importance. Applications in systematic modeling of biological processes across scales of time and length demand more and more sophisticated algorithms and larger and longer simulations. The increased level of sophistication requires that cyberinfrastructure developers either work closely with the applications scientists, or develop middleware that flattens the learning curve for these scientists to use the grid willingly and transparently. Many life sciences researchers prefer to run applications in the grid environment without modifications, and without knowledge of specific computational resources being utilized. Here we report the latest advances in the use of Gfarm-FUSE (Grid Data Farm-Filesystem in UserSpaceE) as a computational data grid, with CSF4 (Community Scheduler Framework 4) as the metascheduler, through a GridSphere portal based environment, termed My WorkSphere. We describe the design and performance of this transparent grid computing environment using bioinformatics and computational biology applications as examples. All the components developed or utilized are open source and available freely. Keywords: GridSphere Metascheduler, Community Scheduler Framework 4, Gfarm, Bioinformatics

Introduction The rapidly improving genome sequencing technology and genome assembly software have made possible the deciphering of the building blocks of diverse species, from microbial to humans. However, scientists are still learning ways not possible before to take advantage of this deluge of genomic information. Multiscale modeling, across the length scale from nanometers for molecules to meters for human bodies, as well as across the time scale from nano-seconds for molecular interactions to the length of human life, is crucial to the development of simulation systems that enable the understanding of the interactions of human physiology and the environment [1]. Despite the “tyranny of scale” (the inability of current computing technology to meet the requirement of full simulation of complete systems across scale [2]), the necessity of multiscale simulation activities is evident. With this type of knowledge, it's possible to develop predictive models and preventive strategies for human diseases and environmental disasters. One challenging task to the development 1

of multiscale biological systems models is to develop the necessary standards, tools and databases using state of the art computer science and technology [3,4]. The increased sophistication in developing multiscale models inevitably demands more attention from the researchers to the specific systems, leaving them little time to learn new programming models or computing technology. The advances in computing technology are turning commodity computing into reality. Nowadays, clusters with peak speed in the teraflop range are springing up across the globe, at 1/10th the cost of a supercomputer with the same capacity 5 years ago. Whereas the newest challenge is to build petaflop scale supercomputers, it’s not far fetched to imagine this newest breed would become common place within the next decade. Even as more and more “privately operated” computing facilities become available, grid computing will become even more important as researchers develop more complex models and increase in their desire to share spare computing cycles, and as grid data security satisfies the requirement of biomedical researchers in terms of confidentiality and fidelity. However, there remain several major stumbling blocks before the highly focused biomedical researchers are willing to use grid computing on a daily basis. First of all, the ability to establish a unique identity that can be as convenient to use as a passport or identity card with password protection is essential. With such a single sign on (SSO) system, a user can be easily authenticated to all the resources with the proper authorizations. Such an identity must be recognized across different virtual organizations (VO’s), or administrative domains. Currently the commonly used authentication system is the X.509 certificate system based the Public Key Infrastructure (PKI) specification (for more information, see [5]). This is available as Grid Security Infrastructure (GSI) in the Globus Toolkit, which includes a Simple CA (certificate authority) for creating and signing X.509 certificates. However, many applications do not support X.509 authentication, and that’s limiting its widespread use. Other systems, such as Shibboleth, use SAML (Security Assertion Markup Language) to federate identities across virtual organizations. A NSF middleware initiative (NMI) funded project named GridShib [6] aims to bridge these two technologies. However, the additional administrative overhead has slowed the adoption of federated authorization systems so far. Besides the authentication and authorization of users, the ability to deploy existing applications seamlessly into a grid environment is rather important. As the demand for more complex simulations of biological systems increases, applications scientists have very little time to think about or even willing to learn how to develop their applications with the grid in mind. Quite often, researchers choose TeraGrid sites based on the type of commodity cluster operating system used in their own software development, to reduce or eliminate modifications required to scale up onto the TeraGrid. While many problems are best solved on petaflop scale supercomputers with low latency networks, the development and maintenance of such a system is extremely expensive, and the platform is often obsolete within a few years, given the exponential growth in processor speed and network bandwidth. The desire of biomedical researchers not to learn the “newest” and “fastest” computing technology, but to focus on solving their problems at hand means that the ability to deploy traditional applications continues to be very important. New programming models takes time to filter down to the educational curriculum to enable new generations of programmer to develop more grid-aware applications. On the other hand, many proven applications are prohibitively expensive to rewrite for the grid, and must be deployed and optimized as is.

2

Figure 1. My WorkSphere utilizes open source components and provides an integrative environment for easy access to the grid computing environment.

Additionally, as the requirement for computing power increases, the data processing and memory requirement also becomes bigger. Network and disk input/output bandwidths pose additional challenges. Quite frequently, the challenge for commodity cluster configuration is to design the systems with sufficient network attached storage (NAS) that can meet the demand of the applications with the best price/performance ratio. The price of a low latency network equipped cluster is now comparable to one without just a few years ago. While this is good news, it’s still a far cry from meeting the demand of applications, whose combined I/O can easily overwhelm 10 Gbps networks. As the cost of large local disks become cheaper, the development of a grid filesystem that takes advantage of local disk I/O yet supports existing applications becomes critical. Lastly, as more and more users demand more intuitive and interoperable computing environment, the ability to schedule jobs transparently and to access data transparently with high throughput becomes important.

My WorkSphere My WorkSphere is an integrative environment which leverages open source components to provide easy access to grid computing resources. The major components are shown in (Figure 1). In this particular environment, we utilize the GridSphere portal framework and Gridportlets [7,8], the GAMA server and portlet [9], the Opal web service toolkit [10], the MEME Opal portlets [11], the community scheduler framework 4 (CSF4) [12], Gfarm [13], Globus Toolkit and Commodity Grid Kits [14,15,16], and Rocks [17]. While most components have been previously described, we present here the latest advances in the development of this integrative environment, including the newest development in 3

software packages and performance evaluation of the system. The purpose of My WorkSphere is to prototype an environment where users can easily gain access to grid computation resources; run jobs without worrying about what resources they are using, and deploy applications once and use it everywhere on the grid. My WorkSphere will serve as the NBCR portal to the TeraGrid Science Gateways for the biomedical community researchers.

GridSphere GridSphere portal framework provides a development environment for JSR 168 compliant portlets, which are java based web components called servlets that run inside the Apache tomcat container [18]. It is currently used in a number of projects internationally, in particular, academic institutions. The GridSphere portal is easy to setup, and there are a set of portlets bundled as Grid Portlets that provides basic functions of file browsing, credential management, and job execution, as long as the proper resources are configured for use with the Globus toolkit. The Grid Portlets is shipped with the Java COG kit, and doesn’t require the Globus Toolkit installation on the portal server, except for basic grid security requirements such as valid host certificates and trusted certificate authorities (CA), as required by COG. GridSphere and GridPortlets provide a set of APIs for development of additional portlets that can take advantage of the credential management and account management services.

GAMA (Grid Account Management Architecture) GAMA is co-developed by the GEON, NBCR, and SDSC at UCSD. GAMA 1.3 has two major components, the GAMA server and the GAMA portlet. The GAMA portlet fully automates the user account creation on the portal, and the X509 certificate creation, and signing. In this case, the GAMA CA is operated by NBCR, and is trusted as needed by PRAGMA [19] partners and the TeraGrid Science Gateways [20]. GAMA server may also be used to create host certificates, and serve as a MyProxy server [21]. The advantage of having a project/organization based GAMA server is reduced administrative overhead and maximum flexibility. The GAMA portlet automatically retrieves user proxy from the GAMA server upon user login into the portal. This is one more step towards SSO, where login once allows one to access resource readily. One additional feature of GAMA is the management of user certificates, and import of user certificates into a new GridSphere portal, which simplifies the upgrade process of GridSphere. The account approval process may also be linked to automatic cluster account creation and grid-mapfile entry addition. Currently gxmap is being used to automate this process, and GAMA version 2 plans to further automate this process.

Opal web service toolkit and MEME portlets Opal web service toolkit [22] is developed to make the deployment of existing applications as web services easy. It leverages the Globus GRAM for job submission, and requires a simple configuration file for the location of the application, and command line options may be supplied as required. A number of web services for popular applications such as MEME [23], APBS [24], and AutoDock [25] are available and used by any clients that support web services. For applications such as AutoDock 4

that require license agreements, Opal based web services support GSI authentication so that only users with appropriate licenses are issued certificates, and allowed to access AutoDock based web services. Opal based web services also makes the development of generic web service client possible in application frameworks such as Gemstone [26]. It also means that portlets may be developed that uses the same application based web services. The MEME portlet is one such example [11], and the same MEME web service is also used by Gemstone, or any other web service clients if desired.

Community Scheduler Framework 4(CSF4) and CSFportlets

Figure 2. CSFportlets enables a visual interface to accessible resources, and eliminates the specification of specific resource for computation. CSF4 is the first WSRF compliant metascheduler released as a contribution to the Globus Toolkit 4 (GT4). The latest release supports resource listing, job listing and job history, including support for the Sun Grid Engine (SGE) [27], TORQUE [28], and commercial schedulers such as LSF. It also leverages the Globus WS-GRAM for support of FORK, and Condor [29]. CSF4 supports user developed scheduling policy plugins [30], and supports proxy delegation for access of Gfarm filesystem [31]. A new prototype CSF4 portlet for GridSphere has been developed, providing a visual interface to CSF4 functionalities (Figure 2). Through either CSF4 command line interface, or the portal interface, a user no longer has to worry about where his application will be run, except for specifying the required resource characteristics, such as memory, number of processors, or application type (serial or MPI). As CSF4 schedules jobs over Globus GRAM and local jobmanager-adapters, there are some additional overhead over the use of ‘globusrun’ or ‘globusrun-ws’. The average time it takes for a globusrun job to execute is about 12 sec, which is largely dependent on the local batch scheduler’s scheduling interval. The use of CSF4 adds about 30 sec on top of GRAM jobmanager, when the default scheduling interval of CSF4 is 20 sec. Therefore, besides the scheduling interval, which is a 5

user configurable option, the overhead is very small with the use of CSF4 through the portal interface.

Gfarm Grid File System

Figure 3. Gfarm-FUSE outperforms NFS as the number of clients increases. The machines have 2 GB of RAM, 3.0 GHz Dual Xeon processors. Gfarm is developed as a petascale high performance file transfer and storage system [13], and it won the StorCloud Challenge Award during SC ‘05. As part of the collaborative activities under PRAGMA between the Biosciences, Resources and Data Grid working groups, we have used Gfarm to as a testbed for the deployment of existing bioinformatics applications [32,33].. While initial activities used the LD_PRELOAD environment variable and Syscall-hooking library to run pre-existing applications, the recent incorporation of FUSE [34] into the Linux kernel 2.6.14 has prompted us to use Gfarm-FUSE. This has eliminated the problems with glibc version incompatibilities, and resulted in a much more stable environment. The major advantage of using Gfarm-FUSE as a grid filesystem is two folds: 1. All applications may run without modifications through a unified filesystem view. This includes both serial and parallel applications. The corollary of this is that users do not need to learn about the grid, and may develop and compile applications within the Gfarm-FUSE file system using familiar tools. 2. Data intensive applications may take advantage of local filesystem I/O, reducing network bandwidth requirements. We have done some performance measurements, and shown in Figure 3. The results show the average of 3 experiments using ‘dd’ to write files of 500 MB to 4 GB in size in 16k block size, either using a 6

single client or 3 clients in parallel. GFS outperforms NFS in single client mode except in the case of writing the 500 MB file, which is likely due to memory caching. Similar results are seen using ‘iozone’ (data not shown). However, when multiple clients are used, GFS-FUSE consistently outperforms NFS from 3 to 5 times. The advantage of Gfarm will become even more obvious as Gfarm-FUSE is using local file system I/O, the scalability isn’t hindered by network bottlenecks as NFS is. Using Gfarm’s built-in I/O measurement tools, one can easily achieve 220 MB/s read speed and 50 to 110 MB/s write speed with 6 clients. On the other hand, the network bottleneck applies if remote file transfer is required.

Running Grid-unaware Applications in Gfarm-FUSE We have used MEME as an example to illustrate the use of Gfarm-FUSE. The Gfarm-FUSE system described here uses PostgreSQL 8.1.4 as the metadata server, a newly added feature in addition to LDAP. However, in our preliminary experiment, the file creation and registration using PostgreSQL is significantly slower than the use of LDAP. This suggests that the use of a relational database requires additional tuning on the persistence layer of Gfarm. For example, the untaring of MEME, creating more than 500 files, requires more than 1 min, vs 74 ms in NFS. Despite the aspect of slow file registration, we have successfully run the entire MEME web server, which includes MPI programs written in C, PERL scripts, and shell scripts, and Apache web server, using the Gfarm-FUSE filesystem. Based on the preliminary job execution studies, the execution and response time of different applications are comparable to NFS. For example, there is no noticeable difference in the Apache web server response time, the execution time for the sample dataset provided by MEME is only about 3% slower for MEME or MAST. What are the advantages of using Gfarm-FUSE if there is a slight performance penalty for a single run? 1. The applications installed in one cluster on the Gfarm filesystem are also accessible on another cluster, with automatic support for architecture matching of binaries. 2. There is no need for stage-in or stage-out of application input or outputs. Gfarm keeps track of the data’s locations, with metadata possible in future versions of Gfarm now that relational database support is available. 3. The ability to take advantage of local I/O for data intensive applications significantly improves the support for multiple clients over NFS. 4. The uniform location of applications in the Gfarm filesystem across sites simplifies job scheduling. With the testbed developed for this prototype environment, a quarter Terabytes of total storage is achieved, and the automatic replication feature of Gfarm may be used to ensure duplicate copies are available in different clusters at distinct physical locations.

Summary and Discussion The GridSphere portal framework [7] enables the reuse of functionalities exposed as portlets, but portable for different JSR168 compliant portal containers. In this work environment termed My 7

WorkSphere, a user can apply for access to compute resources through GAMA (Grid Account Management Architecture) portlet, which automates the user X509 certificate creation, signing, and proxy management [9]. A user who has never used the grid can start using the grid through the portal environment. The same X509 certificate may also be used directly, if desired, since it’s a properly signed certificate by the organization running the GAMA server. We have developed a prototype CSF4 portlet that allows users to submit jobs to CSF4 managed resources in a GridSphere portal through a web browser. A CSF4 server can forward the job request, along with the user credentials, to different heterogeneous clusters using the GRAM (Grid Resource Allocation & Management) Protocol. CSF4 uses the Java Commodity Grid Kit and GT4 (Globus Toolkit 4) delegation service to support full delegation of user proxies to both Pre-WS GRAM and WS-GRAM. The jobs are submitted to different clusters with local batch schedulers, such as SGE, dynamically based on metadata provided by MDS (Monitoring and Discovery System) or through FCFS (first come first serve)/round robin scheduling algorithm. CSF4 is capable of working with different local schedulers, like LSF, PBS (Portable Batch System), SGE and Condor, and via both GT4 WS GRAM and GT2 GRAM (Pre-WS GRAM). For simple access to data and application deployment, Gfarm-Fuse (File System in User Space) is used to enable a familiar Unix environment in which compilation, installation of software is only required once per platform, and the filesystem is transparently mounted and unmounted without user intervention. We have compiled and deployed into Gfarm-Fuse data grid commonly used bioinformatics software such as MEME and BLAST, and other computational mathematics, biology and chemistry applications, such as FFTW, APBS, and AutoDock. We are beginning to utilize My WorkSphere as a production environment to further evaluate the performance and scalability of the system. In addition, we have made available the documentation on the configuration of the major components through the NBCR public wiki and started training users through the NBCR Summer Institute [35]. As more and more clusters, and grid of clusters, as virtual organizations, spring up, the ability to simplify the scheduling of tasks across virtual organizations becomes critical. The Open Science Grid, utilizes a condor-based system, with dedicated compute and storage resources, but not a global filesystem. Such systems use VOMS (Virtual Organization Membership Service) to automate the user creation processes on remote systems through the dynamic generation of temporary users during job execution. Once a job is finished, the user data is moved to specified data central, and may be accessed independently of the computing resource. Such systems provide a central resource broker which yields information about the resources available, and a command line interface for seasoned users in the high energy physics community. Biological applications have also been deployed to such systems.

Conclusion and Future Work We are currently working towards several goals: 1) Use of CSF4 as a metascheduler for Opal based web services, with Gfarm as an data and application repository; 2) Support data uploads from client side, either from CSF portlet or an Opal based web service client; 3) Increase the performance of Gfarm metadata server, and metadata cache server; 4) Deploy the system for production use; 5) Prepare the release of the entire suite as Rocks roll(s) for easy setup and maintenance. 6) As a 8

TeraGrid Science Gateways portal, we are developing ways to interoperate with the TeraGrid, as well as other VO’s within the OSG.

Acknowledgements PWA and WWL wish to acknowledge PRAGMA as supported by NSF Grant No. INT-0216895 and INT-0314015; NBCR as supported by NIH NCRR P41 RR08605; CAMERA project as supported by the Moore Foundation. XW wish to acknowledge Jilin University under Grant No.419070200053 and Grant No.420010302338, National Natural Science Foundation of China under Grant No.60473099 and Outstanding Youth Science Foundation of Jilin Province under Grant No.20040119.

References 1. Hunter PJ, Borg TK (2003) Integration from proteins to organs: the Physiome Project. Nat Rev Mol Cell Biol 4: 237-243. 2. Oden JT, Belytschko T, Fish J, Hughes TJ, Johnson C, et al. (2006) Revolutionizing Engineering Science through Simulation. 3. Hunter PJ, Li WW, McCulloch AD, Noble D (2006) Multi-scale Modeling Standards, Tools, Databases for the Physiome Project. Computer: In Press. 4. Li WW, Arzberger PW, McCulloch A (2006) Developing End-to-End Cyberinfrastructure for Multiscale Modeling in Biomedical Research. CTWatch Quarterly. pp. 6-17. 5. Siebenlist F, Magaratnam N, Welch V, Neuman C (2004) Security for Virtual Organizations: Federating Trust and Policy Domains. In: Foster I, Kesselman C, editors. The Grid: Blueprint for a New Computing Infrastructure. 2nd ed. Amsterdam: Morgan Kaufmann. 6. GridShib (2006) GridShib: Integrating federated authorization infrastructure (Shibboleth) with Grid technology (the Globus Toolkit). http://gridshib.globus.org/. 7. Gridsphere (2004) GridSphere. http://www.gridsphere.org. 8. Novotny J, Russell M, Wehrens O (2004) GridSphere: a protal framework for building collaborations. Concurrency and Computation: Practice and Experience 16: 503-513. 9. Bhatia K, Chandra S, Mueller K. GAMA: Grid Account Management Architecture. 1st IEEE International Conference on e- Science and Grid Computing; Melbourne, Australia. pp. In Press. 10. Krishnan S, Stearn B, Bhatia K, Baldridge K, Li WW, et al. Opal: Simple Web Service Wrappers for Scientific Applications. International Conference for Web Services. 11. Li WW, Krishnan S, Mueller K, Ichikawa K, Date S, et al. Building cyberinfrastructure for bioinformatics using service oriented architecture. Sixth IEEE International Symposium on Cluster Computing and the Grid Workshops Singapore. pp. 39-46. 12. CSF (2005) Community Scheduler Framework. http://sourceforge.net/projects/gcsf/ 2005. 13. Gfarm (2004) Grid Data Farm. http://datafarm.apgrid.org/software/#download. 14. von Laszewski G, Foster I, Gawor J, Lane P (2001) A Java Commodity Grid Kit. Concurrency and Computation: Practice and Experience 13: 643-662. 15. COG (2004) Commodity Grid Kits. http://www-unix.globus.org/cog/. 16. Globus (2004) The Globus Alliance. http://www.globus.org. 17. ROCKS (2005) Rocks Cluster Distribution. http://www.rocksclusters.org. 18. Tomcat (2006) Apache Tomcat. http://tomcat.apache.org/ 2006. 9

19.

PRAGMA (2004) Pacific Rim Applications and Grid Middleware Assembly. http://www.pragma-grid.net/. 20. TGSG (2006) Tera Grid Science Gateways. http://www.teragrid.org/programs/sci_gateways/. 21. Novotny J, Tuecke S, Welch V. An Online Credential Repository for the Grid: MyProxy. High Performance Distributed Computing (HPDC). 22. Krishnan S (2006) Opal Web Service Toolkit. http://nbcr.net/services. 23. Bailey TL, Williams N, Misleh C, Li WW (2006) MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34: W369-373. 24. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci U S A 98: 10037-10041. 25. Morris GM, Goodsell DS, Huey R, Olson AJ (1996) Distributed automated docking of flexible ligands to proteins: parallel applications of AutoDock 2.4. J Comput Aided Mol Des 10: 293-304. 26. Baldridge K, Bhatia K, Greenberg JP, Stearn B, Mock S, et al. GEMSTONE: Grid Enabled Molecular Science Through Online Networked Environments. Life Sciences Grid Workshop; Singapore. World Scientific Press. 27. SGE (2006) Sun Grid Engine. http://gridengine.sunsource.net/. 28. TORQUE (2006) TORQUE Resource Manager. http://www.clusterresources.com/pages/products/torque-resource-manager.php. 29. Litzkow M, Livny M, Mutka M. Condor - a hunter of idle workstations. Proceedings of the 8th International Conference of Distributed Computing Systems June 1988. pp. 104-111. 30. Wei X, Li WW, Tatebe O, Xu G, Hu L, et al. Implementing data aware scheduling in Gfarm using LSFTM scheduler plugin mechanism. International Conference on Grid Computing and Applications; Las Vegas. pp. In Press. 31. Wei X, Ding Z, Li WW, Tatebe O, Jiang J, et al. GDIA: A Scalable Grid Infrastructure for Data Intensive Applications. IEEE International Conference on Hybrid Information Technology, ICHIT 2006; Cheju Island, Korea. IEEE Computer Society. pp. Accepted. 32. Wei X, Li WW, Tatebe O, Xu G, Hu L, et al. Implementing data aware scheduling on Gfarm by using LSFTM scheduler Plugin. International Symposium on Grid Computing and Applications; Las Vegas, NV. 33. Li WW, Arzberger PW, Yeo CL, Ang L, Tatebe O, et al. Proteome Analysis using iGAP in Gfarm. The Second International Life Science Grid Workshop 2005; Grid Asia 2005, Singapore. World Scientific Press. 34. FUSE (2006) Filesystem in USEr space. http://fuse.sourceforge.net/ 2006. 35. NBCR.WIKI (2006) NBCR Public Wiki. https://nbcr.net/pub/wiki/index.php?title=Main_Page.

10

A PERMIS-based Authorization Solution between Portlets and Back-end Web Services Hao Yin1, Sofia Brenes Barahona4, , Donald F. McMullen2, Marlon Pierce2, Kianosh Huffman3, Geoffrey Fox2,4, 1

School of Computer Science and School of Electronic Information, Sichuan University, No. 24, South Section 1, Yihuan Road, Chengdu, 610065, China 2 The Pervasive Technology Labs at Indiana University, 501 N. Morton St. Bloomington, Indiana 47404 USA 3 School of Informatics, Indiana University, 901 E. 10th St., Bloomington, Indiana 47408 4 Department of Computer Science, Indiana University, 215 Lindley Hall, 150 S. Woodlawn, Bloomington, Indiana 47405

{hayin, mcmullen, marpierc, kihuffma, gcf, sbrenesb}@indiana.edu

Abstract: A portal is a Web-based application that acts as an entry point to distributed resources. Individual portlets in a portal can be used to integrate information from a variety of back-end Web services. However, when Web services are deployed, they are available to unintended clients not related to the portal so a general solution for authorizing access to them is needed that is integrated with the portal’s own authentication and authorization mechanisms. This paper investigates the feasibility of an implementation of a general purpose solution for authorization between portlets and their back end Web services based on Privilege and Role Management Infrastructure Standards (PERMIS) which uses Web services security standards such as WSSecurity and SAML. This solution is also appropriate for authorization across organizational boundaries supporting the inclusion of service resources to a portal which are contributed by many different organizations. A motivating example of instrument sharing based on the CIMA remote instrument access protocol is presented. Key Word: Web Portal, Web services, instrument middleware, WS-Security, Authorization, Role Based Access Control

1. Introduction Web-based science portals [1, 2] have been increasingly used as gateways connecting users to

a range of services. For example, the National Science Foundation’s TeraGrid project funds numerous science gateways to provide higher level user interface and services to TeraGrid resources. Many other national Grid systems have adopted similar approaches. This work has been surveyed by the Science Gateways workshop held at Global Grid Forum 14 [3]. Portals are designed to aggregate and integrate content from different sources, possibly provided by Web services [4]. Figure 1 shows a common architecture of Webbased portals and their services. The Common Instrument Middleware Architecture (CIMA) portal system [5] discussed in more detail below provides a realistic example of this system. One of the distinguishing features of science portals is that they must establish and manage a user’s identity. In addition to this authentication process, they also typically provide a system for making authorization decisions about what a particular user can do. These approaches taken by portal system designers to authorization only have scope within a particular instance of a portal and do not extend to external services that are aggregated and presented in the portal, creating a serious gap in the overall design methodology. The key problem, then, in distributed systems such as shown in Figure 1 is that we must provide a “global” way of establishing and conveying identity and privilege to all components of the system. In practice, end users can authenticate themselves to a portal through a login module. Modern Java portal systems, such as GridSphere[6]

1

Figure 1. Architecture of Web-based Portal interacting with remote Web Services. The arrows represent network communication links (HTTP or SOAP over HTTP in the case of services). Each box represents a separate component running of the system, typically running on a separate service

are built to be extended by standard component, called portlets [7]. The portal can decide which portlets the user can access, and only indirectly which Web services can be used. Exposing Web services creates the potential for misuse: a user could invoke the Web service directly instead of logging in to the portal. It is possible to provide individual authentication mechanisms per Web service but this creates additional complexity. Specifically, then, there is an authorization gap between portlets and the services behind them, and a need exists for a reusable, scaleable role-based access mechanism by which identities and authorization schemes used by portals and their backing services can be meshed other than re-authenticating the user at each service-to-service interface. This paper will focus on this problem and give a Privilege and Role Management Infrastructure Standards (PERMIS)-based [8] authorization solution, exemplified by a remote instrument access portal based on the Common Instrument Middleware Architecture (CIMA) we are developing. This scenario is described in more detail in Section 5. This portal provides access to shared instrument resources across a federation of laboratories with similar research interests. We begin by surveying general approaches to authorization mechanisms and discuss which ones are appropriate to our portal system.

2. Overview of Authorization Mechanisms Four main approaches to authorization are available, each with advantages and drawbacks in a given application: access control lists, role-based access control, attribute-based access control, and capabilities-based access control. As we discuss, our motivating scenario is best implemented using

authorization roles, which we want to contrast with other approaches. An access control list (ACL) is a data structure, usually a table, containing an entry for each user with access privileges to a particular system object, such as a file or a directory. The most common privileges include the ability to read, write or execute a file. ACL is widely used on Microsoft Windows NT/2000 and UNIX based operation systems. A fundamental problem with ACLs is the confused deputy problem in which a process started under one set of permissions performs a task for a process with a lesser set of privileges, and does so under the least restrictive ACL [9]. Role-based access control (RBAC) [10] is an alternative approach to ACLs. Instead of assigning permissions to users directly, roles are created for various responsibilities and access permissions are assigned to specific roles. The assignment of permissions is fine-grained and meaningful, and users obtain the permissions to perform particular operations through their assigned role. A policy file contains the definition of roles related to protected resources and permissions related to a specific role. This mechanism has better scaling for large numbers of users and simplifies the authorization management. Attribute-based Access Control. Most current authorization systems are based on identity, which means that the subject should be known to the system before their request for protected resources can be authorized. Attribute-based access control (ABAC) [11] is an approach to solve these scalability problems and establish mutual trust negotiation in the large distributed systems. ABAC enables authorization decisions to be made without subject identity by basing on attribute credentials, which contains the characteristics of the subject, such as name, job title, email, etc.

2

Capability-based access control. Another approach defines capabilities. L. Fang [12] defines a capability as “an identifier that carries a set of specific access permission policies on the referred objects.” Capabilities combine user role, target service, and action in a single token. In a capability-based system the user presents a signed list of capabilities to a service and the service determines whether the user is authorized to perform the requested function based on this list. This is similar to attribute-based access except capabilities are directly related to the service being called, as opposed to attributes, the semantics of which can be shared by several services. In addition to the methods discussed above there are several software systems available that are suitable for implementing and managing cross-organizational access schemes: Virtual Organization Membership Service (VOMS) [13], Community Authorization Service (CAS) [14], Shibboleth [15] and Privilege and Role Management Infrastructure Standards (PERMIS). Our motivating application is a collection of portals used to access instruments and data across a global network of loosely federated, peer laboratories. Role-based authorization best models this environment. Individual users are assigned specific roles to accomplish different operations using shared resources accessible across the lab federation as Web services and accessed through JSR 168 portlets. To simplify authorization decisions users are assigned roles with respect to each shared resource. PERMIS was chosen as the decision engine primarily because it provides a concrete implementation of roles and policies for role-based authorization and can be easily integrated with back-end Web services through Apache Axis API handlers. Also PERMIS can use a Shibboleth identity provider.

3. An Authorization Solution for Portlets As discussed previously, we need to combine authentication with authorization. Our proposed solution involves not only the usage of XML Signature [16], XML Encryption [17] to guarantee SOAP [18] message level security, but also the usage of WS-Security [19] and SAML [20] to provide security information so that Web services

can know who the user is, and then make an authorization decision according to a role-based policy with the information of user, operation and resource extracted from SOAP message. WS-Security is a formal OASIS standard, which applies existing XML security standards, such as XML-Signature, XML-Encryption and SAML, to SOAP message. The purpose of WSSecurity is to provide a standard format in SOAP header to provide interoperable and secure SOAP message. Security tokens, XML encryption and XML signature are three major elements in SOAP header. Username token, binary token (e.g., X.509 certificate, Kerberos v5 ticket) and XML token are composed of security tokens. Security Assertion Markup Language (SAML) is also an approved OASIS standard that defines XML structures for representing security-related information pertaining to user authentication and authorization. There are three kinds of SAML assertions: Authentication Assertion, Authorization Assertion and Attribute Assertion. SAML assertions can be used with WS-Security as an XML token. Web services handlers can be adopted on the portal side to intercept SOAP message, add a SAML assertion in SOAP header and sign it on behalf of the issuer. On the Web services side, handlers can be used to verify the signature and extract the security information from the received SOAP message. By using handlers, only minimal changes need to be made to the portal side’s code, and no changes need to be made to the Web services code. This solution is a general purpose approach and can be easily integrated with various applications. Figure 2 illustrates our solution. Step 1: A user authenticates to the portal, and then the portal invokes Web services to acquire requested instrument data. Before the SOAP request is sent, the request handler on the portal side embeds security tokens in SOAP header with digital signature of the portal. The request handler creates a SAML authentication statement containing the name identifier of the user who logged in the portal, and signs it on behalf of the portal. If necessary, the SOAP message can be also encrypted. Step 2: The request handler on Web services side validates the signature of the received SOAP message. After that it acts as an Authorization

3

Figure 2. Our authorization solution model uses Web service handlers to transmit identity information between portlets and remote Web services. This identity is then mapped to a role and used to make an access decision on the service. The numbered steps are explained in the text.

Enforcement Function (AEF) [29] and constructs the essential factors for making an authorization decision. These factors are subject, action and target. Subject, also called user identity can be obtained from the SAML assertion in the SOAP header. Action can be deduced by the operation name in the SOAP body element. Target should be the qualified name of Web services, which is self-evident to Web services side. Step 3: Authorization Decision Function (ADF) [29] gets the factors from AEF and verifies the access control according to the assigned role of the user and a role-based policy, which can be stored in files or in a LDAP [21] server. Step 4: ADF returns the decision to AEF. If the user is granted the privilege, then AEF forwards the request to a Web service instance, or breaks off the processing. Step 5: Web services return the result to portal. There could be a pair of response handlers on both Web services side and portal side to encrypt or sign the SOAP message when Web services return the result. In addition, the functions of Web services handler can be divided into several modules, which consist of a handler chain. Take the request handler on portal side for example, one handler for adding SAML assertion; one for signing the SOAP message; and one for encrypting the SOAP message as needed.

Our implementation of the authorization solution is designed in terms of Web services security standards. We adopt GridSphere as a JSR 168 compliant portlet container, Apache Axis [22] as Web services engine and PERMIS as ADF. If necessary, OpenLDAP [23] can be used to store the role-based policy. OpenSAML [24] is used to create a SAML assertion and WSS4J [25] is used to add security tokens such as SAML assertion in SOAP header and sign or encrypt SOAP message. Figure 3 shows the overall architecture.

4.2 Handlers We present in this section some implementation details so that the reader can reproduce our approach. In order to complete the steps mentioned above, a request handler or handler chain can be deployed on both portal and Web services side with minimal changes to the source code. On the portal side it can be implemented using the Axis 1.x API. In the code below a WSSecurity token handler is added to the SOAP message processing chain, which embeds a SAML authentication assertion in the SOAP header containing the user identity. SimpleChain sc = new SimpleChain(); sc.addHandler(new WssTokenHandler()); sc.addHandler(new WssSignHandler()); // adds a handler to the end of the chain call.setClientHandlers(sc, null);

4. Implementation 4.1 Architecture of Implementation

The first argument of the method setClientHanders is for request handler, the second is for response handler. The request

4

Figure 3. Our implementation of the general architecture shown in Figure 2 uses the software packages shown.

Figure 4. Request Handler Chain on Portal Side

Figure 5. The Request Handler in the Axis service container processes the SAML assertions in the SOAP header to extract the portal user’s identity. This identity will then be mapped to a role.

handler chain on the portal side is showed in Figure 4: On the Web services side, handlers are added in the configuration file, server-config.wsdd:

5

…

The Axis engine will load the handler determined by the deployment configuration. Figure 5 shows the request handler on Web services side. WssHandler first verifies the signature of the received SOAP message, and then extracts the user identity from SAML authentication assertion, after that maps it to LDAP Distinguished Name (DN), which is the value of subject. Action can be obtained from the element in the SOAP body. Target is set up by the name of the invoked Web services. These factors are essential for SamlADF to make a decision on the basis of entries stored in the LDAP. SamlADF acts as an authorization engine by using PERMIS API. A fragment of the source codes is shown below to initialize a generic Role-Based Access Control interface: DirContext dirCtx[] = new DirContext[1];

The previous discussion illustrates how a configured PERMIS system can be used to make authorization decisions. Prior to this, PERMIS itself must be properly configured with roles and rules (or policies) by an administrator. This administrator creates an XML-based policy using Policy Editor provided by the PERMIS software package. The policy includes the following: object ID, which acts as a handle, or name, for the policy instance; Source of Authority (SOA), a signing certificate for all role and service certificates; roles, which are specified with X.509 certificates; protected targets, which are the X.509 certificate identifiers for Web services; actions, which are methods of the Web service that can be invoked; and privilege allocation, i.e. which roles can do specific actions on a specific target.

Hashtable env = new Hashtable(); //LDAP connection and configuration information placed in the Hashtable. dirCtx[0] = new InitialDirContext(env); AttributeRepository repository = (AttributeRepository) LDAPRepository(dirCtx);

new

//Construct the PBAAPI object using the AttributeRepository. Specific constructor arguments omitted. PBAAPI pbaApi = new PermisRBAC(…); //Use the PBAAPI object to make an authorization decision. boolean result = pbaApi.decision(subject, action, target, null);

If result is true, the current user is granted the privilege to invoke the Web services, or terminate the processing and return an AxisFault exception. In our implementation, AEF and ADF are built as a single module to avoid an additional Web services invocation, which consists of a SAML request with authorization query and a SAML response with authorization statement

4.3 Setup of the PERMIS System

We take as an example the CIMA portal, which provides access to data collection services running at several facilities, including the Indiana University Molecular Structures Center (IUMSC). User roles consist of IUMSC_Researcher and IUMSC_Member, the target service base name is “OU=IUMSC, O=CIMA”, and service actions (i.e. Web service operations) include RequestSession and Register. The administrator creates X.509 Attribute Certificates (ACs) for users using either the Attribute Certificate Manager or the Privilege Allocator tools provided by PERMIS. For each general user, the AC contains the user identity and assigned roles with the signature of the SOA. For the SOA, the AC contains the policy mentioned above with self-sign. The SOA is required to hold a PKCS#12 file to store a pair of PKI keys, which is used to sign the ACs. The administrator can optionally use an LDAP server to store the ACs.. In order to load ACs into LDAP, the following schema is defined to support the Attribute Certificate attribute for entries. attributetype ( 2.5.4.58 NAME 'attributeCertificateAttribute' DESC 'A binary attribute certificate' SYNTAX 1.3.6.1.4.1.1466.115.121.1.5 ) objectclass ( 2.5.6.24 NAME 'pmiUser' SUP top AUXILIARY

6

DESC 'a pmi entity that can contain X509 ACs' MAY attributeCertificateAttribute )

4.4 User Mapping between Portal and PERMIS Generally, a user is required to provide username and password to authenticate to the portal. This user identity can be obtained through the portlet API. Meanwhile, PERMIS policy also defines users, roles and privileges. The user identity in PERMIS is in the form of an X.509 Distinguished Name (DN). Therefore portal user names need to be in correspondence with those in PERMIS. Because the user identity in portal is always a simple name or an email address, a mechanism is required to map it from portal to PERMIS. An easy and feasible way is to extract the simple name is to require it have the DN information in the simpler email-style through portal and construct the DN from this. For example, a user identity in portal is “[email protected]”, the user identity in PERMIS can be “CN=hayin, OU=IUMSC, O=CIMA”. This matching is currently done manually.

4.5 Federated Authorization for Multiple Organizations. The CIMA portal and Web service software are decoupled. A lab site may run its own data services but use the main CIMA portal, or it can use its own copy of the portal software. As the CIMA portal system for federating crystallography labs has been adopted by several sites in the United States and abroad, a single role such as "administrator" is not sufficient. For example, users at two sites may each have the role “administrator”, but the contents that should be displayed to each one are site- or instrumentspecific and therefore different for each user. The "administrator" role provided by the portal is not enough to distinguish between these two types of administrators, so an authorization policy hierarchy as described below is used for this purpose. The top level in the DN used by PERMIS is "O=CIMA". The second level can be used for different labs. For example, the DN for IUMSC could be "OU=IUMSC, O=CIMA" and the DN for the lab at Purdue could be “OU=Purdue, O=CIMA”. In our implementation, the DN for

each lab is the target name used in PERMIS. A user that belongs to these labs has a specific role related to each lab. For an Indiana University professor, the role may be “IUMSC_Researcher”. If the same user on the same portal can also access services at Purdue, the role may be “Purdue_Member” for this service. When the user accesses the CIMA portal, the portal system knows the correct Web Service to invoke and to construct the necessary elements for the PERMIS authorization engine. With this solution, PERMIS can support authorization management across organizations.

5. A Use Case for the CIMA Portal The CIMA project is a Web services-based approach to making instruments and sensor networks accessible in a standards-based, uniform way and for interacting remotely with instruments and the data they produce [26]. A high level overview of the CIMA architecture is shown in Figure 6. The CIMA Service runs as a Web service, and its responsibilities include communicating with all available instruments through their corresponding plugin implementation. As each plugin is specific to an instrument or instrument variable, it contains code for interacting with the instrument and knows how to process each request and construct an appropriate reply, placing its results on a channel that identifies the result as its own. On the portal side, the CIMA Sink is responsible for receiving real time data from instruments and displaying it as tables and graphs. In our application, the CIMA Service is deployed using the Apache Axis for Java API. Our PERMISbased authorization solution is embedded in both CIMA Sink and CIMA Service by using Web services handlers as provided by the Axis API. CIMA must include security in order to restrict access to an instrument, or to restrict the types of operations that particular users can perform. As SOAP messages are being used for communication, the PERMIS framework can be used as an authorization solution that will only allow requests verified by PERMIS to reach the CIMA Service. The scenario implemented for this paper considers the case where a Researcher is leading

7

an experiment with the aid of a group of Students.

Figure 6. A high level overview of CIMA system components includes real-time instrument data sources that are proxied through Web services using instrument-specific plugins. Sinks receive data from the service. Specific clients can be built using these sinks to display or process the data.

The instrument in use requires a valid session key to be provided before data can be retrieved from it. This session key can only be generated by the Researcher, which in turn distributes it to the Students. The Students then use the session key to register with the instrument and start receiving data. There are two possible actions, RequestSession and Register. The Student user is only authorized to perform the latter. If a Student issues a RequestSession request, the request will be denied by PERMIS before it reaches the CIMA Service. In order to test the CIMA and PERMIS implementation, both the CIMA Service and the CIMA Sink were modified to use PERMIS as their authorization method. For the CIMA portal at Indiana University, a Researcher is assigned with the role IUMSC_Researcher, and a Student with the role IUMSC_Member. Integration was successful in that whenever the Researcher performed any of the operations, PERMIS would verify and allow the request to continue. In contrast, when the Student attempted to perform a RequestSession operation, PERMIS would deny the request, returning an appropriate message to the Student, indicating that an unauthorized operation was attempted. None of the denied operations ever reached the CIMA Service, as PERMIS intercepted and denied them.

6. Conclusions and Future Work Access rights of portal users to interact with a particular portlet are qualitatively different from rights that a portlet may need in order to access a service on behalf of users. Hiding back-end services or securing them with an authentication layer leads to scaling difficulties, especially when

services are “owned” by different organizations. We have shown that PERMIS provides a flexible and lightweight role-based way to manage permissions when portlets access Web services for content, both within and across organizational boundaries. Roles and authorization policies based on them are defined by the service owner and associated with user identities shared with the portal owner. The sharing of roles across services and the distribution of management between the service provider (defining roles and policies, associating roles with users) and the portlet provider (defining users and providing user identities when portlets access content provider services) gives a scalable solution with local control. The PERMIS based authorization solution between portlets and Web services described here is built on industry standards. In conjunction with PERMIS, this solution provides both secure SOAP message transactions and an authorization mechanism to enhance the security of Web services. The limitation of our proof-of-concept implementation is that the portal user identities must be synchronized with PERMIS through external mechanisms and user name conventions. The username must match our X.509 DN structure (see 4.4), and the PERMIS administrator must assign that identity to one or more roles. This implementation shortcoming can be solved in the following general manner: the portal owner could create appropriate AC if the service provider gives the portal provider the Source of Authority certificates. Containers like GridSphere can have their user management system extended to support these additional actions (c.f. the Grid Account Management Architecture (GAMA) [27]). We also need to investigate the XML Key Management System [28] for managing the PKI

8

signing keys used in the system.

7. Acknowledgements CIMA is supported by National Science Foundation cooperative agreements and grants SCI 0330568 and MRI CDA-0116050, respectively. OGCE development is supported by the National Science Foundation’s Middleware Initiative, SCI 0330613. We thank Professor Jiliu Zhou, School of Computer Science, Sichuan University, China for supporting H.Y.; Professor David Chadwick, Computer Science Department, University of Kent and PERMIS team for their invaluable help.

Reference: [1] M. A. Smith, “Portals: toward an application framework for interoperability”, Commun. ACM 47, 10 (Oct. 2004), 93-97. available http://doi.acm.org/10.1145/1022594.1022600 [2] Hey, A. and Fox, G., eds. Concurrency and Computation: Practice and Experience, Vol. 14, No. 13-15 (2002). Special Issue on Grid Computing Environments. [3] Science Gateways Workshop, Global Grid Forum 14, July 27-30 2005, Chicago, Illinois. Workshop URL: http://www.gridforum.org/GGF14/ggf_events_ne xt_schedule_Gateways.htm [4] W3C Working Group, "Web Services Architecture", W3C, 2004, Available http://www.w3.org/TR/ws-arch [5] R. Bramley, K. Chiu, T. Devadithya, N. Gupta, C. Hart, J.C. Huffman, K.L. Huffman, Y. Ma, D.F McMullen. (2006) “Instrument Monitoring, Data Sharing and Archiving Using Common Instrument Middleware Architecture (CIMA)”, Journal of Chemical Information and Modeling, 46(3) p.1017-25, May-June 2006. [6] J. Novotny, M. Russell, O. Wehrens, “GridSphere: a portal framework for building collaborations”, Concurrency - Practice and Experience, 2004, 16(5), pp. 503-513. [7] A. Abdelnur, S. Hepper, "Java Portlet Specification version 1.0", 2003, Available http://www.jcp.org/aboutJava/communityprocess/ final/jsr168 [8] D.W. Chadwick, O. Otenko, (2002) “The PERMIS X.509 Role Based Privilege Management Infrastructure” In Proc 7th ACM Symposium On Access Control Models And Technologies (SACMAT 2002), Monterey, USA, pages 135140, June 2002.

[9] N. Hardy, (1988) “The confused deputy”, Operating Systems Review, 22(4), pp. 36-38, Oct. 1988. [10] D.F. Ferraiolo, J. A. Cugini, D. R. Kuhn, "RoleBased Access Control (RBAC): Features and Motivations," 11th Annual Computer Security Applications Proceedings, 1995. [11] W. Winsborough, J. Jacobs, “Automated trust negotiation in attribute-based access control,” DARPA Information [12] L. Fang, D. Gannon, F. Siebenlist, “XPOLA – An Extensible Capability-based Authorization Infrastructure for Grids.” 4th Annual PKI R&D Workshop: Multiple Paths to Trust Proceedings, August 2005. NIST Interagency Report 7224. Available http://csrc.nist.gov/publications/nistir/ir7224/NIS TIR-7224.zip. [13] R. Alfieri, R. Cecchini, V. Ciaschini, L. dell'Agnello, A. Frohner, A. Gianoli, K. Lörentey, F. Spataro, (2003) “VOMS, an Authorization System for Virtual Organizations” European Across Grids Conference 2003: 33-40 [14] L. Pearlman, C. Kesselman, V. Welch, I. Foster, S. Tuecke, (2003) “The Community Authorization Service: Status and Future” In Proceedings of the Conference for Computing in High Energy and Nuclear Physics, La Jolla, California, USA, Mar. 2003. [15] Shibboleth Project, Available at http://shibboleth.internet2.edu/ [16] M. Bartel, J. Boyer, B. Fox, B. LaMacchia, E. Simon, “XML-Signature Syntax and Processing”, W3C, Available http://www.w3.org/TR/xmldsigcore/ [17] T. Imamura, B. Dillaway, E. Simon, “XML Encryption Syntax and Processing”, W3C, Available http://www.w3.org/TR/xmlenc-core/ [18] Simple Object Access Protocol. http://www.w3.org/TR/soap [19] OASIS Web Services Security TC, "WSSecurity", OASIS, Available http://www.oasisopen.org/committees/tc_home.php?wg_abbrev=w ss [20] OASIS Security Services TC, "Security Assertion Markup Language (SAML)", OASIS, Available http://www.oasisopen.org/committees/tc_home.php?wg_abbrev=s ecurity [21] T. Howes, M.C. Smith, G.S. Good, T.A. Howes, “Understanding and Deploying Ldap Directory Services (MacMillan Network Architecture and Development Series)”, Macmillan Technical Pub, 1998 [22] Apache Axis Project, http://ws.apache.org/axis [23] OpenLDAP project, http://www.openldap.org [24] OpenSAML project, http://www.opensaml.org

9

[25] Apache WSS4J project, http://ws.apache.org/wss4j/ [26] D. F. McMullen, T. Devadithya, K. Chiu. “Integrating Instruments and Sensors into the Grid with CIMA Web Services” In Proceedings of the Third APAC Conference on Advanced Computing, Grid Applications and e-Research. 2005. [27] K. Bhatia, S. Chandra, K. Mueller (2005) “GAMA: Grid Account Management Architecture” In Proceedings of the First International Conference on e-Science and Grid Technologies (e-Science 2005), 5-8 December 2005, Melbourne, Australia. IEEE Computer Society 2005, ISBN 0-7695-2448-6 pp. 413-420. [28] W. Ford, P. Hallam-Baker, B. Fox, B. Dillaway, B. LaMacchia, J. Epstein, J. Lapp, “XML Key Management Specification (XKMS), W3C, 2001, Available: http://www.w3.org/TR/xkms/ Survivability Conference and Exposition (DISCEX III), April, 2003. [29] M. Lorch et al, "Conceptual Grid Authorization Framework and Classification,” Global Grid Forum Document GFD-I.038. Available from http://www.ggf.org/documents/GFD.38.pdf.

10

Kickstarting Remote Applications Jens-S. Vöckler1 1 University

Gaurang Mehta1

of Southern California Information Sciences Institute 4676 Admiralty Way Ste 1001 Marina Del Rey, CA 90292 {jens,gmehta,deelman} at isi dot edu

Yong Zhao2

2 University

of Chicago Dept. of Comp. Science 1100 East 58th Street Chicago, IL 60637 yongzh at cs dot uchicago dot edu

Ewa Deelman1

Mike Wilde3

3 Argonne

National Laboratories Math. and Comp. Sci. Division 9700 South Cass Avenue Argonne, IL 60439 wilde at mcs dot anl dot gov

The kickstart executable is a light-weight program that is distributed as part of the GriPhyN Virtual Data System. It sits in between the remote scheduler and the executable, gathering additional information about the executable run-time behavior, including, but not limited to, the exit status of jobs and information to ease grid debugging, in a uniform fashion without assistance from the scheduling system. Kickstart provides the necessary information required by provenance tracking systems to accurately annotate data products. The authors believe kickstart is of value for anyone running applications on the Grid. We would like to make the community aware of its existence and benefits.

1 Motivation

The original objective for kickstart was to reliably provide the remote application’s complete exit status, the exit code and any signals, in order to drive a workflow. It has since evolved into an extensible tool for job monitoring, Grid execution debugging and provenance tracking. Kickstart is able to gather additional information such as the resource usage of the application that just finished. Further features were added over time, collecting information about job status, file stat records, details about the compute environment etc, which are required by a provenance tracking systems. Features are discussed in section 3.

The GriPhyN [1] Virtual Data System [2] (VDS) provides a set of tools for expressing, executing, and tracking the results of scientific workflows. It ships with a set of light-weight compiled C tools and scripts to be used at the remote end of workflows. Each tool enhances Globus’s [4] capabilities, with kickstart offering the most potent extensions. When multiple jobs and dependecies among them are tied into a workflow, knowledge about the success or failure of finishing jobs is required before dependent jobs can be started. Besides the exit code, signals are important part of determining success or failures. For instance, some applications use assertions, causing an abnormal termination with SIGABRT when these are violated. In an environment of pre-WS GRAM services, R available in all releases of the Globus⃝ Toolkit, information about the success or failure of remote application executions is not returned to the submitting entity. On the other hand, WS-GRAM [5] returns the remote exit code of the application or which signal an application may have died upon, if the underlying scheduler supports passing it.

2 Architecture Kickstart was designed to mesh well with the Globus environment, both pre-WS and WSGRAM. Kickstart starts applications, not web services. The design, originally focussed on obtaining the remote exit status, is not intended to present additional challenges. To overcome firewall issues, kickstart takes over the stdio of the grid job, and lets Globus deal with the firewall setup to tunnel stdio back to the submit host.

1

local site

remote site

submit host

rem. scheduler

kickstart [0] unused

local capture

Globus

application -i fn

[0] stdin

[1] PTR

-o fn

[1] stdout

[2] ks err

-e fn

[2] stderr

defaults

defaults

[0] /dev/null [1] /dev/null [2] /dev/null

[0] /dev/null [1] /tmp/mkstemp() [2] /tmp/mkstemp()

Figure 1: Kickstart ’s position in the grid execution pipeline. Figure 1 shows the architecture where kickstart sits between the scheduler and the application. The application’s stdio are controlled by command-line flags, and attached accordingly to files or devices. However, if a user does not indicate a file, the application’s stdout and stderr are still captured in temporary local files to aide grid debugging. A configurable amount of information from these files becomes part of the result record kickstart produces. Kickstart currently passes the grid job’s stdin transparently to the application. The Pegasus [3] planner exploits this feature to pass configuration files to remote gridftp helpers. However, in the future of kickstart a job description will be carried on the grid job’s stdin, to ease the proliferation of command-line options and prevent exceeding resources like the arguments length limit. The stdout of the grid job provides the provenance tracking record (PTR) of the application’s execution and current execution environment. The PTR is an XML instance document1 which provides all data gathered, written as kickstart finishes. Kickstart uses the grid job’s stderr to report abnormal start-up conditions, and once it is running, to provide a best-effort application feedback channel. A kickstart -aware application will find 1

in its environment a reference to the feedback channel, upon which the application can send small data records to the submit host. While kickstart employs GRAM to transport data, it is independent of Globus. Kickstart can be utilized locally, in shell scripts and directly on the command-line.

3 Features Kickstart primarily captures the remote application’s full exit status, because it reaps the processes it started. The full exit status is a 16 bit integer on many systems. The most significant byte is the exit code that an application returns by means of the exit system call. The least significant byte contains the signal an application may have died upon, and if a core file was generated.

Figure 2: XML fragment for exit status.

Kickstart also captures the resource usage of an application in terms of the system time and user time, minor and major page faults, swap operations and received signals (not supported by all

The corresponding XML schema is documented at http://vds.isi.edu/docs/, see latest “IV” schema.

2

jobs. Many workflows use Condor-G [9] to add a layer of reliability and restartability over the Grid. However, Condor places a stringent limit of 4000 characters on the length of the arguments, much lower than the customary 64k and 128k offered by most systems. To overcome this limit, kickstart can read its arguments from a file. This file can be easily shipped from the submit host using appropriate Condor directives. Kickstart obtains a stat record for files such as the application executable that is to be started, kickstart itself, all stdio files and the FIFO. With help of the stat information, application failures due to insufficient rights or staging failures, resulting in zero-sized files, can be detected postmortem, even if the remote job directory no longer exists.

systems), as well as OS context switches. Other resource information like resident set sizes and IO operations are not supported by all operating systems, and are thus excluded for now.

Figure 3: XML fragment for r-usage record. Along with the resource usage for each application, kickstart captures information about the start wall-time according to the host’s clock, the job duration wall-time, the host’s primary interface’s IP address, the user-id, group-id, the current working directory, the environment variables a process sees, and uname-style information about the platform itself. Kickstart will connect the application’s stdio file descriptors to user provided files. If a user does not care about the application’s stdout or stderr, kickstart will still capture these into temporary files. A configurable chunk of both files will be placed in the PTR, permitting grid debugging. For instance, if a remote application fails to run properly, it often sends warnings to stderr. By always capturing and returning both, stdout and stderr, kickstart eases grid debugging, enabling a user at the submit host to see what is going wrong. This feature is meant for debugging only. If any significant data is produced on stdout or stderr, these should be tracked data product files. The current version of kickstart is set up to reflect some data records from the submit side into the PTR, notably the logical name of the application and the site handle, an unique identifier for the site. When kickstart starts, it creates a named pipe (FIFO) in the local filesystem, and places the name of the FIFO in the environment of any job that it starts. Thus, by dereferencing an environment variable, a kickstart -aware application can send arbitrary textual data to the submit side by writing to the FIFO. It even works in shell script

...

Figure 4: XML fragment for stat call. The information record above shows that the stat call worked fine (The “error” attribute is 0). The operand was the executable /bin/date, shown abbreviatedly. The result record of the stat call includes the file access mode in octal representation, the file’s size, its inode number on the filesystem, the link count for the file, the number of disk blocks it occupies, and the file’s owner and group, both numerically and symbolically. Furthermore, kickstart permits one to obtain stat records of arbitrary files, either before the application is started, or after the application finished. Thus, it can be determined, if an input file failed to stage, has zero length, or if output was improperly produced. Kickstart is capable of running more than just a

3

ciated with data product provenance. Kickstart ’s information is obtained in very close temporal and spatial proximity to the application being executed. Thus, its approach is superior to gathering the same necessary provenance information from other sources of less proximity. External services and even on-site catalogs always have a chance of being out of date, misconfigured, unavailable, or simply wrong. Not all services have the same access to all the information that kickstart is able to gather as concurrent parent of processes it starts. Most of the data provided by kickstart can be used to annotate data products, or associated with a workflow and stored in the provenance tracking catalog (PTC) of the VDS. The history data in PTC, combined with recipes (workflow and application definitions) and metadata annotations, enable powerful query capabilities that help discover, understand, validate and reuse data products and their associating applications and workflows [6]. Querying the PTC itself also reveals intriguing facts about ongoing research. We present some examples in the next section.

single application. It has a notion of a setup-, pre-, main-, post- and cleanup job. If a setup job exists, it is always run before anything else. Its exit status does not matter. The pre-, main- and postjob are logically chained. The chain is broken, if any of the jobs in the chain fails, as expressed by the job’s full exit status. The cleanup job is always run after all other jobs, and independent of any previous exit status. To simplify job specifications, kickstart supports variable rewriting using environment variables. ${appdir}/bin/myapp can be used as a valid notation for an application – e.g. in a transformation catalog entry – provided the environment supplies the proper appdir setting, which could be adjusted from a site catalog or by the site itself. While it is considered safer to always use a full absolute path to start an application, for the shell users among us, kickstart will search the PATH when it starts an application with a relative path name to the executable. Kickstart has been used for the last four years, having reached good maturity and stability in millions of executions in diverse execution environments.

5 Applications 4 Provenance CPU years 0.45 19.88 33.88 40.06 15.21 14.95

Provenance refers to the derivation history of a data product, from its origination, up to its current state. Kickstart captures the information about how a data product was created, and the resources and the execution environment used to produce it. Some applications access files which are mentioned neither in configuration nor invocation of the application. For this reason, kickstart makes no assumptions about data products. By design, it has no knowledge about them. The assocation between data products and kickstart -provided metadata is up to the provenance system. Kickstart provides necessary, accurate and essential, albeit not sufficient, information that is generally asso-

Jobs 1402 13267 20678 20229 30833 34591

Month 2004-06 2004-07 2004-08 2004-09 2004-10 2004-11

Table 1: U.S. ATLAS late 2004 jobs. During the second half of 2004, US ATLAS [8] production was captured in a PTC kept at BNL. Grid submission points at various locations across the country contributed provenance tracking records (PTRs) in a distributed manner. Ta-

4

100000

1000000 Jobs

Duration 10000

10000 1000 Jobs [#]

Jobs [#] / Duration [h]

100000

1000

100

100 10

10 1

25 26 27 29 30 31 34 42 43 44 45 49 50 51 Week [2005!2006]

2

3

4

5

7

1 5 10 20 30 45 60 120 180 240 300 360 420 480 540 600 660 720 840 960 1080 1200 1320 1440 1680 1920 2160 2520 2880 3240 3600 3960 4320

1 8

Duration [min]

(a) Jobs and Durations over Time.

(b) Histogram of job durations.

Figure 5: SCEC Provenance Traces. ble 1 shows the CPU-years2 , successful research domain jobs and month. Jobs were run on over 1376 distinct machines in the Grid, with the true number higher due to the fact that many sites use virtual private network addressesfor the primary interface of a worker node. While all architectures were i686 Linux systems, jobs were run on 37 different flavors thereof. In the time frame shown, over 300,000 kickstart records were captured. week 22 23 24 25 26 27 28 29 30

start of week 2006-06-03 11:56 2006-06-05 00:02 2006-06-12 09:37 2006-06-19 00:00 2006-06-26 00:44 2006-07-04 00:16 2006-07-10 00:00 2006-07-17 00:05 2006-07-24 11:26

NLP decoding workflows were run on the Grid and kickstart ed. Table 2 shows the data extracted from the PTC in preparation for, and during the evaluations. The data indicates that close to one CPU year was used every week. Similarly, in June and July, about 2.5 CPU-years were used respectively. The Southern California Earthquake Center (SCEC) ran workflows as part of the CyberShake Project [10] from Sept 2005 until Aug 2006. The CyberShake project is an analysis designed to compute probabilistic seismic hazard curves for regions in the Southern California area. Hundreds of hazard curves are required to compute the final hazard map for the Los Angeles region. Workflows using the Pegasus planner ran for eleven regions – with two regions repeated – totalling over 500,000 jobs. Of these, approximately 250,000 jobs were application domain jobs. The remaining 250,000 jobs were in the workflow management domain, including data transfer and replica registration jobs. CyberShake used about 2.5 CPU-years on two TeraGrid clusters at NCSA and Sandiego, as well as the HPC Cluster at USC. Figure 5(a) shows the number of jobs and their duration over the weeks between 2005 and 2006. Figure 5(b) shows the histogram

CPU days 7.69 70.10 234.38 374.24 310.23 275.77 163.70 206.89 241.35

Table 2: CPU consumption during GALE and NIST evaluations. ISI participated in the GALE and NIST [7] evaluations of its natural language processing (NLP) system in 2006. Applications of the ISI 2

Using 31,557,600 s = 365.25 days – the average length of a year in the Julian calendar.

5

quired for proper functioning of kickstart . To solve this problem, kickstart needs to be integrated into a variant of the Condor condor_startd. When gathering the resource usage of terminated applications, Linux and Mac OS do not provide values for all data fields. Many fields of interest are always zero. For this reason, the resource usage record in the PTR is more limited than desirable. A solution for Linux requires changes at the kernel level. It is conceivable to obtain a subset of the missing data from monitoring of the underlying system. Many remote cluster sites are optimized for MPI support. Some experiments have shown that kickstart can be used in MPI environment to obtain application status without special compilation. However, it requires minor fixes to alleviate write-after-write conflicts when multiple parallel instances write their PTR into the same file. Kickstart streams data from the application feedback channel. These information chunks can be returned during the lifetime of a job. While Globus pre-WS GRAM, by default, streams the application feedback channel data on a best-effort basis, it can only stream what the scheduler and file system make available. Due to performance and resource issues, WS-GRAM prefers to stage stderr. Thus, the application feedback channel feature as provided by kickstart is not very useful for immediate data like heartbeats, because data is returned to the submit host only after the job finished. While it is useful to obtain and return core dumps, doing so presents some challenges. First, systems generate core dumps with different file naming strategies. Similarly, differences between the submit host and the remote execution environment, e.g. version of libc and other libraries, further limit the usefulness of inspecting a core file on the submit host. Remote resource limits may truncate the core file. Finally, files of the potential size of core dumps should not be streamed via the GASS transport which kickstart employs for

of the application domain’s main job’s duration. Due to variations in the amount of data, a job’s duration ranges between one minute and three days. The provenance record provided by kickstart enabled SCEC to estimate future computational requirements. The records also revealed how much time the code spent on different machines, the number of page faults and other useful instrumentation. Equipped with this knowledge, the SCEC scientists were able to reduce the computational requirements three to four times. Kickstart helped provide information about number of job failures (38,235 jobs) and their duration (2.5 hours total) due to various data transfer issues. It also helped debugging these transfer issues by capturing remote error information and returning it to the submit host for post-mortem analysis. The examples above focus on the job durations, though the team of kickstart and PTC provide more than that. For instance, they can well support queries and statistics such as jobs that ran on specific sites, specific hosts, with specific OS types, the effect of different parameter values to the run times, abnormal jobs that ran a lot longer than the average run time, and so on. Any general provenance tracking scheme requires knowledge about the what, when, where, and how data products were derived. The information provided by kickstart , and stored in the PTC, enable provenance systems to gather these essential attributes about data products.

6 Short-comings Kickstart ’s development is requirement driven. It is designed for applications, not web services. While many may consider this a short-coming, the majority of research algorithms are found in applications. If applications want to use check-pointing, and are compiled for the Condor standard universe, Condor prohibits certain system calls that are re-

6

8 Thanks

its own comparatively short messages.

This work is supported in part by the National Science Foundation GriPhyN project under contract ITR-086044. Our thanks go to: Ian Foster, Carl Kesselman, Mei-Hui Su, Karan Vahi, Valerie Taylor, SeungHye Jang, Rob Gardner, Marco Mambelli, Jerry Gieraltowski, Xin Zhao, Luc Mureau, Daniel Marcu and Steve DeNeefe.

7 Future Kickstart in the current version is a wellmatured and stable application that has proven its merit in millions of invocations. At some point, it is conceivable that kickstart will start applications in an instrumented fashion, accumulating knowledge about any open system call an application issues. Thus, kickstart will be able to automatically provide information about any file utilized by an application.

References [1] "The GriPhyN Project: Towards Petascale Virtual Data Grids"; Paul Avery, Ian Foster; Technical Report GriPhyN-2001-15, http://www.griphyn.org, 2001.

Users have frequently requested to augment the stat information record in the PTR with an optional MD5 sum of the file’s content. A prototype exists in v2.

[2] "The Virtual Data Grid: A New Model and Architecture For Data-Intensive Collaboration"; Ian Foster, Jens Vöckler, Mike Wilde, Yong Zhao; in First Biennial Conference on Innovative Data Systems Research, Jan 2003.

For some years now, a prototype of a version 2 has been used to experiment with a configuration driven instead of commandline driven application start. The configuration driven approach will alleviate a lot of the messiness concerning quoting and shell variable expansion.

[3] "Pegasus: a Framework for Mapping Complex Scientific Workflows onto Distributed Systems"; Ewa Deelman, Gurmeet Singh, Mei-Hui Su, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Karan Vahi, G. Bruce Berriman, John Good, Anastasia Laity, Joseph C. Jacob, Daniel S. Katz; Scientific Programming Journal, 13(3):219237, 2005.

Since kickstart is active concurrently with the job it started, it is possible to extend it further to include system resource monitoring, heartbeats, and even an API to remotely check, suspend and resume applications. A prototypical heartbeat is part of v2. To implement the job control API above, kickstart needs to “phone home” by contacting a service, providing a bi-directional communication link. The service can be hosted on the submit host or on a dedicated machine. Over this link, data chunks from kickstart -aware applications can be multiplexed with other data, including multiple channels per application for different purposes. To overcome firewall issues, it will be necessary to run a proxy application for the phone-home protocol on the gatekeeper host.

[4] http://www.globus.org/ "The Grid: Blueprint For A New Computing Infrastructure"; Carl Kesselman, Ian Foster, editors; Morgan Kaufmann, 2004. [5] "Globus Toolkit Version 4: Software for Service-Oriented Systems."; Ian Foster; IFIP International Conference on Network and Parallel Computing; Springer-Verlag LNCS 3779, 2-13, 2005.

7

[9] "Condor-G: A Computation Management Agent for Multi-Institutional Grids"; J. Frey, T. Tannenbaum, I. Foster, M. Livny, S. Tuecke; Cluster Computing, 5(3):237-247, 2002.

[6] "Applying the Virtual Data Provenance Model"; Yong Zhao, Mike Wilde, Ian Foster; International Provenance and Annotation Workshop (IPAW ’06) 2006. [7] Working Notes of the NIST MT Evaluation Workshop; Washington D.C., September 2006.

[10] "Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Experiment"; Ewa Deelman et al.; 2nd IEEE International Conference on e-Science and Grid Computing, 2006.

[8] "The Capone Workflow Manager"; M. Mambelli et al.; Proc. Computing in High Energy and Nuclear Physics (CHEP ’06), 2006.

8

Real-time Storm Surge Ensemble Modeling in a Grid Environment

Lavanya Ramakrishnan1, Brian O. Blanton4, Howard M. Lander1, Richard A. Luettich, Jr.3, Daniel A. Reed1, Steven R. Thorpe2 1 Renaissance Computing Institute, 2MCNC, 3UNC Chapel Hill Institute of Marine Sciences, 4Science Applications International Corporation {lavanya, howard, dan_reed}@renci.org, [email protected], [email protected], [email protected] scale modeling and analysis has driven the use of Abstract high performance resources and Grid environments Natural disasters such as hurricanes heavily for such problems. impact the US East and Gulf coasts. This creates In this paper, we describe the distributed the need for large scale modeling in the areas of software infrastructure used to run a storm surge meteorology and ocean sciences, coupled with an model in a Grid environment. The sensitivity to integrated environment for analysis and timely model completion drives the need for information dissemination. In turn, this means there specific techniques for resource management and is an increased need for large-scale distributed increased fault tolerance when the models run in a high performance resources and data environments. distributed Grid environment. This framework was In this paper, we describe a framework that allows developed as a component of the Southeastern a storm surge model-ADCIRC to be run in a Universities Research Association’s (SURA) distributed Grid environment. This framework was Southeastern Coastal Ocean Observing and developed as a component of the Southeastern Prediction (SCOOP) program[20]. The SCOOP Universities Research Association’s (SURA) program is a distributed project that includes Gulf Southeastern Coastal Ocean Observing and of Maine Ocean Observing System, Bedford Prediction (SCOOP) program. SCOOP is creating Institute of Oceanography, Louisiana State an open-access grid environment for the University, Texas A&M, University of Miami, southeastern coastal zone to help integrate regional University of Alabama in Huntsville, University of coastal observing and modeling systems. North Carolina, University of Florida and Virginia Specifically this paper describes a set of techniques Institute of Marine Science. SCOOP is creating an used for resource selection and fault tolerance in a open-access grid environment for the southeastern highly variable ad-hoc Grid environment. The coastal zone to help integrate regional coastal framework integrates domain-specific tools and standard Grid and portal tools to provide an observing and modeling systems. Specifically, our integrated environment for forecasting and effort in this program is focused on two main areas: information dissemination. 1) storm surge modeling for the south east coast; and 2) experimenting with novel techniques to use 1. Introduction grid resources to meet real-time constraints of the Year after year, the US East and Gulf coasts are application. The storm surge component uses the heavily impacted by hurricane activity causing Advanced Circulation (ADCIRC)[12] model that large number of deaths and billions of dollars in computes tidal and storm surge water and currents, economic losses. For example in 2005, there were forced by tides and winds. While our framework 14 hurricanes, exceeding the record of 12 in 1969, was developed in the context of ADCIRC, the out of which 7 were considered major hurricanes solution is more general and is applicable for [9]. To help reduce the impact of hurricanes, there running other models and applications in grid is a need for an integrated response system that environments. In fact the framework is currently enables virtual communities [1] to evaluate, plan being applied to other models in the context of the and react to such natural phenomena. The integrated North Carolina Forecasting System[22]. system needs to handle real-time data feeds, Our solution builds on existing standard grid and schedule and execute a set of model runs, manage portal technologies including the Globus toolkit [2], the model input and output data, make results and Open Grid Computing Environment (OGCE)[4] status available to the larger audience. In addition, and lessons learned from grid computing efforts in to enhance the scientific validity of the models there other science domains, such as bioinformatics[21], is a need to be able to recreate scenarios and re-run astronomy[5] and other projects. A portal provides the models for retrospective analysis[19]. The large-

the front-end interface for users to interact with the ocean observing and modeling system. The users can conduct retrospective analysis, access historical data from previous model runs and observe the status of daily forecast runs from the portal. The real-time data for the ensemble forecast arrives through Unidata’s Local Data Manager (LDM)[15], an event-driven data distribution system that selects, captures, manages and distributes meteorological data products. Once all the data for a given ensemble member has been received, available and suitable grid resources are discovered using a simple resource selection algorithm. The model run is then executed and the output data is staged back to the originating site. The final ensemble result of the surge computations is inserted back into the SCOOP LDM stream for subsequent analysis and visualization by other SCOOP partners [18].

2.

Science Drivers

Before we detail our design and techniques, we present a brief description of the science elements that are the motivation for our decisions. As mentioned earlier for the storm-surge forecasts, we use the tidal and storm-surge model ADCIRC[12]. ADCIRC is a finite element model that solves the shallow-water generalized wave-continuity equations for a thin fluid layer on a rotating platform. The ADCIRC model is parallelized using Message Passing Interface (MPI). In the current implementation, we use a relatively coarse representation of the western North Atlantic Ocean. Figure 1. shows a 32-processing element decomposition of this ADCIRC grid. Storm surge modeling requires assembling input meteorological and other data sets, running models, processing the output and distributing the resulting

Figure 2. Timeline showing the computation of a hotstart file and a subsequent forecast. On Day K, the hotstart computed "yesterday" (Day K-1) is used to bring the hotstart sequence up to date, and an 84hour forecast is subsequently computed. This same hotstart file is used "tomorrow" (Day K+1) to start Figure 1. Domain decomposition of a highthe sequence over again. resolution ADCIRC grid used in the SCOOP computational system. information. In terms of modes of operation, most meteorological and ocean models can be run in ‘hindcast’ mode, as an after fact of a major storm or hurricane, for post-analysis or risk assessment, or in ‘forecast’ mode for prediction to guide evacuation or operational decisions[19]. The forecast mode is driven by real-time data streams while the hindcast mode is initiated by a user. Our framework is designed to support both these usage models for running ADCIRC and other models in a Grid environment. Further, often it is necessary to run the ADCIRC model with different forcing conditions to analyze

In this paper, we describe the interaction of the Grid components and specific techniques used for resource selection and fault tolerance during model execution. The rest of the paper is organized as follows. In §2 the science drivers are described in greater detail. We describes our design philosophy in greater detail in §3. The architecture and technology components are presented in §4 and §5, experiences from our system and related work in §6 and §7, and we present our conclusions and future work in §8. 2

forecast accuracy. This results in a large number of parallel model runs, creating an ensemble of forecasts. The meteorological modeling community has long recognized that a consensus forecast, based on an ensemble of forecasts, generally has better statistical forecast skill than any one of the ensemble members[14, 11]. Thus, we have taken an ensemble approach to storm-surge forecasting that requires access to a large number of computational clusters, coordinated access to data and computational resources, and the ability to leverage additional resources that may become available over time. Our operational cycle is tied to the typical 6-hour synoptic forecast cycle used by the National Weather Service and the National Centers for Environmental Prediction (NCEP). NCEP computes an atmospheric analysis and forecast four times per day, for which the forecast initialization times are 00Z, 06Z, 12Z, and 18Z. As ADCIRC solves discrete versions of partial differential equations, both initial and boundary conditions are required for each simulation. Boundary conditions include the wind stress on the ocean surface (an ensemble member, described below) and tidal elevations. The initial conditions for each simulation are taken from a previously computed “hindcast” that is designed to keep the dynamic model up-to-date with respect to the analyzed atmospheric model state. This is called hot-starting the model. For each synoptic cycle, a hot-start file is computed that brings the model state forward in time from the beginning of the previous cycle to the start of the current forecast cycle (Figure 2). The wind field boundary conditions for each simulation are taken from a variety of sources, each of which constitutes one member of the ensemble. In addition to the atmospheric model forecasts provided by NCEP, the SCOOP project also uses tropical storm forecast tracks from the National Hurricane Center to synthesize “analytic” wind fields. Each forecast track is statistically perturbed and an analytic vortex model[13] is used to compute the wind and pressure fields for each track. In the SCOOP project, this service is provided by the University of Florida and the wind files arrive through LDM. We are currently investigating the skill of this ensemble approach, and results will appear in a separate communication.

3.

Design Philosophy

The need for timely access to high performance resources for the large suite of ensemble runs makes it important to have a distributed, fault tolerant Grid environment for these model runs. Based on earlier experience in storm surge modeling and the lessons learned from other inter-disciplinary Grid efforts, we identified a set of higher level design principles that helped guide the architecture and implementation of the system. Scalable real-time system: As discussed earlier, using ensemble modeling the forecast accuracy can be increased. Running multiple high resolution, large-scale simulations necessitates the need for a scalable and distributed real-time system. Thus, our system is based on Grid technologies and standards allowing us to leverage access to ad-hoc resources that may become available. Extensible: While this effort has been largely focused in the context of the SCOOP ADCIRC model, our goal is to build a modular architecture to be able to support other applications and add additional resources as they become available. Adaptable: The criticality and the timeliness aspects of the science and the variability in grid environments require the infrastructure to be adaptable at various levels. The infrastructure needs to have active monitoring and adaptation components that can react to these changes and ensure successful completion of the models using fault tolerance and failure recovery techniques. Specifically, based on these underlying design principles, we are focused on building a framework that can be used for real-time storm surge ensemble modeling on the Grid that is triggered by arrival of wind data. The required timeliness of the model runs makes it important to address the following issues on the Grid: a) real-time discovery of available resources b) managing the model run on an ad-hoc set of resources c) continuous monitoring and adaptation to allow the system to be resilient to the variability in Grid environments.

4. Data and Control Flow of the NC SCOOP System The ADCIRC storm surge model can be run in two modes. The “forecast” mode is triggered by real-time data arrival of wind data from different sites through the Local Data Manager[15]. In the “hindcast” mode, the modeler can either use a portal or a shell interface to launch the jobs to investigate prior data sets (post-hurricane). Figure 3 shows the 3

architectural components and the control flow for the NC SCOOP system: 1. In the forecast run the wind data arrives at the local data manager (Step 1.F. in Figure 3). In our current setup, the system receives wind files from University of Florida and Texas A&M. Alternatively, a scientist might log into the portal and choose the date and the corresponding data to re-run a model (Step 1.H. in Figure 3). 2. In the hindcast run, the application coordinator locates relevant files using the SCOOP catalog at UAH[23] and retrieves them from the SCOOP archives located at TAMU and LSU[17]. In the forecast runs, once the wind data arrives, the application coordinator checks to see if the hotstart files are available locally or are available at the remote archive. If they are not available and not being generated currently (through a model run), a run is launched to generate the corresponding hotstart files to initialize the model for the current

resources. The application package is customized with specific properties for the application on a particular resource and includes the binary, the input files and other initialization files required for the model run. 6. The self-extracting application package is transferred to the remote resource and the job is launched using standard grid mechanisms. 7. Once the application coordinator receives the “job finished” status message, it retrieves the output files from the remote sites. 8. In case of the hindcast mode, the results are then available through the portal (Step 8.H in Figure 3). Additionally, in case of forecast mode, we push the data back through LDM (Step 8.F in Figure 3). Data is then archived and visualized by other SCOOP partners downstream. 9. The application coordinator publishes status messages at each of the above steps to a centralized messaging broker. Interested components such as

Figure 3. The control flow through the various components of the architecture forecast cycle. the portal can subscribe to relevant messages to 3. Once the model is ready to run (i.e. all the receive real-time status notification of the job run. data is available), the application coordinator will 10. In addition the resource status information use the resource selection component to select the is also collected across all the sites and can be best resource for this model run. observed through the portal as well as used for 4. The resource selection component queries more sophisticated resource selection algorithms. the status at each site and ranks the resources, 5. Technology Components accounting for queue delays and network We have described the flow through the control connectivity between the resources. system and identified the key components of the 5. The application coordinator then calls an architecture. In this section we will discuss in application specific component that prepares an greater detail the design issues, technology choices application package that can be shipped to remote and implementation of the architecture components. 4

As noted earlier, our architecture is based on existing open source grid middleware and web services tools such as Globus[2], Open Grid Computing Environment (OGCE)[4] and WSMessenger[10]. We describe each of the components in detail below. 5.1. Data Management The data transport system in SCOOP is based on Unidata’s Local Data Manager (LDM). LDM allows us to select, capture, manage, and distribute arbitrary data products over a networked set of computers. LDM is designed for event-driven data distribution where a client may ingest data. In addition an LDM server can communicate with other LDM servers to either receive or send data. LDM is flexible and allows for site-specific configuration and processing actions on the data. The ADCIRC model receives its upstream wind and meteorological data through LDM and the model results are sent downstream to other SCOOP partners through LDM for archiving and visualization. LDM allows us to associate triggers with arriving data that can be used for launching automated model runs. In the long term we anticipate that there might be multiple ways that the data might arrive. In this case, the model runs may need to be triggered by a higher level component. We also use GridFTP to manage data movement during model execution. In addition, we use the SCOOP catalog[23] to locate the data files that may have been generated previously. If available, the files are retrieved from the SCOOP archives[17]. The two types of files retrieved from the archive are the hotstart files to initialize the model run and the netCDF wind files. The wind files arrive through LDM for the forecast runs but may need to be retrieved from the archive for the hindcast runs. In addition, this gives us the ability to use the wind files from the archive to reduce data movement costs during forecast model execution. 5.2. Grid Middleware In the last few years, there has been increased deployment of Grid technologies on commodity clusters. These clusters are used to run scientific applications and are shared across different organizations forming large interdisciplinary virtual communities. For our system we assume a minimal software stack composed of existing grid technologies and protocols to manage jobs and files, namely, Grid Resource Allocation and Management (GRAM)[2] and GridFTP[6] based in the Globus

toolkit. Additionally, the Globus Monitoring Discovery System (MDS)[28] and Network Weather Service (NWS)[27] configured at a site is used to make a more informed resource selection. During the resource selection process, each of sites is queried for the queue status and the bandwidth to each site. The resource selection process is described in greater detail in §5.5. Once a resource is selected, a credential to be used at this site is obtained from a MyProxy[3] server. MyProxy server is a credential management service that stores Globus X.509 certificates. MyProxy allows users to store their certificates and private keys in the repository making it accessible from different distributed resources. MyProxy issues a short lifetime certificate to the system that can then be used to authenticate to the remote system. 5.3. Application Coordinator The Application Coordinator acts as a central component for each of the model runs whether initiated by the user through the portal or triggered by the arrival of data through LDM. It uses the resource selection component to select a grid site. After the user proxy is obtained, the Application Coordinator is able to perform Grid operations on behalf of the user (in case of the hindcast) or a preconfigured user (for the forecast). The application manager invokes a specified script to generate a self-extracting package of the application for the particular remote site. This self extracting package is transferred to the remote site using GridFTP. Once the file is transferred, the job is submitted to the Globus gatekeeper using the GRAM protocol. The GRAM protocol also allows users to poll for the status of the jobs or associate listeners that get invoked when the job status changes. Additionally, when the job completes, we use GridFTP to retrieve the compressed set of output files. The Application Coordinator has been designed to take configuration parameters about the application, its requirements and environment. This module supports running ADCIRC with different grids for different geographical regions and configurations. More recently, the module is being customized to be used with different meteorological models. 5.4. Application Preparation In this work, we assume that the need for urgent computing may necessitate situations that result in ad-hoc quick social arrangements to make resources available during a major storm or weather event. 5

This has implications on how and what we can expect a site to have installed and/or preconfigured. It is possible that the binaries may not

ADCIRC might vary slightly on different resources. ! Finally, create the compressed file

Figure 4. Job status and resource status from the Figure 5. Job History and result files from the portal portal

Figure 6. Hindcast mode from the portal be installed on the target resource. Once a resource for a particular ensemble member is selected, we need to create the application package that will be needed for the particular resource. We create a selfextracting archive file using an open source product called makeself. The self-extracting archive file contains everything that is needed for a model run and is the only file that is transferred to the selected grid resource. While in this particular work, the module contains the binary as well, it is possible to use this for applications which might be preinstalled at sites. The specific steps that are involved in creating this bundle include: ! Running a program that converts the netCDF version of the input wind file to a version compatible with the ADCIRC model. ! Select the correct set of ADCIRC executables for the given resource architecture and model run. ! Identifying specific arguments that are required at the remote end when the bundle is extracted, e.g. the actual MPI command for

containing the binary and all the input data. As described previously in §2, these model runs are usually hotstarted with previous day’s model results. The Application Preparation module checks a “correspondence” description file to identify the type of hotstart file required for a particular wind type and grid. It checks to see if this file has been generated previously and is available either locally or remotely in the archives. If the file does not already exist, it checks to see if another process is running right now that might generate it. If the file is being generated by another process it waits for the process to complete, otherwise a process is launched to generate the hotstart file. 5.5. Resource Selection The Grid sites vary greatly in performance and availability. Even with pre-established arrangements for exclusive access, resources and/or services maybe down or unavailable. Hence, given the criticality of the model run completion, we choose to use a dynamic resource selection algorithm to select an appropriate site for the job submission. 6

During the resource selection process, each of the sites is queried for the queue status and the bandwidth. Globus MDS[28] is an information service that aggregates information about resources and services that are available at a site. Network Weather Service (NWS)[27] is a sensor based distributed system that periodically monitors and dynamically forecasts performance measurements such as CPU and bandwidth. We have developed a simple plug-in based resource-ranking library. While currently we use only real-time information, the library is flexible in allowing us to collect historical information to make better and more accurate predictions. The question we try to answer in our resource selection is "Where should I run this job right now?" The library is built on top of Java CoG Kit[24] and uses the standard libraries for querying resources. The framework is completely extensible and can easily accommodate more sophisticated algorithms in the future. The resource selection first searches a list of remote resources to confirm availability in terms of appropriate authentication and authorization access to the resource, ascertain running of the basic Globus services such as GridFTP and GRAM. All remote resources meeting the above requirements are then ranked according to the real-time information including queue status and bandwidth. This allows us to balance the implications of data movement costs with computational running time. Based on queue and the bandwidth a total time estimate on each resource is calculated to rank the resources. The algorithm takes approximate running times for the model and the data sizes as input to perform this calculation. 5.6. Portal In addition to timely execution of the model, it is important to be able to share the data with the community at large while shielding the consumers of the information from the complexity of the underlying system. We use an Open Grid Computing Environment (OGCE) based portal interface to make available the status of the runs and output files from the daily forecast runs. Figure 4 shows the status of the model runs and Figure 5 shows the results available from the portal. A color marker shows the current state of the run (i.e. “data arrived”, “running”, etc.). In addition end users can use the portal to launch hindcast model execution

(Figure 6) in a grid computing environment using the files from the SCOOP archives. 5.7. Fault Tolerance and Recovery We apply a number of techniques to diagnose and repair errors that might occur during run-time, using a two-phase approach in the ADCIRC Application Manager. The first phase uses retries in the event of a failure or a timeout, and the step is retried a specified number of times. If the retries do not resolve the failure, a "persistent" error has occurred. The execution of the application coordinator has distinct phases (move files, run job, etc.). Persistent errors may occur in one of these labeled phases. A persistent error causes the decoder to retry beginning at an appropriate earlier phase. In addition, certain kinds of persistent errors, such as a failure to successfully transfer a file to a selected resource, cause that resource to be omitted from consideration during the resource selection phase of the retry. This error handling allows the complete execution of model runs under many different adverse circumstances, taking advantage of the inherent redundancy in a grid enabled environment. The application manager can easily detect errors and take appropriate rectification action. But sometimes errors might occur at the model level producing garbled data or a process might run longer than expected and not produce the output. In future implementations, we anticipate we will need additional error checking to detect these scenarios to decrease the probabilities of failures. 5.8. Monitoring and Notification A central component of our design is proactive monitoring of the status of the application and data. This monitoring system is based on standard tools and techniques such as Network Weather Service[27] and instrumentation points at various points of the data flow. The key to managing a distributed adaptation framework is a standard messaging interface. Our messaging interface is based on the workflow tracking tools and eventing system (WS-Messenger)[10] being built as part of another NSF ITR project - LEAD (Linked Environments for Atmospheric Discovery)[26]. Every component in our system publishes status information such as “input data arrived”, “task started”, “task finished”, etc. This status information is available through the portal interface (Figure 5). In addition, the resource monitoring portlet reads a web service we created that serves 7

CPU availability and network bandwidth data. The data itself is currently collected using MDS, NWS and the LEAD eventing system and then stored in a MySQL database.

6.

tolerance that was needed in the Application Coordinator to recover from various errors that might occur during execution. This was then built into the more recent version of the Coordinator. We are currently planning on wrapping these capabilities as web services allowing for more wide spread use in the Grid framework and workflow tools. Our resource selection algorithm is simplistic, but more generally the framework we have developed allows us to easily integrate other more sophisticated algorithms that are being researched in the Grid community.

Deployment Experiences

Various components of the framework have been tested and deployed in the context of hurricane storm surge over the past two years. In this section we briefly describe our evaluation and experiences. 6.1. Resource Pool Management The following SCOOP partner and SURAGrid sites have been tested and added to the resource pool for ADCIRC - local resources at Renaissance Computing Institute (RENCI), Texas A&M University (TAMU), University of Florida(UFL), University of Alabama at Huntsville(UAH), and University of Louisiana Lafayette (ULL). Each of the sites run basic Globus grid services such as the Gatekeeper for job submission, GridFTP for file transfer, and an information service and Network Weather Service. Our current infrastructure is based on the pre-web service protocol stack available in Globus versions 2.x through 4.x. It is important that the basic Globus services are configured correctly at all sites that might be used for the model runs. We have a test suite that is used to test all sites to verify the basic services are running and configured correctly. The test suite verifies the access rights, firewall, configuration of Globus services and the batch scheduler that might configured at the site. The sites are tested periodically to verify correct operation. The test suite helps detect, diagnose errors more proactively. To easily add resources to the pool, we use configuration properties. This allows us to add other resources to the pool, without any programmatic changes. The properties include the addresses for the Globus services, firewall port information and security credentials that can be used for a resource. 6.2. Application Coordinator The application coordinator is configured using a property file allowing easy addition of model configuration parameters etc. An application can use the framework by supplying application specific properties and scripts for creating the packaging, etc. As mentioned the framework is being applied to the North Carolina Forecasting System to run ADCIRC with different grids and other meteorological models. Our early experiences showed the need for higher resilience and fault

7.

Related Work

Grid computing has been increasingly used to run scientific applications from different domains including earthquake engineering, bioinformatics, astronomy, meteorology, etc. Our framework specifically addresses the problems of the need of increased reliability and fault tolerance and recovery that is needed in the context of time sensitive application such as storm surge prediction. Grid scheduling and adaptation techniques have been based on evaluating system and application performance are used to make scheduling and/or rescheduling decisions. Heuristic techniques are often used to qualitative select and map tasks to available resource pools[25]. Our resource selection algorithm is fairly simplistic and only considers queue status and bandwidth measurements to make a decision. While this is simplistic, it works effectively in our current resource environment. The API has been designed to be flexible to allow easy addition of other more sophisticated algorithms in the future.

8.

Conclusions and Future Work

This framework provides a solid foundation on which to build a highly reliable Grid environment for applications that might be time sensitive and/or critical. An enhancement to the computational system currently being developed is the selection of the ADCIRC model grid based on the predicted storm landfall location. We envision a suite of ADCIRC domains with the same basic open-ocean detail, but with different grids resolving different parts of the coastal region and supporting flooding and surge inundation. Grid and portal standards have been a moving target for a few years now. Our software stack for this work was guided by state of the art at the time of the project inception. More recently technology 8

implementations of the standard (i.e. JSR 168) and grid standards (WSRF) have stabilized and we will transition to support Globus 4.0 web services and OGCE-2 for our portlets. Our experiences with building and deploying the framework emphasize the need for increased fault tolerance and recovery techniques to be implemented in real Grid environments. We are investigating standardized web services interfaces that will allow applications to be easily run in a Grid environment with capabilities such as resource selection and fault tolerance. In addition, userfriendly modules that allow scientists to specify the properties needed by the Application Coordinator are being investigated. Data collected from the operation of the framework during the hurricane season will drive further evolution of the framework.

9.

10. References 1. I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations,” International Journal of Supercomputer Applications, 15(3), 2001. 2. I. Foster and C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” International Journal of Supercomputer Applications, 11(2):115-128, 1997. 3. J. Novotny, S. Tuecke and V. Welch, “An Online Credential Repository for the Grid: MyProxy,” Proceedings of the Tenth International Symposium on High Performance Distributed Computing (HPDC-10), August 2001. 4. Open Grid Computing Environment (http://www.collab-ogce.org/nmi/index.jsp) 5. M. Russell, G. Allen, I. Foster, E. Seidel, J. Novotny, J. Shalf, G. von Laszewski and G. Daues, “The Astrophysics Simulation Collaboratory: A Science Portal Enabling Community Software Development,” Proceedings of the Tenth International Symposium on High Performance Distributed Computing (HPDC-10), pp. 207-215, 2001. 6. W. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal and S. Tuecke, “Data Management and Transfer in High Performance Computational Grid Environments,” Parallel Computing, 28 (5), pp. 749-771, May 2002 7. I. Foster, C. Kesselman, G. Tsudik and S. Tuecke, “A Security Architecture for Computational Grids,” Fifth ACM Conference on Computer and Communications Security, pp. 83-92, 1998. 8. L. Pearlman, C. Kesselman, S. Gullapalli, B.F. Spencer Jr., J. Futrelle, K. Ricker, I. Foster, P. Hubbard and C. Severance, “Distributed Hybrid Earthquake Engineering Experiments: Experiences with a Ground Shaking Grid Application,” NEESGrid Technical Report-2004-42, 2004. 9. Climate of 2005: Atlantic Hurricane Season. http://www.ncdc.noaa.gov/oa/climate/research/2005 /hurricanes05.html, 2006. 10. Y. Huang, A. Slominski, C. Herath, and D. Gannon, "WS-Messenger: A Web Services based Messaging System for Service-Oriented Grid Computing ," 6th IEEE International Symposium on Cluster Computing and the Grid (CCGrid06).

Acknowledgements

This study was carried out as a component of the “SURA Coastal Ocean Observing and Prediction (SCOOP) Program”, an initiative of the Southeastern Universities Research Association (SURA). Funding support for SCOOP has been provided by the Office of Naval Research, Award N00014-04-1-0721 and by the National Oceanic and Atmospheric Administration’s NOAA Ocean Service, Award NA04NOS4730254. We would also like to thank the various SCOOP partners for discussion on the use cases - Philip Bogden (SURA and GoMOOS); Will Perrie, Bash Toulany (BIO); Charlton Purvis, Eric Bridger (GoMOOS); Greg Stone, Gabrielle Allen, Jon MacLaren, Bret Estrada, Chirag Dekate (LSU, Center for Computation and Technology); Gerald Creager, Larry Flournoy, Wei Zhao, Donna Cote and Matt Howard (TAMU); Sara Graves, Helen Conover, Ken Keiser, Matt Smith, and Marilyn Drewry (UAH); Peter Sheng, Justin Davis, Renato Figueiredo, and Vladimir Paramygin (UFL); Harry Wang, Jian Shen and David Forrest (VIMS); Hans Graber, Neil Williams and Geoff Samuels (UMiami); and Mary Fran Yafchak, Don Riley, Don Wright and Joanne Bintz (SURA). We would like to thank various SCOOP and SURAGrid partners for making resources available and special thanks to Steven Johnson (TAMU), Renato J. Figueiredo (UFL), Michael McEniry (UAH), Ian Chang-Yen (ULL), and Brad Viviano (RENCI), for providing valuable system administrator support.

9

11. E. Kalnay, “Atmospheric Modeling, Data Assimilation and Predictability,” Cambridge University Press, 2003. 12. R.A. Luettich, J. J. Westerink, and N. W. Scheffner, ADCIRC: An advanced threedimensional circulation model for shelves, coasts and estuaries; Report 1: theory and methodology of ADCIRC- 2DDI and ADCIRC-3DL, Technical Report DRP-92-6, Coastal Engineering Research Center, U.S. Army Engineer Waterways Experiment Station, Vicksburg, MS, 1992. 13. G. Holland. An Analytic Model of the Wind and Pressure Profiles in Hurricanes. Monthly Weather Review, Vol. 108, No. 8, pp. 1212–1218, 1980. 14. J. Sivillo, J. Ahlquist, and Z. Toth. An Ensemble Forecasting Primer, Weather and Forecasting, Vol. 12, pp. 809-818, 1997. 15. Unidata Local Data Manager. http://www.unidata.ucar.edu/software/ldm/, 2006 16. P. Bogden, et al, The Southeastern University Research Association Coastal Ocean Observing and Prediction Program: Integrating Marine Science and Information Technology," Proceedings of the OCEANS 2005 MTS/IEEE Conference. Sept. 18-23, 2005, Washington, D.C. 17. D. Huang, G. Allen, C. Dekate, H. Kaiser, Z. Lei and J. MacLaren "getdata: A Grid Enabled Data Client for Coastal Modeling," Published in HPC06. 18. P. Bogden et al., "The SURA Coastal Ocean Observing and Prediction Program (SCOOP) Service-Oriented Architecture," Proceedings of MTS/IEEE 06 Conference in Boston, September 1821, 2006 Boston, MA, Session 3.4 on Ocean Observing Systems. 19. J. Bintz et al., "SCOOP: Enabling a Network of Ocean Observations for Mitigating Coastal Hazards," Proceedings of the Coastal Society 20th International Conference, May 14-17, 2006; St. Pete Beach, FL. 20. SCOOP Website http://scoop.sura.org/, 2006. 21. D. A. Reed, et al., "Building the Bioscience Gateway," Global Grid Forum Technical Paper, June 2005. 22. North Carolina Forecasting System. http://www.renci.org/projects/indexdr.php 23. S. Graves, K. Keiser, H. Conover, M. Smith. “Enabling Coastal Research and Management with Advanced Information

Technology,” 17th Federation Assembly Virtual Poster Session, July 2006. 24. G. von Laszewski, I. Foster, J. Gawor, and P. Lane, "A Java Commodity Grid Kit," Concurrency and Computation: Practice and Experience, vol. 13, no. 8-9, pp. 643-662, 2001, http://www.cogkit.org/. 25. D. Angulo, R. Aydt, F. Berman, A. Chien, K. Cooper,H. Dail, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, M. Mazina, J. Mellor-Crummey, D. Reed, O. Sievert, L. Torczon, S. Vadhiyar, and R. Wolski. Toward a framework for preparing and executing adaptive grid programs. In Proceedings of International Parallel and Distributed Processing Symposium (IPDPS), 2002(41). 26. K. K. Droegemeier et al, “Service-Oriented Environments In Research And Education For Dynamically Interacting With Mesoscale Weather,” IEEE Computing in Science and Engineering, November-December 2005. 27. R. Wolski, N.T. Spring, J. Hayes, “The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing,” Future Generation Computer Systems, 1998. 28. K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, “Grid Information Services for Distributed Resource Sharing,” Proceedings of the Tenth IEEE International Symposium on HighPerformance Distributed Computing (HPDC-10), IEEE Press, August 2001.

10

TeraGrid User Portal v1.0: Architecture, Design, and Technologies Maytal Dahan , Eric Roberts , and Jay Boisseau Texas Advanced Computing Center The University of Texas at Austin Austin, Texas 78758 Email: {maytal,ericrobe,boisseau}@tacc.utexas.edu Equal Authorship Abstract— The TeraGrid [1] is a grid computing project for building a world-class comprehensive distributed infrastructure for scientific discovery. TeraGrid comprises many heterogeneous systems that enable high performance computing, data management/storage, and scientific visualization, and access to scientific data collections from facilities across the country. TeraGrid uses advanced networking systems, software technologies, and operations and support activities to tie these fundamental resources together into a production cyberinfrastructure for science and research. Allocations, accounting, security, resource monitoring, consulting, and documentation are among the services that each TeraGrid Resource Provider (RP) implements to meet their sitespecific needs while operating within the TeraGrid environment. The purpose of the TeraGrid User Portal [2] is to serve as a launch pad for new users and a control panel for current users by integrating RP resources, services, and information into a single web interface serving a national community of computational researchers. The first version of the TeraGrid User Portal [3] addresses the most fundamental issue for integrating TeraGrid resources: extension and integration of existing, centralized TeraGrid accounting and security services with a clear and comprehensive production portal interface. In addition, the portal provides simple access to existing TeraGrid user-centric information and services such as documentation, consulting, allocation request and renewal, and resource monitoring. This paper discusses the motivation, architecture, technology, and challenges used to build an enterprise portal for the TeraGrid. It also illustrates techniques, such as load balancing, redundancy, and user auditing, that are required to ensure performance, reliability, and traceability of the portal environment.

I. I NTRODUCTION The term user portal [4] is used in the computational research community to describe a web-based interface that provides access to targeted information, resources, and services for a community of users, scientists, and engineers. A grid user portal serves as a single interface for providing this access to a set of distributed resources integrated using grid-computing technologies. Thus, the TeraGrid User Portal is designed to serve as the main interface for providing information and services to users of TeraGrid resources. The TeraGrid User Portal v1.0 presents detailed information about the specific resources on the TeraGrid, such as machine load, job queues, user documentation, and more. Equally importantly, the portal offers an easy, consistent, and unified account interface that enables users to manage the complexity

of TeraGrid accounts, allocations, and distinguished names. In section three we describe the architecture and design behind the TeraGrid User Portal. Sections four and five describe the services currently being offered in the first production version of the TeraGrid User Portal and the technologies used to achieve this functionality. Section six describes the security infrastructure and section seven discusses the challenges encountered while developing and deploying the user portal. Finally, in section eight the paper discusses the future capabilities that will be included in upcoming releases of the User Portal. II. A RCHITECTURE The basis for the User Portal architecture revolves around the web server. We use the Apache Tomcat HTTP web server and within the web server are the individual portlet web applications [5], authentication modules, the GridSphere portal framework [6], and the JSR168 [7] portlet container, which are all discussed in more detail in section five. The web server receives requests from clients and manages resources (such as database connection pools) used to communicate with backend services. The individual portlets each depend on backend services including the TeraGrid Central Database (TGCDB), the GridPort Information Repository (GPIR) [8] web service, the MyProxy credential management [9] server, and the TeraGrid website. Figure 1 shows a detailed architecture diagram of the TeraGrid User Portal. The portlet web applications in the User Portal make use of a common and powerful design pattern known as Model-ViewController [17] (MVC). The MVC paradigm breaks things up into three separate pieces: the model, the view, and the controller. The architecture is designed to separate the application (model) from the way it is represented (view) and the portlet business logic (controller). This design enables rapid development and deployment of new portlet web applications. Using MVC we have developed a wide variety of interfaces to services described in the next section. III. S ERVICES S UPPORTED The TeraGrid User Portal presents many different types of information and services through a single interface by

Fig. 2. Fig. 1.

TeraGrid User Portal Architecture Diagram

using horizontal tab-based navigation. There are currently six main portal sections (tabs): Home, Resources, Documentation, Consulting, Allocations, and My TeraGrid. The Home section contains a login form, a feedback form, and information about the User Portal project. The Resources section includes a resource monitor that displays the current load, status, and job queues of all TeraGrid compute resources as well as other useful, more slow changing information about compute and visualization resources across TeraGrid. The Documentation section consumes and presents existing user documentation from the main TeraGrid website. The Allocations and Consulting sections provide information about how to apply for a TeraGrid allocation, and to request consulting support for using TeraGrid, respectively. Finally, the most personal section of the User Portal is the My TeraGrid section. It provides user-specific information to authenticated portal users about their project allocations and usage across TeraGrid, system accounts on TeraGrid resources, distinguished name (DN) registration, facilitates adding users to projects, and enables changing a users portal password. Figure 2 presents a snapshot of the TeraGrid User Portal home page showing the tabbednavigation. Given the large number of resources available for use in the TeraGrid, a user could have more than twenty system accounts to manage; the average user has about ten. Prior to the existence of the User Portal, a user was required to use command line interface (CLI) tools to monitor their allocation usage and generate and propagate their grid certificates. While these tools will always be useful to have in a command line environment, offering similar capabilities and presenting analogous information via a web interface have some advantages over CLI tools. One advantage is aggregating capabilities on different systems into a single interface; another is the superior

Snapshot of production TeraGrid User Portal v1.0

formatting capability available through HTML, which allows structuring data into easy to read layouts and using aesthetic elements such as graphic images to make the information display more appealing and in some cases easier to interpret. IV. T ECHNOLOGIES E MPLOYED In order to build and support the services on the TeraGrid User Portal a variety of technologies are required. The initial TeraGrid User Portal provides personalization, single sign on, and data aggregation in a consistent and comprehensive interface. This is achieved using the following technologies. The base technology for the TeraGrid User Portal is portlets. Portlets are pluggable user interface components that are displayed and managed using a portal framework (which well describe shortly). A single portal page may have a variety of portlets displayed at one time and the same page may look different for different users. Specifically, the TeraGrid User Portal portlets are based on the JSR-168 Java Portlet specification. The specification sets a standard for interoperability of portlets between different portal frameworks and defines interfaces for security, customization, and presentation rendering. Portlets run within a portlet container, which manages each portlets lifecycle and environment. In addition to managing requests from the portal to display portlets it also stores persistent customization information per user, per portlet. Once users access the TeraGrid User Portal using a web browser the portal framework receives the request and determines the list of portlets that are required to satisfy the request. Through the portlet container the framework invokes the required portlets and creates a single portal page composed of the HTML fragments generated by each portlet and returns that page to the browser. An example of this can be seen in Figure 3. It illustrates how two individual portlets are requested and displayed on the main My TeraGrid page. As mentioned in the architecture section, the TeraGrid User Portal is built using the GridSphere portal framework,

Fig. 3.

My TeraGrid Portlet Page Request

which provides a JSR-168 compliant portlet container and a collection of core portal services such as login, logout, portlet layout, customization, and portlet management. GridSphere is an extensible open-source portal framework that enables us to develop and deploy TeraGrid-specific portlets. In addition to a portal framework we needed a tool to automate building, configuring, deploying, and testing the portal. Rather than building such a system from the ground up we use an extensible, open source framework called Maven [10]. Maven provides a core set of plug-ins that handle a large set of standard software management operations including compiling java code, making java web application archives (.war files), recursive find/replace configuration, copying files, and more. However, Maven does not provide all of the functionality needed to manage the user portal. We have taken advantage of the plug-in architecture that Maven provides and built our own plug-in to manage all of the necessary portal management tasks that Maven does not provide. This plug-in allows us to easily configure and deploy the TeraGrid User Portal using simple, one-line commands and extended configuration. The plug-in enables us to overlay custom code that overrides the default behavior of GridSphere, deploy custom portlets, and greatly simplifies and centralizes portal configuration. The initial code base for the TeraGrid User Portal used a portal toolkit known as GridPort 4 [11]. GridPort is based on the GridSphere portal framework and uses the custom maven plug-in discussed previously. GridPort enables rapid development of functional grid portlets and comprises a set of

core grid portlet interfaces and services that provide a wide range of functionality. In addition to these core portlets, we have also developed many portlets specifically for the TeraGrid User Portal such as account and allocation management portlets but used GridPort as the starting framework for TeraGrid User Portal. One interesting requirement for the User Portal involved the need to consume existing web interfaces as if they were part of the portal itself. One example is the documentation services that existed prior to the development of the User Portal. The documentation is not managed by the portal development team therefore modification and migration of documentation is beyond our control. In order to consume these external services we use a web element called an iframe, which allows external web pages to be presented in a window that is nested inside the portlet web page, making it appear as if it were part of the same page. This makes it very convenient to include external, non-portlet applications with little or no development. Therefore, we have developed our own general-purpose iframe portlet just for this purpose. Furthermore, much of the functionality required in the portal requires database connections to the TeraGrid Central Database (TGCDB), which stores all account and allocation information for all TeraGrid users. An example of a portlet that requires database access is the system account portlet. A TeraGrid user can potentially have twenty different system accounts on TeraGrid resources, each with unique usernames. The goal of the system account portlet is to allow users to see their usernames on all systems for which they have an account. In order to retrieve the users system account information across TeraGrid the portlet needs to access the TGCDB. This is done using the Java Database Connectivity API [12], known as JDBC. JDBC enables access to relational databases through a simple interface and configuration. Since many of the TeraGrid User Portal portlets depend on a database connection and we anticipated a large number of concurrent users accessing these portlets. We needed to consider the time and performance issues associated with instantiating a database connection. To optimize performance issues a database connection pool is utilized to manage the database connections. A connection pool is simply a pool of open database connections that are managed by a separate process in the web server. It allows applications requiring a database connection to simply ask for a connection, issue queries, and release the connection back to the pool without the instantiation overhead or destroying the connection. The connection pool can also dynamically grow and shrink based on the application demands for connections. For example, when a portlet application needs a connection it first requests an available connection from the connection pool. If one is not available (e.g. all connections are currently in use) then a new connection is created by the connection pool and given to the portlet application. A web server administrator can configure the pool with various settings to control the initial pool size, the maximum pool size, the number of available connections, and more. The specific tool being used for connection pooling

in the TeraGrid User Portal is the Apache Commons DBCP [13] package. As you can see above, the portal requires a wide and extensive variety of technologies. All these are crucial to the development and production release of the TeraGrid User Portal v1.0. The next section describes the security issues and related security technologies required to develop the User Portal. V. S ECURITY One of the crucial requirements in the TeraGrid User Portal is user authentication, authorization, and account management. The TeraGrid User Portal uses a single sign-on model to secure access to many disparate services. The main authentication service is composed of many different geographically distributed services that all work together to provide single sign-on access. The services that comprise this master authentication service include Kerberos [14], a Certificate Authority [?], and a grid security service called MyProxy credential management service. Below we describe each of these services individually as well as how they work together. Kerberos is a strong network authentication protocol. By not allowing passwords to be transferred over the wire, Kerberos provides a very secure form of authentication, one that is suitable for securing sensitive information for such a large distributed project as TeraGrid. Kerberos provides high availability and failover through replicated, geographically distributed service instances and thus, scales to the volume of users that we expect to eventually be part of the project. A certificate authority generally acts as a trusted third party that issues long and short-term digital certificates. Digital certificates form the basis for public key infrastructure, which is the security infrastructure used by many pieces of grid software, such as the Globus Toolkit [16]. TeraGrid uses the Globus toolkit as a core infrastructure piece for performing grid-related tasks on distributed resources. Globus uses shortterm certificates called proxies that users use to authenticate to the various grid services, including Globus. The MyProxy online credential repository is a remote server used for storing and retrieving proxies. Normally, users possess long-term certificates and use those to generate short-term proxy certificates. If a user wished to use a short-term proxy generated on system A on system B they would normally have to manually copy that proxy from system A to system B using a file transfer tool (e.g. ftp, scp, etc). To address this problem, MyProxy allows users to delegate, or store, proxies to the server for later use elsewhere. This increases user mobility between systems, allowing them to retrieve a proxy from any machine with MyProxy client tools, which are part of the core Globus software that is installed on all production grid resources in TeraGrid. MyProxy also supports pluggable authentication modules (PAM), which allows other PAMcompliant authentication services to be used for authentication. By combining all of these security technologies together, we have deployed a very reliable authentication service that can authenticate users and issue short-term proxy certificates on

the fly. Kerberos and certificate authority software plug into MyProxy using its PAM interface, allowing user credentials (i.e. username and password) to be authenticated by Kerberos, and a certificate issued from the Certificate Authority, but all hidden behind the MyProxy service interface. The GridPort toolkit contains a MyProxy authentication module that plugs into GridSphere and therefore this service addresses our security requirements quite well. This service is also a big advantage to users because it does not require them to manage long-term certificates, an unwanted security burden on users. The portal can use this service to quickly authenticate a user and store the generated proxy in that users portal session for later authorization to other services. VI. R ELEASING A P RODUCTION P ORTAL There were many important issues to consider when releasing the first production TeraGrid User Portal. The initial concerns were ensuring that the server and services could handle the load from portal visitors. Also, it is important to be able to log not only how many user visitations to the web portal but also do more fine grain auditing of what users visited what portals and portlet functionality. This section discusses how these two objectives were completed and how they will continue to be refined throughout development of the TeraGrid user portal. In order to handle the possible multitude of visits to the user portal we have implemented load balancing for Apache Tomcat. Load balancing distributes the server load by switching to different nodes or servers based on the policy defined. When a user visits the user portal, one of the servers is chosen to execute the request. Load balancers are the single point of entry that direct traffic to various servers or clusters. The load balancing also enables to redirect load if a underlying server is down or code is being updated and pushed to production. This way the portal service is not interrupted if a single server is down. In addition to managing user visits and balancing stress on the server we also address the issue of auditing user visits. For monitoring web server statistics a package known as AWStats is used. This monitors how many visits the portal gets a month, what operating systems are used, browser types, etc. This offers valuable insight as to what type of user we are getting, what browser and OS are most commonly used, and how many unique visits the portal is receiving. In addition to monitoring web statistics we are also implementing portlet filters. These are snipits of code that filter information about each portlet request to a log file using log4j. By mining the log files on a regular basis this will give us insight as to what components of the portal is being used most often, by what users, etc. In addition to monitoring web visit and portlet requests we are also tracking both scheduled and unscheduled downtime to ensure that the TeraGrid user portal is up and reliable and meets the expectations of both users and stakeholders. VII. V1.0 C HALLENGES One of the biggest challenges involved in developing a User Portal for TeraGrid has been the complexity of integrating the

authentication and database infrastructure to support a constantly growing user community of more than four-thousand users. The portal relies on many geographically distributed services for security, documentation, systems data, and more. Tying these distributed resources together in a consistent and scalable manner has been difficult. A comprehensive plan to map all the systems and resources the user portal depends on was crucial in ensuring completeness and reliability. Another challenge was anticipating and simulating the load and usage that the portal would encounter for a user community as large and continually growing as TeraGrid. For database queries by the interfaces that contact the TGCDB, the portal server has been configured to use database connection pooling to reduce this overhead as much as possible. Furthermore, we ran performance load tests to verify that the web servers and services offered through the portal would achieve a reasonable level of responsiveness (approximately three second maximum response time) while supporting a reasonable average concurrent usage volume (approximately one hundred concurrent users). The portal currently performs acceptably but we are always tuning different aspects and looking for performance bottlenecks to fix in order to increase the performance as much as possible. In addition to performance tuning, load-balancing techniques to distribute the request load across multiple web servers and failover systems to manage network outages or system failures have also been deployed and configured. VIII. F UTURE P LANS The TeraGrid User Portal v1.0 implements a significant portion of features based on user requirements, but this is just a small set of the planned features. There are many new features and improvements requested by users during the testing and deployment phase. These features include a portal consulting interface to help better direct help desk tickets; to list and integrate TeraGrid data collections and science gateways resources; to further integrate documentation and user services. In addition to expanding with new features, there are a number of existing TeraGrid services and capabilities that will be architected for integration into the User Portal. These longer-term plans include: event notification services; interactive grid services such as remote file transfer and remote job submission; interactive remote visualization, and more. These features are currently being planned, architected, and developed and will be released in subsequent versions of the TeraGrid User Portal. IX. C ONCLUSIONS The objective of the TeraGrid User Portal is to serve the national community of computational scientists by integrating resources, services, and information into a single web interface. Developing this comprehensive interface requires a complex architecture and a wide variety of technologies as described above. While the challenges for developing the initial version were significant, it has been clearly apparent from user response that the portal is providing great benefits for users navigating the heterogeneous TeraGrid resources. As

additional services get integrated into the portal, the value and impact will continue to increase and should become an indispensable resource for all TeraGrid users. R EFERENCES [1] TeraGrid, URL: http://www.teragrid.org [2] E. Roberts, M. Dahan, J. Boisseau. TeraGrid User Portal: An Integrated Interface for TeraGrid User Information and Services TeraGrid 06. [3] TeraGrid User Portal, URL: http://portal.teragrid.org [4] M. Thomas, S. Mock, J. R. Boisseau. Development of Web Toolkits for Computational Science Portals: The NPACI HotPage. HPDC 2000: 308-309 [5] Portlet Definition, URL: http://en.wikipedia.org/wiki/Portlets [6] J. Novotny, M. Russell, O. Wehrens GridSphere: An Advanced Portal Framework GridSphere Project Website, URL: http://www.gridsphere.org [7] JSR-168 Portlet Specification, URL: http://jcp.org/en/jsr/detail?id=168 [8] GridPort Information Repository (GPIR) Web Service, URL: [9] J. Basney, M. Humphrey, V. Welch. The MyProxy Online Credential Repository. Software: Practice and Experience, Volume 35, Issue 9, July 2005: 801-816 [10] Apache Maven Project, URL: http://maven.apache.org/ [11] The GridPort 4 portal toolkit, URL: http://gridport.net [12] Java Database Connectivity API, URL: http://java.sun.com/javase/technologies/database.jsp [13] Apache Commons DBCP, URL: http://jakarta.apache.org/commons/dbcp/ [14] Kerberos, URL: http://web.mit.edu/kerberos [15] Definition of Certificate Authorities, URL: [16] Globus Toolkit, URL: http://www.globus.org [17] Description of the Model-View-Controller, URL: http://en.wikipedia.org/wiki/Model-view-controller/

Extending Grid Protocols onto the Desktop using the Mozilla Framework Karan Bhatia1

Brent Stearn1 Michela Taufer2 Richard Zamudio2 Daniel Catarino2 1 San Diego Supercomputer Center University of California, San Diego 9500 Gilman Drive La Jolla, CA 92093 2 Department of Computer Science University of Texas, El Paso 500 W. University Ave. El Paso, TX 79902 {karan,flujul}@sdsc.edu {mtaufer,rzamudio,Dcatarino1}@utep.edu Abstract

provide as much interaction as allowed by HTML, enough for basic tables and charts, but not enough for scientific visualization.

This paper describes an approach to building problem solving environments that provide users with direct access to remote grid services and protocols from their desktop or laptop. We describe two examples of this approach: Topaz extends the Firefox web browser to support the GridFTP protocol, thereby enabling the user to move large files between server-based data repositories and their primary day-to-day computer; and the Gemstone frontend that allows users to browse the set of services provided by an application provider and enables dynamic integration of the user interface elements. Gemstone is also a framework that allows application service providers to not only specify the service interface, but define the user interface and experience. Both Topaz and Gemstone build upon the Mozilla platform to leverage the current state of the art in web technologies.

1

• leveraging of local resources – web portals can not make use of the significant resources available at the desktop including accelerated graphics cards and hundreds of gigabytes of disk. • dynamic discovery of new services – web portals require significant development efforts to incorporate new services or capabilities and can not yet dynamically discover or automatically integrate newly created services, a key feature of the emerging grid services standards [18]. • capability to create and integrate new workflows – web portals provide a fixed workflow or set of workflows for managing the execution of well understood sequence of tasks; however, they are not flexible to support exploration of new scientific workflows.

Introduction

The technology of web portals continues to advance, with additional interactivity provided through the liberal use of JavaScript, Macromedia’s Flash, and Microsoft’s ActiveX components; however, the benefit of a web portal – the ability to aggregate the services into a single location – is also its fundamental weakness. Web-based portals are an example of problem solving environments (PSE) [23] that are purely server-based. PSEs exist that are purely desktop based as well: the Python Molecular Viewer [4], dataflow-based workflow tools [10, 5, 19], and visualization toolkits [26]. Each has its place among the scientist’s toolset, however, each may require recompiles or significant reconfiguration as new services are made available or existing services are updated.

Web portals have become a popular end-user environment for many grid computing projects [27, 7, 34, 1]. They provide users with high-level access to services utilizing a heterogeneous set of resources and protocols while shielding the user from the low-level details of their implementations. In addition they require only a web browser to be installed on the user’s desktop thereby supporting ubiquitous access. They do, however, require significant server resources including development resources for building the capabilities or customizing the components and maintenance resources for managing the server environment. Ultimately they are limited in the following ways: • interactivity with the end-user – web portals can only 1

To address these limitations, in this paper we propose an alternative approach that leverages the rapidly advancing technology in the web browser to extend grid protocols onto the user’s desktop. We describe two examples of this approach: the first extends the popular open-source Firefox web browser to support the GridFTP protocol [9], and the second further expands the approach by building an entirely new Mozilla-based application that provides discovery and access to remote application web services. Section 2 describes Topaz, the GridFTP protocol extension for Firefox, and Section 3 describes the Gemstone frontend that provides access to a set of biomedical applications. Section 4 and 5 discusses related work and concludes.

core framework supports the installation of extensions that can add functionality to the applications – a key feature that has led to the development of a thriving developer community with over 1500 extensions available [3]. Extensions may contain simple user interface additions or extensions to the core framework using the XPCOM (Cross Platform Component Object Module) system [16]. Mozilla-based application extensions follow a uniform standard for packaging and distribution to simplify user installation and updating. Installation of any extension is as simple as selecting a link to the packaged extension or loading it into the browser. Extensions like Topaz require few changes to be used with other Mozilla-based applications.

2

2.1

Topaz

The GridFTP protocol is an extension to the standard File Transfer Protocol (FTP) that supports GSI-based security, high performance data transfer using striping and parallel streams, and third party file transfer across different GridFTP servers. The GridFTP server and client tools form an integral part of the Globus Toolkit and provides a basis for higher-level data management capabilities such as the Data Replication Service (DRS) [13]. GridFTP is typically used for data transfer between server systems characterized by high bandwidth and strong host-based security. However, extending the GridFTP protocol to the user’s desktop allows the user to easily move data between their primary machine and the traditional server-based grid environments. Any such solution, however, should not require the installation of the Globus Toolkit on the desktop due to its complex installation and configuration. Furthermore, desktops (and laptops) in many environments do not have fixed IP addresses and hence do not meet the strong host security requirements of the Globus Toolkit. Keeping these requirements in mind, we have developed Topaz as an extension to the popular Firefox browser to integrate the GridFTP protocol on the user’s desktop or laptop without the need for other additional grid software. Topaz provides the same semantics to the enduser as is provided by default for the standard FTP protocol. The user provides a gsiftp URL in the form of gsiftp://hostname:port/path/to/file by either entering it into the URL bar or by clicking on a link. The Topaz extension prompts the user for security information and contacts the file server using the GridFTP protocol for directory browsing as well as file download and upload. The third-party-transfer functionality is work under development. The Firefox browser is built on top of the Mozilla framework, the same framework also serves as the basis of the Thunderbird email client as well as other applications. The

Directory Browsing, File Download and Upload

Figure 1 shows the components involved in the Topaz extension. The GsiFTPHandler protocol handler is responsible for processing and setting up the data transfer; the GsiFTPChannel channel handles data flow between the browser and the stream; and the GsiFTPStream stream handles the data transfer between the client and the GridFTP servers through the use of the Globus client libraries that are included with the extension. For listing directory contents or downloading files, the user passes the gsiftp URL of the GridFTP server to the protocol handler using the Firefox URL bar or selects a gsiftp link. The protocol handler creates an nsIURL XPCOM object, a built-in Mozilla component, from the URL string that contains the URL information and the appropriate methods to parse it. The protocol handler also creates a channel for the data transfer and passes it the XPCOM object. The first time the channel is created it generates a login manager to verify that the user has a valid proxy credential. If a valid proxy does not exist, the login manager presents the user with a dialog box to enter a username and password and to select a certificate authority for authentication. Topaz currently only supports the GAMA authentication server [11], but in the future additional credential services. Once authenticated, the GAMA server returns a x509 proxy certificate that Topaz stores in the filesystem within the Firefox profile. Once the security context is created the channel creates a stream and returns a pointer to the stream object. The Mozilla framework uses a sophisticated multi-threaded I/O model and when ready pumps the stream until end-of-file is reached. File upload in Topaz involves less interaction with the Mozilla infrastructure but adds more user interaction. Topaz adds an “Upload File” menu item to Firefox’s main “File” menu. When selected, this item calls a JavaScript function to begin the upload sequence: (1) the current browser location URL is retrieved to use as a destination, (2) a Gsiftp2

Protocol lookup

return channel

GsiFTPHandler Protocol Handler

create Channel

display Stream Pump

pump/read()

GsiFTPChannel

IGSIStream XPCOM Interface

mozilla topaz

Globus Libs

globus

GridFTP Servers

Figure 1. The Components of the Topaz Firefox Extension. Stream component is created and (3) passed the URL string; the stream launches a file-chooser dialog to select the file to upload and finally (4) starts the transfer.

2.2

the different download and upload experiments three times for each protocol and each file size. The results show that while the grid protocols are slower than the non-grid protocols, as the file size grows, the differences become negligible. The network bandwidth becomes saturated quickly and the transfer times scale linearly.

Performance

In this section, we present some initial performance results comparing the performance of Topaz with other GridFTP tools and with other common data transfer tools. The comparisons are not meant to be exhaustive and no protocol optimizations have been performed. These preliminary results indicate that the additional overhead of Topaz over command line Globus tools is fixed as a function of the file size – Figure 2.a shows a fixed overhead of roughly 1.9 seconds for download and .7 seconds for upload. We believe that this overhead may be reduced by leveraging Mozilla’s asynchronous I/O with the parallel data transfer capability of GridFTP; however, for most users this overhead seems acceptable. Figure 2.b also shows a comparison of data transfer times for non-grid protocols. Specifically, we compared the transfer times in seconds for downloading and for uploading data between a client at the University of Texas at El Paso and a GridFTP server at the San Diego Supercomputer Center using Topaz, the globus-url-copy command in Globus 4.0.1, the Secure Copy Protocol command scp, and the https protocol. We used a set of files with different sizes (i.e., 32KB, 256KB, 2MB, 16MB, 128MB, and 1024MB) and we ran

We reiterate that no GridFTP protocol optimizations are currently used in these initial experiments, only default settings. For more details and commentary on performance tests see [36]. While we are working to improve the performance, it should be noted that it may be preferable to use the gridbased data movement for reasons other than performance. For example, using scp may involve managing accounts whereas using GridFTP tools will leverage the user credential. In addition, using a non-grid protocol to handle data movement from the desktop may necessitate staging the file on an intermediary server node. Hence, two file copies will be needed as opposed to a single one with GridFTP tools. Finally, GridFTP also supports third-party-transfer, that is, the capability of moving files from one GridFTP server to another in a single data movement as opposed to first staging the data on the local coordinating machine. While this capability is currently in development in Topaz, it can greatly increase performance by not using any significant bandwidth of the client as compared with non-grid protocols. 3

Figure 2. Initial performance data shows (on the left) the additional overhead of Topaz over the Globus command line tools as a function of the data files and (on the right) a comparison of the transfer times for Topaz, Globus command line tools, scp, and HTTPS.

3

Gemstone

To access a service, users select the service in the service list and the Gemstone framework loads the corresponding service panel for that service into the center panel in Figure 3. The service panel provides the user interface for the application, including support for attaching input or data files, setting various application parameters, and retrieving and visualizing the outputs. The service panel is specified in the XML User-interface Language (XUL) [15] which is standard for all Mozilla-based applications. The business logic for the user interface is specified in JavaScript. The Gemstone frontend provides a standard API for accessing the remote services, accessing local resources, managing remote jobs and communicating between service panels. Section 3.3 describes the capabilities of the service panels in more detail. Finally, the right panel in Figure 3 shows a view into the local directory and supports drag-and-drop across into the service panels. The files in the panel can be edited or visualized depending on their content type. While currently only local file directories are supported, remote GridFTP-based filesystems will also be possible once the Topaz extension is integrated into the Gemstone frontend. Operating behind the scenes of the Gemstone frontend is a common data model specified in XML Schema that defines datatypes such as molecules, atomic co-ordinates, basis sets, etc. These same definitions are used in the suite of application web services developed in collaboration with NBCR and we are actively working with the application developers within the scientific community to standardize on these definitions. With these common definitions across the set of application web services, the Gemstone frontend supports exploratory workflows where users start with a data object, locate all the various operations that could be applied to it, apply a desired operation resulting in a new or

While Topaz extends the functionality of the Firefox browser, the Gemstone frontend goes even farther by building an entirely new application on top of the Mozilla source code in the same manner as used to build the Firefox browser and the Thunderbird mail client. The Gemstone frontend is an complete PSE that runs on the user’s desktop and provides an interface to accessing remote application web services. Gemstone is also a framework in that it allows application providers to define the user interaction with their applications and provides an architecture and a set of APIs to enable them to do so. The Gemstone framework dynamically integrates the application user interfaces into the frontend when needed, thereby allowing the application provider to evolve the interface as needed without any reconfiguration or plug-ins required by the users. Figure 3 shows the Gemstone frontend interface. The top panel shows a URL bar that points to a particular registry. User’s can browse different registries of services. For example, the default registry shows the services available by the NBCR services infrastructure [25] hosted at the San Diego Supercomputer Center, but we envision that many centers will publish their service registries in the future. Section 3.1 discusses the registry in more detail. The far left panel in Figure 3 is the service list that lists the services that are published in the registry. Each service entry represents a web service specified by a valid web service endpoint and described by a WSDL document. The services can be grouped into domains and can specify the level of security required for invocation: for example, the Siesta application provided by the NBCR services requires acceptance of a license agreement and authenticated access. We discuss the security capabilities in Section 3.2. 4

Figure 3. The Gemstone framework provides security and discovery of services, but allows application providers to specify the user interface to the application.

modified data object, and repeat. See [12] for examples of the types of workflows enabled by the Gemstone frontend. As stated above, the Gemstone frontend is built on top of the Mozilla source code. We use a new suite called XULRunner [17] which is a full runtime environment composed of the same libraries used to build the Firefox web browser and the Thunderbird mail client. As with all Mozilla-based applications interface elements are defined with XUL and rendered by the Gecko runtime engine; the logic driving the interface is defined through XPCOM components and JavaScript. The Gemstone frontend supports three major platforms: Microsoft Windows XP, Linux, and MacOSX. It uses Mozilla’s mechanisms to detect new versions of the framework and will prompt the user before updating to the newest version.

3.1

complex requiring additional server systems to be setup and maintained. While WS-Inspection and WS-Discovery address some of UDDI’s shortcomings and may be of use in the future, neither specification has yet been accepted by any standards body nor seen widespread adoption in the community. The Gemstone approach to service registration and discovery utilizes a much simpler decentralized mechanism. The registry is simply a file made available by a standard web server at a specific URL. The file format is in RDF [6] and it can be easily incorporated into virtually any client system. In addition to the Gemstone frontend, this discovery mechanism has been incorporated into python-based PSEs and Java-based workflow systems. Because of the simplicity of the registration and discovery infrastructure, organizations can easily publish the set of web services available to their users. For example, the registry provided by the NBCR services infrastructure is available at: http://rocks-106.sdsc.edu:8080/xul/registry.rdf and contains the following structure:

Registry and Discovery of Services

Application web services have endpoints and interfaces that need to be known to the PSE before they can be made available to the end-user. Within the web services toolset, the Web Service Description Language (WSDL) [14] is used to describe the service interfaces and the parameters. For registration and discovery, there are a number of existing and emerging standards including the Uniform Description Discovery and Integration (UDDI) specification [30], Web Services Inspection Language (WS-Inspection) [24], and WS-Discovery. UDDI is fairly heavy-weight and quite

The first portion of the registry defines the broad categories of the services, Chemistry, Biology, Materials Science, etc. The second part defines each individual service: the name, the URL of the service panel definition, an optional flag to indicating that authentication is required, and finally the URL location of the WSDL defining the service. For services requiring authentication, the service panel can support the display and acceptance of an end-user license. New registry entries may also include custom tags with additional information for the service. The Gemstone frontend provides the user with a familiar URL bar (see Figure 3), similar to what is provided by most web browsers, allowing the user to effectively browse the services offered by various service providers. We are currently working with various application providers to ensure that the registration capabilities can meet their needs.

3.2

3.3

Service Panels

As described above, the service panels are the main interaction between the user and the remote application services. The applications are services in the Web Service sense, with an interface definition describing the various methods and parameters for those methods. The service panel contains all the user interface elements to set various application parameters, attach data files, start the computation and retrieve output files. A key capability in the Gemstone architecture is that the application developers – those who create the application service interface – are the ones who define the user interface. Traditionally the application developer may develop the service API, but the user interface is left for the client application developer who may not have the knowledge or experience with the application to understand how the various parameters are specified, the inter-dependencies between the parameters, or the best presentation of the parameters. For example, consider the quantum chemistry application GAMESS that has over 300 different “control groups” with each control group providing a number of different parameter options. Many control groups have dependencies: the “PDC” control group implies that control group “ELPOT” must be defined. The existence of other control groups and parameters negates or overrides other control groups in complex ways. Building an user interface for such an application needs to be done with significant application developer assistance. With the Gemstone infrastructure the developer can provide the entire user interface and continue to evolve it with the clients dynamically integrating the newest versions as they are deployed. The service panels are described with XUL. Business logic – what happens when buttons are pressed, for instance – is defined in JavaScript. The XUL and JavaScript portions can be developed independently and updates to either

Security

The Gemstone frontend integrates the Grid Security Infrastructure (GSI) [21] with the native security mechanisms provided by the Mozilla framework and supports authenticated access to and access control for the remote application web services. As with Topaz, Gemstone leverages the GAMA system to provide user authentication services and supports authorization-based access to the remote application web services. Operationally, the user specifies a host for a GAMA server in the Gemstone frontend preferences. Then the user supplies the user and password for his or her certificate to the right of the registry URL bar and pushes the login button. Gemstone contacts the GAMA server over a secure SSL connection and supplies the username and password for the user. The GAMA server authenticates the user and returns a PKCS12 certificate which is loaded into the local security repository provided by the Mozilla core. At this time, Mozilla does not support x509 proxy certificates which are short term self-signed certificates, hence we use the actual certificate in the PKCS12 format. To protect the user’s private key, the local certificate repository is itself 6

that XUL development is in many ways simpler to HTML forms development. More sophisticated interactions are also possible using advanced JavaScript techniques such as AJAX, Dynamic HTML, even Macromedia’s Flash – the same techniques that are used for highly dynamic Web 2.0 sites.

are immediately available to users. When a user selects a service panel in the service list, the Gemstone frontend downloads the corresponding XUL and JavaScript files and integrates them into the framework. The XUL file defining the service panel is loaded into a new tab in the main content area. The code in the JavaScript is also loaded, but within a security context that restricts its access to local resources. Gemstone framework APIs are added into the execution scope of the running panel to give access to user security features, the local filesystem, interpanel communication, and the Job Manager. As Gemstone develops, additional APIs will be added as needed. Gemstone’s Job Manager APIs are used to launch and manage remote service invocations on behalf of the user. When a service panel is ready to initiate a invocation – say for instance when the user selects the “Run” button in the service panel – the service panel creates a new Gemstone Job object and sends it to the Job Manager to begin execution. Job information, including input parameters if requested, is written to local storage and the web service is invoked. The Manager will poll the web service for status at a modifiable interval, updating both local storage and the Job Manager Viewer with real-time information. It is typical that a remote service invocation may result in a long running job or set of jobs being scheduled for execution on a computational cluster. As these jobs may take significant time to complete, the user may close the service panel or the Gemstone application at any time; when he or she returns to check progress a new service panel will be created containing their pre-filled input options and current status. For applications like GAMESS, the service panel will also refresh graphs if necessary. For the service panel developer, the Job Manager APIs make it simple to interact with the running job; each panel is automatically sent notice at each stage of its job’s lifecycle and may choose to ignore or further process the corresponding web service response. Service panels typically act autonomously but there is often a need to share information with the framework or communicate with other panels. The Gemstone frontend supports a publish-subscribe communication model for one-tomany notification. A panel can register interest in a particular event or send event notifications through a global observer service which is built into the framework. A typical example is the “login-complete” event generated after a successful login. A service panel can use this information to allow different “secure” options or display the user’s DN which is sent as well. Aside from the security constraints, the developer of the service panel can build as simple or as complex interactions as they wish. The XUL technology is simple enough to enable non-programmers to develop basic functionality such as text boxes and buttons. The following is the XUL fragment associated with the PSize service panel and shows

Deployment of a service panel is as simple as locating the corresponding XUL and JavaScript file on a webaccessible location, typically hosted at the same location of the application service deployment.

3.4

Visualization

The Gemstone frontend also provides visualization capabilities. Figure 4 shows two examples. The first is integrated into the GAMESS service panel. As an energy minimization calculation is running on the remote resource, the output service panel shows the GMAX and GRMS outputs – the total energy of the molecular conformation and the gradient of the molecular energies vs the number of steps. This particular runs shows the user that the calculation is not converging properly due to incorrect input parameters. Currently this capability is built directly into the service panel, but we are working to move this kind of graphing capability directly into the framework so that it can be leveraged by any service panel in a standard way. In addition, Figure 4 also shows molecular visualization of molecules integrated into the Gemstone frontend by the Garnet component. Garnet uses JOGL (Java for OpenGL) to display three-dimensional visualizations of even very large molecules and uses the capabilities of accelerated graphics cards where available. Garnet is an alternative to other molecular viewers as it offers the ease of deployment that a web-based applet would yet all the advantages of a desktop viewer. Using the communication method described earlier Garnet can interact with the service panels as needed. 7

Figure 4. Two types of visualization capabilities.

4

Related Work

can not access local resources on the user’s desktop, and are mostly limited to HTML which is not very interactive. New AJAX frameworks may help here, but because the Gemstone frontend is built using the same technology, any advances in web development can be leveraged in our framework in addition to better access to the user’s desktop. Also building on the .NET framework and very similar in concept to the Gemstone frontend is the Notebook project [32] that is “client side data repository, collaboration environment and smart client for SOAP-based web services.” While the capabilities are similar, the technologies used to build the Notebook are specific to the next-generation of Windows, specifically XML Application Markup Language (XAML). The Java Cog [35] also integrates grid protocols into a set of Java libraries that can be used to build rich client or desktop applications that can access grid resources. In particular, the Java Cog supports the services of the Globus Toolkit including GridFTP. For data management, web-based access through portals means that file uploads must first be staged on the web server before being copied to the GridFTP server. Using Topaz, the user can directly access the remote data repository securely using established grid security protocols without having to install grid libraries or additional tools. Other groups have also been developing similar capabilities: MyGridFTP [31] and WSRF.NET [20] provides similar capabilities by building custom client applications using the Windows .NET framework.

One of the most prevalent mechanisms for building PSEs for major grid computing projects is web-based portals that provide end-users equipped with a web browser access to a heterogeneous set of tools and services. Just as with Topaz and Gemstone, these web portals provide data management capabilities (in many cases providing a front-end to the GridFTP servers), single sign-on security, and access to computational applications and services. Many toolkits exist for building web or grid portals, both open source [28, 33, 22] and commercial. One of the main advantages of a web-based portals over our approach is that they support collaboration and data sharing across users. For example, the Geon Portal [27] serves as a repository for datasets relating to geoscience and has over 400 registered datasources submitted by their users. Other users can search the repository for relavent datasets using metadata and ontologies and integrate the datasets for their work. Because they provide a central site that all users go to for accessing data, the server can maintain state in order to share data and knowledge across the user community. However, any central web portal site requires significant maintenance and administration for operation. In addition, it requires significant development effort to add additional applications or services. For example, to integrate a new application would require web development on the portal, hence this model would not be appropriate in an environment where applications or resources are being added or removed frequently. The Web Services Remote Portlet (WSRP) [29] specification may evolve to address this specific capability by allowing the user interface elements of a portal to be defined at a remote site; however, the specification and implementation are still quite limited. Secondly, while web-based tools support ubiquitous access through the browser, the are inherently limited: they

5

Summary

In this paper we describe an approach to building an extensive PSE based on the Mozilla framework that provides users with access to remote grid resources including data and services. Our approach integrates grid protocols directly into desktop applications enabling a high level of 8

http://www.nbirn.net/.

interactivity and seamless integration with both desktop resources and remote grid resources. Topaz, the data management component, extends the capabilities of the Firefox browser to support the GridFTP protocol. User’s simply specify a gsiftp URL and the extension does the rest. There is no requirement for Globus to be installed on the user’s machine, hence access to data managed by the GridFTP servers is greatly simplified. The current version of Topaz supports both data download and uploads; support for third-party-transfer is in development. Since we integrate the Globus implementation of the GridFTP client libraries into our extension, the performance is what you would expect – very little additional overhead is added by Topaz over and above the performance of the native Globus tools. A beta release of the Topaz extension is currently available [8] and supports MacOSX and Linux. Gemstone provides a complete end-user environment for the discovery of and access to remote application services. Because the registries used by Gemstone are defined by a simple RDF file published on a website, it is very simple for institutions to publish and maintain their list of services. Gemstone allows users to browse these services and provides a framework to integrate those capabilities dynamically, while allowing the application provider to control the functionality of that interface. Gemstone leverages the current state-of-the-art in web technologies, including support for JavaScript and AJAX, SVG, Canvas, and CSS. Version 1.0 of the Gemstone frontend has recently been released and is available for MacOSX, Linux and Windows [2]. It provides access to the application services developed by the National Biomedical Computation Resource (NBCR) and includes applications in the Biomedical and Computational Chemistry domain: AutoDock, APBS, GAMESS, Siesta, LigPrep, Protein Data Bank, and many associated utilities.

6

[2] The Gemstone Project. devel.sdsc.edu/gemstone/.

http://grid-

[3] The Mozilla Add-ons. http://addons.mozilla.org/. [4] The Python Molecular Viewer (PMV). http://www.scripps.edu/ sanner/python/pmv/. [5] The Vision programming environment. http://www.scripps.edu/ sanner/python/vision/. [6] RDF/XML Syntax Specification. Technical report, W3C, Feb 2004. http://www.w3.org/TR/rdf-syntaxgrammar/. [7] Teragrid Science Gateways. online documentation, Sept 2006. http://www.teragrid.org/programs/sci%5fgateways/. [8] The Topaz Firefox Extension, http://gcl.utep.edu/projects/topaz/.

2006.

[9] W. Allcock, J. Bester, J. Breshnahan, A. Chervenak, L. Liming, and S. Tuecke. GridFTP: Protocol Extensions to FTP for the Grid. Internet Draft, March 2001. http://www.gridforum.org. [10] I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludaescher, and S. Mock. Kepler: An Extensible System for Design and Execution of Scientific Workflows. In 16th International Conference on Scientific and Statistical Database Management (SSDBM’04), 2004. [11] K. Bhatia, S. Chandra, and K. Mueller. GAMA: Grid Account Management Architecture. In IEEE International Conference on EScience and Grid Computing, 2005. [12] Karan Bhatia, Stephen Mock, Sriram Krishnan, Jerry Greenberg, Brent Stearn, Kim Baldridge, and Celine Amoreira. Grid Workflow Challenges in Computational Chemistry. Technical Report SDSC TR-2006-7, San Diego Supercomputer Center, 2006.

Acknowledgments

We thank the NSF for supporting the Gemstone and Topaz projects through the Middleware Grant SCI0438430. Topaz was also supported by NSF EIA-0080940, Graduate Education for Minority Students in Computer Science and Engineering, and NSF SCI-0506429, Dynamically Adaptive Protein-Ligand Docking System based on multiscale modeling. We also would like to thank the NIH for supporting NBCR through the National Center for Research Resources program grant P41RR08605 that supports the development of the computational infrastructure.

[13] A. Chervenak, R. Schuler, C. Kesselman, S. Koranda, and B. Moe. Wide Area Data Replication for Scientific Collaborations. In 6th IEEE/ACM Int’l Workshop on Grid Computing (Grid2005), Nov 2005. [14] E. Christensen, F. Curbera, G. Meredith, and S. Weerawarana. Web Services Description Language (WSDL). Technical report, W3C, 2001. http://www.w3.org/TR/wsdl.

References

[15] The Mozilla Corporation. XML User Interface Language (XUL). http://www.mozilla.org/projects/xul/.

[1] Biomedical Informatics Research Network (BIRN). 9

[16] The Mozilla Corporation. http://www.mozilla.org/projects/xpcom/.

[28] Jason Novotny, Michael Russell, and Oliver Wehrens. Gridsphere: An advanced portal framework. euromicro, 00:412–419, 2004.

XPCOM.

[17] The Mozilla Corporation. XULRunner. http://developer.mozilla.org/en/docs/XULRunner.

[29] OASIS. WSRP 1.0 specification. Technical report, OASIS, Aug 2003. http://www.oasisopen.org/committees/download.php/3343/oasis200304-wsrp-specification-1.0.pdf.

[18] K. Czajkowski et al. WS-Resource Framework, May 2004. http://www106.ibm.com/developerworks/library/wsresource/ws-wsrf.pdf.

[30] OASIS. UDDI Version 3.0.2. Technical Standard, Feb 2005. http://www.oasis-open.org/committees/uddispec/doc/spec/v3/uddi-v3.0.2-20041019.htm.

[19] T. Oinn et al. Taverna: A tool for the composition and enactment of bioinformatics workflows. In Bioinformatics Journal, volume 20(17), 2004.

[31] A. Paventhan and K. Takeda. MyGridFTP: a zerodeployment GridFTP client using the .NET framework. In Advances in Grid Computing (EGC 2005) European Grid Conference, Amsterdam, The Netherlands, Feb 2005.

[20] J. Feng, L. Cui, G. Wasson, and M. Humphrey. Toward Seamless Grid Data Access: Design and Implementation of GridFTP on .NET. In Proceedings of the 2005 Grid Workshop, 2005.

[32] G. Quinn, B. Jennings, and M. Miller. The Notebook Project: Connecting Research Environments, 2006. http://www.notebookproject.org/.

[21] I. Foster, C. Kesselman, G. Tsudik, and S. Tuecke. A Security Architecture for Computational Grids. In ACM Conference on Computers and Security, 1998.

[33] Charles Severance. Integrating Grid Capabilities into the CHEF Collaborative Portal Framework. Technical Report NEESgrid-2003-01, NEESGrid, 2003.

[22] The Sakai Foundation. Sakai: A Collaboration and Learning Environment for Education, 2006. http://www.sakaiproject.org/.

[34] S. Subramaniam. The Biology Workbench: A Seamless Database and Analysis Environment for the Biologist. Proteins: Structure, Function, and Genetics, 32(1):1–2, 1998.

[23] Stratis Gallopoulos, Elias Houstis, and John Rice. Computer as Thinker/Doer: Problem-Solving Environments for Computational Science. IEEE Computational Science and Engineering, 1994.

[35] G. von Laszewski, J. Gawor, S. Krishnan, and K. Jackson. Grid Computing: Making the Global Infrastructure a Reality, chapter 25, Commodity Grid Kits - Middleware for Building Grid Computing Environments. Wiley, 2003.

[24] IBM, Microsoft. Web Services Inspection Language (WS-Inspection). http://www106.ibm.com/developerworks/webservices/library/wswsilspec.html?dwzone=webservices.

[36] Richard Zamudio, Daniel Catarino, Michela Taufer, Brent Stearn, and Karan Bhatia. Topaz: A Firefox Protocol Extension for GridFTP based on Data Flow Diagrams. Technical report, University of Texas at El Paso, April 2006.

[25] Sriram Krishnan, Kim Baldridge, Jerry Greenberg, Brent Stearn, and Karan Bhatia. An End-to-end Web Services-based Infrastructure for Biomedical Applications. In 6th IEEE/ACM International Workshop on Grid Computing, 2005. [26] J.L. Moreland, A.Gramada, O.V. Buzko, Qing Zhang, and P.E. Bourne. The Molecular Biology Toolkit (MBT): A Modular Platform for Developing Molecular Visualization Applications. BMC Bioinformatics, 6(21), 2005. [27] Ullas Nambia, Bertram Ludaescher, Kai Lin, and Chaitan Baru. The GEON Portal: Accelerating Knowledge Discovery in the Geosciences. In Proceedings of the 8th ACM International Workshop on Web Information and Data Management (WIDM 2006), Nov 2006. 10

Designing Grid Tag Libraries and Grid Beans Mehmet A. Nacar1,2, Marlon E. Pierce1, Gordon Erlebacher3 and Geoffrey C. Fox1,2 1 Community Grids Lab, Indiana University 2 School of Informatics, Indiana University 3 School of Computational Science, Florida State University {mnacar, marpierc, gcf}@indiana.edu [email protected] Abstract: We present a detailed description of the implementation of a library of Grid tag libraries and Grid beans for Grid Web portal development. Grid tags provide Java Server Faces (JSF) custom components for Grid services. They enable the definition of attributes to the Grid service parameters in a dynamic way embedded into JSF view pages. In addition, Grid beans provide client proxies to the Grid services. Grid tags and beans together provide a platform to develop Grid portlets easily. In addition to standard Grid job submission and remote file operation tags, we also provide management and monitoring capabilities for Grid tasks. This system can persistently store bean features and job parameters, which results in a permanent storage for archiving and reference. Keywords: Java Server Faces, Grid portals, portlets, tag libraries 1.

Introduction

Grid Web portals are gateways to science applications and data simulations. For instance, TeraGrid [1] computing resources are accessed through gateway portals that provide higher level user interfaces to basic TeraGrid services. The Globus [2, 3] provides implementations for the core services, such as file transfers, remote job submission, and resource information retrieval. Most of the Grid services are accessible by using command-line tools or Web services clients. GCE Shell [4] supports Grid services by using a command-line tool. Grid portals provide user friendly Web interfaces. Both portals and command line environments are implemented using the Java CoG abstractions [5] that wrap Grid services to support an additional clientprogramming layer on top of Grid services. Java-based portals may be built out of standard components, called portlets, which are standardized in the JSR 168 specification. Grid portlets are frontend clients of Grid services. Typical capabilities include client tools for interacting with the following services: • • • • •

Credential generation and management Job Submission File transfer and operations Monitoring and persistence Resource management

It is our observation that portlets work well for exchanging relatively complete applications between projects. There needs to be a way, however, to construct portlets themselves out of reusable components. The capabilities listed above can be implemented as individual portlets, but typically we want to combine these common Grid capabilities into more specific portlets for a particular application. For example, a portlet developed to submit a computational material science code needs to be composed out of job submission, file transfer, and job monitoring capabilities into a single portlet, rather than composed as a combination of existing job submission, file transfer, and job monitoring portlets. Java Server Faces (JSF) [7] is a Web development framework that provides component model to build dynamic web pages, and forms the basis for our approach. JSF applications can be deployed as portlets by using the JSF portlet bridge [8], which provides JSR 168 compatible libraries. The JSF component model can be customized to extend new tag libraries. We have used the JSF tag library framework to design Grid tags and beans to simplify Grid portlet development. In this paper, we discuss significant revisions and improvements to tag libraries based on the lessons we have learned in the previous work [9]. Our Grid tag libraries enable the design of Grid portlets out of basic Grid tasks. We have changed the design of tag libraries, and we

1

provide additional Grid tags and beans. Instead of using generic task tags, we have used specific tags such as myproxy, jobsubmit, fileoperation, and filetransfer, building off the work described in [5]. The new specification brings additional features for application developers. First, the new specification provides more attributes specific to Grid tags that are self-contained and can be customized easily. Second, composite tasks can contain an unlimited number of subtasks (limited to system resources), unlike the previous work, which was restricted to three multi-staged tasks. Both implementations are currently limited to “one deep” nested composite tasks, but the new approach will enable us to build recursively nested subtasks. The third advantage of Grid tags is that it gives liberty to developers to use their own Grid beans library or add more Grid beans to the existing ones.

Grid tags. RSF tags are described in XML and do not directly tie to technology specifications like JSTL [13]. RSF tags must comply with rendering technologies like IKAT [14].

We summarize related works in the second section. The third section explains JSF Grid beans and tags. We introduce applications and conclude with future works.

We aim to provide a set of Grid tags in JSF that can be used to build Grid portlets. Our tag libraries provide common Grid capabilities such as proxy credential management, job submission, file operation, and workflow by means of multi-staged tasks. Grid tags are associated with Grid beans to access Grid services. Grid bean methods are bound to tags with attributes. These can then be used to simplify the building of new Grid portlets.

2.

Related Work

Grid portlets have been developed by a number of groups. GridSphere’s Grid portlets [10] provide a set of capabilities that supports Grid services available by the Globus toolkit, including GRAM, Grid FTP, MDS, GRIS, MyProxy, Web Service Resource Framework (WSRF) for GT4 and Open Grid Services Architecture (OGSA). These portlets are built on JSP and use the Grid portlet services of GridSphere. GridSphere Grid portlets are strictly dependent on the GridSphere portal framework; as a result these portlets are not portable among portal containers. Reasonable Server Faces (RSF) [11] is another Web framework that works to separate the presentation and logic. It enables HTML pages to be totally independent from the backing beans. It also supports simple beans in the request scope. The beans are outside of local JVM and are created in the Spring container [12]. Similar to JSF, RSF supports custom components. In this case, components do not present any view behavior, unlike JSF. In other words, RSF components are non-visual. This is one advantage of RSF in terms of developing

OGCE portlets [15] are built on Velocity and provide access to common Grid services through the Java CoG abstraction layer [5]. OGCE also provides portlets for Condor and Storage Resource Broker services. These portlets are compliant with JSR 168 and portable among portal frameworks. For example, one can deploy OGCE portlets on either GridSphere or uPortal. Each portlet provides a single Grid capability. JSR 168 does not support inter-portlet communication in its specification; however, OGCE portlets has to share session data to access proxy credential. 3.

3.1

JSF Grid Tags and Grid Beans

Grid tags

Grid services are interfaced by Java CoG abstractions. These programming interfaces have capabilities to generate proxy certificates, submit jobs, transfer files and make file operations. They also provide composite task submissions and their handling. JSF technology helps to build user interfaces based on an object-oriented component approach. JSF tags are built from Java classes that can be extended using JSF component model. New components derive from JSF base component classes. Each component should define its attributes, which can bind values, methods or actions. A full discussion explaining how to extend JSF components is beyond the scope of this paper. We recommend [16] for a tutorial on this subject.

2

The main goal is to make Grid portlet development easier by encapsulating standard Grid operations with JSF tags. These tags can be assembled to create composite tasks. In traditional Web frameworks such as Velocity and JSP backing bean objects and HTML tags are mixed within the server pages. Instead JSF eliminates this intervention by proposing JSF tags that separate backing bean and server pages. 3.2

Use case example

Typically a Grid portlet must do several related tasks in response to a user-generated event. These may be thought of as simple workflows. These workflows can be considered the nodes of a Directed Acyclic Graph (DAG), which are Grid tags are designed to support. The DAG, or composite task, is called ‘multitask’ in our approach. Multitasks only allow dependent task units and prevent parallel tasks. Figure 1 shows a multitask with sub-tasks and their dependencies. In this example, Task A makes a directory. Task B transfers an input file form a remote host to newly created directory, and Task C is responsible for submitting a job on the remote computer. When Task C completes, Task D transfers output file to another location. The following explains the scenario in detail through the use of Grid tags. This example demonstrates a composite Grid task with Grid tags. The JSF snippet below (Listing 1) shows how a portlet developer would create a custom Grid portlet. First, a myproxy tag generates a proxy credential form gf1.ucs.indiana.edu myproxy server. Second, using this credential, it makes a directory on the TeraGrid resource cobalt.ncsa.teragrid.org. Third, it transfers an input file called input_file from gf1.ucs.indiana.edu to cobalt.ncsa.teragrid.org. Forth, it then executes a script called execute. When the execution is completed outputs are written to the file named result. If an error occurs it is also written to the file named error. Finally, result file is transferred back to gf1.ucs.indiana.edu.

Figure 1. A typical multistage Grid job involves four sub-tasks: moving an input file to a particular execution host, submitting the job, and moving the output to a storage host. The tag is used at the top of the page

to define the custom tags called with the “o” namespace. Application developers must define Grid operations in a Web form. The tag is a submitting button for the composite task that is bound to a JSF action method [16]. The defines composite task and defines their dependencies. The tasks , , and are unit tasks for this composition. The dependency tags indicate that taskA must complete successfully before taskB will run, taskB must complete successfully before taskC can be run and taskC must complete successfully before taskD could run. Complete XML schema specifications of Grid tags can be found at [17]. Each Grid tag is associated with UI component and tag class that is explained in great detail in section 3.4. 3.3 Grid Beans Grid tags and beans work together to perform Grid tasks. Grid tags provide the JSF components for Grid applications, while Grid beans provide the business logic of Grid applications. We have implemented Grid beans in a generic and standard way to support underlying Grid technologies. We have also attempted to design our tag libraries to support other Grid bean implementations. The Grid

3

....... .......

Listing 1: Grid tag libraries are used to build a sample Web form. • JobSubmitBean: Executes GRAM job beans are generic tasks that may be extended submissions. using other toolkits besides Globus. For • FileOperationBean: Performs common example, the JobSubmitBean for job submission file and directory operations like rm, uses Globus resources in our implementation. mkdir, put, get Developers can create their own beans with other toolkits. For example, Condor can be used • FileTransferBean: Transfer files among for job submissions rather than Globus. GridFtp servers However, this requires that Grid bean method • MultitaskBean: Creates composite tasks names should be standardized and required bean and execute them. methods has to be provided. For example, Note that these are independent of the JSF actions methods should be called submit in all framework. The Grid tag libraries shown in beans. Parameter names should also be Listing 1 are built from these, as we describe in consistent throughout the beans e.g., hostname, the next section. provider, username and executable etc. 3.4 Design and management of Grid Our Grid beans are listed below. tags • MyproxyBean: This bean generates user proxies and stores the Grid credential in the session.

Grid tag libraries are built using JSF custom component development techniques. A standard

4

JSF tag requires at least two classes to be implemented: the ComponentTag and IUComponent classes must be extended. Tag names and attributes have to be defined in a tld file and this file is added to web.xml. Component names and classes are defined in facesconfig.xml. A full explanation of JSF custom tag development is available from [16]. Custom component classes extend the UIComponentBase class and are normally associated with HTML or other rendered widgets (input fields, buttons, etc.) in the user interface. We have implemented several custom UI components, including UISubmit and UIMultitask, as discussed here. Components can access a map (specifically, a java.util.Hashmap) of attributes and child components. If the component is visual like UISubmit (which we associate with the HTML button), it also implements encoding and decoding methods to process HTML markup. If the component is non-visual (i.e. does not need to be converted into HTML), it is associated with a null renderer. UIMultitask class is a non-visual component. In addition, the JSF ComponentTag class extension has to implement release(), setProperties(), getComponentType(), and getRendererType() methods. The setProperties() method binds attribute values and methods to the associated UIComponent. In JSF, the tags and attributes are used to render displays and communicate attribute values (see Listing 1). We encapsulate the actual logic of the page (associated with user button clicks) in several beans that are called by the UISubmit’s action method. Besides tag and component classes, there are core beans as following: • ResourceBean: A general bean to collect property values used in JSF form pages. By default it loads property values from a resources.properties file. • FactoryBean: Manages multiple Grid beans (super class of JobSubmitBean, FileOperationBean and FileTransferBean) and MultitaskBean instances for a single user • MonitorBean: Monitors and manages bean executions

TaskListener: Catches Grid bean execution stages and propagates events back to the monitoring bean. • ComponentBuilderBean (CBB): Retrieves Grid components from JSF pages (see Listing 1) and builds internal hierarchical directed graph of the grid actions to be taken ResourceBean, MonitorBean and ComponentBuilderBean are managed by JSF’s session handling mechanisms and are declared in the faces-config.xml file. CBB is not normally used directly by developers in their JSF pages. They instead interact with this object through Tag libraries. Application developers can directly use ResourceBean and MonitorBean to build up pages. •

Figure 2. Shows architecture of ComponentBuilderBean and its components Figure 2 shows the architecture of components. In this diagram, bean and listener tables are in the HttpSession and tables store bean and listener objects in a Hashmap. CBB handles user requests on the server side using Grid bean property values provided by ResourceBean. The actions are fired off by the Grid submit tag that is bound to the submit method of CBB. Its action listener catches the event and calls required methods to parse custom components. FactoryBean then constructs corresponding subtasks. Next, CBB constructs a taskgraph using MultitaskBean. CBB adds child components which are Grid beans and their dependencies. It then submits the taskgraph and passes the control to the submit button’s action attribute. The JSF engine handles the value of the action attribute, while a navigation rule points to the destination page based on the attribute value.

5

The above classes (particularly the Factory Bean) are designed to accommodate a common use case in Grid portlets that is not handled well by JSF: we need to construct many beans for encapsulating many submissions by a single user in a single session. JSF manages the sessions (lifecycle) of beans but these are statically configured in faces-config.xml, so we need an approach to create and manage lots of Grid beans. We must also address a disparity of time scales: JSF event processing may take milliseconds, while the corresponding backend action may take much longer. Our solutions are described in the following section. 3.5

Design Principles

We have used the strategy of returning immediate results to the user such as passing the control to the next page since Grid operations can take a long time to complete. Thus, a user submits the job in one page and is not required to wait until the job finishes. Instead, users are able to monitor their jobs in another page. To maintain this scenario, either we need to keep callbacks for each job or to store listeners for each job in the servlet HttpSession object. We have therefore used CBB that take care of each request in the session. Then we stored bean instances and their listeners into tables (Hashmap) among the session with taskname key. The taskname key is created by putting the user-defined taskname (collected from Web form input) and the timestamp together to provide a reasonable ID. Grid tags launch Grid operations. Keeping track of lifecycles and archiving are also important aspects of Grid portlets. Thus, we define a tag in Listing 2 that provides capabilities allowing users to manage lifecycles manually such as canceling, suspending, and resuming the jobs. The tag is visual and it is rendered as HTML button. The session tables only persist until the servlet session expires or terminates. So we need to have mechanism to persistently preserve them in a permanent storage. The persistent attribute of the multitask tag switches archiving on and off (see Listing 1). A context server [9] provides archival facilities that store bean values and the status in a structured way.

Listing 2. The handler tag is used with to create a table of tasks and enable cancellation actions. Figure 3 illustrates the user interaction with the Grid beans and tags is illustrated. When user hits a submit button, CBB takes control. CBB first constructs a multitask with the components defined by the Grid tags. CBB also submits the multitask and manages its lifecycle with associated listeners. After the submission is completed, control is passed to MonitorBean shown on the right. MonitorBean interacts with the session to retrieve the information of submitted tasks. 3.6 jobs

Monitoring and management of

Monitoring pages are responsible for keeping track of submitted tasks. Grid tasks usually take time to process. Consequently, managing the persistence of the tasks and archiving the results and input parameters are important for portal users. CBB provides a mechanism to store task handlers into persistent storage in the user’s workspace. Monitoring pages collect status information and task parameters from user’s workspace with a key named taskname. In general, CBB provides status information and updates archival storage accordingly. This has an important advantage that caches the monitoring information in the session. On the other hand, CBB stores URL handlers of

6

Figure 3: Sequence diagram for Grid tags and beans including user interaction. supports these capabilities for active (running) submitted jobs which are provided by the tasks. The MonitorBean allows users to manage Globus API. A URL handler is important for their job archive: failed tasks may be deleted or persistence. In case the user logs out or a session renamed for resubmission. Successful task expires, the handler can always be accessible results and output files can be downloaded or from archive and the user can retrieve status transferred to permanent storages. information with it. Monitoring pages check the status of submitted tasks. We model task with Java Bean class called JobData. Each submitted task has an associated JobData object. The collection of JobData objects is stored in a java.util.List. Job status information is displayed in HTML using the JSF HtmlDataTable component (which JSF converts to an HTML ). Properties stored in the JobData object include taskname, input parameters, output and error file locations, start time, finish time and status. Portal users can manage the tasks that resume, cancel or resubmit jobs. The MonitorBean

3.7 Additional Topics: Collecting User Input Values and Handling Navigation Our Grid tags are primarily non-visual components in a JSF page that are associated with submit button actions. However, many of the tag attributes (e.g., which host to use or input file to copy) must come from user input. This is done using Web forms. Thus, Grid tags are embedded into a complete JSF page that contains a Web form that has visual input and output text elements. There are only two exceptions: the and tags

7

are bound to a button that triggers series of actions behind the scenes. Since Grid tags are unable to get inputs from the page, we need a mediator to communicate these user-provided inputs to our Grid tags. ResourceBean provides a simple way to represent common property values across the application. We define common property values for Grid beans such as hostname, provider, username etc. Each of these values corresponds to Grid tag attributes. Thus, ResourceBean gets its value from the Web form dynamically and assign it to the Grid tag attribute. ResourceBean enables users to enter dynamic values in the form and submit their tasks with these values. JSF page navigation is somewhat complicated compared to JSP page navigation, as the JSF pages’ links and HTML form actions do not directly point to the next page to load. Instead, JSF navigation rules for a particular web application are configured in the facesconfig.xml file. Similar to standard JSF, advanced navigation controls the page with constant values as well. The button provides action attribute (see Listing 1) that assign a constant value for the destination page. Action methods and action listener methods of the tag are hidden from the application developers to reduce the complexity. But the navigation is left to application developers. The advantage of this architecture is that users need not wait on the submit page until it is completed. Instead they are directed to the destination page immediately (i.e., asynchronously). 4.

Applications and Future Work

The Grid tags and beans described here are used by several science portal applications. The Virtual Laboratory for Earth and Planetary Materials (VLab) portal is mainly focused on computational methods of material science. In this case, scientists launch PWSCF [18] simulations and get visual results. VLab job submission portlets enable material scientists to launch and monitor simulations through Grid portal. We have developed VLab portlets to use Grid tags and beans to facilitate issuing credentials, file operations, remote job executions and file transfers.

Grid tags are also being developed for use in the Common Instrument Middleware Architecture (CIMA) crystallography portal [19]. CIMA provides access to X-Ray crystallography, instrument and sensor data. Sample data includes CCD images of crystals as well as laboratory conditions such as temperature and humidity. The CCD images may also be postprocessed. One of the post-processing applications used is SAINT [20], used to integrate CCD image frames, sort reflection lists, scale, filter, and merge reflections. In this case, crystallographers launch a SAINT application using multitasks to initiate an image analysis. This process results in image files that are being downloaded to a portal server and are made available for users. In this paper, we have described the design, implementation and usage of Grid tags and beans for developing Grid portlets. This extends and improves both the interface and implementation of our previous work. Using a fine-grained component architecture enables the construction of science application specific Grid portlets in terms of reusable Grid tags. Dynamically monitoring and tracking status changes of tasks is another important aspect of Grid portals. We are considering the use of Ajax [21] technology in the next release. In the current architecture, multitasks are currently limited to “one-deep” graphs and are not recursive. We will consider adding this capability. We also need to investigate supporting the second-generation of portlet API, JSR 286 [22]. 5.

Acknowledgements

This work is supported by the National Science Foundation’s Information Technology Research (NSF grant ITR-0428774, 0427264, 0426867 VLab) and Middleware Initiative (NSF Grant SCI 0330613) programs. 6.

References

[1] Charlie Catlett, "The Philosophy of TeraGrid:

Building an Open, Extensible, Distributed TeraScale Facility," ccgrid, p. 8, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02), 2002.

8

[2] Foster, I. and Kesselman, C. Globus: A Toolkit-

Based Grid Architecture. In Foster, I. and Kesselman, C. eds. The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, 1999, 259-278. [3] Foster, I., Kesselman, C. and Tuecke, S. The

Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing [4] Mehmet A. Nacar, Marlon Pierce and Geoffrey

C. Fox Developing a Secure Grid Computing Environment Shell Engine: Containers and Service Special issue on Grid computing in Journal of Neural Parallel and Scientific Computations (NPSC), Volume 12, pages 379390, 2004 [5] Kaizar Amin, Gregor von Laszewski, Rashid Al

Ali, Omer Rana, and David Walker, An Abstraction Model for a Grid Execution Framework, Journal of Systems Architecture, Volume 52, Issue 2 , February 2006, Pages 7387, Parallel, Distributed and Network-based Processing. [6] Abdelnur, A., Chien, E., and Hepper, S., (eds.)

(2003), Portlet Specification 1.0. Available from http://www.jcp.org/en/jsr/detail?id=168. [7] Craig McClanahan, Ed Burns, Roger Kitain.

Java Server Faces Specification. Version 1.1. [8] Apache portal bridges Web site:

http://portals.apache.org/bridges/ [9] Mehmet A. Nacar, Mehmet S. Aktas, Marlon

Pierce, Zhenyu Lu and Gordon Erlebacher, Dan Kigelman, Evan F. Bollig, Cesar De Silva, Benny Sowell, and David A. Yuen VLab: Collaborative Grid Services and Portals to Support Computational Material Science Dec 30, 2005 Special Issue on Grid Portals based on SC05 GCE'05 Workshop, Concurrency and Computation: Practice and Experience. [10] Michael Russell, Jason Novotny, Oliver

Wehrens: The Grid Portlets Web Application: A Grid Portal Framework. Parallel Processing and Applied Mathematics (PPAM) 2005: 691-698.

[12] Spring container Web site:

http://www.springframework.org/ [13] Bayern S., JSTL in Action. Manning. 2002. [14] IKAT Web site:

http://www2.caret.cam.ac.uk/rsfwiki/Wiki.jsp?pa ge=IKAT [15] Jay Alameda, Marcus Christie, Geoffrey Fox,

Joe Futrelle, Dennis Gannon, Mihael Hategan, Gregor von Laszewski, Mehmet A. Nacar, Marlon Pierce, Eric Roberts, Charles Severance, and Mary Thomas The Open Grid Computing Environments Collaboration: Portlets and Services for Science Gateways March 2006 Concurrency and Computation: Practice and Experience Special Issue for Science Gateways GGF14 workshop [16] How to write your own JSF components. Web

site: www.exadel.com/tutorial/jsf/HowToWriteYour OwnJSFComponents.pdf [17] http://grids.ucs.indiana.edu/users/manacar/GridT

ags/GridTagsInterface/GridTagsXMLSchema.xs d [18] S. Scandolo, P. Giannozzi, C. Cavazzoni, S. de

Gironcoli, A. Pasquarello, and S. Baroni, Firstprinciples codes for Computational Crystallography in the Quantum-ESPRESSO package, Z. Kristallogr. 220, 574-579 (2005). [19] Hao Yin, Donald F. McMullen, Mehmet A.

Nacar, Marlon Pierce, Kianosh Huffman, Geoffrey Fox and Yu Ma Providing PortletBased Client Access to CIMA-Enabled Crystallographic Instruments, Sensors, and Data Technical Report April 21 2006 with short poster version for 7th IEEE/ACM International Conference on Grid Computing (GRID 2006). Barcelona, Spain. [20] SAINT Web site: http://xray.utmb.edu/saint.html [21] Michael Mahemoff. Ajax Design Patterns.

O’Reilly. 2006 [22] Stephen Hepper. Java Portlet Specification

Version 2.0 Early Draft. July 2006.

[11] Reasonable Server Faces Web site:

http://www2.caret.cam.ac.uk/rsfwiki/Wiki.jsp?pa ge=Main

9

Copyright © 2024 M.MOAM.INFO. All rights reserved.
| About Us | Privacy Policy | Terms of Service | Help | Copyright | Contact Us | Cookie Policy

Sign In

Email

Password

Remember me Forgot password?

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close