On Interoperability: the Execution Management ... - CiteSeerX

1 downloads 0 Views 485KB Size Report
National Laboratory that will allow using Globus ... UNICORE and Globus by virtue of an interoperability ..... [12] Ali Anjomshoaa,Fred Brisard,An Ly, Stephen.
On Interoperability: the Execution Management Perspective Based on ChinaGrid Support Platform* Yongwei Wu, Likun Liu, Weimin Zheng, Feng He Department of Computer Science and Technology; Tsinghua National Laboratory for Information Science and Technology Tsinghua University Beijing 100084, China {wuyw, zwm}@tsinghua.edu.cn {reality, hefeng}@chinagrid.edu.cn Abstract Interoperability between different Grid implementations is attracting more and more attention to seamlessly job execution. In this paper, one approach is proposed to implement the basic interoperability between heterogeneous Grids based on Grid abstractions provided by ChinaGrid Support Platform (CGSP). The minimal set of service and functionalities required by seamlessly job execution are abstracted first. Then the problems encountered and corresponding solutions taken within integrating CGSP with other OGSA-compliant Grids are described, and the specific interoperability issues including job submission and job execution management, data management and staging, resource representation and discovery, security and identities management are addressed as well. At last, GPE4CGSP, which integrates CGSP with Grid Programming Environment (GPE), is given as a case study to verify the feasibility of the approach.

1. Introduction Interoperability is the ability of two or more system or components to exchange information (syntactic interoperability) and to use the information that has been exchanged (semantic interoperability). Implementing interoperability will make applications be deployed across both Grids infrastructures, Natural extension of each Grid’s efforts to include more

* This Work is supported by ChinaGrid project of Ministry of Education of China, Natural Science Foundation of China (60373004, 60373005, 90412006, 90412011, 60573110, 90612016), and National Key Basic Research Project of China (2004CB318000, 2003CB316907)

resources in their own infrastructure, and will enable users to use the Grids transparently. However implementing interoperability of all kinds of heterogeneous Grids is very difficult and universal solution has not been designed yet. Two of the most important difficulties are that: lacking of Grid standards Standards are extremely important for Grids, because they enable interaction between heterogeneous resources, services and solutions. As the development of Grid standards is slower than the development of Grid itself, many significant Grids, which have been developed and deployed before the standards came out, do not provide supporting for new standards. heterogeneousness and variety of Grid implementing Most of the current standards are defined in the level of functionalities and architecture, rather than in the level of interface and implementation. Different people design and implement different Grid based on their own understanding on Grid standards, and result in the variety of Grids. As the developing of the Grid, implementing interoperability of different Grids becomes more and more important. A global gird environment is possible due to the developing of Grid standards and technologies. All the Grids conform to OGSA [2] standard whose aim is to provide an extensible, manageable and dynamic framework to support the global Grid. The most important steps towards creating a seamless Grid environment are the WSRF standards, which define conventions on how to manage the state of resources across multiple web service interactions within the context of established Web service, and JSDL

standards, which enables user to submit a job to Grid in the uniform interface. In this paper, one approach is proposed to implement the basic interoperability between heterogeneous Grids based on ChinaGrid Support Platform (CGSP) [1] through abstracting a minimal set of service and functionalities required by seamlessly job execution. The problems encountered for interoperability and corresponding solutions taken within integrating CGSP with other OGSA-compliant Grid are described too. This paper is organized as follow: Section 2 is some related work, following with an overview of CGSP in Section 3. Section 4 describes the key conceptions of our approach to interoperability in details based on CGSP. We demonstrate the feasibility of our approach by showcasing GPE4CGSP which integrates CGSP and GPE [3] in Section 5. Our conclusions and future work are provided in Section 6.

2. Related work There are some works covered the interoperability between heterogeneous Grids. Grid Interoperability Project (GRIP) [4] is an EU funded project in collaboration with Argonne National Laboratory that will allow using Globus resources from a UNICORE client and to access UNICORE-managed resources from Globus. It bridges UNICORE and Globus by virtue of an interoperability layer which has two main functionalities: translating UNICORE requests for job submission, job monitoring and output retrieval into the corresponding Globus constructs and mapping permanent UNICORE certificates into a Globus user proxy certificate. Interoperability between Condor and OGSA* is addressed by Clovis Chapman in reference [5]. Condor service for the Global Grid is proposed with the aim of bringing Condor in line with advances in Grid computing and providing the Grid community with a mature suite of high-throughput computing job and resource management service. It identifies mapping between elements of the OGSA and Condor infrastructures, potential areas of conflict, and defines a set of complementary architectural options by which individual Condor service can be exposed as OGSA Grid services, in order to achieve a seamless integration of Condor resource in a standardized grid environment. [5] ALiEn-EDG Interoperability, whose goal is to exploit upcoming Grid resources to run AliEn managed Jobs and store the produced data in EDG, is resource-broker-based architecture which using interface machine acts as interfaces nodes between

systems, and provides a single resource view for both of the two Grids [6]. WorldGrid is an intercontinental testbed running a given set of core services and several collective optional services, sharing the same authentication and authorization mechanism, and using a common schema for resource location and status in the context of a Virtual Organization. The WorldGrid testbed has been successfully demonstrated at SuperComputing 2002 and IST2002 where real HEP application [7]

3. Overview of CGSP ChinaGrid Support Platform (CGSP) is a grid middleware developed for the construction of the ChinaGrid, which, founded by Ministry of Education of China, aims at building a public service system for Chinese education and research by exploring the various resources on existing and well developed internet infrastructure, CERNET (China Education and Research Network)[8]. CGSP, which is contrast to the Globus [9], [10] bottom-up approach by assuring basic service to be compliant, is a top-down approach that is applicationdriven. It provides a consistent view and an almost universal level of functionality of the underlying Grid resources by exposing a collection of WSRF-compliant Services. Figure 1 outlines the CGSP architecture in a logic view. Components of CGSP are Job Manager, Information Center, Data Manager and Data Service, General Running Service, Workflow Engine and Domain Manager [11]. A user-friendly Portal and Grid Parallel Programming Interface (GridPPI) are also provided in order to enable user to use CGSP more conveniently and flexibly. Job Manager (JM): Job Manager is responsible for submitting, scheduling, monitoring and controlling jobs launched by end-users. It enables applications to have consistent access to underlying resources using uniform interface. Information Center (IC): Information Center keeps track of information of all resources in ChinaGrid and connects all domains, all CGSP Components and all resources together. Data Manager (DM) & Data Service (DS): The data management aims at integrating the heterogeneous storage resources distributed in the grid environment, and providing consist and virtual view of storage resource and reliable data transfer mechanism with high performance.

Figure 1. CGSP Architecture & job executing flow

General Running Service (GRS): GRS is responsible for executing program with specific execution environments requirements. It shields the heterogeneousness of all kinds of target systems and provides a consist view for all kinds of computing resources. Workflow Engine (WE): It provides the below functionalities: Workflow execution, control and monitoring; Workflow load balancing; Domain Manager: domain manager is responsible for identities management and performing user authentication; To illustrate how all the CGSP modules work together, a general job executing flow in CGSP is outlined in Figure 1 tagged circled numbers as its order. -

4. Grid abstractions based on CGSP The Grid abstractions offered by the CGSP have been designed with a top-down approach. Rather than exposing the complex Grid functionality to an application in a non-intuitive fashion, we outline the most general and basic functionalities desired by a number of applications and provide appropriate interfaces. In the rest of this section we describe important abstractions and Specific interoperability issues, including job submission and job execution management, data management and staging, resource representation and discovery, and security and identities management, which must be addressed to achieve the objective of seamlessly job execution.

4.1. Job submission management

&

job

execution

From the execution perspective, interoperability is implementing seamlessly job execution between all kinds of heterogeneous Grid systems. However, when attempting such interoperability, carefully analysis of the differences between job and task is necessary. Task, which is an abstraction for computing operation, is an execution of a program with specific data input and output at a single specific computing resource. Job is an atom unit of Grid execution, which can be managed by job manager, on a single specified computing resource. Usually, a job consists of more than one task, and the tasks have some dependence between each other, that is a job is a partial ordered set of tasks. Global job submission using JSDL. Since different Grid system usually uses different job submission language and provides different interfaces and utilities for job definition and submission, the user must know in which Grid the job will be executed and define that in an explicit way, which is conflict with the transparence of Grid. Our approach to this problem is to use a standardize job submission language. In CGSP, we choose JSDL [12], which is XML-based language for describing requirements of computational jobs for submission to resources, particularly in Grid environments [12]. It is developed by GGF’s JSDL working group [13] and is supposed to become standard language for Grid middleware systems.

In order to enable the existed Grid system without JSDL support to support JSDL without significant changes, a conversion should be made to map JSDL job submission to the native job submission that can be accepted by the target Grid. The conversion should be made using an adapter in order to keep no changes to the existing Grid. Another abstraction provided by CGSP is submission task. From the function view, task is an execution of a program with specific data on the computing resource. That means we can just consider the submission of a job as a normal task which is an execution of submission program. Since a program can be executed by GRS in a separate environment as exactly as user specified, the problem of interoperability is convert to configuring the environment for the submission program. Job representation & management. Job can be defined as a life-cycled dynamic resource with specific status which can be changed by operations of the associated services. Some fundamental properties of job resource that must be supported by all Grids are ID, job status, user credential, endpoint address of backend service. A job is identified by its resource ID, which is assigned by JobManager and returned via EndpointReference (EPR) in the response for submission request. In order to be more currency and provide most capability, some degrees of abstraction is made in CGSP which defines the set of minimal job control interfaces that should be provided by any job management system of any Grid by considering the core needs of interoperability. The fundamental set of job control interfaces includes start, abort, suspend, resume, and getStatus.

The two abstractions above allow Grids managing job using a consistent way at an abstract level. To achieve the aim of basic interoperability, the Grid must provided support to these abstractions.

4.2. Storage management & data staging Data Management components is a basic component for any Grid. It integrates the heterogeneous storage resources distributed in the grid environment, and provides a consistent view of storage resource and provides location-independent representation for files. Two of the basic functionalities of data manage is the storage management and reliable data transfer. Strorage management. CGSP data space is designed as a Grid file system, which provides almost all basic file system interfaces using WSRF-compliant service. The interface that must be provided including ListDirectory, ListProperties, Copy, CreateDirectory, Delete, and ChangePermissions. Additionally, ImportFile and ExportFile interface are also required to enable user to import data into and export data from Grid data space. Data staging. Two transfer modes are supported by CGSP data service: GridFTP-based third-party file transfer and traditional C/S mode downloading and uploading. Third-party file transfer provides parallel, effective, reliable data transfer service, but requires GridFTP server must be installed on both computing resource and data server, which is not required by C/S mode data transfer.

Figure 2. An example of translating between JSDL data staging

To computer, the data staging is just a normal task, which is the execution of the data transfer program with the source URL and destination URL as its argument. So we can unify the execution and data transfer using the task model and abstract the data transfer as a Grid task. As the task can be executed in an exactly environment as user specified which can be different to the current Grid environment, it can access the data in another Grid environment just like the Grid client for that Grid. Interoperability of data management can be avoided using such approach. Figure 2 shows an example of translating between JSDL data staging and GRS tasks of a job.

4.3. Resource discovery

representation &

resource

In CGSP, the information of the resources is managed by Information Center (IC). To support all kinds of resources, IC, by design, is a large XML database which enables users to query information using XPath. The query result is returned as XML elements list. This design makes it scalable and can store almost any information. IC of different domains can be integrated to provide a global view of a collection of domains. Two of the most important attributes of an information element are ID, which is used to identified the information, and expired time, after when, the information will be removed from information center. IC does not assure correctness of the information and coherence between each other. If the information is not unregistered correctly due to the abnormal crash of the corresponding resource, it will still be maintained until expired. So all the components depend on information center must provide the ability of fault tolerance to mendacious information. This design makes not only the information center simple but also the whole system robust, and avoids the whole system failed due to the information center crash. To Implement Interoperability of resource discovery, solutions to the following key aspects had to be developed: The resource of the target Grid must register to one of the CGSP information center with the format that can be accepted by CGSP. The target Grid should accept and can process the CGSP’s information format. As CGSP’s information model is XML-based and query using XPath, there is no difficult for a Grid to get the information needed from CGSP information center. But still some conversion is needed to synchronize the information center between

interoperable Grids, for which information proxy is recommended.

4.4. Security and identity management Security solution of heterogeneous Grids is always a challenging problem and none universal solution has been designed yet, since most Grid middleware implementations have been using different security model. Here we will give some possible suggestion based on CGSP security model, but will not go into implementation details. As CGSP container is extended from Globus’s container, GSI [9] is adopted as our underlying security infrastructure to provide us with remote user authentication and credential mapping between global identities and identities local to CGSP domain. We adopt group-based authorization policy and a twolayered approach to access control, container responsible for validating permission of service access, and service responsible for validating permission of resources access. In order to implement interoperable security model in the complex heterogeneous Grid environment, Globus’s proxy delegation is recommended to void exposing user certificate to Grid. Each Grid wanted to implement interoperability with each other must support proxy delegation and credential mapping between global identities (Grid certificates) and its local identities. And authorization is performed based on its local identities, which should be transparent to other Grid.

5. GPE4CGSP: interoperability between CGSP and GPE To support our interoperability approach we proposed based on CGSP, GPE4CGSP which implements interoperability between CGSP and GPE is shown as a case study. Our objective is to showcase the feasibility that several Grids can be interoperable via our approach without significant modification. A brief overview of GPE is provided, following with a schematic implementation of interoperability.

5.1. GPE conceptions Grid Program Environment (GPE) is grid project funded by Intel whose propose is to bring applications to the Grid by establishing a stable interface to underlying Grid implementations that encapsulate Grid-specific protocols and languages and only expose stable web service interfaces to the client side [3].

Figure 3. Overview of GPE Components [3]

By Building on Standards such as OGSA, WSRF, JSDL, BPEL, CIM, interoperability is hoped to be achieved. An overview of the components that build the GPE framework are shown in Figure 3. Two basic conceptions of GPE used to achieve the aim of interoperability of heterogeneous Grids are: Target system abstraction. The actual target system is abstracted as a WS-Resource which is managed by Target System Service (TSS). Properties of the target system resource, including available hardware, installed software and current workload, available disk space, etc, are registered at the Target System Registry which keeps track of all target systems.. Access target system through automatic Grid service. The basic access to the target system in GPE will be provided by five different atomic WSRF services, i.e. TSS, Target System Factory (TSF), Job Management Service (JMS), Storage Manage Service (SMS), File Import Service (FIS), and File Export Service (FES). TSS, which is created by TSF, manages WS-Resources representing the actual target systems, and can be registered at the Target System Registry

(TSR) which can be queried by GPE client. TSS also provides a factory method for jobs that will be managed through the JMS. The storage resources on the target file system are managed by SMS. The FIS/FES will manage file import/export resources that are created by the SMS and represent file imports and exports to and from a storage resource at the target system.

5.2. Integrating CGSP and GPE The integration of CGSP and GPE is very straight, since the two share a similar abstraction of Grid. To adopt the atomic service layer, we choose directly implement the interfaces by connecting it to already existing CGSP services in GPE4CGSP. Figure 4 outlines the integration architecture. TSS passes the invocation to the underlying Job Manager in standard JSDL format, generates Job instance resources and returns the resource ID to submitter. The JMS invokes the underlying Job Manager, and receives notifications from Job Manager to maintain job resources created by TSS.

Figure 4. Mapping CGSP to GPE

SMS manages the virtual file space provided by CGSP Data Management. All the operations defined in SMS are directly supported by the corresponding functionalities implemented in CGSP Data Management service. File Import / File Export are supported by CGSP Data Manager using download and upload interface. Since CGSP and GPE use different resource representation, A proxy daemon is provided between CGSP’ IC and GPE’s Registry. The proxy daemon is response for converting CGSP information representation to GPE’s CIM representation and auto synchronizing between each other. A typical job execution flow including data stage-in and stage-out from and to the client is as follow: 1. GPE client upload files required by the job to CGSP data space via FIS which exploits upload interface of CGSP DM, since all data files must be in CGSP data space before they can be accessed by any GRS. 2. Client submits JSDL script, which contains data stage-in, data stage-out and invocation of an application, to CGSP JM via TSS. 3. The client starts the job via JMS which will invoke the start interface of CGSP JM. CGSP JM is responsible for managing job execution by virtue of exploit the underlying GRS to do staging in, computing, and staging out. 4. When the job completing, CGSP JM notify GPE JMS.

5.

GPE client downloads the result files from CGSP data space via FES which exploit download interface of CGSP DM.

6. Conclusions and future works Interoperability between different Grid implementations is essential to seamlessly job execution. This paper proposes our approach to implement the basic interoperability between heterogeneous Grids by abstracting a minimal set of services and functionalities based on CGSP. We describe the problem encountered and the solution taken within integrating CGSP with other OGSAcompliant Grid,andparticularlyaddressthespecific interoperability issues including Job submission and job execution management, data management and staging, resource representation and discovery, and security and identities management. We have tested the approach by integrating CGSP and GPE. In future, we are planning to continue our work in following directions: first, motivated with implementing more widely interoperability, the job model should be improved to be capable with more Grids. Second, we will establish the Grid workflow model based on BPEL specification. Third, more standards should be confirmed and supported by CGSP. Furthermore, solution of security interoperability should be studied. And finally, we will try to integrate a security model into GPE4CGSP which is a challenging problem.

References [1] The ChinaGrid Project, http://www.chinaGrid.edu.cn [2] Foster, I., Kesselman, C., Nick, J. and Tuecke, S. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Globus Project, 2002. [3] Ralf Ratering,Grid Programming Environment (GPE) Concepts, Revision 1.0, 2 June 2005 [4] David Snelling, Sven van den Berghe, Gregor von Laszewski, Philipp Wieder, Jon MacLaren, John Brooke, A Unicore Globus Interoperability Layer [5] Clovis Chapman, Paul Wilson, Todd Tannenbaum, Matthew Farrellee, Miron Livny, John Brodholt, and Wolfgang Emmerich, Condor services for the Global Grid: Interoperability between Condor and OGSA [6] S. Bagnasco, P. Cerello, AliEn – EDG Interoperability in ALICE [7] F. Donno,V. Ciaschini, D. Rebatto, L. Vaccarossa, M. Verlato, The WorldGrid transatlantic testbed: a successful example of Grid interoperability across EU and U.S. domains, Computing in High Energy and Nuclear Physics, 24-28 March 2003 [8] Hai Jin, Zhaoneng Chen, Hsinchun Chen, Qihao Miao, ChinaGrid: Making Grid Computing a Reality, Digital Libraries: International Collaboration and Cross-Fertilization - Lecture Notes in Computer Science, Vol.3334 [9] "The Globus Project," http://www.globus.org [10] Foster I. & C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit”, Intl J. Supercomputer Applications, 11(2), 1997. [11] Wu Yongwei, Wu Song, Yu Huashan, Hu Chunming, Introduction to ChinaGrid Support Platform, Lecture Notes in Computer Science, Vol.3759, P.232-240, 2005 (ISPA2005) [12] Ali Anjomshoaa,Fred Brisard,An Ly, Stephen McGough, Darren Pulsipher,Andreas Savva, Job Submission Description Language (JSDL) Specification, Version 1.0 (draft 19),Global Grid Forum 27 May 2005 [13] Global Grid Forum. http://www.ggf.org [02/28/2005] [14] The Condor Project Homepage. http://www.cs.wisc.edu/condor [02/28/2005] [15] Michael Rambadt, Philipp Wieder,UNICORE – Globus Interoperability: Getting the Best of BothWorlds

Suggest Documents