The GSML tool suite - IEEE Xplore

3 downloads 0 Views 409KB Size Report
The GSML Tool Suite: A Supporting Environment for User-level. Programming in Grids. Zhiwei Xu, Wei Li, Donghua Liu, Haiyan Yu, Bingchen Li. Institute of ...
The GSML Tool Suite: A Supporting Environment for User-level Programming in Grids Zhiwei Xu, Wei Li, Donghua Liu,Haiyan Yu,Bingchen Li Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100080, China Email: (zxu, liwei, dliu, yuhaiyan, libinechenl Bict.ac.cn Absrracf- The grid technology emerges for the need of resource sharing and cooperating in wide areas. In the OGSA framework, g i d resources are abstracted as Grid Services in order to hide their distributed and heterogeneous properties. A Grid Service can be seen as a server-side interface that defines the access fashion of grid resources. To explore and utilize grid resources effectively and conveniently, we propose a user-level programming language called Grid Service Markup Language (GSML) to help end users describe their usage of grid services. The base of GSML is the CAM model, which abstracts the grid as a computer with active memories. When designing the GSML, we define several GSML tags to indicate the structure, the operation element of GSML pages and relations between operation elements. The key feature of GSML is using a special tag to mark up grid services. To support the running of a GSML page, we design a set of tools called the GSML tool suite to compose, execute and deploy GSML pages. In order to provide the physical-resource-independent property for GSML, we also propose the concepts of virtual resource space and community. As a user-level programming language, the GSML can be a complement to OGSA framework. A prototype to prove our concepts is also presented. Keywords: Grid, Grid Service Markup Language, GSML, Virtual Resource Space, Community

I. Introduction Grid computing [ 5 ] has emerged as a technology for wide-area resource sharing. The Open Grid Service Architecture proposed in [4] abstracts grid resources as Grid Services and enables the integration of Grid Services across multiple Virtual Organizations. When we adopt the Grid Service concept to build a grid system, an important issue is how to explore and utilize the distributed grid services effectively and conveniently. In other words, we need to build an environment to support the user-level programming in a grid. There are several approaches to solve this problem: 1 Customize an application that integrates several services together. It means that we can develop a custom-built software that fulfills required functions, with the customized CUI. the customized operating - The work is supported in pm by the China NaNral Science Foundation (69925205). the China National “863” Project (2002AA104310). the China National “973” Project

(G1998030602) and the Chinese Academy of Sciences (20014010).

mode, etc. Build a grid portal [7] that integrates several services together. Using this method, we c“ build a grid portal that contains all applications of various users. Then an end user can use an all-purposed Web ‘browser to interact with his own applications via the grid portal. The above approaches are common methods used in the server-centric environment. But in a user-centric environment such a s a grid, the above ,methods face several challenges: 8 Hard f o express [he need of end users. It is hard to provide the flexibilities to express the requirement of .end users because these requirements are often dynamically changed and undetermined. To support a new requirement, new programs should be coded, compiled, debugged and deployed, which are time-consuming and laborious work. H a d to organize rhe resources for end users f o develop applications. In many cases, a user needs to utilize multiple resources to fulfill one task. Because of the heterogeneous, dynamic properties of grid resources, it is very hard for end users to describe the resource accessing fashion and process, which is usually done by experienced programmers. Hard to deploy the need of end users. Another problem is the deployment of applications. In many cases, end users want to deploy applications as they need. In a server-centric environment, the applications are deployed and managed by server-side engineers. It is hard to provide on-demand application deployment abilities for end users. In addition. when designing a support environment for user-level programming in grid environments, we should consider following two issues: 1 Physical-resource-independent requirement of programmers. Machine-independent programming ability is a primary requirement in computer architecture design. For instance, the virtual memory technology [21 can provide a machine-independent memory space for programmers. Although this is not fulfilled by the language itself, the support environment should supply such abilities. Runtime support for physical-resource-independent programs running on grid systems. When a user-level program runs on a grid, the supporting runtime system should satisfy several requirements such as hiding heterogeneity, resource selection, error recovery, etc. In order to enable the user-level programming in.Grid environments, we propose a Grid Service Markup Language (GSML) to express and composite grid services. 9

.

-

0-7803-7840-7/03/$17.00 0 2 0 0 3 IEEE. -629-

'

We define different partitions of a GSML page as GSML structure tags such as Head, Body and Tail. Another class of GSML tags is the GSML component tag, such as Service. The last class of GSML tags is the GSML relation tag, such as Transiiion. As a markup language, the GSML enables end users easily describe their requirements to a grid just by creating, modifying a text-format page. To support the function of GSML, we provide a GSML Tool Suiie to compose, explore and execute GSML pages. Especially, we propose a Grid Resource Space Hierarchy to guide the design of GSML tool suite. In this tool suite, we develop a GUI tool called Grid Bmwser to interpret the GSML language to a graphic format. A GSML Server is a container of GSML pages and processes the requests from Grid Browsers. Between the' Grid Browsers and GSML SeNers, we design a new communication protocol called Grid Service Request Pmiocol (GSRP). The paper is organized as follows: Section 2 introduces the relation between GSML and the CAM model. Section 3 discusses the basic concepts and design principles of GSML Section 4 describes the Grid Resource Space Hierarchy and the structure of the GSML tool suite. Section 5 gives the implementation detail of the GSML tool suite. Section 6 gives the conclusions and the future plan of the GSML tool suite.

11. GSML versus the CAM Model To better understand the nature of the grid, we propose a new model of Compufer wiih Acfive Memory (CAM) [9]. Fig. 1 'illustrates the principle of the CAM model. This CAM model is better suited than RAM [ 11 and PRAM 131 for modeling grid computing. Each memory cell in CAM can be viewed as a grid service. In the CAM model, the memory cells are equipped with traditional read/write operations as well as new execution operations. This CAM model can be used to study architectural mechanisms and algorithms. design and analysis of grid computing. The entire memory can be viewed as the grid. A processor of CAM (Box labeled "P" in the figure) represents a client device or a user of the grid. Instructions of CAM can be .viewed as users' requests to the grid for desired services.

m

Fig. 1 The principle of the CAM model When designing the GSML, we use the CAM as the base

distributed resources in a universal fashion. The Grid Service concept gives a general format for accessing various resources. This unified resource access interface makes it possible to use general-purpose tags to indicate the grid resources in GSML.

111. Basic Concepts of GSML A. Design Principles of GSML The first design principle of GSML is easy-to-present. From the experience of HTML, we h o w that the HTML gives end users an easy fashion to present contents. When designing GSML, we follow this way and make an effort to simplify its definitions, by which an end user can easily use GSML tags to express his requirements. The main difference between GSML and HTML is that it can present services in addition to contents. The next design principle of GSML is easy-fo-organize. A challenge of designing GSML is how to organize the basic elements in a GSML page, especially the internal relations among operations elements. The GSML does not aim at providing omnipotent programming abilities, but just providing limited abilities. Compared to the HTML, the GSML adds one important enhancement which is called Transiiion in addition to hyperlink. The Transition supplies a simple conditional expression among several operation elements. In a GSML page, a user can freely place basic operation elements everywhere and uses Transition tags to establish a simple conditional relation among operation elements easily. The last design principle of GSML is easy-io-deploy. Using GSML, a user can create an application by describing his requirement in a text file. The work of deploying this application is just to copy this text file to a GSML server. This process is similar with the deployment of a HTML page. This easy-to-deploy property of GSML can greatly reduce the efforts to build a grid application.

B. Structure of GSML Pages The structure tags define the basic structure of a GSML page. These tags include the Head tag, the Body tag and the Tail tag. The Head tag includes some context information between a client and a GSML server. The information includes the private information such as the authors, the administrators, which are invisible to end users. It also includes the public information such as the title, the index and the communities (discussed later). The Body tag comprises the operations elements and their relations in a GSML page. This tag gives a section visible to end users, and end users can operate on any resources included in this section. The Tail tag specifies some information invisible to end users in order to process the clearup work after a client disconnects to a GSML server.

of our programming model. Firstly, we view the grid as a huge memory space where each grid service is a memory cell. Using this method, we can access the heterogeneous

C. Resource Component of GSML Pages

-630.

The component tags define the basic operation elements

or resources in a GSML page, in which the most important one is the Service tag. A service tag is an abstract of grid services. In OGSA framework, a grid service is described by a GSDL description [SI.In GSML, the description of a service is much simpler than the GSDL descriptions of grid services, because the description in GSML is just for display and interaction between end users and browsers. The following description gives the non-nonnative

keyword gives the communication mode between end users and grid services; the address keyword gives the location of a grid service; the instance keyword gives the handle of an activated grid service; the operation keyword indicates a method that will be called by end users; and paramerer elemenr uses following syntax to pass all needed parameters when end

definitions in OGSA, the grid service defined in GSML is very simple and is a minimum set for end users to interactive with a grid service.

D. Relations among the Components The relation tags define the relations among different resource components in a GSML page. When designing the relation tags, we focus on two important issues: how to empower users with the ability to program the grid and at the same time, how to keep the complexity manageable. The objective is to have an interface that has enough power while still usable, so that an end user with simple training can use it. In relation tags, the most important one is the Transition tag. Following description gives the definition of the Transition tag in GSML

ni ion, the condifion kevword uses boo1 expressions to describe the conditions. that will fire a transition event; the targef attribute determines the destination for a transition event; local means the transition will happen in the current GSML page and new indicates that the transition will transfer to another GSML page.

IV. The GSML Tool Suite A. Grid Resource Space Hierarchy In traditional computer architecture design, an indispensable function is the machine-independent feature,

A classical example is the virtual memory [2] technology, which provides a physical-independent storage view for programmers. With the support of virtual memory, the programmer can concentrate on solving his own problems. Also there are no needs to reprogramming when hardware changes and the programs can tun on computers with different hardware configurations. In GSML design, we propose a similar concept called v i m a l resource space (VRS) to provide the physical-resource-independenrfeature for programmers. In this virtual space, a programmer need not care for the location and dynamic, heterogeneous properties of grid. resources. All physical properties of resources are hidden by a virtual resource layer. To explain the connotation of virtual resource space concept, we give following definitions first: Physical Resource. An entity with compute abilities, which can be accessed via network. In OGSA framework, the physical resources are represented by grid services. 1 Virtual Resource. A common abstract to physical resources. All virtual resouces form a space called virtual resource space. VRS is a globally visible resource view for programmers. Communily. A subset of VRS which includes virtual resources most common used. Different from VRS, a community is owned by a certain user and provides a collection of usable resources to this user. If a programmer programs directly on the physical resources, he should solve following problems: 1 They must know the physical location of resources. When the location of a resource is changed, they must know this change and modify the code. 1 The programmer must handle the heterogeneity of multiple same resources. Although the OGSA framework gives the standard to represent a grid service, it is hard to determine the compatibility among multiple same resources. The programmer should know and handle the diversities of these resources. 1 It is hard to share the code between programmers. If a programmer wants to share a program module written by another programmer, the physical resources referred in this module may not be available for some reasons, such as access right, etc. 1 In runtime, a program should process all issues related to physical resources, such as resource locating, resource allocating, resource selecting etc. In most cases, the code for above issues can be reused. In order to release the programmer from above heavy burdens, we use the virtual resource concept to solve above problems and provide a programming-friendly interface. The physical resource layer, virtual resource layer and community layer form the grid resource space hierarchy illustrated in Fig. 2. In our design, the virtual resource is abstract as a location-independent identifier uniquely differentiating one another. With this identifier, multiple same physical resources can he mapped to one virtual resource. The virtual resource layer in Fig. 2 also fulfills the resource locating, resource allocating, resource selecting,

-

-631-

etc. It also enable a program automatically switch to an available same physical resource when a physical resource is failed. When programming on the virtual resource layer, the programmer can obtain a stable, homogeneous and reliable resources view.

Fig. 2 Grid Resource Space Hierarchy Although VRS can provide above benefits, it only provides a global visible resource view for programmers. In run time,, a visible virtual resource must be mapped to a usable physical resource. A visible virtual resource may not he usable for some reasons, such as accessing an unauthorized physical resource. So, when a program runs, we need to make the visible resources he a usable resource. The time cost on mapping virtual resources to physical resources is an important factor that influences the performance of a program. Generally, there are three methods to do mapping work. The fist one is ail-mapping method, which maps each virtual resource to an available physical resource before the running of a program. The second one is run-rime-mapping method, which maps a virtual resource to an available resource when a program just accesses this virtual resource. The thud one is parrialmapping method, which maps a subset of virtual resources to available resources. The fist method is high-cost because of the huge amount of virtual resources. The second method is bad-performance because the mapping work is a time-cost operation. The third method avoids the shortages of previous methods and can obtain better performance and low-cost if the subset of virtual resources are most common used. -We think the partial-mapping is a good choice if we can determine the resources most common used by users. So we suppose the Community concept 10 achieve this goal. The community can .bring benefits both at programming phase and at run time. We can predefine communities $at include the most common used resources, and a programmer can find most of required resources in these communities q d needs not to search them in whole VRS. At mn time, the community can attach the virmal resources with usable physical resources in advance, which can provide better perfomance for programs mnning on the virtual layer. In the design of GSML tool suite, we build the relevant tools and runtimes on the virtual resource layer.

B.GSMLToolSuite In order to fulfill the function of GSML; we need a set of '

'

tools and protocols that enable end users to use grid resources conveniently via the GSML. Such a tool suite is called GSML tool suite, which consists of the following protocols and tools, as illustrated in Fig. 3: A software Chid "browser" at &e client side, which renders GSML pages and provides an interface for users to access a grid. The kernel of the Grid Browser includes receiving an end user's request and transferring to a server, and receiving a GSML page from a server and displaying this page. This process is something similar to that in current Web browsers. A software grid server (also called GSML server) that receives service requests from the client side, processes the requests, and sends results hack to the client side. The GSML sever is a container of GSML pages. The most important function of the GSML server is to translate a service tag to a usable physical resource and delegate an end user to access that resource. A protocol for the interactions between the GSML server and the GSML browser, called Grid Service Request Protocol (GSRP). The goal of GSRP is to provide lightweight, high efficiency communication abilities for both Grid Browsers and GSML servers.

(;sui.

Grid

IlmlSCr

'

I'ileC

A/

/ -

!

Fig. 3 The relation between the GSML tool suite and VRS Figure 3 illustrates the relation between the GSML tool suite and VRS. The key feature of the GSML tools suite is that it is built on the VRS but not on physical resources layer. As the container and the manager of GSML pages, the GSML server predefines several communities to satisfy end user's requirements when they edit GSML pages. In runtime, all possible used communities are loaded in the GSML server in advance. When an end user requires for a GSML page, the virtual resources in these communities have been mapped to usable physical resources. At this time, every service tag in this GSML page is attached with a usable physical resource. When the Grid Browser parses the GSML page and sends a service accessing request, the GSML server find the associated virtual resource in a community, and delegate the end user to access the physical resource bound to this virtual resource. The integration of GSML tool suite and VRS can bring benefits to both end users and the GSML servers. For the GSML servers, with the support of VRS, they need not to

-632.

process various problems brought by physical resources, for instance, the resource locating, the resource selecting, etc. The only thing the GSML server should know is the identifier of a resource. In addition, the GSML server just uses the programming interface provided by VRS to customize communities as it needs. This makes the development of a GSML server easily and quickly, and also provides better performance for GSML pages. For an end user who will edit GSML pages, he can find out most of his needed resources in communities and just mark up a resource in his page with a simple service tag. The end users can not see physical resources at all. Most of their resource requirements can he satisfies by communities in GSML servers.

To better support the ability of GSML, we propose an important concept called virtual resource space. The virtual resource space provides the physical-resource-independent property to programmers and makes it possible for modular programming. Other benefits brought by VRS come from the running time of programs, for the VRS runtime system can do most of work related to physical resources, such as resource locating, selecting etc. In next step, we will integrate the VRS runtime system with other systems in Vega COS to build a self-contained grid software platform. This platform Will support not only the GSML tool suite, but also other grid applications. We will also make efforts to solve key problems on vihual resource space.

V. Implementations

References

,

The GSML tool suite is part of the Vega Grid project [91 [I] S.A. Cook, R.A. Reckhow, ’Time Bounded Random and CNGrid project. The Vega Grid project aims to learning Access Machines”. Journal. of Computer and System fundamental properties of grid computing, and developing Science, Vol. 7, No. 4,.pp. 354-375, 1973. key techniques that are essential for building grid systems [2] Robert C. Daley, Jack B. Dennis, “Virtual memory, and applications. The Vega team is conducting research processes, and sharing in Multics”, Proceedings of the Zsf work in several areas, such as Dawning Superservers, Vega ACM Symposium on Operating Systems Principles; October Grid software platform, Vega Information ,Grid, and Vega 1967, Gatlinburg, TN USA. Knowledge Grid. The full name of CNGrid project is the 131 S . Fortune, . I . Wyllie, “Parallelism in random access Chinese National Grid project, which aims to integrate machines”. Pmc. 10th ACM Symp. on Theory of Compuring. high-performance computers of China together to provide a pp. 114-118,May 1978. virtual super computing environment. The GSML tool suite [41 I. Foster, C. Kesselman, etc., ‘Grid Services for will be deployed in this testbed as a developing Distributed System Integration’:, Computer, 2002,35(6). environment and an operating environment. We are also [5] I. Foster, C. Kesselman, and S . Tuecke, ‘The Anatomy cooperating with IBM Research to build a prototype of of the Grid: Enabling Scalable Virtual Organizations”, GSML tool suite on l B M s portal products. The benefit of International J. of High Peiformance Computing using portal technologies is that we can develop a version Applicarions, 2001, 15(3): pp. 200-222. of GSML tool suite that supports current Web browsers. [6] Wei Li, Zhiwei Xu, etc., “A Grid Resource Discovery In addition to building prototypes in research projects, we Model Based On The Routing-Transferring Method”, The use the GSML tool suite to develop applications for end 3rd International Workshop on Grid Computing, Baltimore, users, such as that of the railway department of China. We November 18,2002. use GSML service tags to integrate database services 171 M. Russel, G. Allen, etc., ‘The Astrophysics Simulation according to their business requirements. The end users can Collaboratory: A Science Portal Enabling Community easily compose and deploy new applications as they need Software Pevelopment”, Cluster Computing, 5(3):297-304, just by r e ~ t i n ga GSML page. A technical report [IO] 2002. gives the detail implementation of the GSML tool suite and [SI S . Tuecke, K. Czajkowski, etc., “Grid Service those applications. Specification”; Open Grid Service Infrastructure WG, The current implementation of the GSML tool suite is Global Grid Forum, Draft 2,7/17/2002. based on the physical resource layer. Now these tools are [91 Zhiwei Xu, Wei Li, Hongguang Fu, Zhenbing Zeng, migrating to a virtual resource layer provided by Vega Grid “The Vega Grid and Grid-Based Education”, The 1st software platform (discussed below). The Grid Browser is International Conference on Web-based Learning, 2002. developed on Java platform and the GSML server .is a [lo] Zhiwei Xu, Bingchen Li, Wei Li, etc., ‘.‘Implementation standalone daemon process running on UNx platforms. The Issues of a Grid Service Markup Language”, Technical GSRP protocol enables the read, write and execute Repon of Vega Grid Pmject, TRO1-Vega, 2003. operations between Grid Browser and GSML servers. ,

VI. Conclusions and Future Plan In this paper, we propose a user-level programming language called GSML and its supporting environment called GSML tool suite. As a markup language similar to the HTML, the GSML has the properties such as easy-to-present, easy-to-organize and easy-to-deploy. We also introduce the basic concepts and syntax of the GSML.

-633-