An Efficient Web Services Based Approach to Computing ... - CiteSeerX

2 downloads 0 Views 135KB Size Report
Web Services framework [4] is followed when publishing and consuming the grid services. J2EE framework [3] is used in handling local issues in a domain. The.
An Efficient Web Services Based Approach to Computing Grid Weimin Zheng, Meizhi Hu, Guangwen Yang, Yongwei Wu, Ming Chen, Ruojie Ma, Shucheng Liu and Baoyin Zhang Tsinghua University, Beijing, 100084, P.R.China [email protected]

Abstract The main focus of Grid is the resource sharing at substrate level. With present existing tool kits, however, great efforts still need to be made to implement resource sharing at higher level, such as publishing grid services on grid substrates. In this paper, we present Construction Platform for Specific Computing Grid (CPSCG), a system that renders the ease and speediness for users to construct a computing grid. Shared computational resources are wrapped to web services and exported as grid services with little human labor in CPSCG. Web user interface, which is automatically built by system based on web services, is supplied to end users for them to invoke grid services.

1. Introduction The ambitious goal of grid computing is resource sharing [2]. Great efforts are need to be made to construct grid operation platform and publish shared resources as grid services. Although some tool kits helping grids development have been provided by some projects, such as GT [1, 2], it is still very arduous and time-consuming for users to accomplish the business. In this paper, we present CPSCG, Construction Platform for Specific Computing Grid, which is a system that renders the ease and speediness for users to construct a computing grid. Shared resources are wrapped into web services and exported as grid services. With provided ease-of-use interfaces, little human labor is required no matter wrapping the shared resources or publishing grid services. For the best convenience of users, web interface automatically built by the system based on the web services is supplied for end users to invoke grid services.



CPSCG is a fully decentralized system with graceful scalability. Each administrative domain maintains its inner issues and determines all its cooperation policies independently according to its policies and the system’s global requirements. Scheduling at multiple levels is a feature of CPSCG. The system directly handles two-level scheduling: scheduling between domains and scheduling between nodes in a domain. There may have another scheduling handled by underlying management software of nodes. In CPSCG, flexible securities are supported. Resource owners can set access restrictions for their sharing resources. The system searches appropriate and available services for users automatically corresponding to their restrictions. With such mechanisms, we seek to gain the best benefit for both resource owners and consumers. Keeping in accordance with industrial criterion is another emphasis of CPSCG. Web Services framework [4] is followed when publishing and consuming the grid services. J2EE framework [3] is used in handling local issues in a domain. The primary elements of the information subsystem are UDDI registries. The remainder of this paper is organized as follows: Section 2 gives the overview of CPSCG. The architecture of CPSCG is depicted in section 3. Related work are talked in section 4. Finally, the conclusion remarks are given in section 5.

2. Overview In many research fields, applications of grid become more and more popular. At present some tool kits have been provided to help users to develop grids, such as GT from GLOBUS [1, 2]. However, with these tool kits, it is still arduous and time-consuming to construct a grid system. To lighten human labor and shorten the development time in constructing computing grid systems are

This work is supported by the National Natural Science Foundation of China under Grant 60273007, 60373004, 60373005

Proceedings of the IEEE International Conference on E-Commerce Technology for Dynamic E-Business (CEC-East’04) 0-7695-2206-8/04 $ 20.00 IEEE

two key motivations of CPSCG. A CPSCG prototype is currently under development. This section provides a brief overview of the system. The architecture of CPSCG is left to the following section.

2.1 Assumptions There kinds of actors in CPSCG: owners of resources and consumers of shared resources. A software can be installed in more than one domain, and even in several nodes in a domain. Following are some other assumptions in CPSCG: a). Each domain has a front end, which is its only entry in the system. Communications between domains are all via their front ends. If the front end fails to work, the domain is thought to collapse. b). Any participating site (also named “node”) in a domain, is supposed to be a cluster or a mainframe and works independently.

2.2 Domains, Users and Tasks Domains: Each domain is autonomous and independent from others in CPSCG. The participating nodes in domains are the final execution sites of users’ tasks. From the overall perspective of the system, domains constitute a self-discovery and self-maintenance overlay, supporting the upper grid service level. A database is maintained in each domain to store information of local registered user, and information of tasks either executed on this domain or belonging to local registered grid users. Information of grid services available in this domain is also kept in the database. In order to

schedule tasks and balance loads, a living peer list and a living node list are maintained in each domain. All these are usually managed at the domain’s front end, which is more viable at the view of implementation. Users: Users are classified into two groups according to different operating restrictions: administrators and consumers. Administrators are responsible for the maintenance of inner environment of their domains. They can add or delete consumers in their domains, publish or remove grid services. Consumers can submit tasks through web user interface or programming APIs. The global identifier of a user is decided by his local register name on the administrative domain and the domain’s global identifier together. For global identifier, a user can logon at any interface, but the register information of the user is only stored in the domain where the user register. Tasks: Currently we only consider tasks that can be executed in a single site. When a task is submitted, an appropriate domain that satisfies its requirements is chosen and a suitable node in that domain is picked out to execute the task. A task is identified with a local unique identifier on its executing domain and the domain’s global identifier. Information of a task is kept in both the domain where the task is executed and the one where its owner registers. But the input and output results of the tasks are only kept in the execution domain. Such mechanism makes it very easy to locate any task of any users and manage data of tasks. When users check their tasks’ states, two steps are enough to get to the execution domain of a task: one step is to the user’s registered domain, and the other is to the execution domain.

Figure 1. A typical interaction between a user and the system

Proceedings of the IEEE International Conference on E-Commerce Technology for Dynamic E-Business (CEC-East’04) 0-7695-2206-8/04 $ 20.00 IEEE

Figure 1 shows a typical interaction session between a user and the system. User identified with “User1 @ Domain A” login at “Domain B”, its authentication request is sent to “Domain A” and the authentication response returns. When the user intends to submit a new task, “Domain B” first queries the information subsystem for available services satisfying the requirements. An answer (for example a service at “Domain C”) is returned to “Domain B”. Then the task is submitted to “Domain C”. After receiving the task, “Domain C” sends the submission information and execution states of the task to “Domain A”. When the user checks if any task has been finished, “Domain B” sends the “check” to “Domain A” and the check results are returned (for example, one task of the user executed at “Domain D” has been finished). Then the “download results” indication is sent to “Domain D”. The results are downloaded to “Domain B” and shown to the user.

2.3 Services In CPSCG, shared resources are wrapped to web services and exported as grid services. All services are registered to the information subsystem, whose primary elements are UDDI registries, and classified into categories according to their application realms, such as mathematics, biology, etc. A service has a unique name in its category. It is possible that some different services respectively belong to different categories may have the same service name, so a service is globally identified with its service name and the category it belongs to.

3. Architecture The architecture of CPSCG is shown in Figure 2. There are five components: Resource Wrapper, Information subsystem (namely UDDI component), GridFrontEnd, GridBackEnd and Web Interface. All components except the UDDI component are resided in each administrative domain.

3.1 GridBackEnd and GridFrontEnd The GridBackEnd component manages a domain’s local issues and acts as the local resource manager and monitor. It receives tasks from the GridFrontEnd and ensures the execution of tasks. A living node list is maintained here to keep the state of participating nodes. The component is responsible for nodes’ local systems and is compatible with OpenPBS, one of popular management software for clusters. GridFrontEnd handle the interactions between domains, scheduling within nodes in the domains, and users management. It also keeps the submission information and execution states of tasks either executed in this domain or belonging to local registered users. It maintains a living peer list recorded all participating domains, which is periodically retrieved from the information subsystem. This ensures that new joining domains or removed domains can be founded at least after an interval. Both GridFrontEnd and GridBackEnd provide real-time information of participants in the system to the information subsystem so that scheduling between domains can be handled.

Figure 2. The architecture of CPSCG

Proceedings of the IEEE International Conference on E-Commerce Technology for Dynamic E-Business (CEC-East’04) 0-7695-2206-8/04 $ 20.00 IEEE

3.2 Resource Wrapper This component is mainly used to wrap shared resources, export them as grid services, and register the services to the information subsystem. We extend WSDL criterion for the wrapper.

3.3 Information subsystem There may have several members distributed in different domains. Each one can complete all functions of the subsystem. It has two parts: a UDDI registry and a global scheduler. The former serves as an information center and the latter aims to schedule tasks between domains. UDDI registry of a new joining member can retrieve contents from other existing UDDI registries. Synchronization mechanism, Bloom Filter [7, 8], is adopted ensuring the consistency of UDDI registries of different members. The global schedulers handle the scheduling between domains. When a task is submitted, a suitable living domain is chosen according the global scheduling policy and some particular issues, such as security policies of domains, the restriction of the consumer and so forth.

3.4 Web Interface WSDL files of services are firstly analyzed here and the needed information for invoking it will be automatically transformed to input items shown in web pages so that users can handle the inputs easily. It acts as a program consumer of CPSCG and hides the implementation details from common consumers.

Legion [5] is an object-based metasystems software project. It devotes to develop an object-based integrated system, a grid OS. Everything in it is wrapped to an object. Message communications between objects behave as the execution procedure of a task. UNICORE [6] seeks easy and uniform access to distributed HPC computing resources. It is C/S structure. End users must install a client before they enter the system. The client submits task to the fixed server determined during the installation of the client though the unsatisfied task can be transferred to a suitable server. It is inconvenient and less popular compared with CPSCG. Yet simple workflow is supported in UNICORE, while CPSCG lists this issue as a future work since its emphasis is the ease and speediness to wrap resources and construct grid systems.

5. Conclusion A fully decentralized grid system with graceful scalability named Construction Platform for Specific Computing Grid (CPSCG) is presented in this paper. The architecture is illustrated explicitly. CPSCG enforces the ease and speediness for users to construct a computational grid. With tools provided by CPSCG, it is very convenient and efficient for users to wrap shared computational resources and to export them as web/grid services. Little human labor is required no matter wrapping resources or publishing services. For the best convenience of end users, programming APIs and web interface are supplied to invoke grid services. The needed information when accessing a shared resource can be automatically transformed to input items shown in web pages by the web user interfaces base on web service. Other features of CPSCG include scheduling at multiple layers, flexible security, and diverse accounting policies.

4. Related work There are many grid projects exist now. GLOBUS [1, 2], Legion [5], UNICORE [6] are some typical ones of them. The architecture of CPSCG shares some similarities with OGSA [1] of GLOBUS. Both of them are based on Web Service, seeking to the unified methods to invoke the grid services in different domains. However, work in GLOBUS is focused on the grid substrate, therefore it is still arduous and time-consuming for users to use its toolkits to construct grid systems.

Reference [1] Globus project. http://www.globus.org [2] I.Foster and C.Kesselman. The Globus Project: A Status Report. Proc. IPPS/SPDP `98 [3] J2EE Tutorial: http://java.sun.com/j2ee/ [4] http://www.ibm.com/developerworks/webservices/ [5] Legion project. http://legion.virginia.edu/ [6] UNICORE project. http://www.unicore.org [7] B. Bloom , Space/time trade-offs in hash coding with allowable errors , CACM, 13(7): 422-426, July 1970. [8] Michael Mitzenmacher, Compressed Bloom Filters, the 20th PODC, Aug, 2001.

Proceedings of the IEEE International Conference on E-Commerce Technology for Dynamic E-Business (CEC-East’04) 0-7695-2206-8/04 $ 20.00 IEEE