pervasive access to the data grid - Semantic Scholar

4 downloads 208 Views 704KB Size Report
Jan 25, 2005 - Jadavpur University, Kolkata 700 032, India .... programming abstraction and provides a common interface for all the platforms supported by .... GLOBUS GRID FTP http://www.globus.org/grid_software/data/gridftp.php. 16.
PERVASIVE ACCESS TO THE DATA GRID§ Sunirmal Khatua, Subhasis Dasgupta, Nandini Mukherjee { [email protected], [email protected], [email protected] } Department of Computer Science and Engineering, Jadavpur University, Kolkata 700 032, India

Abstract: A natural extension of the existing Grid technology is to integrate mobile and low-profile devices into it and accessing the Grid services in a pervasive manner. However, the existing middleware for Grid does not support mobility in the Grid nodes and are generally suitable only for the high-profile service providers and clients. This paper proposes architecture for accessing Grid services from clients, which are low profile. The main focus of this work is to support day-to-day data-intensive query applications for commercial usages instead of scientific usages. Keywords: Grid Middleware, Data Services, Mobile Client, Pervasive Access

1. Introduction During the recent years Grid technology has emerged as the central technology for efficient resource sharing among many autonomous organizations known as “Virtual Organizations”[1] or VOs. The relationships among the VOs are dynamic and Quality of Service (QoS) plays an important role in defining these relationships. According to Foster [1] a Grid is such a system that coordinates resources, which are not subject to centralized control, uses standard, open, general-purpose protocols & interfaces and delivers non-trivial quality of services. With the proliferation of wireless and mobile technologies, a natural extension of the Grid-based computing would be to incorporate the mobile devices into a Grid environment and have access to the Grid-based applications in a pervasive way. The Open Grid Services Architecture (OGSA) [4], developed by the Global Grid Forum [3], is a common, standard, and open architecture for grid-based applications. OGSA aims at standardization of services that are commonly found in grid applications and specifies a set of standard interfaces for them. Within this architecture Grid requires a stable QoS provided by the VOs and changing of sharing relationship does not take place frequently. Thus, when mobile devices are integrated with the current Grid architecture, new challenges come forth and need to be confronted with. In mobile environment, networks are unstable, mobile devices can join and leave the network frequently and the quality of connection is also unpredictable. Moreover, these devices have limited resource capability and short battery life. The middlewares that have been developed for Grid environments target mainly the static and high-end computing resources and hence are unsuitable to work on service providers, as well as clients which are mobile. This paper proposes an architecture that will allow clients with low computing resources to be used in a Grid environment and access data resources distributed among the VOs through a wireless mobile network. The architecture is built on top of the Grid Services which are basically stateful web services that conform to a set of conventions for such purposes as service lifetime management, inspection and notification of service state change.

2. Associated Technologies and Related Works Grid services along with some middleware support are the emerging technologies in the field of large scale heterogeneous distributed computing. Globus [5] group in collaboration with several Universities like Argonne National Laboratory, University of Chicago, UC Los Angeles etc and Organizations like IBM, SGI, SUN etc. is the pioneer within this field. The Globus Toolkit developed by the Globus group is the standard middleware for §

This work has been done as a part of the project “Extension of Grid Middleware to Incorporate mobile devices” under the “Centre for Mobile Computing and Communication” (www.cmccju.org) Jadavpur University India and sponsored by UGC under the “University for Potential Excellence Scheme” India.

accessing Grid services. The OGSA-DAI [6,7] project provides a middleware support (as a part of the Globus Toolkit) for accessing and integrating data from several sources via the Grid. The project was conceived by the UK Database Task Force [8] and is working closely with the Global Grid Forum [3], DAIS-WG [17], the OMII [18] and the Globus team [5]. Not much work has yet been done for the mobile grid environments. GridLab [9] proposes architecture for mobile clients, although the approach increases the overhead in the mobile devices. Moreover, GridLab community focuses on scientific computations, whereas our motivation is to focus on the data-intensive query applications that are used in the scientific world, as well as in the commercial world.

3. Grid and Services As discussed in [10], resources are organized in a Grid environment among heterogeneous distributed dynamic Virtual Organizations (VOs) that can be formally defined as a tuple (O, RS, I, PY, PL) where O à the set of concrete organizations ({o}) forming an instance of VO; RS à the set of resources and services ({rs}) supported by VO; I à the interface for accessing RS; PY à the set of policies ({py}) for the operation of the VO and PL à the set of protocols ({pl}) for the implementation of PY. A concrete organization (o) can be defined as a tuple ( RS, I, PY, PL ) where RS, I, PY & PL are used with the same meaning as described above. The protocols and interfaces as defined in the definition of VO must be standard and open. Moreover, multiple Quality of Services (QoS) issues including security, reliability and performance must also be addressed. Open Grid Service Architecture (OGSA) of Globus Grid Forum (GGF) provides one such open, standard and general-purpose protocol that negotiates and manages distributed sharing of the resources while addressing the various Quality of Services issues. OGSA standardizes the services which are typically used by a grid application including VO Management Service, Resource discovery & Management Service, Job Management Service and Data Management Service One important requirement in a Grid environment is that all these services must be invoked in a common and standard manner. For this purpose Web Services could be chosen as the underlying technology. Commonly Web Services are implemented as stateless services, although requirement of OGSA is that the services be “stateful”. To meet this requirement, Web Service Resource Framework (WSRF) [11], a specification developed by OASIS [12], has been introduced. Globus Toolkit – With the advent of Grid middlewares, Globus Toolkit developed by Globus Alliance [5] has become the de facto standard for the Grid community for realization of the OGSA requirements. The toolkit provides high-level services that can be used to build Grid applications. These services include a resource monitoring and discovery service, a job submission infrastructure, a security infrastructure, and data management services. The latest version of the Globus Toolkit (GT4) offers a complete implementation of the WSRF specification. Most of the services are implemented on top of WSRF (WS components). The relationship among GT4 (the toolkit), OGSA (the architecture) and WSRF (the specification) as depicted in [13] is presented in figure 1.

4. Data Management in Grid Grid community is distinctively divided into two groups – one that use the Grid for compute-intensive applications and the other, which concentrates on data-intensive query applications. The focus of this paper is on the users of the Data Grid accessing data services for solving their data-intensive queries. A data service (DS) provides the entry point for clients who want to access data resources. A data service can be discovered using a registry. Type of data in the Grid varies from simple files to some relations or XML data [14]. For simple file transfer gridFtp [15], a component of the Globus Toolkit is an effective tool and facilitates secure file transfer from one virtual organization to another. However, only file transfer is not sufficient for information retrieval and responding structured query. The Open Grid Services Architecture - Data Access and Integration (OGSA-DAI) project [6,7] has been established to produce a common middleware solution, aligned with the Global Grid Forum (GGF) OGSA vision to allow uniform access to data resources using a service based architecture. OGSA-DAI provides an extensible framework that allows flexible representation of a data resource. The latter need not be a single physical resource, e.g. RDBMS, but they could also represent complex virtual data resources. For example, a federation of many physical data resources can be represented as a single virtual data resource. This allows new data resources to be easily added to OGSA-DAI [19]. The requirements for data services in Data Grid environment have been highlighted in [19]. Based

on these requirements, OGSA-DAI provides a client-side library – the client toolkit (CTK), which offers a programming abstraction and provides a common interface for all the platforms supported by OGSA-DAI. The client makes the appropriate (SOAP) requests depending on the data service. Developers need only to learn the APIs provided by the client toolkit to access any OGSA-DAI data services. The OGSA-DAI along with the Globus Toolkit has been used in our architecture to provide data services to the mobile clients. Next section gives an overview of this architecture.

5. An Overview of the Pervasive Grid Architecture This section presents a pervasive grid architecture that will allow low profile clients to access grid services in a pervasive way. The client may be a low profile desktop PC or a GPRS-enabled mobile phone. Its application mostly covers the commercial usage where the client interaction is quite high than the scientific usage. In particular, it concentrates on the day-to-day data-intensive query applications. The architecture considers the heterogeneity of the resources as well as the clients. It also addresses the performance issues by providing a high-end proxy for the clients. In the following sections we describe the architecture with the help of a layered diagram and also present the workflow using a sequence diagram.

5.1 Layered Architecture of Pervasive Data Grid The architecture of the proposed pervasive data grid may be described by a layered diagram where a layer is defined by its core functionality. The architecture consists of the six layers as presented in Figure 2. The functionalities of various layers of the architecture are described below. Grid Data Layer - This layer provides and manages the data for a particular application. It encapsulates all of the physical, logical and conceptual data. At present Grid supports some Relational Database Management Systems like MySQL, IBM DB2, SQL Server & Postgres SQL; XML Databases like Apache Xindice [16] and some File-based systems. Thus this layer provides abstraction to the implementation of the physical accesses of the data required for the particular application.

Grid Middleware Layer - Grid Middleware plays a central role as the supporting software for grid services. The main functionality of this layer is to create interfaces between various nodes (that provide a grid infrastructure) of the Grid. This layer maintains connection with the data resources and provides data services to the virtual organizations. The service instances are deployed on the “Grid Service Container” (described later) through this layer. Since Globus Toolkit offers standard middleware-level services, we utilize some part of this toolkit for our architecture. The data services in this architecture are created using OGSA-DAI which follows the specification of WSRF (including WS-Addressing, WS-Notification etc.) and supports heterogeneity of the data resources. A data

service supports zero or more data resources. So, a data resource at a specific site is a part of a data service that can be a part of our entire service of the virtual organization. As discussed in the formal definition of Grid, the policies and other constraints are defined through the policy file of the Grid. In addition, there may be other local database constraints, which are also applicable for the data services. Grid Container Layer - This layer contains the services (more specifically the service instances) supported by the Grid. Currently Grid supports two types of containers for its services – one is provided by the Globus Toolkit and the other is the Jakarta Tomcat. However, in order to use Tomcat as the Grid Container the user needs to configure it. OGSA-DAI provides tools that automatically configure Tomcat for this purpose.

Client Proxy Layer - In our earlier discussion we already mentioned that accessing a grid service from a device with low storage and low computing capability is not possible due to the heavy overhead in the “Grid Middleware” layer. The “Client Proxy” layer in our architecture solves this problem by integrating a high profile client as a proxy in between the low profile clients and the Grid Container. This proxy layer acts as the service provider for the low profile clients and at the same time also acts as a client for the Grid Services. It receives requests from the low profile clients and forwards them to the Grid Container, and redirects the reply back to the low profile clients. Service Registry Layer - Apart from heterogeneity, loose coupling is a key factor to distributed computing. In a Grid environment, clients need not know who the service providers are. Similarly the service providers also do not know who the exact clients are – they only have information about the intended clients. To cope with such loose coupling a registry is maintained which stores a list of services along with their proxies. Thus, one can easily find out the client proxy for a particular service by querying such a registry. Client/WAP Gateway - Our architecture is intended to support clients with low storage and computing capability. Generally the clients are simple GPRS-enabled mobile phones or PDAs. Considering the heterogeneity of the client devices, our architecture also supports low profile desktop PCs. Generally the low profile clients communicate with the proxy through the http protocol. However, in case the low profile client is not http compliant, an intermediary Gateway is used. Since most of the mobile devices and PDAs use WAP (Wireless Access Protocol), we propose to use a WAP Gateway in our architecture. This layer receives the client’s request from a mobile phone or a PDA through standard mobile network and forwards the request to the client proxy and vice versa.

5.2 Workflow of Pervasive Data Grid Following (Figure 4) is a sequence diagram to explain the workflow in the proposed architecture. As mentioned earlier, the proposed architecture works with two types of clients - the desktop PCs and the mobile devices. The mobile clients will be connected only through the mobile radio network, which is mainly available for mobile phones. A mobile client sends its query to the WAP gateway, which is located at a static IP in order to find out the proxy for a particular kind of service. The WAP gateway forwards the message to the Service Registry, which invokes the client proxy. However, in case of a desktop PC (static client), the intermediary Gateway is not required. The client sends its query directly to the registry. The proxy invokes the particular client code and obtains answer

from the container, which utilizes the Grid resources to find the answer. Thus, in case of mobile clients the WAP Gateway is used only for protocol conversion.

Figure 4 Sequence diagram showing the workflow

6. Preliminary Implementation Details A preliminary implementation of the proposed architecture has been made on a local Grid. The local Grid test-bed consists of a high-end resource (IBM P690 Regatta server with 16 processors and 32 GB memory), a medium-range HP Xeon Server and a number of Pentium-4 and Pentium-3-based desktop PCs as clients and a simulated mobile client. We have used SuSE Linux and RedHat Linux as the operating systems. The Grid is built using Globus Toolkit (GT4) and the tomcat web server is used as the Grid container. A PostGres SQL relational database and an XML database are used as data resources. OGSA-DAI provide data services using these data resources. The port types and descriptions of the data services are available and can be accessed by generating WSDL. The client proxy acts as the service provider for end users and as a client for Grid data services. We include some additional API support for tomcat and the client proxy is also deployed on top of the tomcat web server (may be located on a different node). We have implemented a small database for a health management system with the hierarchy shown in Figure 5. Heterogeneous databases are implemented at the Block levels and HealthInfo Services are implemented at the State level. Both, the service and the proxy are deployed on the HP Xeon server. The proxy is basically a servlet that intelligently discovers a service in response to a request from the client. The proxy invokes the service and dynamically generates the responses for the client. The WSDL of the service along with the client requests and responses for a desktop and a simulated mobile device are depicted in Figure 6.

Figure 6 Health Management on a Mobile Grid

7. Performance This section analyses the performance of the preliminary implementation of the proposed architecture by comparing the performance of the Data Grid when accessed by a client without proxy and a client with proxy. In both cases, a desktop PC has been used as the client. In case of mobile clients the response time will be slightly higher due to the Gateway conversion. Figure 7 depicts the performance.

8. Conclusion A layered architecture for accessing Grid data services in a pervasive manner has been presented in this paper. This architecture is implemented on top of the Globus Toolkit and uses the OGSA-DAI data services in particular. A preliminary implementation of the architecture has been carried out and the initial results of this implementation are presented. The initial results are encouraging because it demonstrates much improvement compared to the traditional data services based on OGSA-DAI only. This is possibly because of the presence of proxy, although further experimentation is needed to understand the performance issues. A pervasive Data Grid is particularly useful in various applications like health management, disaster management etc. where query is made from low profile mobile devices from anywhere at any point of time. We can also use the same model to update the data resources using mobile devices. Currently, we are engaged in deploying locationbased services on top of Grid and use these services in similar applications.

Figure 7 Overall performance

References: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.

I. Foster, C. Kesselman, S. Tuecke “Anatomy of Grid”, in Grid Computing – Making the Global Infrastructure a Reality., Edited by F. Berman, A. Hey and G. Fox ÆÉ 2003 John Wiley & Sons, Ltd ISBN: 0-470-85319-0 Ian Foster “What is Grid ? A Three Point List” Argonne National Laboratory & University of Chicago July 20 , 2002. GLOBAL GRID FORUM (GGF) http://www.gridforum.org/ I. Foster et. al “The Open Grid Service Architecture, Version 1.0 Available at http://www.gridforum.org/documents/GFD.30.pdf published on 25th January 2005. GLOBUS Alliances http://www.globus.org/ M. Antonioletti, M.P. Atkinson, R. Baxter, A. Borley, N.P. Chue Hong, B. Collins, N. Hardman, A. Hume, A. Knox, M. Jackson, A. Krause, S. Laws, J. Magowan, N.W. Paton, D. Pearson, T. Sugden, P. Watson, and M. Westhead. “The Design and Implementation of Grid Database Services in OGSA-DAI”. Concurrency and Computation: Practice and Experience , Volume 17, Issue 2 -4, Pages 357-376. K. Karasavvas, M. Antonioletti, M.P. Atkinson, N.P. Chue Hong, T. Sugden, A.C. Hume, M. Jackson, A. Krause, C. Palansuriya. “Introduction to OGSA-DAI Services”. Lecture Notes in Computer Science, Volume 3458, Pages 1 -12, May 2005. Database Access and Integration Service Group http://www.cs.man.ac.uk/grid-db/ GRIDLAB, Mobile Grid http://www.gridlab.org/WorkPackages/wp-12/ M Parashar “Conceptual and Implementation of Model of Grid”. Proceedings of IEEE vol. 93 No. 3 MARCH 2005 page 653 – 668. GLOBUS WSRF http://www.globus.org/wsrf/ OASIS project http://www.oasis-open.org/home/index.php GLOBUS Programmer’s Tutorial : http://gdp.globus.org/gt4-tutorial/ Paul Watson “Database and the Grid”, in Grid Computing – Making the Global Infrastructure a Reality. Edited by F. Berman, A. Hey and G. Fox ÆÉ 2003 John Wiley & Sons, Ltd ISBN: 0-470-85319-0 GLOBUS GRID FTP http://www.globus.org/grid_software/data/gridftp.php XML DATABASE http://xml.apache.org/xindice/ DAIS Working Group http://forge.gridforum.org/projects/dais-wg. Open Middleware Infrastructure Institute (OMII) U.K. http://www.omii.ac.uk/about/employment.jsp M. Atkinson et al, “A new Architecture for OGSA-DAI”, Proceedings of the UK e-Science All Hands Meeting 2005, September 2005

Suggest Documents