Key words: Agora, Grid architecture, Grid programming, Grip, OGSA, ... evaluated the computer systems approach and the associated techniques by discussing ...
Journal of Grid Computing (2004) 2: 109–120
© Springer 2004
Vega: A Computer Systems Approach to Grid Computing Zhiwei Xu, Wei Li, Li Zha, Haiyan Yu and Donghua Liu Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China E-mail: {zxu, liwei, char, yuhaiyan, dliu}@ict.ac.cn
Key words: Agora, Grid architecture, Grid programming, Grip, OGSA, Virtualization
Abstract In this paper, we contrast four approaches for Grid computing, and discuss a computer systems approach in detail. This approach views a Grid as a distributed computer system, and its main concerns are systems abstractions and constructs, such as the Grid equivalents of computer architecture, address space, process, device, file system, user/developer’s interface. Following this methodology, we identified several essential issues, developed a Vega Grid systems architecture, and proposed several systems techniques such as Grid routers, Grid address spaces, Grid process (grip), Grid community (agora), and a Grid Service Markup Language (GSML) software suite. We evaluated the computer systems approach and the associated techniques by discussing an OSGA-compliant Grid system software implementation and a travel agency example.
1. Introduction The Vega Grid team has witnessed a rapidly growing interest in Grid technology in China. The application domain has also significantly expanded, from scientific computing to applications in manufacturing, transportation, drug discovery, environment, natural resource, scientific database, education, and e-governance. Many Chinese users are easily drawn to the Grid vision. They appreciate the idea of nontrivial quality of services through resource sharing, collaboration, and integration over a distributed platform (e.g., the Internet), based on general-purpose open standards. They like the promise of a single system image, utility computing, and at the same time, maintaining autonomous controls by users and resources in multiple domains. At the same time, two sets of questions are often asked by users and developers of Grids. One set of question is: How is Grid technology different? After all, people have been building what we call Grids today for many years. Many existing distributed systems already provide nontrivial quality of services with resource sharing, collaboration, and integration. Some are even based on open standards. The utility computing concept is not new either. People have been trying to provide information utility since the 1960’s.
Another set of questions arise in the Grid research community. This community made great strides in the past few years, generating important standards such as the Open Grid System Architecture (OGSA) [7] standard put forth by the Global Grid Forum. Nevertheless, a question remains: How should these standards be implemented? For instance, who should be responsible to execute the protocols in the OGSA standard? Is an all-embracing Grid middleware the only way? To answer these questions, we summarize four approaches, and present in detail a computer systems approach used by the Vega Grid project. The Vega approach aims to developing a Grid platform to provide the following properties that we conveniently term VEGA properties: • Versatile services: various application services can be provided by integrating a few key systems constructs. • Enabling intelligence: The Grid platform should provide common supports to help achieve automatic, dynamic, and interactive properties, thus reduce cost. • Global uniformity: From the user’s viewpoint, the Grid provides connectivity, single system image, and interoperability.
110 • Autonomous control: For management, the Grid is an open architecture comprised of decentralized domains. To validate the usefulness of these properties we note that others in the community also use similar properties in their approaches. Examples for such systems are Avaki [2], the Globus Alliance [10], the Java CoG Kit [17], and Chimera [9]. However, the Vega approach has three main premises: (1) Identify lasting computer systems problems essential to Grids, which are abstractions from common requirements of current and future Grid applications; (2) Leverage the knowledge accumulated in the past 40 years in computer systems design; (3) Leverage current Grids research and industrial development efforts, such as OGSA and mainstream Web service standards. This paper is organized as follows. Section 2 summarizes four approaches to Grids and comments on related work. The remaining sections follow the computer systems approach and discuss issues, architecture, and system techniques. Section 3 discusses several computer system problems essential for Grids, integrated into the metaphor of man–computer society. Section 4 presents an OGSA-compliant Grid systems architecture and several systems techniques, including Grid routers, Grid address spaces, Grid process (grip), Grid community (agora), and the Grid Service Markup Language (GSML) software suite. Section 5 discusses implementation and evaluation. Section 6 offers some concluding remarks.
2. Four Approaches to Grids We first need to differentiate two concepts, which are often confused in practice: • Distributed application systems with Grid characteristics. These systems are simply called Grids. Grid characteristics include nontrivial quality of services, resource sharing, collaboration, integration, single system image, and autonomous control. • Grid technology. It is an emerging technology that provides a specific methodology and a set of tools to build and use Grids, as exemplified by OGSA, J2EE Web services [27], and Microsoft.Net [23]. While we develop Grid technology to better support application Grids, a distributed application system with Grid characteristics does not necessarily utilize Grid technology. We can summarize four approaches that have been used in the research and
Figure 1. Four approaches to Grids.
development of Grids, which can be grouped into two classes (see Figure 1). The four approaches are not necessarily mutually exclusive, and a Grid application system may use a mixture of them. The first class is called the solution-oriented approach, which emphasizes the total solution for a specific application that presents Grid characteristics. The underlying technology is of secondary considerations, often comprised of a bag of available technical tools, patterns, and ad hoc rules. This approach is probably the most widely used today. The second class is called the platform approach, which tries to provide a general-purpose platform to support the development and operation of Grid applications. The Grid platform aims to providing common support for Grid features, so that an application developed to run on top of this platform can focus on application specific functionality. If we compare Grids to semiconductor chips, the solution-oriented approach is very similar to the ASIC (Application Specific Integrated Circuit) approach. The platform approach is more like the microprocessor chip approach. The platform approach can be further divided into three approaches. The middleware approach focuses on realizing Grid capability through a general-purpose middleware platform. Its basic methodology is not unlike CORBA [29] and the previous IBM San Francisco [16] project, if we take out new features such as service orientation. Many Grid-related software packages are labeled as middleware, as they run on top of native operating systems but below applications. The middleware view of Grid is still very prevalent, so much so that many people equal Grid technology to a kind of middleware. The network systems approach inherits wisdom from the Internet and the Web. It views a Grid as a network of resources, and focuses on networking issues such as naming, protocols, interfaces, QoS, and
111 the Grid network architecture. A basic principle often utilized in the network approach is the end-to-end argument [25]. The W3C Web service and the OGSI standard all have a network flavor. The computer systems approach views a Grid as a distributed computer system, especially at the hardware and the system software levels. It tries to leverage lessons learnt and knowledge accumulated in the past 40 years of computer and operating systems development (e.g., [13, 14, 26]). Its main concerns are systems abstractions and constructs, such as the equivalents of computer architecture, address space, process, device, file system, user/developer’s interface (e.g., the instruction set architecture), etc. The computer systems approach also pays great attention to performance and partitioning of functionality among the hardware level, the system software level, and the other levels. While the middleware approach has many benefits, it would be wrong to equal Grid technology to only a kind of middleware. Grid functionality can be provided at the system software and the hardware levels as well. It is interesting to look at the history. When IBM PC operating systems migrated from DOS to Windows, the industry took two approaches. IBM developed a new operating system called OS/2. Microsoft developed Windows on top of DOS. This “middleware” approach adopted by Microsoft later disappeared. Today’s Microsoft Windows is a standalone operating system, without the need to run on top of DOS. Microsoft’s next generation of operating system, code named Longhorn [24], will again be a standalone operating system, without the need to run on top of a “native” operating system. In the new wave of Grid research started from the late 1990’s, the computer systems approach is revived. For instance, the Globus project [10] tries to provide a Grid operating system kernel, while the Legion project [12] aims to building a worldwide computer. IBM is consolidating its Grid-related activities into an On Demand Computing initiative, and is using WebSphere [15] as IBM’s brand of an Internet operating system. Microsoft also describes its Windows.Net as an Internet operating system, with Office.Net an interface to the net.
3. Man–Computer Society In his Turing award lecture of 1999 [11], Jim Gray pointed out three lasting visions of computer science
research, symbolized as Charles Babbage’s computers, Vannevar Bush’s Memex, and the Turing Test. Bush’s vision mainly deals with the problem of how to design/use computers to help humans, or human– computer relationship. Since Bush published his article “As We May Think” in 1945, the 1960 has witnessed a surge in human-computer relationship research: J.C.R. Licklider published his influential article “Man–Computer Symbiosis” [20], the Mutlics team proposed time-sharing, and John MaCarthy developed Lisp. These studies produced significant innovations that have been used to this date, such as process, virtual address space, time-shared operating system, and dynamic scoping. As we enter the 21st century, network computing is becoming a mainstream paradigm, with Grid a driving technology. The Grid systems have significantly changed the landscape of human–computer relationship. Man–computer symbiosis has shifted to a new metaphor of man–computer society, as characterized by the following six distinct features of a Grid: • People and Computers. Although the idea of a networked world is touched upon, man–computer symbiosis is concerned mainly with 1-person– 1-computer systems. However, a Grid is an m-person–n-computer system, where both m and n could be large. This raises several challenging systems issues. We must consider scalability, load balancing, availability, while at the same time maintain single-system image. Human–computer relationship is not limited to man–computer interaction any more, but also includes computer– computer interaction and man–man interaction. • Open Society. A Grid is a dynamic, open society of heterogeneous users and computing resources. We cannot assume a closed world with predetermined user requirements and resource space. User requirements, as well as the available resources, are always changing. A user or a resource can join or leave a Grid at any time. A Grid program may reference a resource that the programmer does not even known its existence. Users and resources are often autonomous, not subject to the control of a centralized entity such as the kernel in UNIX. • Standards-Based Interactions. The Grid program itself is not a pre-specified algorithm any more, but could change through interactions. Unlike the World Wide Web, which mainly allows users to read the contents of a Web page through the HTTP protocol, Grids call for a small set of more sophisticated, standards-based interaction operations for
112 write, operate, and inter-create. Furthermore, it is desirable that the user interface for interaction is as user-friendly as the Web interface (e.g., HTML). Users should not be forced to learn or perform procedural programming in order to create or use an interface page. Our goal is an interface with which a secretary should be able to program the Grid, and an executive should be able to use the Grid. • 4A Patterns. The Grid computing paradigm inherits the utility concept from the electricity Grid. We should be able to obtain and use computing resources the same way as we obtain electric resource, at any time, any place, on any device, by any user. This calls for, among other things, characteristics of persistence and context. • Full Lifecycle. Past computing models such as the Turing machine focus on only the usage phase of the lifecycle. With Grid computing, we must consider the full lifecycle, from development, deployment, usage, to maintenance. The complexity metrics should not be just the algorithmic time/space complexity of an application run, but new measurements such as time to production. • High-Productivity Service. Although algorithmic time/space complexities are still important, performance evaluation in Grid computing should be more user-centric, emphasizing metrics such as user productivity and service-level agreements. If we follow the computer systems approach, if we view a Grid as a distributed computer, we must ask several questions related to supporting man–computer society, such as the following: What should be the underlying theoretical model? What should be this Grid computer’s basic encoding? What should be this Grid computer’s system architecture? What should be this Grid computer’s address space? What should be this Grid computer’s process? What should be this Grid computer’s programming language? For the encoding question, we already have a consensus, which is XML. The remaining questions are still active research topics. The Grid research community has made significant progress in answering the architecture question. The Global Grid Forum has proposed the OGSA architecture [7]. The OGSI standard [28] has several implementations available. However, much remains to be done at the OGSA Platform layer [8]. Furthermore, the OGSA/OGSI standards may be considered as offering a Grid system software architecture, and there is no standard yet for Grid system hardware.
4. Vega Grid Systems Techniques Figure 2 illustrates the Vega Grid system architecture for a Grid computer system, including both software and hardware. This is an OGSA-compliant architecture. We will only discuss the salient features of the architecture here. Section 5 contains an example illustrating more details. 4.1. The Vega Hardware Architecture The Vega Grid hardware architecture consists of multiple Grid nodes that could span wide-area locations, interconnected through the Internet. Within each node, there are one or more computers (servers) that host computing, storage, data, and program resources. All resources are encapsulated as services. Currently, the Vega Grid architecture supports web services and OGSI-compliant Grid services (such as GT3 services). Two new architectural components are the Grid resource router [19] and switch. They differ from a network router/switch in that they do not just route messages, but provide a link to other Grid nodes and to the users. The main function of the Grid router/switch is to connect Grid resources together. A Grid switch connects resources within a Grid node, which is often managed by one institute. A Grid router connects resources among multiple Grid nodes, which could be owned by multiple organizations. These Grid router/switch features are already implemented on a Linux server in software. A dedicated hardware box is being implemented with Dawning 4000A, a Grid-enabling superserver under development at ICT. The goal is to enable a supercomputer or network storage device to be easily connected to a Grid, in ways similar to today’s routers connecting machines to a network. 4.2. GSML and Abacus To enable effective Grid programming, we are developing a suite of software, called the GSML [18] (Grid Service Markup Language) suite, which includes a GSML language and a set of tools. The GSML is an XML-based markup language for users to program a Grid and to access Grid services. GSML supports services as first class citizens. Its main constructs include elements, transitions, conditionals, a block construct to provide structure, and a display construct to support presentation. The work most related to GSML is Microsoft’s XAML language [6], which is a flexible language allowing the mixing of XML tags and executable codes
113
Figure 2. The Vega Grid system architecture.
written in C#. GSML differs in that its design focuses on ease of use by limiting expressive power. GSML is more powerful than HTML but less powerful than XAML. Our goal is an interface with which a secretary should be able to program the Grid, and an executive should be able to use the Grid. In the case that a Grid application needs more expressive power than GSML could provide, a lowerlevel language called Abacus can be used. Abacus is a service-oriented programming language that is as powerful as Turing machine, but will require the skills of a programmer, instead of a secretary. We believe that many Grid applications, when eventually presented to the end user, will only require the expressive power of GSML. We view Abacus as an assembly language, while GSML a high-level language, of the Grid computer. A tool called GSML browser is provided at the client side, which renders GSML pages and provides an interface for users to interact with a Grid. This Grid browser is different from a Web browser in that
it allows the users not only to read contents from a Grid, but also to write to and to operate a Grid, by sending service requests to the Grid side. A protocol called Grid Service Request Protocol (GSRP) is used between the GSML server and the GSML browser. GSRP supports wide-area services with a number of techniques, such as persistent handle and soft connection. 4.3. The EVP Model and the Vega GOS Learning from experiences in computer architecture and operating systems design, we developed an address space model for Grids, called the EVP Grid resource space model. Based on the EVP model, we developed a software runtime called the Vega Grid operating system (Vega GOS), using the Globus Toolkit 3 as a kernel. See Section 6.1 for more details. Native resources and services located in individual Grid nodes are called physical resources (P), such as Web services and GT3 Grid services. These resources
114 and services are abstracted into location-independent virtual resources (V) at the Grid operating system layer (i.e., the OGSA Platform layer). The Grid users see effective resources (E), e.g., through a GSML page. A technical staff can use a software tool called Grid resource Mapper to convert native Grid resources into virtual resources, which is presented and organized in a form that a secretary can understand. The secretary can then use another software tool called GSML Composer to browse and select virtual resources and integrate them into a GSML page. 4.4. Grip and Agora Learning from traditional operating system research, a runtime construct called grip (for Grid process) [21] is developed in the Vega Grid architecture. A grip is a handle for users to access Grid resources and services. It helps the Vega GOS with maintaining user data structure, service composition, name translation, integrated service discovery and accessing, and exception handling. Agora [30] is a persistent construct realizing the virtual organization (VO) concept, which is frequently used in the Grid research community but rarely precisely defined. In the Vega Grid architecture, an agora has a precise definition, consisting of Grid subjects (e.g., Grid users), Grid objects (e.g., resources and services), policies, and context. Working with grip and the Vega GOS, the agora construct provides supports for policy enforcement (e.g., Grid access control [3]) and context utilization. The grip and the agora constructs are helpful for dealing with issues of open society and 4A usage in the man–computer society metaphor.
5. Implementation and Evaluation A prototype of the Vega GOS and the associated utilities on top of it have been implemented in Java, to validate the Vega Grid architecture and core techniques, such as GSML, agora, grip, and the EVP Grid resource space. Most core services of the Vega GOS (i.e., the OGSA Platform layer) have been heavily debugged and integrated into a package called Vega GOS 1.0. This package has been released and deployed on the China National Grid. In this section, we describe the implementation, illustrate some salient features with a travel agency example, and evaluate the advantages and tradeoffs of the Vega Grid approach.
5.1. Implementation The Vega GOS design and implementation follow the premises we set out in Introduction. That is, we need to achieve the Grid goals of resource sharing, collaboration, and integration by leveraging past computer systems research and current Grid research. Our research should merge seamlessly with international research settings as well as commercial efforts. We have implemented the Vega GOS as shown in Figure 3. While providing a single-system image, the Vega GOS is a distributed software system, with a copy running on each Grid node. Note that Figure 3 only shows our current implementation. It does not imply that the Globus Toolkit is concentrated on the OGSI layer. We are well acquainted with the fact that the Globus Alliance as well as the commercial Web service software suits are building up tools to cover the OGSA Platform layer. The Vega GOS software will utilize these tools when they become available. Kernel, Core, and Utilities. Learning from traditional operating system design, we divide the Vega GOS into three layers of kernel, core, and utilities. Our main focus is on core and utilities, built on top of a kernel software that is widely available, such as the open source Globus Toolkit, a J2EE Web application server (e.g., Apache Axis), or Microsoft.Net. Virtualization. The EVP resource space model is the foundation for virtualization in the Vega GOS. In the current implementation, the utility layer usually only references effective services, which are translated by the Vega GOS runtime into virtual services, taking into consideration of context and policy. The Address Space Management module is responsible for mapping virtual services into physical services (e.g., GT3 Grid services or Web services). OGSA Compliance. The Vega GOS is made OGSAcompliant by mapping the aforementioned three layers to the OGSA architecture as shown in Figure 3. At the Kernel layer (the OGSI layer), the Vega GOS supports the encapsulation of resources as Grid services or web services, and uses GT3 software as the runtime environment for Grid service. It also supports basic Grid services (e.g., GridFTP, GASS, and GRAM) and other OGSI features (e.g., GSI). At the Core layer (the OGSA Platform layer), the Vega GOS provides a set of core services implemented as runtime modules, including:
115
Figure 3. Implementations of Vega GOS.
• Grid-level task management. Creation, execution and scheduling of grips; exception handling; management of the relations among grips, agoras and the Grid address space. • Global data management. This module is mainly used to maintain a global file directory, and provide a global data space, while hiding low-level file service implementations such as GASS and GridFTP. • Grid address space management. The functions include address space related operations such as virtual-physical resource mapping, service registry, service discovery, and service deployment. • Agora management. Management of policy and context governing the interaction between Grid users and resources. At the Utility layer (the OGSA application layer), the Vega GOS provides two important tools: the GSML tool suite and the Abacus programming environment. The Vega GOS also provides other utility tools, such as system monitoring, resource registry, data transferring, and user management. 5.2. A Travel Agency Example A Grid-enabled travel agency application integrates three types of physical services: airline service, bus service and hotel service, to provide a virtualized highlevel service to tourists in China. This application differentiates three classes of services: VIP, regular, and economy. We discuss below how the Vega GOS supports such a Grid application by looking at a VIP customer Mr. Zhao’s use case.
First, the travel agency creates an agora called TravelAgora, which relates registered customers to accessible resources via policies and contexts. The travel agency then provides an application written as two GSML pages, as illustrated in Table 1. This program can also be written by Mr. Zhao’s son, and stored locally in their family PC. Figure 4 shows two GSML browser screen shots when Mr. Zhao invokes this application to reserve a 5-day Beijing–Hangzhou tour for his family. A GSML document has three parts: head, tail, and body. The head and the tail parts are usually not visible via the browser interface, so that an end customer (e.g., Mr. Zhao) does not have to worry about the information contained in these two parts. Their contents are visible through a “show source code” feature in the GSML browser, to be used by a secretary (e.g., Mr. Zhao’s son) to change the information therein. The tail part mostly contains clean-up information. The head part contains document-wide information such as encoding, name space, and most importantly, policy and context information specified in an agora. This way, policy and context information is hidden from the user, and separate from user application logic. The body is where the application contents and logic are specified. The application logic of GSML is not a procedural logic, but a limited form of an interactive functional style. GSML does not provide control flow constructs such as conditional jumps. Interaction is supported mainly through conditional events via a construct called transition. Figure 5 illustrates how Mr. Zhao’s reservation request is transmitted to the physical services. The GSML code in Table 1 references effective services.
116 Table 1. Skeleton of the travel agency application code written in GSML.
// source file for travel.gsml . . . . . . . . . Welcome to our VEGA Travel Service! Please fill in the following items: User Name: Current Location: Destination: Departure Time: Days of Tour: . . . . . . // source file for Reservation.gsml ... . . . 1. Airline: ${plane#2} from ${plane#3} to ${plane#4} at ${plane#5}; 2. Bus: ${bus#2} from ${bus#3} to ${bus#4} at ${bus#5}; 3. Hotel: ${hotel#2}, from ${hotel#3}to ${hotel#4}; 4. Bus: ${busr#2} from ${busr#3} to ${busr#4} at ${busr#5}; 5. Airline:${planer#2} from ${planer#3} to ${planer#4} at ${planer#5} . . .
They are abstract ones named “Airline”, “Bus”, and “Hotel”. Their parameters are also very simple, as illustrated in Figure 4(a). At this level, Mr. Zhao only needs to provide the minimal information required. This information is then passed to the TravelAgora, which has access to Mr. Zhao’s context information that is not explicitly specified in each request. For instance, it knows that Mr. Zhao is a VIP customer
with a big family of 12 people. The TravelAgora then converts all this information into location independent virtual service requests, which are then mapped to physical services by the Vega GOS. For instance, an abstract Airline reservation request “Mr. Zhao from Beijing to Hangzhou on 200412-06” from Figure 4(a) will be eventually translated into a physical service invocation as shown in Table 2.
117
(a)
(b)
Figure 4. GSML browser screenshots for travel application: (a) initial interface; (b) result interface.
Figure 5. Virtualization in an agora-based travel application.
5.3. Evaluation The computer systems approach used in the Vega Grid development complements the network systems approach, and can offer several advantages. The Vega approach also makes some tradeoffs, notably between ease of use and express power of the Grid programming environment. These issues are discussed below. Virtualization. Learning from the address space work in traditional computer systems (e.g., [5]) that has been successfully used for forty years, we developed a virtualization scheme based on the EVP resource space model. At the application level, an end user only sees effective services that hide many low level details, as illustrated by the travel
agency example. The Vega GOS, together with the agora and grip, automatically maps such high-level effective services into physical services. This is akin to how the operating system, with the help of the CPU hardware, performs address translation. With the Vega GOS, an application can be implemented by several methods. The first is to use Java (or another language) and a low-level API (e.g., the GT3 API or a Web service API). This is probably the most common method used in current Grid application developments. For instance, many applications in the UK e-Science project use a Web portal as an application entry point, and the portal server uses low-level APIs to make direct, location-aware calls to GT2 or GT3 [1]. The second method uses Java with the Vega GOS API, which utilizes virtual services and is location independent. The third method is to use the Abacus or the GSML tool suite, which access effective services and can utilize policy and context information. Change Management. With virtualization, it is easier to adapt to dynamic changes of Grid resources. For instance, when a new airline route is open, it can be encapsulated into a web service or a GT3 Grid service, and deployed into the Grid as a physical service. Then all customers can utilize this new resource. There is no need to change the application code.
118 Table 2. The source code of invocation of a physical airline service.
import org.ict.examples.Airline.*; ... String level, departure, arrival, date; int nPeople; ... OGSIServiceGridLocator gridLocator = new OGSIServiceGridLocator( ); Factory factory = gridLocator.getFactoryPort (new java.net.URL \ (“http://vega.ict.ac.cn/ogsa/services/examples/AirlineFactoryService”)); GridServiceFactory airlineFactory = new GridServiceFactory(factory); // Create a new AirlineService instance and get a reference to its Airline PortType LocatorType locator = airlineFactory.createService( ); AirlineServiceGridLocator airlineLocator = new AirlineServiceGridLocator( ); AirlinePortType airline = airlineLocator.getAirlineService(locator); ... // Call remote method ‘reserve’ String strRet = airline.reserve(level, nPeople, departure, arrival, date); // Get a reference to the GridService PortType and destroy the instance ...
Changes in user requirements are also supported. For instance, when a user wants to change her service policy, she only needs to switch the current agora specified in her GSML document head to a different agora that provides the needed policy. The application itself specified in the GSML document body does not have to be modified. Separation of Concerns. The Vega GOS tries to separate several concerns of Grid application development. The GSML document body is responsible for specifying application logic and user-Grid interactions. Context and policy are specified in an agora, the name of which can be specified in a GSML document head. Naming mappings among effective, virtual, and physical services are performed by the Vega GOS runtime. In the current implementation of the Vega GOS, we treat agora as a persistent entity, which can only be changed by an agora manager via a software tool. End user applications cannot create, delete, or modify an agora. In this regard, an agora is different from the MPI communicator concept [22]. We trade off flexibility for simplicity, as we think dy-
namic policy and context is more than a secretary can handle. Integrated Service Discovery and Access. Currently, many Grid applications need to write explicit code to discover a service firstly and then access the same service. In the Vega GOS, applications written in GSML, Abacus, or the Vega GOS API do not force such separation upon developers. All is needed is a service request. Service discovery and access are integrated. The Vega GOS and the grip corresponding to an application will automatically handle the details. Ease of Use versus Expressive Power. To make GSML usable by a secretary through a GSML browser/composer tool, we have deliberately limited its expressive power to be no more than that of finite-state machines. When more expressive power is needed, we allow the embedding of Abacus code in GSML, similar to embedding assembly codes in a C program. In contrast, Microsoft XAML is very flexible, including the capability of a full-fledged programming language. GSML is meant mainly by ordinary users, while XAML (and Abacus) is meant to be used by programmers.
119 6. Concluding Remarks Four approaches to building and using Grids are summarized in this paper. A main difference between Grid technology and previous technologies is that it puts more emphasis on a general-purpose, open, service-oriented platform. Three platform approaches are discussed. The middleware approach, although prevalent and important, is not the only path. The network systems approach and the computer systems approach are both viable ones in developing Grids. A good approach in practice may learn from all these approaches. This paper presents a computer systems approach used in the Vega Grid project. Our experience in the past three years shows that this approach helps us clarify applications and resources requirements, reveal computer systems problems, and provide a reference methodology to organize our research activities. Using this approach, the Vega Grid team identified several core problems, which are integrated into the metaphor of man–computer society. We developed an OGSAcompliant Vega Grid systems architecture. Several systems techniques are also developed, such as Grid routers, Grid address spaces, Grid process (grip), Grid community (agora), and the GSML software suite. Implementation and evaluation reveal that the Vega approach has several advantages. It achieves virtualization and hides system details from users. It helps applications to adapt to resource changes seamlessly. It helps separate concerns of application logic and interaction, policy and context, and resource naming. Part of the Vega GOS techniques are already deployed and being used in the China National Grid project. After we gain more application experiences and perform extensive testing, the Vega GOS code will be released in an open source form to the Grid research community. We are also evaluating the newly announced WSRF and WS-Notification specifications [4], and exploring the opportunity to merge Web services and Grid services.
References 1.
2. 3.
4. 5.
6.
7.
8. 9.
10. 11.
12.
13.
14. 15. 16. 17.
18.
19.
Acknowledgement This work is supported in part by the National Natural Science Foundation of China (Grant No. 69925205), the China Ministry of Science and Technology 863 Program (Grant No. 2002AA104310), and the Chinese Academy of Sciences Oversea Distinguished Scholars Fund (Grant No. 20014010).
20.
21.
M. Atkinson, “UK e-Science and the National e-Science Centre”, Presentation to the China National Grid delegation, UK National e-Science Center, Edinburgh, October 2003. Avaki, http://www.avaki.com, October 2004. G. Bu, Z. Xu, “Access Control in Semantic Grid”, Future Generation Computer Systems, Vol. 20, No. 1, pp. 113–122, January 2004. K. Czajkowski, D.F. Ferguson, I. Foster and J. Frey, The WSResource Framework, http://www.globus.org/wsrf/, 2004. R.C. Daley and J.B. Dennis, “Virtual Memory, Processes, and Sharing in Multics”, in Proceedings of the 1st ACM Symposium on Operating System Principles, January 1967, pp. 12.1–12.8. D. Esposito, “A First Look at Writing and Deploying Apps in the Next Generation of Windows”, Microsoft MSDN Magazine, Vol. 19, No. 1, January 2004. I. Foster, C. Kesselman, J. Nick and S. Tuecke, “Grid Services for Distributed Systems Integration”, IEEE Computer, Vol. 35, No. 6, pp. 37–46, June 2002. I. Foster and D. Gannon (eds), Open Grid Services Architecture Platform, http://www.ggf.org/ogsa-wg, 2003. I. Foster, J. Voeckler, M. Wilde and Y. Zhou, “Chimera: A virtual data system for representing, querying, and automating data derivation”, in Proceedings of the 14th Conference on Scientific and Statistical Database Management, July 2002, pp. 37–46. Globus Alliance, http://www.globus.org/ J. Gray, “What next: A Dozen Information-Technology Research Goals”, Journal of the ACM, Vol. 50, No. 1, pp. 41–57, January 2003. A.S. Grimshaw, W. Wulf, “The Legion Vision of a Worldwide Virtual Computer”, Communication of the ACM, Vol. 40, No. 1, pp. 39–45, January 1997. J.L. Hennessy, D.A. Patterson and D. Goldberg, Computer Architecture: A Quantitative Approach, 3rd edn. Morgan Kaufmann, 2002. K. Hwang, Z. Xu, Scalable Parallel Computers: Technology, Architecture, Programming. McGraw-Hill: New York, 1998. IBM, “Websphere Platform”, http://www.ibm.com/ websphere The IBM San Francisco Project, IBM Systems Journal, Vol. 37, No. 2, 1998, Special Issue. G. von Laszewski, I. Foster, J. Gawor, W. Smith and S. Tuecke, “CoG Kits: A Bridge between Commodity Distributed Computing and High-Performance Grids”, in Proceedings of the ACM 2000 Conference on Java Grande, June 2000, pp. 97–106. B. Li, W. Li and Z. Xu, “Implementation issues of a Grid Service Markup Language”, in Proceedings of the 4th International Conference on Parallel and Distributed Computing, Applications and Technologies, August 2003, pp. 620–624. W. Li, Z. Xu, F. Dong and J. Zhang, “Grid Resource Discovery Based on a Routing-Transferring Model”, in Proceedings of the 3rd International Workshop on Grid Computing, November 2002, pp. 145–156. J.C.R. Licklider, “Man–Computer Symbiosis”, IRE Transactions on Human Factors in Electronics, HFE-1, pp. 4–11, 1960. T. Liu, X. Li, W. Li, N. Sun and Z. Xu, “Notes on a RunTime Construct for Grid”, Journal of Computer Research and Development, Vol. 40, No. 12, pp. 1811–1815, December 2003.
120 22.
Message Passing Interface Forum, MPI: A Message-Passing Interface Standard, 1994. 23. Microsoft, .NET Framework, http://www.microsoft.com/net/ 24. Microsoft Longhorn Developer Center Home, Understanding Longhorn, MSDN website, http://msdn.microsoft.com/ \breakLonghorn/understanding/, October 2004. 25. J.H. Saltzer, D.P. Reed and D.D. Clark, “End-to-End Arguments in System Design”, ACM Transactions in Computer Systems, Vol. 2, No. 4, pp. 277–288, 1984. 26. A. Silberschatz, P. Galvin and G. Gagne, Operating System Concepts, 6th edn. Wiley, 2001. 27. Sun Microsystems, Java 2 Platform Enterprise Edition, http: //java.sun.com/j2ee
28.
29.
30.
S. Tuecke, K. Czajkowski, I. Foster, J. Frey et al., Open Grid Services Infrastructure (OGSI) Version 1.0, Global Grid Forum Draft Recommendation, 2003. S. Vinoski, “CORBA: Integrating Diverse Applications within Distributed Heterogeneous Environments”, IEEE Communications Magazine, Vol. 14, No. 2, pp. 46–55, February 1997. H. Wang, Z. Xu, Y. Gong and W. Li, “Agora: Grid Community in Vega Grid”, in Grid and Cooperative Computing: 2nd International Workshop, December 2003, pp. 685–691.