Finally, we sketch a set of possible components of such a programming ... Metacomputers are built by combining several platforms, ranging from dedicated su-.
Software Components Enable Wide-Area Supercomputing: Takeo of the Albatross Thilo Kielmann, Aske Plaat, Henri E. Bal Dept. of Mathematics and Computer Science Vrije Universiteit, Amsterdam, The Netherlands http://www.cs.vu.nl/albatross/
April 1998 Abstract
Ever-growing computational demands, along with the need for ecient utilization of existing supercomputing facilities recently fostered the development of a software infrastructure for constructing large virtual supercomputers from geographically distributed resources which are linked by high-speed networks. [10, 14] Such virtual supercomputers are typically referred to as metacomputers . Recent work indicated the utility of metacomputers not only for embarrasingly parallel problems, but also for applications with ner communication granularity [4]. But metacomputing software is still in its early development stages, mainly concerned with recognizing the many sources of additional complexity introduced by wide-area computing. While the necessity of resource-aware applications has been recognized, there is hardly any programming support for building such applications. The goal of the Albatross project, currently conducted at Vrije Universiteit, is to focus on programmability and performance of parallel applications on top of metacomputing platforms. In this paper, we propose the construction of component toolkits as the most promising approach to the construction of metacomputing applications, because components allow to make programs resource-aware and simultaneously resourceindependent. Finally, we sketch a set of possible components of such a programming toolkit.
1 Introduction New classes of high-performance applications are being developed that require unique capabilities not available in a single computer. Such applications are enabled by the construction of networked virtual supercomputers , or metacomputers [7]. Currently, several research projects are underway in order to provide so-called metacomputer infrastructure toolkits . Examples are Legion [14] and Globus [10]. Recent evaluations of metacomputing platforms 1
indicate the applicability of metacomputing to problems with medium to coarse grained granularity of communication patterns [4, 9]. Despite traditional (single platform) parallel computing environments, metacomputers exhibit much higher system complexity that applications have to deal with. This kind of complexity arises from the heterogeneity inherent to metacomputers, from dynamically changing and unpredictable behaviour of network connections as well as (possibly shared) computing nodes, and from the autonomy of the involved administrative domains:
Heterogeneity of hardware and software. Metacomputers are built by combining several platforms, ranging from dedicated supercomputers to networks of workstations. Obviously, cpu speed varies from platform to platform. Sometimes, cpu's even dier within platforms (e.g. in networks of workstations). Besides heterogeneous speed, binary formats also dier between platforms, making the distribution of program executables a non-trivial task. Because metacomputers are allocated dynamically, applications can not make any static assumptions about the structure of the system on top of which they will execute.
Dynamic and unpredictable behaviour. A single platform, being a component of a metacomputer, can be assumed to have predictable behaviour, as long as it can be allocated for exclusive use. In contrast, wide-area network connections typically rely on shared media (mostly on the Internet). Facing the highly dynamic behaviour of such links, communication behaviour of a parallel application becomes absolutely unpredictable.
Site autonomy. Each platform that is part of a metacomputer constitutes its own administration domain, typically with its own access policies. Metacomputers hence have to integrate several of them, coping with security and authentication as well as with temporal (un-)availability due to possibly dierent time zones of a system that operates globally.
Fault tolerance. With an increasing number of cooperating entities, unpredictable system behaviour and dynamically changing platform availability, the probability of (partial) system failure rises to a level that is no longer negligible by reasonable, long-running applications.
2 Why Existing Approaches to Parallel Programming Fail Parts of the complexity of metacomputers (like security and coarse-grain task assignment) should preferably be handled by suitable software infrastructure on the middleware layer. But in order to achieve reasonable eciency, important tasks concerned with heterogeneity and dynamic behaviour of network and computing nodes remain to be dealt with by the application itself. From a software engineering viewpoint, this additional complexity must 2
not be fully exposed to the core application code, because this would lead to mixtures of application properties and algorithms with concerns related to the heterogeneous computation substrate as well as with dynamically changing network behaviour. As a consequence, application software becomes highly complex and extremely hard to maintain, analyze, and optimize; not to talk about any kinds of re-using given algorithms or data structures. Unfortunately, most existing parallel applications are written in such a way. The reason why they work is either a sacri ce of portability or the assumption of homogeneous computing nodes and highly ecient network connections, or even a combination of both. Experience with programming systems aiming at completely hiding system complexity from application code indicates that even in simple cases (e.g. with homogeneous, dedicated parallel computers), programmers have to tune communication behaviour manually [11, 12]. Hence, the application has to know which kind of behaviour is \expensive". These problems are getting even worse in metacomputers and led e.g. to the introduction of application-level scheduling facilities [5]. As a consequence, all existing approaches to parallel programming will fall short when applied to metacomputing. Into this category fall message-passing systems [17], automatic parallelization [11], distributed shared memory [3, 6], as well as distributed object computing [20]. The only way out of the dilemma outlined above seems to be the construction of applications that are resource-aware and simultaneously resource-independent , in order to make application ecient as well as capable of adapting themselves to changing system properties. In the following, we outline how this goal can be achieved by constructing toolkits of reusable components, providing suitable abstractions of all sources of complexity, ranging from low-level communication and synchronization, via design patterns like managers and workers, up to parallel algorithm schemata. Our vision of building parallel applications for metacomputers is a process of gluing together given components and hence orthogonally integrating resource-awareness with resource-independent application code.
3 Component-Based Programming Component-based programming, originally targeted at the construction of sequential programs [19], is based on functional composition making use of various avours of procedure calls. Whereas the idea of gluing together existing components signi cantly improves the software development process, the existing compostion mechanism is obviously not suited to the construction of parallel (and distributed) applications [1]. Consequently, the notion of software composition has to be transferred to the eld of parallel applications. As outlined in [18], components on dierent abstraction levels can be identi ed here. Such components range from low-level idioms , like futures and remote procedure calls, via design patterns , like manager/worker, proxy, and active objects, to architectural styles like pipes and lters, blackboards, shared objects and datastructures, and event-based models. Obviously, components of dierent abstraction levels require different composition mechanisms, whereas one kind of components can help gluing together abstractions of other layers. For example, dynamic software composition can be achieved by structuring parallel applications into active objects that are composed via shared data structures and Linda-like, generative communication [16]. 3
One example of a collection of reusable abstractions for concurrently operating entities is the ACE toolkit [21]. It integrates lower-level entities like futures and singletons with design patterns like active objects, and even with architectural styles like event dispatching mechanisms and message queues. Because ACE is targeted at distributed object computing, it lacks suitable abstractions for parallel programming. Nevertheless, ACE is a valuable example how to build component toolkits. The idea of identifying coordination patterns for parallel applications has already been proposed in previous work [13]. Here, the expression of parallel program structure is supposed to rely on the integration of reusable components. Nevertheless, the formulation of a thoroughly de ned set of components is still subject to ongoing work. Finally, the work on class libraries containing parallel algorithms should be mentioned [8]. Whereas such pre-implemented parallel algorithms allow application construction on a very high level, they typically lack mechanisms for composing multiple instances of these algorithms into applications. Furthermore, these parallel algorithm libraries are implemented directly on top of communication primitives (like message passing) and are hence as sensible to the pecularities of metacomputers as manually implemented parallel applications. In the following, we suggest the construction of component toolkits in order to provide programming platforms for metacomputing that orthogonally integrate resource-aware components with resource-independent, application-level components.
4 A Metacomputing Toolkit The goal of this work is the assembly of a component toolkit that re ects and abstracts properties of metacomputing platforms and allow the construction of resource-aware (and still resource-independent) applications. We can identify the following classes of components:
Con guration management. Most importantly, metacomputers expose highly varying and especially dynamically changing con guration properties. As a basis for any metacomputing application, components representing con guration topology like the clustering of cpu's to platforms and the resulting eects on communication behaviour have to be provided. Special care has to be taken of dynamic changes, in network performance as well as in platform availability.
Shared data structures. Throughout the last years, shared data structures have been introduced as useful communication abstractions for parallel processes. Shared objects as in Orca [3] allow lowlevel communication, while event queues as in ActorSpaces [2] combine asynchronous messages with anonymous communication. Even higher-level containers like object spaces [15] or task queues are well-suited for dierent application needs. Finally, persistent data structures seamlessly integrate le input/output to parallel applications. The core contribution of shared data structures is to abstract their ecient implementations into corresponding components, hence making applications simultaneously maintainable and ecient. 4
Fault tolerance. With increasing failure probabilities of metacomputers, applications have to take partial errors into account. The isolation of persistent tasks that can be executed until they nally succeeded are a key concept for metacomputing applications. The de nition of parallel tasks for making ne-grained components fault tolerant is a natural, although non-trivial extension of this concept. Organizational patterns. Typical patterns for constructing parallel programs also have to be provided. Examples are components like manager and worker as well as divide-and-conquer or SPMD-style organizations. These components are ideally suited to deal with the con guration management components in order to adapt application behaviour to dynamically changing platform properties. One example of such adaptive behaviour is a manager component that takes temporal unavailability of platforms into account when making scheduling decisions. Another example is a SPMD component that dynamically redistributes data due to changing communication behaviour. Algorithms. At the highest abstraction level, parallel algorithms (along with composition mechanisms like pipes and lters) support metacomputing applications. Examples are dataparallel operations (e.g. for collective communication), numerical algorithms and stancil computations. Such parallel algorithms, when constructed from the organizational patterns, easily transport adaptive behaviour to the application level.
5 Conclusions New classes of high-performance applications currently being developed require unique capabilities that can only be resolved by metacomputers, being composed of multiple highperformance computing sites, geographically distributed over various locations or even continents. In this work, we brie y presented the complexities introduced to parallel applications by metacomputing environments. We then concluded that existing approaches to parallel programming fail to deliver reasonable metacomputing applications because they either expose too much complexity detail, prohibiting maintainability, performance tuning, and reuse of code, or they hide too much complexity detail, resulting in inecient programs, contradicting the primary goal of metacomputing: application speed. As (the only?) suitable approach to the construction of metacomputing applications, we then presented the concept of component-based programming and proposed the structure of a metacomputing toolkit. As the subject of our ongoing work is to concretize and implement suitable metacomputing components, we are looking forward to soon seeing the Albatross taking o and ying at full speed over wide-area computing platforms.
References [1] G. Agha. Compositional Software Architectures. IEEE Concurrency, 6(1):2{3, 1998. 5
[2] G. Agha and C. J. Callsen. ActorSpace: An Open Distributed Programming Paradigm. In Proc. of the Fourth ACM Symposium on Principles and Practice of Parallel Programming, pages 23{32, San Diego, Ca., 1993. Published in SIGPLAN Notices, Vol. 28, No. 7, 1993. [3] H. E. Bal, M. F. Kaashoek, and A. S. Tanenbaum. Orca: A Language for Parallel Programming of Distributed Systems. IEEE Trans. Softw. Eng., 18(3):190{205, 1992. [4] H. E. Bal, A. Plaat, M. G. Bakker, P. Dozy, and R. F. Hofman. Optimizing Parallel Applications for Wide-Area Clusters. In Proc. 12th International Parallel Processing Symposium IPPS'98, pages 784{790, 1998. [5] F. Berman and R. Wolski. Scheduling from the Perspective of the Application. In Proc. 5th IEEE Symp. on High Performance Distributed Computing, 1996. [6] N. Carriero, D. Gelernter, T. G. Mattson, and A. H. Sherman. The Linda alternative to message{passing systems. Parallel Computing, 20(4):633{655, 1994. [7] C. Catlett and L. Smarr. Metacomputing. Commun. ACM, 35:44{52, 1992. [8] J. Dongarra, R. Pozo, and D. Walker. ScaLAPACK++: An object{oriented linear algebra library for scalable systems. In Proc. Scalable Parallel Libraries Conference, pages 216{223. IEEE, 1993. [9] I. Foster, J. Geisler, W. Nickless, W. Smith, and S. Tuecke. Software Infrastructure for the I-WAY High-Performance Distributed Computing Experiment. In Proc. 5th IEEE Symp. on High Performance Distributed Computing, pages 562{570, 1996. [10] I. Foster and C. Kesselman. The Globus Project: A Status Report. In Proc. IPPS/SPDP '98 Heterogeneous Computing Workshop, 1998. [11] G. C. Fox. The Application Perspective for Scalable Data and Task Parallel Languages HPF and HPC++. In 1st Symposium on High Performance Computing and Communications, pages 445{457, Arlington, VA, USA, 1994. ARPA-CSTO. [12] B. Freisleben and T. Kielmann. Automated Transformation of Sequential Divide{and{ Conquer Algorithms into Parallel Programs. Computers and Arti cial Intelligence, 14(6):579{596, 1995. [13] B. Freisleben and T. Kielmann. Coordination Patterns for Parallel Computing. In D. Garlan and D. L. Metayer, editors, Coordination Languages and Models, number 1282 in Lecture Notes in Computer Science, pages 414{417, Berlin, Germany, 1997. Springer. Proc. COORDINATION'97. [14] A. S. Grimshaw and W. A. Wulf. Legion { A View From 50,000 Feet. In Proc. Fifth IEEE International Symposium on High Performance Distributed Computing. IEEE Computer Society Press, 1996. 6
[15] T. Holvoet and T. Kielmann. Behaviour Speci cation of Parallel Active Objects. Parallel Computing, 1998. To appear in special issue on coordination languages and systems. [16] T. Holvoet and T. Kielmann. Towards Generative Software Composition. In Proc. of the Thirty rst Annual Hawaii International Conference on System Sciences, volume 7, pages 245 { 254, Kona, Hawai'i, USA, 1998. IEEE. [17] Message Passing Interface Forum. MPI: A Message Passing Interface Standard. International Journal of Supercomputing Applications, 8(3/4), 1994. [18] O. Nierstrasz. Coordination Patterns. Tutorial Material, COORDINATION'97, Berlin, Germany, 1997. [19] O. Nierstrasz and D. Tsichritzis, editors. Object{Oriented Software Composition. Prentice Hall, 1995. [20] Object Management Group. The Common Object Request Broker: Architecture and Speci cation. (draft) edition 2.0, 1995. [21] D. C. Schmidt. A Family of Design Patterns For Flexibly Con guring Network Services in Distributed Systems. In International Conference on Con gurable Distributed Systems, pages 124{135, 1996.
7