Journal of Systems Architecture 48 (2002) 81–98 www.elsevier.com/locate/sysarc
Building a dependable system from a legacy application with CORBA D. Cotroneo a
a,*
, N. Mazzocca b, L. Romano a, S. Russo
a
Universit a degli Studi di Napoli ‘‘Federico II’’, Dipt. di Informatica e Sistemistica, Via Claudio 21, 80125 Napoli, Italy b II Universit a degli Studi di Napoli Via Roma, 29 81031 Aversa (CE), Italy
Abstract This paper presents a dependability oriented, fault tolerance based system design, development, and deployment approach. The approach relies on an architectural framework, which allows legacy software modules to be reused as the basic building blocks of a distributed dependable application. Different levels of replication and alternative adjudication strategies are implemented behind a unified interface. These can be configured for achieving the optimal compromise between dependability and performance, according to application, deployment environment, and fault characteristics. The suggested solution can be implemented on top of any CORBA infrastructure. The architecture has been developed and tested. Experimental results are presented and discussed. 2002 Elsevier Science B.V. All rights reserved. Keywords: Legacy systems; Fault tolerance; Replication; Adjudication; Multi-tier architectures; CORBA
1. Rationale and approach Software systems have functional requirements (i.e., what services the system has to provide), and non-functional requirements (i.e., the quality the system must guarantee in the delivery of such services). Typical functional requirements are business specific services, and typical non-functional requirements are dependability and performance. Dependability is defined as the quality of the delivered service such that reliance can justifiably be placed on the service it delivers [5]. As
*
Corresponding author. Tel.: +39-0817-683824; fax: +390817-683216. E-mail address:
[email protected] (D. Cotroneo).
such, it is a composite attribute which consists of an application dependent mixture of a variety of properties, such as reliability, availability, safety, maintainability, and security. Fault tolerance is a major category of dependability techniques, and the only applicable when the dependability of individual components cannot be improved and a solution must be found at the architectural level. By fault tolerance it is meant the ability of a system to behave correctly also in the presence of faults. However, several factors limit the deployment of fault tolerance solutions on a large scale. First, fault tolerance often involves some form of adjudication. Replication is indeed the most commonly used redundancy technique, when specific assumptions on the application cannot be done. Adjudication is a flexible means of building
1383-7621/02/$ - see front matter 2002 Elsevier Science B.V. All rights reserved. PII: S 1 3 8 3 - 7 6 2 1 ( 0 2 ) 0 0 0 9 3 - 0
82
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
a single result from multiple sources of information, which is particularly useful when the pool of available resources and the dependability requirements of the application vary dynamically [21]. Unfortunately, replication and adjudication have extra costs, which typically include additional hardware and software, and/or additional execution time (i.e., performance penalty). A costeffective design, development, and deployment approach is required to implement efficiently different replication and adjudication strategies, in order to meet fault-tolerance requirements, while minimizing negative effects on performance. We suggest a dependability oriented, fault tolerance based system design, development, and deployment approach for the reuse of legacy applications. The suggested approach is suited to a wide spectrum of legacy applications. In this context, by legacy application we mean a software program for which maintenance actions are either impossible or prohibitively costly. This may be due to a variety of reasons, including: • The application is written in a programming language which has become obsoleted, as compared to the rest of the technologies used by the enterprise to develop its business applications. An example legacy application might be a COBOL program in an enterprise environment where web oriented technologies are being massively used; • The application is not well documented. As an example, it has been modified over the years by different people, who are not available anymore and/or who have not adequately reported in the software documentation the modifications they have performed; • The application runs on a dedicated hardware platform, and the following two conditions hold: (i) the original platform does not support the deployment environments currently adopted by the enterprise, and (ii) the cost for rewriting the application, in order to migrate it to a new platform is prohibitive. There is a large number of legacy applications around. These are often a valuable resource to the enterprise, for three fundamental reasons:
• They fully satisfy the application’s functional requirements, i.e., they work pretty well in all currently needed ‘‘use cases’’; • They play a crucial role in the production chain, since they are strongly coupled with the rest of the information infrastructure; • They represent a significant investment, since they are the result of costly business analysis processes, and projects. However, the dependability level achieved by these applications––which might have been satisfactory at the time of the original design––has become more and more inadequate over the years, due to the fast pace of increase in the dependability requirements of software applications. Solutions are needed, to leverage the quality of the services which legacy applications deliver. The strategy we suggest provides an effective means to achieve dependability at a low cost, since it has some important advantages, which are briefly discussed in the following. First, re-programming of existing functions is avoided, since the existing software, implementing those functions, is integrated in the new system, and thus reused [1]. Although this may involve wrapping the legacy modules which are to be integrated in the new system, we believe the final balance is positive under many circumstances. This is especially true if one adopts object oriented design (OOD) methodologies, and an adequate middleware technology. Second, different levels of replication and alternative adjudication strategies are supported. The system can be configured, as to achieve the best compromise between fault tolerance and performance, according to the specific needs of the application at hand, as well as to the characteristics and the conditions of the deployment environment, and to the nature of faults. This is done behind a unified interface, which maximizes transparency to the clients. Third, the testing activity is reduced, since only the newly integrated mechanisms, which implement non-functional requirements, need to be validated. In fact, in the typical scenario, legacy modules have already been thoroughly tested and debugged, during the years long operational phase
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
prior the integration. Since the software architecture and the hardware platform of the back-end tier of the system are left untouched, these modules do not need to be tested again after integration. Fourth, the resulting system is a distributed application based on standard technologies. As such, it does not mandate the development of a dedicated hardware platform. Instead, it can be deployed over a general purpose computing platform, which is either already available or can be acquired at a low cost. An example of such a platform is a Network Of Workstations (NOW). An architectural framework, implementing the suggested strategy, has been developed and applied to a case-study application. The framework can be deployed on top of any CORBA infrastructure, i.e. it does not mandate the adoption of a specific ORB implementation. Experimental results demonstrate that the architecture is able to achieve dependability at an acceptable performance penalty, even if stringent dependability requirements are enforced. The rest of the paper is organized as follows: Section 2 illustrates the assumptions we make and the definitions we use throughout the paper, including the fault models we have considered. Section 3 states the dependability objectives we want to achieve. Section 4 describes the architecture of the middle-tier server, and the protocols for system operation and re-configuration. Section 5 deals with implementation details of the system prototype. Section 6 presents experimental results about the cost of the wrapping and the trade-offs in terms of fault tolerance vs. performance introduced by replication, detection, and adjudication activities. Section 7 discusses pros and cons of our approach, as compared to some possible alternatives. Section 8 concludes the paper with final remarks and lessons learned.
83
system, and the assumptions we make about the legacy system and the supporting infrastructure. 2.1. Conceptual model This section provides a brief overview of concepts crucial to fault tolerant computing, and clarifies the meaning of terms we use throughout the paper. Most definitions are taken from [5], where further details can also be found. A system failure occurs when the delivered service deviates from the specified service, where the service specification is an agreed upon description of the expected service. The failure occurred because the system was erroneous: an error is that part of the system state which is liable to have led to the failure. The cause of an error is a fault. A fault is transient if it disappears without repair. It is permanent if repair actions are needed to remove it from the affected system. The chain of events, which lead from a fault in a component to a failure of the overall system is illustrated in Fig. 1. The figure reinterprets the error propagation scheme presented in [22] in the context of a three tier architecture. The figure assumes that the CORBA interface in the first tier is the observation point for the overall system. That means, we say the system has experienced a failure only if the service exported to the CORBA ORB deviates from its specification. Faults and fault-related events propagate from the back-end to the system interface. As far as their origin is concerned, faults can be classified as internal or external [26]. In the back-end, faults––if activated––lead to errors, which in turn may produce failures of the back-end system. These failures represent external faults for the middle-tier. In the middle-tier, internal and external faults may generate errors. Finally, if an error affecting the middle-tier is activated, a system failure occurs. 2.2. Assumptions about the legacy system
2. Assumptions and definitions Before illustrating in detail the approach we suggest, we need to clarify the conceptual model we adopt for the propagation of events within the
As far as the legacy system is concerned, we assume that the application consists of a client and a server module, which interact according to the request/reply scheme. This is quite a common structure for legacy applications [6]. Nevertheless,
84
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
Fig. 1. System conceptual model and error propagation scheme.
the approach could be easily extended to legacy applications which do not a have a client server organization. 2.3. Assumptions about the deployment infrastructure As far as the network is concerned, we assume a local area network (LAN) connection exists between the middle-tier and the back-end. Such a connection will be considered reliable, i.e. messages are always delivered without errors and with a limited delay. This is not a very strong assumption for currently available LAN technologies. As far as the middleware infrastructure is concerned, we assume the basic CORBA services are always available. In other words, we do not address issues related to faults affecting the infrastructure. Instead, we concentrate on faults in the back-end application. 2.4. Fault models As far as faults/errors are concerned, we refer to definitions given by Powell in [19]. In particular, we apply those definitions to the specific context of
a hierarchical system. In such a system, faults, errors, and failures propagate as described in Fig. 1. In this study, we only consider faults affecting the back-end servers. However, we do not handle such faults directly. Instead, we wait for the effects of such faults to manifest as failures of the backend servers. Failures of the back-end servers are then treated as system level faults affecting the overall architecture. We consider two types of failures of the back-end servers, which correspond to two types of system level faults. These are briefly described in the following: (1) Failures in the time domain (timing fault): A server instance S experiences a timing failure if a client C––trying to contact it––does not receive any reply within a predefined amount of time. This may occur for a number of reasons, including: • Process expiration––that means, the server process has crashed; • Host crash––that means, the machine hosting the server process instance is down or hung; • Network failure––that means, the machine hosting the server process instance is not reachable;
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
• Overload––this occurs when the server process or host is too busy for delivering the requested service within the predefined timeout. (2) Failures in the value domain (value fault): In this model, a server instance S experiences a failure if S is ‘‘alive and kicking’’, i.e. it does reply to stimuli from C, but the replies it delivers are incorrect, i.e., they are not conforming to the specifications. This may occur for a number of reasons, including: • Hardware-induced software errors––these are errors in the application due to faults affecting the underlying hardware platform [3]; • Communication errors––such errors occur when data gets corrupted while traversing the network infrastructure. Unless ‘‘leaky’’ communication protocols are adopted, this kind of errors is fairly unlikely to happen. A leaky protocol is a protocol that allows corrupted information to be delivered to the destination [4]; • Software errors––these may be due to errors in the design and/or implementation process of the application and/or of the operating system. It is worth noting that the timing fault model is not the most dangerous scenario, for at least two reasons. First, in this case the fault can be detected relatively easily. Second, it has a low probability of propagating to other components. Many permanent faults, for example, fit into this model [14]. The value fault model is useful for studying situations where the occurrence of a fault does not prevent the affected component from being operational, in the sense that it is still able to perform some work––although not conforming to the specifications––and it is also able to interact with other components. Transient faults are best represented by this other model [15]. It should be noted that the dependability measures which result from these definitions are strongly dependent on the desired quality of service (QoS) level for the system under study. As an example, more stringent requirements on the timeliness of service delivery would result in a
85
higher rate of timing faults. A formal description of this phenomenon is presented in [24]. Similarly, more restrictive data integrity checks on replies delivered by the target system would result in a higher rate of value faults.
3. Dependability objectives Our goal is to develop mechanisms which prevent faults occurring in the back-end from propagating to the system level and eventually generating failures of the overall system. In order to achieve this goal, a number of activities is necessary. Definitions are given further in this section, as individual activities are dealt with. Such definitions have been taken from [2] and adapted to our specific context. In general, the system will have to go through a number of steps. Which activities are performed at each step in specific situations, depends on the type of the fault(s), on the level of replication currently available, and on the dependability settings of the middle-tier. Detailed descriptions of individual activities are given in the following: • Error detection––if a fault hits a back-end server and this generates an error, the system must recognize that something unexpected has occurred. The detected error represents a system level fault; • Fault diagnosis––upon detection of a system level fault, the system must perform activities aiming at discovering the nature of the fault. In particular, it must be understood whether the fault is permanent or transient; • Degradation––upon occurrence of a permanent fault, the system must exclude the affected component and continue operation. This may have an impact on the dependability characteristics of the system; • Error processing––upon detection of an error in a system component, measures must be taken to condition the effects of such an error in the system. If enough replication is available, the system may hide errors by allowing redundant information to outweigh the incorrect information (error masking). In other cases, the system
86
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
might not be able to hide the errors, but it could be able to enforce a fail-silent behaviour. A system is said to be fail silent if it satisfies the failsilent assumption, i.e. it either produces correct outputs or it does not produce any output at all [7]. Under many circumstances, the fail-silent behaviour can be a desirable property; • Fault treatment––even if the effects of errors are successfully conditioned or concealed, faults must be removed from the system, in order to avoid that new errors be generated and failures eventually occur; • Re-integration––this is a crucial activity, since it prevents the exhaustion of system resources. It includes actions aiming at integrating in the system resources as they become available again.
satisfying the functional requirements. The middle-tier, namely the SM component, is in charge of leveraging services provided by the legacy servers, in order to satisfy dependability requirements. To this aim, the SM wraps and replicates the legacy servers. Enhanced services are then exported by means of CORBA [8]. The first tier is represented by clients that access the SM services using CORBA facilities [9]. Clients are not aware of the fault tolerance mechanisms implemented by the SM component. All they know is which services are available and what their interface is. Legacy clients can still contact the legacy server modules in the usual way. However, they are now only allowed read operations. 4.1. System operation modes
4. Overall system architecture The overall organization of the system is depicted in Fig. 2. The solution we propose is flexible, since it relies on a modular architecture, consisting of re-configurable software components. As shown in the figure, the clients, the Service Manager (SM) component, and the legacy servers are located in the first, second, and third tier of a three-tier architecture, respectively. This favors a development approach which exploits clean separation of responsibilities. In particular, the backend provides application-specific functions, thus
Services can be grouped in two main categories, namely queries and updates. For all services, the client addresses its request to the SM and sits waiting for a reply. The SM forwards the request to the back-end servers. It then collects the replies from the servers and builds the reply to be sent back to the client. The SM supports different behaviours, which correspond to distinct operation modes of the system. An operation mode results from the combination of a replication level (i.e., the number of back-end servers which are contacted), and of an adjudication strategy (i.e., the technique which is used to build the reply to be sent out to the client). Three levels of replication
Fig. 2. Overall system architecture.
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
are possible, namely triple modular redundancy (3MR), dual modular redundancy (2MR), and single modular redundancy, i.e. simplex execution (1MR). These can be combined to three alternative adjudication strategies, namely (fail silent, most dependable, and first available). That results in nine possible combinations, of which only seven make sense. The system has thus seven distinct operation modes. Detailed descriptions of such operation modes are provided later in this section. The availability of a whole variety of operation modes makes it possible for the system administrator to create the deployment scenario which best accommodates the specific dependability/performance requirements of the application. To this aim, the Management interface allows the administrator to modify the operation mode and other configuration parameters at any time (Fig. 3). The seven operation modes are: (1) Single modular redundancy (1MR): This corresponds to simplex execution. While in this configuration, the adjudication strategy is irrelevant, since a single information source is available. The SM wraps the back-end server, but it does not (and it cannot) implement any form of fault tolerance. Fault detection would be possible, by introducing in the SM some form of controls, external to the wrapped application. Assertions are an example of a possible mechanism [27]; (2) Dual modular redundancy with first available (2MR-FA): In this case, the first reply from any of the two legacy servers is immediately sent back to the client. If the reply consists of multiple chunks of data, this strategy is applied to individual chunks. No data comparison is performed. The strategy trades off fault tolerance for performance. In fact, on the one side it allows the middle-tier to get data from the fastest (at any given time) back-end source, on the other side in the event of a value fault, the error is propagated to the client; (3) Dual modular redundancy with most dependable (2MR-MD): This configuration should be used in situations where an erroneous result is to be preferred to no result at all (fail-silent
87
alternative). In this case, all replies from the legacy servers are collected and compared. Upon detection of a value fault (the backend servers provide two different replies), the SM uses a set of heuristics aiming at picking the reply which is more likely to be correct. More precisely, the reply provided by the server which has proven more dependable in the past is handed out to the client. If the reply consists of multiple chunks of data, this strategy can be either applied to individual chunks separately, or to the reply as a whole. It should be noted that, in general, treating individual chunks of the same reply as independent entities is possible at no dependability penalty only if chunks are independent of each other; (4) Dual modular redundancy with fail silent (2MR-FS): This configuration should be used in situations where no result at all is to be preferred to an erroneous result (most dependable alternative). A system is said to be fail silent if it satisfies the fail-silent assumption, i.e. it either produces a correct output, or it produces no output at all. If the fail silent assumption is not satisfied, a fail-silent violation occurs. Due to the hierarchical organization of the system, fail silent violations of the back-end servers may lead to fail silent violations of the overall system. The behaviour of the system while in this mode is the same as in (2), with the difference that an output is produced only if both units agree on the results (two out of two mechanism). By choosing the 2MR-FS operation mode, one can significantly reduce the rate of fail silent violations of the overall system. In particular, in the absence of common mode failures of the back-end servers, this operation mode can enforce fail silent behaviour of the overall system also in the presence of fail silent violations of the back-end servers; (5) Triple modular redundancy with first available (3MR-FA): Same as (2), but three back-end servers are used instead of two; (6) Triple modular redundancy with most dependable (3MR-MD): Similar to (3), but with some differences. First, three back-end servers are used instead of two. Second, whenever it is possible, the system exhibits a two out of three
88
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
behaviour, i.e. if at least two units agree on a value, that value becomes the output of the overall system. Instead, in the case of a tie (i.e., the back-end servers provide three different replies), the SM uses a set of heuristics aiming at picking the reply which is more likely to be correct. Again, the reply provided by the server which has proven more dependable in the past is handed out to the client. If the reply consists of multiple chunks of data, this strategy can be either applied to individual chunks separately, or to the reply as a whole; (7) Triple modular redundancy with fail silent (3MR-FS): Same as (4), but three back-end servers are used instead of two. The system delivers an output only if at least two units agree on the results (two out of three behaviour). The additional redundancy may also allow the system to treat the fault which has led to the error (if it is not a permanent one). System degradation occurs in two cases: (1) The monitoring facility of the SM detects a timing fault; (2) The two following conditions hold: (i) the adjudication facility of the SM detects a value fault, and (ii) the fault diagnosis procedure comes to the conclusion that the error is due to a permanent fault. Monitoring capabilities are provided to periodically send heartbeat messages to the legacy servers. As soon as failed servers become available again, they are re-integrated in the system. 4.2. Availability of the Service Manager In any adjudication architecture, the adjudicator is a single point of failure for the overall system. This also applies to the SM component. As a consequence, measures should be taken in order to prevent the SM from crashing. To this aim, the CORBA replication service might be a viable option [10]. However, the focus of this paper is replication of the back-end resources, and not of middle-tier components. Nevertheless, we implemented a simple but effective strategy to increase
the availability of the SM, which works as follows. The CORBA naming service stores object references in a hierarchical structure [9]. The hierarchy is made up of naming contexts, which can contain object references, as well as other naming contexts. We created a specific naming context, called SM, under which we bound multiple instances of the SM component, which were launched on different machines. Clients look up the SM object, by invoking the resolve() method on this naming context and obtain a reference to an available instance. In the absence of failures of the naming service, this strategy actually improves the availability of the SM.
5. Implementation In this section, we provide implementation details for the main system components, according to such methodologies. In particular, we point out the pipeline mechanisms we have set up, which allow the system to exhibit good performance also if stringent dependability requirements are imposed. It should be emphasized this was not a trivial task. In fact, wrapping and replication are both costly in terms of performance. Since our approach uses replication of wrapped legacy modules to achieve dependability, the goal of acceptable performance was hard to achieve. Our system integrates in the third tier the server module of a legacy client/server application. In particular, the application used for the prototype––a client/server information system for the management of a clip repository––uses the services of a commercial Database Management System (DBMS), namely PostgreSQL [16], for stable storage facilities. As far as development technologies are concerned, we used JAVA (specifically SUN JDK 1.3) [17] as the programming language for implementing the SM component and the CORBA client, and Inprise Visibroker (version 4) [18] for providing ORB services. 5.1. Client The client is a Graphical User Interface (GUI) based access point to the SM services. As already
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
mentioned, multiple instances of the Server Manager are launched and bound under a CORBA naming context called SM. The client looks up the SM object, by invoking the resolve() method on this naming context and obtains a reference to an available instance. Then, it is ready to issue requests. It has a push interface, which allows it to receive data from the SM, as soon as it is available. 5.2. Service Manager component The system has been developed according to OOD methodologies. Service implementation exploits polymorphism, i.e. the ability to hide
89
different implementations behind a common interface. Fig. 3 illustrates a simplified class diagram of the SM in unified modeling language (UML) syntax [13]. System functions are exported over the ORB by the IDL DBOperationOperations interface. In order to make our CORBA server implementation independent of the specific ORB used, we exploited the Tie mechanism to implement delegation in the generated CORBA classes. This relies on the DBOperationPOA_tie class, which can be directly instantiated to implement ORB specific services. For all business methods, the Tie defers, or delegates, the calls directly to those
Fig. 3. Class diagram of the SM.
90
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
(in the interface) with the same signature. The Service Provider is the class in charge of implementing business services. The Status class is in charge of holding information about the current operation mode of the system, and currently available servers. The RebuildDB class is in charge of managing system mode transitions. Transitions are performed according to indications provided by the Status object. The Serviceprovider is the central component. It can be configured according to the replication levels and adjudication strategies described in Section 4.1. The administrator configures the system via the Management interface. The Gateway class represents the client side entry point to the legacy server. It acts as a TCP/IP proxy, since the legacy application came with a TCP/IP socket-based interface. In order to be able to communicate with the server without performing any changes to it, we had to implement in the Gateway object the communication pattern originally in place between the legacy client and the legacy server. Since a reply can be very large, it is delivered to the SM in chunks of data. For this reason we implemented a load method in the
Gateway class. Each call to this method retrieves a single chunk of data, belonging to the current reply. These chunks are then forwarded to the client via its push interface. Three threads, namely the GatewayThreads, are used to parallelize the activity of the Gateway objects. In order to decouple the GatewayThreads from the Service Manager, two independent buffer objects––namely the Response and the Request––have also been implemented. These are illustrated in the class diagram of Fig. 3. Combined use of the GatewayThread, the Request, and the Response objects, allowed us to implement in the middle-tier a caching strategy, which consisted in prefetching new data, belonging to the current reply, while data already available was delivered to the client. A pipeline was so set up between the client and the back-end servers, as shown in Fig. 4. This resulted in a significant performance improvement, as discussed later, in Section 6. 5.3. Operation mode transitions Operation mode transitions are necessary when one of the following events occurs:
Fig. 4. The multithreaded caching scheme of the SM.
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
• The system administrator changes the system operation mode, to better cope with modified dependability requirements in the application (this is done via the Management interface); • The system detects that one or more of the backend servers are unreacheable; • The two following conditions hold: (i) the adjudication facility of the SM detects a value fault, and (ii) the fault diagnosis procedure comes to the conclusion that the error is due to a permanent fault. As already mentioned in Section 5.2, the RebuildDB object is in charge of maintaining stable state consistency when state transitions occur, i.e. when the operation mode of the system changes. 5.4. Fault detection To implement detection of timing faults, we developed a class called FaultDetector. The FaultDetector object is in charge of periodically checking if the active servers are still available. To this aim, the FaultDetector object reads the current configuration of active servers and sends them an heartbeat message. If a server fails to acknowledge the heartbeat message, the FaultDetector object notifies the Status object, which in turn has the Gateway object close the connection to that server. It is then the responsibility of the Gateway object to monitor the failed server, in order to re-integrate it in the system after recovery. An interesting study about trade-offs in terms of resources vs accuracy when designing a fault detector is [25].
6. Experimental results In this section, we discuss the results obtained from the dependability and performance experiments we have conducted. The primary objectives of the experimental campaign can be stated as follows: (1) To test system fault tolerance features, i.e. to verify that the dependability objectives presented in Section 3 have been achieved;
91
(2) To evaluate the performance penalty due (i) to integration related activities (wrapping, and CORBA overhead), and (ii) to dependability related activities (replication, detection, and adjudication), and the effectiveness of the performance improvement schemes which have been implemented. The testbed used for the experimental campaign is illustrated in Fig. 5. The figure illustrates system configuration, and allocation of tasks to nodes. We believe such a configuration is suited to test the application we have developed in a realistic deployment scenario. During the tests, the external load was the normal background load of active services and applications. We did not mandate an idle workload, because we felt the normal workload would be more representative of the workload that would be experienced in a cluster of non-dedicated nodes.
6.1. Dependability experiments Our objective was to verify that the dependability objectives stated in Section 3 were achieved. In order to do so, we conducted fault-injection experiments on the system. Fault injection is a method for testing the fault tolerance of a mechanism with respect to its own specific inputs: the faults [20]. An experiment consisted in the execution of system functions while faults were injected to the back-end servers. We developed two different kinds of fault-injectors, to induce timing and value faults in the system. In particular, the timing fault injector consisted of a Simple Network Management Protocol (SNMP) Java Client, which shut down the switch ports where the legacy servers were attached for an amount of time greater than the time-out set in the middle-tier server. The value fault injector, instead, was a malicious client which read data from individual replicas of the legacy servers, corrupted it, and wrote it back. Permanent value faults were simulated by having the SM inhibit write operations to injected data. Transient value faults, instead, were obtained by allowing the SM to rewrite data upon error detection.
92
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
Fig. 5. Experimental testbed.
A variety of injection tests was performed. Results have shown that the middle-tier structure actually improved the dependability characteristics of the legacy system, since the new system performed correctly in all monitored situations.
6.2. Performance experiments Our objective was to evaluate the performance penalty due to integration and to dependability related activities, and in particular: • The performance penalty due to the wrapping; • The performance penalty due to the detection activity; • The performance penalty due to the replication and adjudication activities. In order to do so, we measured the performance of the legacy system, and compared it to our system’s in various operating modes. An experiment consisted in the cyclic execution of a query operation for different values of the reply size. Performance was measured in terms of the response times at the client sides. That means, the performance of the legacy client and of the new client were measured in terms of the response times
at the legacy client side ðtrlegacy Þ and at the new client side ðtrnew Þ, respectively. Since our objective was to evaluate the performance of the system under normal operation conditions, no faults were injected during the experiments. The performance penalty was measured as the the ratio of the response times at the legacy client side to the response time at the new client side (in a given operation mode), i.e. K ¼ trlegacy = trnew . In order to evaluate the performance penalty due to the wrapping, we measured the ratio of the response time at the legacy client side to the response time at the new client side while the system was in single modular redundancy operation mode (1MR), i.e. K ¼ trlegacy =tr1MR . We repeated the experiments on the three servers of the experimental test-bed. While the values of the response times of the individual legacy servers varied with the particular server used for the experiment (as it is apparent from Fig. 6), the ratio between the response times of the legacy client and the one of the new client resulted to be relatively independent of this factor. Values of such ratio are reported in Fig. 7. Experimental results show that the performance penalty decreases, as the reply size increases. In particular, for smaller values of the reply size, the penalty can be as high as 40%. As values increase,
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
93
Fig. 6. Performance of the legacy system on the different nodes.
Fig. 7. Performance penalty ðK ¼ trlegacy =trnew Þ due to the wrapping.
the penalty decreases. The new system achieves the same performance as the old one for a value of the reply size of about 2.5KB. For larger values of the reply size, the new system exhibits even better performance than the old one. This is made possible by the pipeline organization deriving from the multithreaded caching scheme, described in Section 5.2, which allows information exchanges between the first tier and the middle-tier, and between the middle-tier and the back-end to proceed in parallel. In order to evaluate the performance penalty due to the replication and adjudication activities, we compared the response times of the system in dual modular redundancy with fail silent (2MRFS) and in triple modular redundancy with fail
silent (3MR-FS) operation modes to the one of the system in 1MR. Fig. 8 reports experimental results. These show that the performance penalty due to replication and adjudication activities is relatively independent of the reply size. More precisely, a performance penalty of about 14% and of 21% was observed if the system was operating in 2MR-FS and in 3MR-FS, respectively. In order to evaluate the effectiveness of the caching and pipelining schemes, we compared the response time of the system in triple modular redundancy with first available (3MR-FA) to the one of the system in 1MR. In this case, for smaller values of the reply size (i.e. 1–5KB) redundant execution still exhibited some performance penalty,
Fig. 8. Performance penalty due to replication and adjudication.
94
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
although far smaller than the one observed for 2MR-FS and 3MR-FS. On the contrary, for larger values of the reply size (i.e. larger than 5KB), the system operating in redundant mode exhibited better performance than simplex execution did. This behaviour can be interpreted taking into account that two conflicting factors impact system performance. On the one side, having multiple threads run concurrently gives the system the possibility to receive data from the fastest (at any time) available source. On the other side, thread management has a cost. For smaller values of the reply size the latter effect prevails, whereas for larger values the former effect prevails.
7. Related work A great deal of research has been conducted on providing support for dependability to existing distributed applications. These projects differ from one another in many aspects, including the nature of the fault tolerance mechanisms (hardware, software, or a combination of the two), and the level of transparency to the application level (application aware/unaware approach). Here, we limit our analysis to projects operating within the framework of CORBA infrastructures. The need for fault tolerance in CORBA has been long advocated by the scientific community. This has led to the specification of fault tolerant CORBA by OMG [10]. A number of projects has been activated, both in the academia and in the industry, which aim at developing fault tolerant CORBA platforms. In the following, we briefly review some of the major projects in this area, and how they might have been used to provide alternative solutions to our problem. We then discuss pros and cons of our solution, as compared to such alternatives. Several systems have been developed to provide fault tolerance for CORBA applications. These systems incorporate fault tolerance management in the ORB and/or in additional software layers. AQuA [11], and Eternal [12] are examples of such systems. AQuA provides replication groups. A replication group is composed of one or more identical objects. Communication between repli-
cation groups is done using connection groups. The Eternal system adds fault tolerance to CORBA applications by replicating objects and incorporating a number of mechanisms to maintain replica consistency. It would have been theoretically possible to develop our system prototype on top of the infrastructures provided by some of those projects. Should we have done so, we would not have had to design and develop the middle-tier component. We would still have had to wrap the legacy server, but then we could have used the ORB built-in replication facilities to replicate the wrappers, and thus achieve fault tolerance. However, this approach has a number of disadvantages. First, to the best of our knowledge, to have the individual gateway replicas ‘‘talk’’ to distinct instances of the legacy servers, running in the backend would not have been straightforward. We would have had to find some sort of work-around, to circumvent the strong replica consistency enforced by the system. Second, assuming we managed somehow to have the replicas talk to distinct servers, we would have had to rely on the built-in voting capabilities of the ORB, for fault handling. Studies are, demonstrating ORB level voting is still unsatisfactory, for a number of reasons, and in particular [23]: • Every CORBA implementation available to date performs voting in a byte-by-byte fashion on marshalled network messages; • Little progress has been made toward the ability to change voting algorithms at runtime, in response to changing conditions. Third, we would have been stuck with a specific ORB environment. In fact, all fault tolerant CORBA projects we are aware of mandate for the adoption of a specific ORB implementation and/or of specific supporting components. Our approach is different, in that we provide fault tolerance via the business logic of the middletier. An interesting proposal which also relies on a multi-tier scheme is presented in [28]. In [28], the authors present a set of protocols, mechanisms
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
and services which allow a CORBA system to handle object replication. The paper discusses implementation details of the replication logic, and points out specific properties of the approach, and in particular its non-intru-siveness and interoperability. Since we provide fault tolerance via the business logic of the middle-tier, we have to explicitly program the replication strategies and the adjudication strategies in the SM component. If on the one side, this approach is more costly in terms of programming effort, on the other side it has several advantages. First, since we rely on server replication (as opposed to object replication), having the wrappers talk to distinct servers, is completely hassle free. Second, we are able to adjudicate on unmarshalled application level data (as opposed to byteby-byte voting). Third, we can change the replication level and/or the adjudication strategies at runtime, in response to changing conditions, in order to provide alternative operating points with different trade-offs with respect to dependability and performance. Fourth, we do not mandate the adoption of any specific ORB. In conclusion, we believe that––at the time of this writing––there is still a great need for configurable architectural frameworks which improve the dependability level of existing applications, since the built-in fault tolerance features of existing ORB implementations are not flexible enough to accommodate the wide variety of needs such applications have.
8. Conclusions This work presented a dependability oriented system development approach, based on reuse of legacy applications. The suggested approach is suited to virtually any client-server application which is based on a request–reply interaction paradigm, and could be easily extended to applications which do not rely on such an organization. An architectural framework, implementing the suggested strategy, has been developed and applied to a case-study application. Such a frame-
95
work can be deployed on top of any CORBA infrastructure, i.e. it does not mandate the adoption of a specific ORB implementation. A system prototype was developed and tested over a distributed heterogeneous platform. The prototype incorporated the server module of a legacy lient/ server information system and replicated it according to different levels of replication. Dependability was achieved by means of the business logic of the middle tier component. The service was delivered to the clients at a higher level of dependability, in a transparent way, since it had a polymorph implementation of a unified interface. From a methodological point of view, this experience has shown the followings: (1) The suggested strategy provides an effective means for achieving dependability at a low cost, since it favors a development approach based on a clean separation of duties. In fact, the strategy makes it possible to: (i) guarantee functional properties, by incorporating in the back-end existing software modules, and (ii) add dependability to the resulting system via proper programming of the business logic of the middle tier(s). This has several advantages, in terms of development effort, since: • The software implementing pre-existing functions is reused. Although this may involve wrapping the legacy modules which are to be integrated in the new system, we believe the final balance is positive under many circumstances. This is especially true if one adopts OOD methodology and an adequate middleware technology; • The complexity of the overall system is reduced, as compared to typical software fault tolerance solutions. In fact, studies are, which demonstrate such solutions typically lead to a dramatic increase in the complexity of the software design, implementation, and testing phases, since (i) functional and non-functional requirements are dealt with together throughout the entire system development process, and (ii) code implementing non functional requirements is often redundant and it is spread all over the application.
96
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98
(2) The resulting architecture enables the system developer to incorporate fault tolerance strategies specific to the application, and the system de-ployer to customize those strategies according to specific environment and platform characteristics. This ultimately provides a flexible mechanism to implement different levels of dependability, which can adapt to changing application requirements and environment conditions. From an experimental point of view, results demonstrated the followings: (1) The dependability gain includes various features, such as: error detection, fault diagnosis, different forms of error processing, fault treatment, graceful degradation, and re-integration; (2) The architecture is able to achieve dependability at an acceptable performance penalty, even if stringent dependability requirements are enforced. It should be emphasized this was a particularly challenging issue to address, since wrapping and replication are themselves costly, in terms of performance. A multi-threaded caching scheme was implemented, in order to limit the performance penalty experienced by the system. Such a scheme allowed information exchanges between the first tier and the middle-tier, and between the middle-tier and the back-end to proceed in parallel. measurements indicated that: • System performance penalty due to the wrapping decreases, as the result set size increases. In particular, for smaller values of the reply size, the penalty can be as high as 40%. As values increase, the penalty decreases. The new system achieves the same performance as the old one for a value of the reply size of about 2.5KB. For larger values of the reply size, the new system exhibits even better performance than the old one does; • The performance penalty due to detection, replication, and adjudication activities is relatively independent of the reply size. More precisely, a performance penalty of about 14% and of about 21% was observed if the
system was operating in 2MR with fail silent and in 3MR with fail silent; • The system exhibits quite a different behaviour when operating in 3MR with first available operation mode. In this case, for smaller values of the reply size (i.e. 1–5KB) redundant execution still incurs some performance penalty, although far smaller than the one observed for 2MR with fail silent and 3MR with fail silent. On the contrary, for larger values of the reply size (i.e. larger than 5KB), the system operating in redundant mode exhibits even better performance than simplex execution.
Acknowledgements This work has been partially carried out under the financial support of Italian Ministry for University and Science and Technology Research (MURST) within the project ‘‘MUSIQUE: Infrastructure for QoS in Web Multimedia Services with Heterogeneous Access’’. Authors are grateful to Angelo Testa and Carlo Mascolo for the excellent job they have done in writing part of the code. Authors want to acknowledge the contribution of Andrea Bondavalli and Silvano Chiaradonna for their suggestions and comments.
References [1] H.M. Sneed, Encapsulating legacy software for use inclient/server systems, in: Proceedings of Working Conference on Reverse Engineering, 1996, pp. 104–119. [2] D.P. Sieworek, S. Swarz, Reliable Computer System–– Design and Evaulation, second ed., Digital Press, 1992. [3] K.K. Goswami, R.K. Iyer, Simulation of software behavior under hardware faults, in: Proceedings of the 23rd Annual International Symposium on Fault-Tolerant Computing, 1993. [4] R. Han, D. Messerschmitt, A progressively reliable transport protocol for interactive wireless multimedia, in: Multimedia Systems, 7, 1999, pp. 141–156. [5] J.-C. Laprie, Dependable computing and faut tolerance: concepts and terminolgy, in: Proceedings of The 15th International Symposium on Fault Tolerant Computing Systems, IEEE Computer Society, 198.
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98 [6] R. Orfali, D. Harkey, J. Edwards, Client/Server Survival Guide, third ed., 1999. [7] D. Powell, G. Bonn, D. Seaton, P. Verissimo, F. Waeselynck, The delta-4 approach to dependability in open distributed computing systems, in: Proceedings of the 18th International Symposium on Fault Tolerant Computing Systems, IEEE Computer Society, 1988. [8] Object Management Group, The common object request broker: architecture and specification, second ed., OMG, 1995. [9] Object Management Group, CORBA services: common services specification, 95-3-31 ed., OMG, 1995. [10] Object Management Group, Fault Tolerant CORBA Specification, V1.0, April 2000. [11] M. Cukier, J. Ren, C. Sabnis, D. Henke, J. Pistole, W. Sanders, D. Bakken, M. Berman, D. Karr, R. Shantz, AQuA: an adaptive architecture that provides dependable distributed objects, in: Proceedings of the 17th Symposium on Reliable Distributed Systems (SRDS-17), IEEE-CS, 1998. [12] P. Narasimhan, L.E. Moser, P.M. Melliar-Smith, State synchronization and recovery for strongly consistent replicated CORBA objects, in: Proceedings of the 2001 International Conference on Dependable Systems and Networks, IEEE-CS, 2001, pp. 261–270. [13] I. Jacobson, G. Booch, J. Rumbaugh, Unified Software Development Process, Addison-Wesley Object Technology Series, January 1999. [14] R.K. lyer, D. Tang, Experimental analysis of computer system fault tolerance, in: D.K. Pradhan (Ed.), FaultTolerant Computer System Design, Prentice Hall Inc, 1996 (Chapter 5). [15] A. Bondavalli, S. Chiaradonna, F. Di Giandomenico, F. Grandoni, Threshold-based mechanisms to discriminate transient from intermittent faults, IEEE Trans. Comput. 49 (2000) 230–245. [16] Available from . [17] Available from . [18] Available from . [19] D. Powell, Failure mode assumptions and assumption coverage, in: Proceedings of the 22nd International Symposium on Fault-Tolerant Computing, IEEE-CS, Los Alamitos, CA, USA, 1992. [20] B. Randell, J.C. Laprie, H. Kopetz, B. Littlewood, in: Predictably Dependable Computer Systems, SpringerVerlag, Berlin, 1995, pp. 307–308 (Chapter 5). [21] F. Di Giandomenico, L. Strigini, Adjudicators for diverseredundant components, in: Proceedings of the 9th Symposium on Reliable Distributed Systems, Huntsville, Alabama, 1990, pp. 114–123. [22] B. Randell, A. Avizienis, J.C. Laprie, Fundamental concepts of dependability, Technical Report 01145, LAAS, April 2001. [23] D.E. Bakken, Z. Zhan, C.C. Jones, D.A. Karr, Middleware support for voting and data fusion, in: Proceedings of the 2001 International Conference on Dependable Systems and Networks, IEEE-CS, 2001, pp. 453–462.
97
[24] F. Christian, C. Fetzer, The timed asynchronous distributed system model, in: proceedings of the 28th International Symposium on Fault-Tolerant Computing, IEEE-CS, 1998. [25] W. Chen, S. Toueg, On the quality of service of failure detectors, in: Proceedings of the 2000 International Conference on Dependable Systems and Networks, IEEE-CS, 2000. [26] A. Avizienis, The N-version approach to fault tolerant software, IEEE Trans. Soft. Eng. SE-11 (12) (1985) 1491– 1501. [27] J.C. McCluskey, Built-in self-test techniques, IEEE Des. Test (1985) 21–28. [28] R. Baldoni, C. Marchetti, M. Mecella, A. Virgillito, An interoperable replication logic for CORBA systems, in: Proceedings of the 2nd International Symposium on Distributed Object Applications, 2000. Domenico Cotroneo is currently a research associate and a teaching assistant at the University of Naples, Italy. His research focus is on distributed computing, object-oriented programming, and some of the fundamental aspects of computer system dependability, namely availability, reliability, and performability. He received his MS degree and Ph.D. in Computer Science Engineering from the University of Naples. Nicola Mazzocca graduated in Electronic Engineering at the University of Naples ‘‘Federico II’’ in 1987. He spent one year at Ansaldo Trasporti, where he worked on the design of process control systems. In 1988 he joined the Computer Science Department of the University of Naples, where he received his Ph.D. degree in Computer Science. He is currently a full Professor at the Department of Information Engineering of the Second University of Naples. His research interests include dependable and parallel computing, performance evaluation, and embedded systems. Luigi Romano is currently an Assistant Professor at the University of Naples. His research interests include some of the fundamental aspects of computer system dependability, namely availability, reliability, performability, and security. He has been at the Center for Reliable and High-Performance Computing of the University of Illinois at Urbana Champaign for 18 months, doing research with Prof. R.K. Iyer. He has also worked as a consultant for Ansaldo Trasporti and Ansaldo Segnalamento Ferroviario in the field of safety critical computer systems design and evaluation. He received his MS degree in Electronic Engineering and Ph.D. degree in Computer Science from the University of Naples.
98
D. Cotroneo et al. / Journal of Systems Architecture 48 (2002) 81–98 Stefano Russo is currently a Full Professor at the University of Naples ‘‘Federico II’’. He graduated in Electronic Engineering and received his Ph.D. degree in Computer Science at the same university, in 1988 and 1993, respectively. His research interests include software engineering for dependable and parallel computing, object oriented and middleware-based methodologies and technologies, and distance learning environments.