Formalizing a Methodology for Design- and ... - IEEE Computer Society

2 downloads 0 Views 370KB Size Report
large scale long living projects, such as Model Driven Design. [12], [25] or the ... consulting [14]. SHADOWS (Self-healing Approach to Designing Complex.
2010 Seventh IEEE International Conference and Workshops on Engineering of Autonomic and Autonomous Systems

Formalizing A Methodology for Design- and Runtime Self-healing Georg Jung, Tiziana Margaria, Christian Wagner, Marco Bakera Chair Service and Software Engineering, Universit¨at Potsdam, Potsdam, Germany {margaria,wagner}@cs.uni-potsdam.de

Abstract—In this paper we report our experience with the extraction and formalization of the methodology for the development of self-healing capabilities arisen in the context of the recently concluded SHADOWS project. Defining a methodology for system augmentation, technology description and technology introduction is in fact a central output of the project. This methodology effort additionally aimed at enhancing the coordination between project partners, the interaction between various technologies, and the applicability of self-healing to software projects. Keywords-Software Methodologies, Self-healing, Autonomic Computing, Methodology Description Languages 1

This work was carried out partly within the SHADOWS Project funded by the EU under contract No. 035157.

I. I NTRODUCTION Methodologies [2], [7] address the problem of organizing the development of information and computation systems in a software-process, which itself is subject to evolutionary refinement. The research on software methodology reaches back to the 1970s (e. g., [6]) and has gained momentum with the increasing complexity of information and computation systems and of their development. There are several related approaches, that also aim at handling the integrity of software systems in large scale long living projects, such as Model Driven Design [12], [25] or the efficiency of software development such as Software Product Lines [9], [38]. In contrast, the methodology evolution strategy is rather non-intrusive, relatively easy to adopt and blend into existing software processes [30] and it has been endorsed in software development management and consulting [14]. SHADOWS (Self-healing Approach to Designing Complex Software Systems) [34], [35], [36] aimed at developing technologies that augment large software systems with a sort of immune response against various issues and contingencies that can occur at design-time or runtime. It targeted general issues in the areas of performance, concurrency, and functional problems. Without self-healing protection, these would result in a costly, partial or complete breakdown of the system. By design, the methodology here presented is widely applicable, reaching well beyond the scope of the case studies and application domains investigated in the project. In fact, the aim was to provide general techniques that augment a given or planned system independently from the system’s functionality. A methodological description for the very heterogeneous capabilities offered by the integrated SHADOWS platform 978-0-7695-4004-7/10 $26.00 © 2010 IEEE DOI 10.1109/EASe.2010.21

is highly beneficial to its adoption. In fact, outfitting a generic program with SHADOWS self-healing capabilities is a non-trivial multi-step process. The highly heterogeneous techniques from the various contributors in SHADOWS may each require design- or coding-time preparations, a specific execution environment, or a monitoring infrastructure. These preparations and the integration of the execution environment and this infrastructure are complex tasks. Accordingly, the methodology aspects are an integral part of the SHADOWS research, and the project conversely provides a large-scale case study with multiple stake-holders, ideal for the investigation of methodology issues. In this paper we describe the SHADOWS methodology, more extensively presented in detailed project deliverables [1]. We have also • extended preexisting notations developed for capturing methodologies, in order to adapt them and render them suitable for the specific requirements of injectable selfhealing capabilities. • carried out a complex case study of application of those notations and of our meta-methodology. The entire span of SHADOWS techniques has been successfully captured and expressed, and communicated and shared with our industrial partners. • achieved a deep uniformation effect, that closes to a standardization, between the different technologies and their contributed methodologies. This was surprising, since we started with technologies, case studies, and engineering and design cultures that were very diverse. The effect of providing an adequate notation for visualizing and communicating at the methodology-level spawned a succession of discussion and uniformation rounds that ended up with the uniform and streamlined formulations and diagrams described here. In the reminder of this paper, Sec. II describes the project, its conceptual architecture and thematic structure, Sec. III introduces our approach to method-engineering and the notations developed, Sec. IV describes how we established the methodology in the project, captured the individual processes, and a case study example, Sec. V overviews some related work, and Sec. VI offers our conclusions. II. T HE SHADOWS PROJECT SHADOWS was a three-year EU IST STREP project started in June 2006. The consortium included nine partner institutions from industry, industrial research, and academia, from seven countries.

106

Fig. 1.

The architecture of the SHADOWS self-healing framework (according to [36]).

Large scale computing and information systems are a vital and integral part of today’s society. Many modern IT systems are to such a degree critical for essential infrastructure, and sometimes for human lifes, that no downtime is actually acceptable. A most recent event that illustrates the costs of a computing system’s failure is the crash of the central server of the German railway company Deutsche Bahn AG. The company serves more than five million people daily with more than 1,300 long distance and 22,000 regional trains and multiple local bus services. The crash on January 14, 2009, blanked out inter-train and inter- and intra-station communication as well as the screens in over 400 ticket offices and 7,000 ticket vending-machines, and the online ticketing system, resulting in hours of delay for thousands of travellers and an undisclosed financial loss. The robustness of software systems can be increased by adopting development methods for high assurance software However, with the complexity of modern computation systems, even in high assurance systems the possibility of contingencies is still a risk. SHADOWS proposes the automatic or semi-automatic detection and repair of possibly problematic behavior in its early design and development stages, as soon as it can be spotted, and this consistently throughout the system’s life-cycle. Upon detection of a possible issue, SHADOWS techniques immediately react, applying appropriate countermeasures aimed at handling, containing, or solving the problem before it becomes fatal. SHADOWS aimed at developing a well-integrated set of technologies, embodied in tools that implement a well-defined methodology for developing systems capable of self-healing. Our current solution provides support to parts of the design, testing and deployment stages. Specifically we are addressing the following dimensions:

Self-healing of software, both at design time and at testing and deployment time. • Self-healing at the component level, as well as at the system level. • Self-healing for multiple types of errors (performance, function, concurrency). The self-healing technologies we developed are based on novel techniques for detection, prediction and classification of faults, and for performing corresponding corrective actions. Detection and prediction are performed against models of desirable behavior. Fig. 1 illustrates a conceptual architecture of our self-healing framework. The framework comprises: • a model repository with models for safe concurrency, functional requirements, and performance behaviors, • a monitoring layer A detecting and predicting the problems manifested by deviations from the models, • an analysis layer that B classifies and identifies the problems, and • Detection and analysis of a problem by layer B leads to the invocation of corresponding corrective action (either by layer C at design time, or by layer D at deployment time). Such corrective actions may then feed into the managed system or influence its working environment. •

A. Conceptual healing architecture Adopting self-healing is a strategy-level decision. Selfhealing, then, encompasses a number of techniques targeted either for design time, when a program structure is scrutinized before implementation (i. e., based on model validation), or for deployment- and runtime (i. e., when a program is “out in the field”). A self healing system in the sense of SHADOWS realizes two essential elements. • its structure is based on formal models, which abstractly represent the system’s architecture and behavior, and is scruti-

107

Monitoring Infrastructure / Execution Environment

Oracles

n

io at ic un m us om B

C

Healers

Program

Fig. 2.

Actuators

Healing architecture of a SHADOWS-augmented system

nized during abstract execution with design-time self-healing technologies. • during and after implementation it is integrated with special components which detect and handle problems at the system’s runtime based on the system-specific models and on general (i. e., system independent) considerations. Fig. 2 illustrates the conceptual runtime-architecture of a system (here called program) augmented with SHADOWS self-healing features. The execution of the system itself is connected to an appropriate monitoring infrastructure. One or more oracles process the monitoring information to detect possible problems. The healers analyze the data from the oracles and decide whether and how to appropriately act. Finally, the healers call actuators to effect the countermeasures to potential problems. B. Addressed problem categories Our self-healing technologies target performance, concurrency, and functional aspects. 1) Performance healing: Issues that may slow down a system execution, problems coming from overtaxing a system with requests are addressed by SHADOWS performance selfhealing technologies. These fall into the following three main categories: Object Dump Healer. This healer monitors memory utilization and may decide, according to its algorithm, to store living objects in persistent storage and to release the memory allocated to them. The healer uses a configurable policy to decide which components should be saved first (any standard caching policy can be used). Loitering Objects Healer. This healer is an extension on top of the Object Dump healer, and it is used to deal with Java memory leaks, and specifically with loitering objects [13]. The healer uses various heuristics to detect objects that could potentially be hanging around unused. Resource Pooling Healer. This healer dynamically creates resource pools to allow resource reuse in cases where resource re-instantiation introduces undesirable delays in response time. All three are based on IBM’s PANACEA framework and ConTest technology that provides instrumentation, monitoring and runtime environment [24].

2) Concurrency healing: They address concurrency-related issues such as shared variable access and race conditions. Depending on the stage of the development life-cycle (e. g., testing or deployment phase, application runtime), SHADOWS offers a collection of approaches with different levels of overor under-approximation when detecting race conditions [17]. Partly, these approaches require static analysis of the systems and healing assurance to avoid introducing deadlocks [16]. 3) Functional healing: These techniques detect and identify functional deficiencies and subsequently perform actions that reduce their impact or permanently solve the problem. There are a host of relevant, recurring problems which can be categorized according to their prevalent nature and that justify considering some kind of “immune response” to handle them in a systematic way. In SHADOWS, the following set of functional technologies have been developed and can be reused in self-healing systems. Healing connectors for COTS. Modern system development highly relies on COTS for cost reduction. However, the use of COTS components comes with the risk of usage faults and integration errors due to e.g. a lack of proper documentation. Also, COTS may contain internal errors or deficiencies. Their widespread usage helps to identify common misunderstandings and internal issues through experience, and known issues can be found for example in forums and fault repositories. Nevertheless, this accumulation of knowledge per se does not prevent errors from recurring. SHADOWS proposes to introduce healing connectors that deal with known integration issues and internal COTS software errors [8]. Behavior monitoring for fault diagnosis. If models of the expected behavior of a system or system component exist (e. g., interface protocols, parameter domain boundaries, invocation sequences), then deviating behavior can be detected through analysis (at design time) or though observation (at run time). SHADOWS proposes either to perform an in-depth drill-down analysis of the design, for example with GEAR’s advanced game-based model checking techniques [5], [4] or to mine appropriate specifications through passive learning techniques [20]. Software history tracing to heal regression faults. Often, behavioral issues arise through refactoring of components (i. e., augmentations, bug-fixes, etc.). SHADOWS keeps track of the versions, analyzes the changes, and identifies the affected components. In case of a failure, the faulty components are detected and under certain conditions replaced by previous versions. Transactional services to protect from recurring failures. The From Failure To Vaccine (FFTV) technology [19] analyzes the execution context during the occurrence of failures. Whenever an execution context similar to one with previously observed failures is detected, SHADOWS healers extend appropriate actions. III. D ESCRIBING THE M ETHODOLOGY A software design methodology (SDM) [2], [7] is a composition of design methods along with rules or guidelines for

108

OPEN CONCEPT

Activity A

[else]

1

[condition] CLOSED CONCEPT

sub−activity 1

0..1

OPEN CONCEPT

sub−activity 2

1..* sub−activity 3

property Role

Activity B sub−activity 4

SIMPLE CONCEPT

SIMPLE CONCEPT

sub−activity 5 Role

1..*

0..*

1 is associated with

Activity C

1..* SIMPLE CONCEPT

1 SIMPLE CONCEPT

sub−activity 6

SIMPLE CONCEPT

sub−activity 7 Role

Fig. 3.

A template PDD (based on [40])

applying them to arrive at design decisions. An SDM can consist of multiple elements such as concepts, criteria, notations, artifacts, etc. [37]. Together, they should complement each other to form a process that guides the creation of software artifacts that satisfy the given requirements. For SHADOWS, the original thrust of methodology research (i. e., the evolutionary improvement of existing software development processes) is only motivated in its applicability, since SHADOWS does not aim at a development process per se, but rather at the collection of artifacts resulting from this development. While self-healing capabilities should ideally be based on a thorough, model-based or even model-driven process (e. g., to ensure a certain level of accuracy of the abstractions which determine the healing process), SHADOWS targets design or implementation artifacts regardless of the specifics of their creation. Some of the notations from method or methodology engineering, particularly the process-deliverable diagrams (PDDs) [39], [40] are, with some adaption, well suited for • describing the SHADOWS system augmentation, where software in the process of implementation or already existing software (even legacy software, where the development process is concluded) is outfitted with various SHADOWS’ self healing capabilities, and • the process of SHADOWS healing, where the system recovers from occurring issues. A. The notion of the PDD Fig. 3 displays a template PDD to explain the notation concepts. It is a heterogeneous representation: • The left-hand side of a PDD describes a (software development) process in UML activity diagram-like notation. • The right-hand side describes deliverables of the activities in form close to a class diagram. PDDs depart from the UML standard in some respects, specific to their own methodology description purposes. The PDD distinguishes simple activities (displayed as rectangles

with rounded sides, e. g., sub-activity 1, sub-activity 2, etc., in fig. 3) and complex activities (displayed as shadowed roundedsides rectangles, e. g., sub-activity 5). By intuition, simple activities or concepts form atomic components that cannot be subdivided into smaller parts. Complex activities in contrast can reasonably be split into sub-activities. An activity is open if this kind of refinement is part of the model (i. e., the sub-activities or -concepts are displayed too) and it is closed otherwise. Open/closed activities are displayed with an outlined/solid shadow. The right-hand side of the template PDD in Fig. 3 links the activities to concepts, which form the deliverables of the activities. Like activities, concepts can be simple or complex, and if complex they can be open or closed. Dashed arrows connect the activities with the concepts they produce. Altogether, the following three different combinations: 1) simple activities/concepts (e. g., sub-activity 1). 2) open complex activities/concepts (e. g., sub-activity 5). 3) closed complex activities/concept (e. g., the closed concept at the top of fig. 3). Concepts are captured in UML class-diagram-notation and related through UML class relations such as aggregation, composition or inheritance when appropriate. For instance, the dashed arrow from sub-activity 1 to the closed concept denotes that sub-activity 1 delivers several instances of classes represented by the closed concept. Since the concept is closed, some of these deliverables are not part of the model. Other kinds of arrows like those for composition, aggregation, inheritance, etc., found in the diagram conform to standard UML notation and meaning. A detailed overview of generic PDD-syntax is available in [39], [40]. The resulting PDD is a compact and intuitive diagrammatic overview over activities and their effects on the deliverables. It can be modified or incrementally refined and augmented with side- or sub-activities to model improvements to the the overall process. B. Extending PDDs to XPDDs For the SHADOWS methodology, however, this expressiveness is insufficient: We need to capture the information needed for adding the self-healing capabilities (augmentation) and the self-healing infrastructure necessary to perform problemresponses (healing). To adapt the PDD notations in these two contexts, we need to enrich the description means in order to express SHADOWS influences on the process. For this purpose, we allow using class-diagram notation on the left-hand side of the activity description too. We call these eXtended PDDs XPDDs. An XPDD then consists of a triple (SHADOW S, Activities, Impact), with 1. Activities in the middle: this is the process description itself in PDD-UML activity-diagram notation. It describes the

109

static code analysis

order, flow, and hierarchy of the tasks that are performed to accomplish the development/augmentation/healing effort. • For augmentation, these are the activities necessary to analyze and outfit a program with SHADOWS capabilities, • for healing, these are the self-healing activities performed during runtime.

load source code find find concurrent concurrent related related code code blocks blocks

2. Impact on the right: these are the process deliverables in PDD-UML class-diagram notation. This part captures the objects/objectives of the tasks displayed in the Activities. The classes denoting these objectives are linked to the tasks which produce them or operate on them by means of dashed arrows pointing from activity to objective. Further, the objectives can be interrelated in many ways. For example, if an activity produces a feature belonging to an already existing objective, the objective denoting the feature is connected to the latter through an aggregation, or if an activity produces a refined version of an existing objective, the objective and its refinement are connected through an inheritance relation. • For augmentation the objectives are the additions to the program and the analysis results which enable healing. • For healing, the effects of the healing activities on the running processes fall into this part, 3. SHADOWS on the left: these are the SHADOWS-specific input data and functionality in PDD-UML-class notation. This is an extension to the standard PDD notation, capturing the necessary inputs of SHADOWS-related activities. These inputs, including abstractions,models, and functional artifacts such as healer methods, are linked to the activities that rely on them also via dashed arrows that point from input to activity. Naturally, the three-part XPDD topology is not absolute, since objectives produced by earlier activities can well serve as input to later activities. Important is rather the essential idea of XPDDs (i. e., linking activities to deliverables, and now in addition linking necessary inputs to activities), which allows capturing the development, augmentation, and healing process in a flexible and intuitive way, that in particular enables methodology increments and improvements. In fact, we experienced an initial high volatility of the diagrams, and the ease of adoption and adaptation proved here to be a central asset. As a simple example of an XPDD, we show in Fig. 4 the XPDD describing an augmented static analysis process for the ConTest-based concurrency self-healing. The left side (inputs and preconditions) specifies the necessity for Soot call-graph analysis and Soot points-to analysis, as well as the GEAR Modelchecker. These need to be available for the activities static code analysis (sub-activity convert code to call graph) and lock graph creation and analysis (sub-activities compare lock objects and analyze lock graph) respectively. The activity static code analysis in turn is represented as the open activity find concurrent related code blocks which generates the synchronized blocks graph as output. Consequently, the synchronized blocks graph is specified on the right side. Other specified outputs are the lock graph over-approximation

synchronized blocks graph

convert code to call graph

Soot call-graph analysis

lock graph creation and analysis compare lock objects

Soot points to analysis lock object exists? yes

no

add concurrent block to existing lock group

create a new lock group

GEAR Modelchecker

Fig. 4.

create lock graph

lock graph overapproximation

analyse analyse lock lock graph graph

pre-analyzed lock graph

The structure of an XPDD

produced by the sub-activity create lock graph as well as the pre-analyzed lock graph produced by the closed sub-activity analyze lock graph. IV. T HE SHADOWS M ETHODOLOGY A. Requirements to the Methodology In the context of SHADOWS, the methodology establishment effort faces various challenges. On one hand, the SHADOWS project is highly modular and heterogeneous, and built bottom up (driven by the needs of the industrial case studies), therefore the chances of a natively uniform methodology are dim. On the other hand, once going beyond the surface, many analogies and parallels are found between the different subprojects and the way they handle information and actions. The task was therefore threefold: • establishing the methodology should be non-intrusive. Each sub-project within SHADOWS is research-intensive and necessarily guided by project-specific considerations. Imposing a fixed methodology for establishing prototypes within the subprojects seemed counterproductive. • The methodology should deliver comparable specifications. In spite of the heterogeneous nature of the different subprojects, any parallel found helps to unifying the approach and thereby making it more practicable. • The end-product of the methodology effort should be suitable as (part of) the documentation of the SHADOWS project, making the project’s results accessible to third parties (i. e., users, applicants, of SHADOWS technologies). Non-intrusion and comparability are unfortunately competing goals. A non-intrusive methodology specification and

110

results for runtime anlaysis of self healing enabled software

models/graphs

Collecting Methodology Workflow collecting individual specifications collecting individual specifications

individual process descriptions not necessary as XPDDs

FSA

ddeessiiggnn ttiim mee aannaallyyssiiss results for next static analysis process

...

check process descriptions

ssoouurrccee ccooddee ggeenneerraattiioonn self healing enabled software

adaquate specifications?

optimizing the process

no

yes

no

unification of individual specifications

XPDD documents

optimization of the whole

standardized XPDD documents

Fig. 6.

Template: Design Time Analysis

version control system

comparable process descriptions?

eeen tt naaan v n tn a nv viiirrroo onn nmm me e en nl yaaslliyysss ii ss

set of libraries

list of possibilities for runtime healing software

information about update cycles

yes

...

Fig. 5.

Fig. 7.

Overview of the general methodology collecting workflow

evolution suggests separate methodologies for each sub-project (i. e., distinct methodologies for performance healing, concurrency healing, and functional healing), while aiming for comparability leads to structuring the capture of the tasks along the stages of the life-cycle of the development effort (i. e., it leads to distinct methodologies for design-, codingand deployment-time). Further, the unified SHADOWS methodology should provide indications, guidance, and rationales concerning which of the various self-healing techniques (or combinations thereof) is most applicable and beneficial in a given, industrial, project. Finally, the methodology should exhibit the adaptations necessary to the software development process of a product when augmenting it with SHADOWS technologies. B. Method of building a methodology Fig. 5 describes the workflow for establishing the SHADOWS methodology, i.e. our meta-methodology. The first step to establishing an overall methodology in SHADOWS was to collect the individual methodologies of the various pre-existing self-healing approaches. Once captured in the concise XPDD notation, some optimizations could possibly be suggested. The understanding of the process was augmented by adding precision and details to the corresponding XPDDs. Once the individual processes are captured, comparing the methodologies may reveal opportunities for unifications, such as shared use of tools (e. g., tools that instrument the programs for monitoring purposes). Also, it may uncover conflicts or dependencies between different healers. For example, some healing activities change the code at runtime (e. g., healing for regression faults, see sec. II-B3), which might cancel or invalidate the results of prior static analysis or pre-deployment activities.

Template: Environment Analysis

C. The template approach When capturing the individual processes, the competing goals of non-intrusion and comparability need to be reconciled. Therefore, we introduced a flexible template structure that allows specifying each individual process with precision and detail while using similar milestones and terminology. Currently, we defined • four templates capturing the various preparation tasks that roughly follow the stages of the SHADOWS development, analysis, and deployment (i. e., design-time analysis, environment analysis, static analysis, and testing-time analysis), • and a fifth that captures the run-time healing] activities. These templates are the result of the uniformation process for the SHADOWS methodology, and we are convinced that they are a core of generic and generalizable methodology for self-healing techniques. The detailed processes for the single technologies are then instances and concretizations of these templates. 1) The methodology patterns: We present here the methodology patterns in their XPDD formulation. Design-time analysis. Fig. 6 shows the template for capturing design-time analysis methodologies. Inputs to design-time processes for healing approaches include all kinds of models or graphs that describe or specify the system under consideration. Possible outcomes of this phase are any analysis results that are used for self-healing and that might be useful for further analytic activities. If the development process is model-driven, the source-code generation that may follow the design phase should be able to output some self-healing enabled code, or code that supports self-healing technologies. Environment analysis. This template (fig. 7) serves to capture all environmental properties which a self-healing ap-

111

results for runtime anlaysis of self healing enabled software

source code

intermediate representations: e.g call graphs, lock graphs

static analysis results

testing time results

ssttaaatttiiiccc aaannn yyissiis oosccseessss aa laylls psr p opcrre results for next static analysis process

log files/traces, information about events occured

rrruu m ff ehhaee nntttiiim l iaa nlg un meeessesleef llh liinngg

environment analysis results

instrumented byte code

design time results

...

Fig. 8.

self healing architecture

Template: Static Analysis

events from monitors

process events process events process events

message to oracle

message to oracle

choose healer choose healer choose healer

call healer and transfer necessary information

results for runtime anlaysis of self healing enabled software

test case

test protocol

tt e im e ssttiinnggt itm ee a naan l yasliyss i s results for next static analysis process

instrumented byte code ...

Fig. 9.

call healer and transfer necessary information

Template: Testing Time Analysis

proach might rely on. It considers as inputs developmentor execution-environment aspects such as the availability of code version-control systems, the used libraries, or the milestones and update-cycles of the development and maintenance process. Generally, a list of opportunities for runtime healing that springs from these development-environment capabilities should be the output of the process. Static analysis. The static analysis process, which is central to most healing approaches, is captured along the template in Fig. 8. The static analysis phase is based on the source code, or any kind of intermediate representation (e. g., flow graphs, call graphs, lock graphs). Various SHADOWS approaches, use instrumented byte-code for the analysis. Expected results should again provide necessary information for healing capabilities to the software, which in some cases can also be used to augment further analyses. Testing-time analysis. This template is analogous to the static analysis template (fig. 9), it differs only in the inputs which are generally used. Instead of source code or other static artifacts, the basis for the analysis is data collected from test runs. Runtime self-healing. The methods used during the application of SHADOWS technologies are captured along the template in Fig. 10. The self-healing process itself takes inputs from all previous analysis and preparation processes. Output are all kinds of logs and other kinds of information about occurred events. Following the SHADOWS architecture, the runtime selfhealing process splits into four different sub-processes for the monitors, oracles, healer, and actuators. Some approaches do not distinguish all those elements, others additionally consider a healing assurance process which is not captured yet in the template of fig. 10. 2) Examples of Concrete Processes Expressed in XPDD: Fig. 11 specializes the static analysis template (fig. 8) for

control actuators

choose cure choose cure choose cure

influence influence program program execution execution

control actuators

Fig. 10.

e.g. log procedure

Template: Runtime Self Healing Overview

Pattern based Analysis BCEL intermediate representation

iterate through code next?

CFG for each method no

yes

check check for for patterns check for patterns patterns patterns found?

no

yes

produce related atomic section

program locations

save program location create document

Fig. 11.

XML document with related atomic sections

Concurrency Healing: Static Analysis

the pattern based lock safety analysis, which is part of the concurrency healing static preparations (see sec. II-B2). Of the inputs suggested by the pattern, the intermediate representation is chosen, here the Byte Code Engineering Library (BCEL) representation and Control Flow Graphs (CFG). Output of the process are the program locations relevant for locking in form

112

SHADOWS in fact initially used differing monitoring infrastructures.

AtomRace detection Message from ConTest about events occured

check event if apply to Oracle

D. Status of XPDD Capture and Standardization

appropriate event?

yes

set of program locations with final or volatile variables

Healer category performance concurrency functional SHADOWS total

check concurrency violation violation check concurrency check concurrency violation

atomic sections of interest

violation?

yes

no

message to healer with information about to detected violation

Dynamic: Atom Race Detection

BCT anomaly detection Message from ConTest about events occured

check event if apply to Oracle appropriate event?

yes

Behavioral model

Filter Out Violations in Correct Actions

Filter Out Violations in Correct Actions

model violations BCT anomaly graph

build anomaly graph no

failures analyse anomaly graph

Fig. 13.

total XPDDs 3 14 20 37

Table I lists the current status of the methodology descripion and structuring process, expressed in numbers of XPDDs. Of the 37 XPDDs currently available, 16 have been already defined (or restructured) along the templates introduced in Sect. IV-C and 21 are manually derived, well formed XPDDs, but still need to be standardized in a template-conform style. According to the healing areas, we have 3 performance XPDDs, 14 concurrency XPDDs, and 20 functional XPDDs. These numbers are likely subject to change in the course of the uniformation process, since some of the manually defined diagrams are rather comprehensive, and cover the scope of several templates. Of the 16 template-based XPDDs, 10 concern static analysis and 6 runtime self-healing. The testing and environment analysis patterns have been prepared by analysis of the possible synergies among the functional (manual) XPDDs currently available, and are going to be heavily used in the course of their ongoing uniformation.

send instance of variable to healer

Fig. 12.

from templates 3 13 0 16

TABLE I C URRENT SPLIT OF XPDD S WITHIN THE SHADOWS METHODOLOGY

no

detect instance of variable

manual 0 1 20 21

message to healer with information about faulty components

V. R ELATED WORK

Dynamic: BCT diagnosis

of an XML specification, they embody the “results for runtime analysis. . . ” (sec. 8) suggested by the template. When processes from similar stages of the various SHADOWS techniques are captured along the lines of the same template, the methodologies become comparable, and analogies concerning tool integration and shared data and infrastructure use can be leveraged. Fig. 12 and Fig. 13 implement the runtime self-healing template (fig. 10) • for the atomic race conditions detection oracle from the concurrency self-healing sub-project (fig. 12) • and for the behavior monitoring oracle from the functional self-healing sub-project (fig. 13). Note that both activities, as suggested by their respective XPDDs, rely on events received from the ConTest monitoring framework. This analogy actually originates from the ongoing integration effort: the prototypes produced in the first phase of

Self-healing, self-management, self-inspection, self-configuration, etc., generally called self-* (self-star) [3], [18], [23] capabilities are increasingly studied and developed to address the challenges of modern, high-reliability, computing. While systems that benefit from such concepts are generally cutting-edge in size, complexity, and heterogeneity, there is little previous work on the methodological structuring of their creation, enhancement, and management. Methodological viewpoints are found in [11] concerning comparisons between different families of systems, as well as in concrete analyses of phenomena, for instance in autonomous networks [29], but no study of how to describe the methodology for self-healing is so far available. Instead, structure has been scrutinized in terms of architecture (e. g., [41]). Even large-scale programs like the German DFG SPP on Organic Computing [28], which supports tens of projects over 6–9 years, remain at the level of development of techniques and their application, without addressing a coherent and cohesive methodological layer above the single projects. Yet to address the intricacies of the development process of self-*, with sophisticated, injectable, functionality orthogonal to a system’s main purpose, engineering of the method itself seems essential. In this respect, SHADOWS and the present work are unique.

113

activity 1 output object activity a

Fig. 14.

possible to define objectives (comparable to the deliverables) and associate them with hierarchically modeled functions (e. g., business objectives as results of business processes). It is conceivable that (an altered or augmented version of) SADT syntax would have been equally suitable for the capture and manipulation of methodologies, but we preferred a UMLlike notation in view of compatibility with the mainstream formalism in software engineering.

activity 2 input object output set (collection)

UML 2.0 graphic input-output syntax

Methodology issues have been investigated in the software process design community, and in fact our inspiration and the source of our notation stem from that line of research. In that domain, connecting node-action flow-graph diagrams or similar state-transition systems with a representation of the respective resources or deliverables is a natural choice when capturing algorithms, protocols, or processes. Various informal notations and forms of this concept are found in many different flavors in standard text-books about software engineering (see [15] as a random example). The actual form of notation and the graphic syntax of this paper is heavily based on the works of Weerd, Brinkkemper, et. al. [7], [39], [40], who proposed their notation particularly for methodology research. Beside their contribution, little work was dedicated explicitly to method engineering in general (i. e., as stand-alone subject of research independent from other areas of computing and information sciences). With the widespread use and availability of UML [27], it seemed sensible to employ UML or a derived notation for the specification of the SHADOWS methodologies. The PDD notation, that has been established in method engineering, is closely related to UML activity diagrams and therefore immediately familiar to all project members. However, starting with version 2.0, UML includes its own standardized inputoutput notation within the activity diagrams [26]. Fig. 14 displays the relevant graphic syntax. Activities can have objects or collections (i. e., sets) of objects either associated to their input or to their output. Nevertheless, in this rather restrictive notation it seems not feasible to capture complex resource-activity-deliverable relations, and it does not allow specifying relations between deliverables, resources, or both (to capture, e. g., alternating dependencies between separate processes). It is also not possible to express refinements (inheritance), aggregation, or deliverables of an activity that serve as resource for a different process than the one producing it. The Structured Analysis and Design Technique (SADT) [10], [21], [31], [32], [33] offers a hierarchic, diagrammatic syntax for capturing complex systems and applications. While SADT is in many respects comparable to UML, it was developed much earlier and differs in its strengths and weaknesses (e. g., the notations of SADT are far less inspired by the concepts of object oriented programming than those of UML, while UML in contrast lacks in regard to functional calculus notions). SADT’s base elements are functions, they have inputs and outputs, but additionally also controls that influence the function, and so called mechanisms that represent the resources associated with that function. In SADT it is

VI. C ONCLUSIONS We have addressed, to our knowledge for the first time, the issue of a methodology for self-healing augmentation and execution in a heterogeneous, complex, industrial context driven by concrete case studies. We have shown how to adapt the PDD notation for expressing software-related methodologies to the more complex and articulated case of injectable third party software that achieves self-healing for the target systems. We have also introduced patterns as a means to capture generic elements of the methodology. This way we account for ease of specialization to the rich heterogeneity encountered in the project (of the platforms, of the systems, of the purposes, goals, and techniques), while still guaranteeing ease of understanding, and a uniform look and feel of the descriptions. In fact, we have already started reaping the benefits of this introduction: we have found similarities at the template level between previously apparently unrelated technologies, and the technology providers and users have started to realign them, guided by the templates – that suggested a potential for synergy, harmonization, and simplification. As the project progressed, we unified the individual methodologies wherever possible, to obtain an overall concise yet precise methodology description for the SHADOWS technologies. It is our intention to achieve a smooth and standardizable methodology, that we aim at proposing to the community of developers and users self-healing techniques as a basis for a standardization effort. From conversations with industry partners it becomes clear that the methodology has not only a value by itself, but it may provide also guidance and a rationale about when and where the use of which SHADOWS technology is most beneficial. Further, as previously noted, industry partners, when using SHADOWS technologies, face the need to integrate the additional requirements posed by SHADOWS into their product development process. To do so, they need clear listings of the static requirements, at a very concrete level (e. g., where exactly within the process a tool or piece of information or property of the code has to be available). With IBM as a project partner, the USE of UML-style notation was an early decision. Nevertheless, we plan to establish tool-support for the capture of and experimentation with methodologies with an augmented ligthweight process coordination framework [22]. This way we hope to establish “executable” methodologies and stricter adherence to the specified processes.

114

R EFERENCES [1] Methodology for the development of self-healing systems. Technical report, SHADOWS EU IST FP6 Project, Deliverable D6.5, Nov. 2008, 83 pg. [2] M. Arisawa, G. Bergland, and E. C. B. et al. Evaluation of design methodologies. ACM SIGSOFT Software Engineering Notes, 7(1):56– 69, Jan. 1982. [3] O. Babaoglu, M. Jelasity, A. Montresor, C. Fetzer, S. Leonardi, A. van Moorsel, and M. van Steen, editors. Self-star Properties in Complex Information Systems. Conceptual and Practical Foundations, volume 3460 of LNCS. Springer, 2005. [4] M. Bakera, T. Margaria, C. Renner, and B. Steffen. Property-driven functional healing: Playing against undesired behavior. In Int. Symp. on Quality Engineering for Embedded Systems (QEES), 2008. [5] M. Bakera, T. Margaria, C. Renner, and B. Steffen. Tool-supported enhancement of diagnosis in model-driven verification. Innovations in Systems and Software Engineering, 5:211–228, Sept 2009. [6] G. D. Bergland. Structured design methodologies. In Proc. of the 15th Conf. on Design automation, pages 475–493. IEEE, 1978. [7] S. Brinkkemper. Method engineering: engineering of information systems development methods and tools. Information and Software Technology, 38(4):275–280, April 2008. [8] H. Chang, L. Mariani, and M. Pezz`e. Self-healing strategies for component integration faults. In Proc. of the Int. Worksh. ARAMIS, 23, pages 25–32. IEEE, 2008. [9] P. Clements and L. Northrop. Software Product Lines. Addison Wesley, 2002. [10] T. DeMarco. Structured Analysis and System Specification. Prentice Hall, 1979. [11] B. D.W., S. R., T.-B. A., and L. A. Autonomic system design based on the integrated use of ssm and vsm. AI Review, 25(4):313–327, 2006. [12] D. S. Frankel. Model Driven Architecture: Applying MDA to Enterprise Computing. Wiley, 2003. [13] M. Goldstein, O. Shehory, and Y. Weinsberg. Can self-healing software cope with loitering? In 4th SOQUA, pages 1–8. ACM Press, 2007. [14] G. J. Hidding. Reinventing methodology: who reads it and why? Communications of the ACM, 40(11):102–109, November 1997. [15] E. Horn and T. Reinke. Softwarearchitektur und Softwarebauelemente. Eine Einf¨uhrung f¨ur Softwarearchitekten. Hanser Fachbuch, August 2002. [16] V. Hrub´a, B. Kˇrena, and T. Vojnar. Using JavaPathFinder for self-healing assurance. In 3rd Doctoral Wrksh. MEMICS, pages 67–73, 2007. [17] Z. Letko, T. Vojnar, and B. Kˇrena. AtomRace: Data race and atomicity violation detector and healer. In 6th PADTAD, pages 1–10. ACM Press, 2008. [18] B. C. Ling and A. Fox. A self-tuning, self-protecting, self-healing session state management layer. In Proc. of the Autonomic Computing Wrksh., 5th Int. Wrksh. On Active Middleware Services, pages 131–140. IEEE, 2003. [19] D. Lorenzoli, L. Mariani, and M. Pezz`e. Towards self-protecting enterprise applications. In 15th ISSRE, pages 39–48, Trollh¨attan, Sweden, 2007. [20] D. Lorenzoli, L. Mariani, and M. Pezz`e. Automatic generation of software behavioral models. In 30th ICSE, 30, pages 501–510. IEEE, 2008. [21] D. Marca and C. McGowan. SADT: Structured Analysis and Design Technique. Software Engineering Series. McGraw-Hill, 1988. [22] T. Margaria and B. Steffen. Lightweight coarse-grained coordination: a scalable system-level approach. Int. Jrnl. on Software Tools for Technology Transfer, 5(2):107–123, March 2004. [23] N. H. Minsky. On conditions for self-healing in distributed software systems. In Proc. of the Autonomic Computing Wrksh., 5th Int. Wrksh. On Active Middleware Services, page 86. IEEE, 2003. [24] Y. Nir-Buchbinder and S. Ur. ConTest listeners: a concurrency-oriented infrastructure for Java test and heal tools. In 4th SOQUAs, pages 9–16, New York, 2007. ACM Press. [25] Object Management Group. MDA guide version 1.0.1. http://www.omg. org/docs/omg/03-06-01.pdf. [26] Object Management Group. UML version 2.1.2. http://www.omg.org/ spec/UML/2.1.2/Superstructure/PDF/. [27] Object Management Group. Unified modeling language. http://www. uml.org. [28] Organic computing initiative. http://organic-computing.de/.

[29] S. R. Autonomic networks: Engineering the self-healing property. Engineering Applications of Artificial Intelligence, 17(7):727–739, 2004. [30] C. Roland. A primer for method engineering. In Proc. Conf. INFormatique des ORganisations et Syst`emes d’Information et de D´ecision (INFORSID), Toulouse (F, 1997. [31] D. Ross. Structured analysis (SA): A language for communicating ideas. IEEE Trans. on SE, 3(1):16–34, 1977. [32] D. Ross. Applications and extensions of sadt. IEEE Computer, 18(4):25– 34, April 1985. [33] D. Ross and K. Schoman. Structured analysis for requirements definition. IEEE Trans. on SE, 3(1):6–15, 1977. [34] SHADOWS. A self-healing approach to designing complex software systems. https://sysrun.haifa.ibm.com/shadows/. [35] O. Shehory. SHADOWS: Self-healing complex software systems. In 23rd IEEE/ACM Int. Conf. on Automated Software Engineering Workshops (ASE), pages 71–76. IEEE, 2008. [36] O. Shehory, S. Ur, and T. Margaria. Self-healing technologies in SHADOWS: Targeting performance, concurrency and functional aspects. In Proc. of the 10th Int. Conf. on Quality Engineering in Software Technology (CONQUEST), 2007. [37] X. Song and L. J. Osterweil. Experience with an approach to comparing software design methodologies. IEEE Trans. on Soft. Eng., 20(5):364– 384, May 1994. [38] C. Szyperski. Component Software: Beyond Object-Oriented Programming. ACM Press / Addison-Wesley, 2nd edition, 2002. [39] I. van de Weerd, S. Brinkkemper, J. Souer, and J. Versendaal. A situational implementation method for web-based content management system-applications: method engineering and validation in practice. Software Process: Improvement and Practice, 11(5):521–538, July 2006. [40] I. van de Weerd, S. Brinkkemper, and J. Versendaal. Advanced Information Systems Engineering, volume 4495 of LNCS, chapter Concepts for Incremental Method Evolution: Empirical Exploration and Validation in Requirements Management, pages 469–484. Springer, June 2007. [41] S. R. White, J. E. Hanson, I. Whalley, D. M. Chess, and J. O. Kephart. An architectural approach to autonomic computing. In 1st ICAC, pages 2–9. IEEE, 2004.

115