maintaining consistency and behavior of object

2 downloads 0 Views 2MB Size Report
Aug 28, 1995 - The big costs of these activities are a major reason why usually ...... The hypothesis of his approach is that design aws can be uncovered at the time a hierarchy is ...... The actions taken on acceptance or rejection of a message by a lter depend on the class of the ...... :lambda-list (method-lambda-list method).
MAINTAINING CONSISTENCY AND BEHAVIOR OF OBJECT-ORIENTED SYSTEMS DURING EVOLUTION

A Thesis Presented to the Faculty of the Graduate School of the College of Computer Science of Northeastern University in Partial Ful llment of the Requirements for the Degree of Doctor of Philosophy

by Walter L. Hursch August 28, 1995

c MCMXCV by Walter L. Hursch Copyright All Rights Reserved.

Dedicated to my wife Karin.

Jim Finn

Doug Long

Ian Holland

Ignacio Silva−Lepe

Walter Hürsch

Cristina Videira Lopes

Linda Keszenheimer

Efstahios Zachos

Karl Lieberherr

Martin Fuerer

Cun Xiao

Richard Büchi

.....

habilitation advisors

.....

Hugo Ribiero Vermuis

Ernst Zermelo

David Hilbert

Erwin Engeler

Paul Bernays

Edmund Landau

Kurt Schütte

Paul Bergstein

Sanders MacLane

John Scranton

Haskel Curry

Thesis Advisor Genealogy

.....

Abstract

The area of software evolution has gained tremendous interest in the software research and industry communities during the last decade. This is not surprising considering the fact that, according to a 1988 study conducted by the National Institute of Standards and Technology (NIST) 60 to 85 percent of the total cost of software is due to maintenance, and of this, about 60 percent is invested for so-called perfective maintenance, continued development and evolution of a software system after it has become operational. Any change to a software system is a prodigious source of new software faults and inconsistencies in other parts of the system. The arising inconsistencies can be conceptual, structural, or behavioral, depending on whether they a ect the schema, the objects, or the methods, respectively. The evolution process must resolve such inconsistencies in order to keep the system operational. Therefore, changes potentially trigger other changes which in turn may trigger yet other changes. Moreover, any change must be followed by extensive regression testing as part of the revalidation and reveri cation process. As Jacobson observes, \a seemingly small change often results in a disproportionately expensive alteration". The big costs of these activities are a major reason why usually either the changes are not done, or the subsequent testing is reduced, both being unsatisfactory. Most of the work to date, especially in the area of object-oriented database systems, has focused on the conceptual and structural consistency problems, investigating how the schema and the data of a system can be kept consistent. The problem of maintaining behavioral constistency, which deals with keeping the methods consistent, is much less well understood. This thesis presents a framework for automatically maintaining the overall consistency of an object-oriented system under the additional semantic constraint that its behavior is preserved. The v

vi

Abstract

framework formally de nes behavioral equivalence, and presents a comprehensive set of primitive schema transformations. The framework's versatility is demonstrated by applying it to a variety of language types: typed (C++), untyped (CLOS), and adaptive (Propagation Patterns). For the adaptive software model, the proofs of behavioral equivalence and consistency are demonstrated formally utilizing a new formal and executable semantics for adaptive software, and a newly developed set of proof techniques. It is also addressed how to best realize the framework. An ideal means to deal with software evolution is at a higher level of abstraction, the meta level. It is shown how implementing software evolution is tantamount to meta-programming. An implementation of the evolution framework is suggested with a metaobject protocol, and outlined with the metaobject protocol of Closette, a subset of CLOS. A secondary focus of the thesis is on the design of dynamic algorithms. Dynamic algorithms represent a di erent aspect of evolution. In contrast to a static (o -line) problem where a solution to a xed problem instance is looked for, dynamic problems assume a solution for a given problem instance has been found previously, but need to nd a solution for a \slightly" modi ed (evolved) instance. The goal of a dynamic algorithm is to e ectively exploit the fact that a previous solution is known. It is thus expected to be faster than the corresponding static algorithm which needs to compute a solution without previous knowledge. The thesis studies the problem of maintaining the set of medians and their minimal objective function value in a tree while the tree is undergoing repeated dynamic modi cations. The employed data structure is such that only O(D) nodes need be traversed to update the tree after a modi cation, where D is the diameter of the tree. In contrast, the optimal static algorithm needs to traverse O(n) nodes to compute the median, where n is the number of nodes in the tree.

Preface

\You can't steal second base and keep one foot on rst."

American proverb

In the spring of 1989, after having successfully completed my Diploma in Physics at the Swiss Federal Institute of Technology in Zurich, I was confronted with the challenging decision whether to go right away to industry as a physicist, or to continue my studies starting a Ph.D. in physics, or to acquire more knowledge in my minor eld, computer science, which I felt was closer to my heart. After consulting with several managers in industry and academia about the advantages and disadvantages of doing a Ph.D., I became convinced that going the research route and taking on the challenge of writing a dissertation was the right thing for me; the question remained whether to do it in physics or in computer science. Fortunately, the decision was made easy for me by an o er that I could not refuse. In the summer of 1989, during my work on database design for a new object-oriented project at MettlerToledo AG in Greifensee, Switzerland, I was lucky to meet Prof. Karl Lieberherr on the occasion of a talk on object orientation that he gave at Mettler. Karl was (and still is) Professor and Director of Research at Northeastern University and an expert in the fascinating and then newly emerging eld of object orientation. Since I knew that Mettler had previously supported Ph.D. students, I took the opportunity to suggest to both Mettler and Karl a \ternary relationship" in which Mettler would support my research in Karl's Demeter group at Northeastern University in Boston. By the time I came back from four months travel to Australia, New Zealand and Thailand in the end of 1989, my proposal was accepted. Mettler and Northeastern University made me the o er to start in June 1990. Thus I got the fascinating chance to combine three di erent goals in one: continue my education in (and move to) computer science, do a Ph.D., and become aquainted with a new culture vii

viii

Preface

and way of living in the United States of America. I cannot thank enough both Mettler-Toledo AG and Karl Lieberherr for providing me this opportunity. Doing a Ph.D. in the USA had one drawback: it usually takes longer than doctoral studies in Europe and thus would keep me away longer from \real work". However, the longer duration was at the same time an advantage for me: the initial portion of the Ph.D. curriculum requires to ful ll a certain course work which allowed me to both broaden my background in computer science and at the same time start my research. Nevertheless, had I known and realized in advance how much e ort and time it would need to nish this thesis, I probably would have thought twice. Of course, now that it's all done I'm glad. Working in the USA had many other exciting aspects. One of them was the multicultural and international atmosphere that I encountered. An example of this cultural diversity was the Demeter team itself. During the time I worked with the team, Karl's group hosted members from China, Greece, Holland, India, Ireland, Mexico, Portugal, the USA (yes, really), and Switzerland. I also had the pleasure to attend international conferences in Phoenix, Indianapolis, Vancouver, Bologna, Kanazawa, and Montreal, and visit other universities including the University of Berne, the University of Frankfurt, the Tokyo Institute of Technology, and the University of Tokyo. The thesis at hand represents a compilation of my work done over these past ve years at Northeastern University in Boston as a member of the Demeter team. Major portions of this compilation have been previously published in slightly di erent forms as part of conference proceedings and a journal publication. Other portions present new research results, and unifying motivation and background for the work.

| Boston, Massachusetts, USA, September 1995

W.H.

Acknowledgments

For the past ve years I have been closely working with my supervisor, Professor Karl Lieberherr. I feel very lucky to have met Karl and I cannot thank him enough for his un agging support and friendship. I wish to express my gratitude to the members of my comprehensive and thesis committees: Ken Baclawski, Jens Palsberg, Boaz Patt-Shamir, Bryant York and Roberto Zicari. They gave me valuable comments and feedback on earlier drafts of this thesis. Special thanks to Jens Palsberg for his guidance and friendship and to Bryant York for his sage advice. In addition, I had many helpful discussions with other faculty members: William Clinger, Bob Futrelle, Betty Salzberg, Mitch Wand, Ronald Williams, and Bulent Yener. I would like to thank especially Mettler-Toledo AG for their generous support of my work. It was a pleasure to have Messrs. Rudolf Kubli, Fredy Ulmer, Felix Brunner, Rene Scheidegger, and Linus Meier as my contact persons at Mettler. The members of the Demeter team, Ivan Baev, Paul Bergstein, Ian Holland, Linda Keszenheimer, Cristina Lopes, Salil Pradhan, Ignacio Silva-Lepe, and Cun Xiao, have always provided me with a pleasant working environment. I respect them as colleagues and thank them for their friendship. Thanks also to the other graduate students whom I met on my way: Mohammad Al-Ansari, Mike Cleary, Dave Gladstein, Hank Hughes, Geo Hulten, Amarit Laoragpong, Philippe Mulet, Shankara Shankaranarayanan Ra, Paul Steckler, Greg Sullivan, Michael Tselman, Liz Twarog, Chendong Zou, and many others. Finally, I wish to thank Remy Evard, his Systems Group, and the Crew for the great computing environment that they provided. Their timely responses to all sorts of computing problems is phenomenal. Thanx also to Pat Hinds, Diane Burke, Janet Josephs, and Marylin DuBois for their work and support in oce matters. ix

x

Acknowledgments

I would not have had the energy to nish this thesis if it hadn't been for the support and warm friendship of many personal friends: Karin Beerli, Lydia Cartar, Alice Giubellini, Thomas Harder, Urs Hafeli, Peter Lohse, and Stefan Neeser. My gratitude goes also to my parents who never stopped to support, encourage and love me. This thesis is the culmination of their constant e orts to provide me with a good education. Thank you for everything. I conclude with thanking my wife Karin who made much of this work possible. Karin, thanks for your love and support. This thesis is dedicated to you.

~

Contents

Thesis Advisor Genealogy

iv

Abstract

v

Preface

vii

Acknowledgments

ix

Contents

xi

List of Tables

xx

List of Figures

xxii

1 Introduction 1.1 1.2 1.3 1.4

Motivation and Background Objectives and Limitations Contributions : : : : : : : : Thesis Outline : : : : : : :

1 : : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : : xi

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

: : : :

1 3 5 6

xii

Contents

2 Related Work 2.1 2.2 2.3 2.4 2.5 2.6

Schema evolution in database systems Refactorings : : : : : : : : : : : : : : : Program restructuring : : : : : : : : : Program transformations : : : : : : : Language preserving transformations : Change avoidance : : : : : : : : : : : : 2.6.1 Structural change avoidance : : 2.6.2 Behavioral change avoidance :

11 : : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

: : : : : : : :

3 A Framework for Evolution 3.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : 3.1.1 Who performs and initiates changes? : : : : : 3.1.2 What is changed? : : : : : : : : : : : : : : : 3.1.3 How to change? : : : : : : : : : : : : : : : : : 3.1.4 When to change? : : : : : : : : : : : : : : : : 3.1.5 Summary : : : : : : : : : : : : : : : : : : : : 3.2 Schema Evolution : : : : : : : : : : : : : : : : : : : 3.3 Object Evolution : : : : : : : : : : : : : : : : : : : : 3.3.1 Object evolution strategies : : : : : : : : : : 3.3.2 A qualitative model : : : : : : : : : : : : : : 3.4 Method Evolution : : : : : : : : : : : : : : : : : : : 3.5 Evolution Framework : : : : : : : : : : : : : : : : : 3.5.1 System consistency : : : : : : : : : : : : : : : 3.5.2 Behavioral equivalence : : : : : : : : : : : : : 3.5.3 Designing a change management mechanism :

11 12 13 14 15 15 16 17

19 : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

19 23 23 25 26 27 28 29 30 32 36 37 37 38 40

Contents

xiii

4 The Kernel Data Model

43

4.1 The Kernel Model Schema : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 44 4.2 Class Graphs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 45 4.3 Object Graphs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 49

5 Primitive Transformations 5.1 Basic Primitive Transformations : : : : : 5.2 Composite Primitive Transformations : : 5.3 A Collection of Primitive Transformations 5.3.1 Addition of Concrete Class : : : : 5.3.2 Deletion of Concrete Class : : : : : 5.3.3 Renaming of Class : : : : : : : : : 5.3.4 Addition of Abstract Class : : : : 5.3.5 Deletion of Abstract Class : : : : : 5.3.6 Addition of Reference : : : : : : : 5.3.7 Deletion of Reference : : : : : : : 5.3.8 Renaming of Reference : : : : : : : 5.3.9 Addition of Inheritance Relation : 5.3.10 Deletion of Inheritance Relation : 5.3.11 Replacement of Reference : : : : : 5.3.12 Generalization of Reference : : : : 5.3.13 Addition of Subclass : : : : : : : : 5.3.14 Abstraction of Common Reference 5.3.15 Distribution of Common Reference 5.3.16 Telescoping of Reference : : : : : : 5.3.17 Telescoping of Inheritance : : : : : 5.4 Characterizing Transformations : : : : : : 5.5 Transformation Transactions : : : : : : :

53 : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

54 55 56 59 59 60 60 61 61 62 63 63 64 64 65 65 66 66 66 67 68 70

xiv

Contents

5.6 Related work : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 71 5.6.1 Database systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 71 5.6.2 Programming languages : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 72

6 Class Graph Extension

75

6.1 Class Graph Relations : : : : : : : : : : : : : : 6.2 Implication to Objects : : : : : : : : : : : : : : 6.2.1 E ect on modeling power : : : : : : : : 6.2.2 Object evolution : : : : : : : : : : : : : 6.3 Decomposition into Primitive Transformations 6.3.1 Correctness : : : : : : : : : : : : : : : : 6.3.2 Completeness : : : : : : : : : : : : : : : 6.3.3 Minimality : : : : : : : : : : : : : : : : 6.4 Related Problems : : : : : : : : : : : : : : : : : 6.5 Related Work : : : : : : : : : : : : : : : : : : : 6.6 Conclusion : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

7 Class Graph Abstraction 7.1 Motivation and Applications : : : : : : 7.2 The Abstraction Algorithms : : : : : : : 7.2.1 Preliminary De nitions : : : : : 7.2.2 The Algorithm CGA : : : : : : : 7.2.3 The Algorithm TCGA : : : : : : 7.3 Properties of the Abstraction Algorithm 7.3.1 Correctness : : : : : : : : : : : : 7.3.2 Special cases : : : : : : : : : : : 7.3.3 Analysis of CGA and TCGA : : : 7.4 Related Work : : : : : : : : : : : : : : : 7.5 Conclusions : : : : : : : : : : : : : : : :

: : : : : : : : : : :

76 81 81 83 84 85 87 89 89 90 91

93 : : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: : : : : : : : : : :

: 94 : 95 : 95 : 97 : 98 : 101 : 103 : 104 : 105 : 106 : 108

xv

Contents

8 The Abstract Superclass Rule 8.1 Introduction : : : : : : : : : : : : : : : : : : 8.2 Abstractness in Object-Oriented Languages 8.3 Discussion of the Abstract Superclass Rule 8.3.1 The Expressiveness Aspect : : : : : 8.3.2 The Covariance Aspect : : : : : : : 8.3.3 The Data Modeling Aspect : : : : : 8.3.4 The Evolution Aspect : : : : : : : : 8.3.5 The Reusability Aspect : : : : : : : 8.3.6 The Simplicity Aspect : : : : : : : : 8.4 Conclusion : : : : : : : : : : : : : : : : : :

109 : : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

9 Adaptive Software Introduction : : : : : : : : : : : : : : : : : : : Semantics of Propagation Patterns : : : : : : Comparison to Previous Semantics : : : : : : Adaptive Software as Separation of Concerns 9.4.1 Motivation : : : : : : : : : : : : : : : 9.4.2 Concepts and Implementation : : : : : 9.4.3 Bene ts : : : : : : : : : : : : : : : : : 9.4.4 Existing Approaches : : : : : : : : : : 9.4.5 Meta-level Programming : : : : : : : : 9.4.6 Composition Filters : : : : : : : : : : 9.4.7 Discussion : : : : : : : : : : : : : : : : 9.5 Future Work : : : : : : : : : : : : : : : : : : 9.6 Related Work : : : : : : : : : : : : : : : : : : 9.7 Summary : : : : : : : : : : : : : : : : : : : :

9.1 9.2 9.3 9.4

: 110 : 112 : 114 : 114 : 116 : 120 : 123 : 126 : 126 : 127

129 : : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: : : : : : : : : : : : : :

: 129 : 136 : 144 : 148 : 148 : 150 : 151 : 152 : 153 : 154 : 155 : 156 : 159 : 159

xvi

Contents

10 Minimizing Information Acquisition Cost 10.1 Introduction : : : : : : : : : : : : : 10.2 The Problem : : : : : : : : : : : : 10.2.1 Assumptions : : : : : : : : 10.2.2 Traversal sequences : : : : : 10.2.3 Acquisition cost : : : : : : 10.3 Optimization and Algorithm : : : : 10.3.1 Optimization : : : : : : : : 10.3.2 Brute-Force Algorithm : : : 10.3.3 Optimal Algorithm : : : : : 10.4 Implementation and Measurements 10.5 Related and Future Work : : : : : 10.6 Conclusion : : : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

161 : : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

: : : : : : : : : : : :

11 Maintaining Medians During Evolution 11.1 Introduction : : : : : : : : : : : : : : : : : : 11.2 Static Tree Algorithm : : : : : : : : : : : : 11.2.1 Brute-force algorithm : : : : : : : : 11.2.2 Optimal algorithm : : : : : : : : : : 11.3 Characterization of Median : : : : : : : : : 11.3.1 The accumulated weight function W 11.3.2 Single median : : : : : : : : : : : : : 11.3.3 Multiple Medians : : : : : : : : : : : 11.3.4 Summary : : : : : : : : : : : : : : : 11.4 Median in a Dynamic Tree : : : : : : : : : : 11.4.1 Insertion of a Node : : : : : : : : : : 11.4.2 Deletion of a Node : : : : : : : : : : 11.4.3 Change of Node Weight : : : : : : :

: 162 : 164 : 165 : 166 : 169 : 169 : 169 : 172 : 174 : 176 : 179 : 180

181 : : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: : : : : : : : : : : : :

: 182 : 184 : 185 : 186 : 190 : 190 : 191 : 192 : 194 : 194 : 195 : 199 : 202

xvii

Contents

11.4.4 Change of Edge Weight : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 203 11.4.5 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 204 11.5 Concluding Remarks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 204

12 Maintaining Behavior and Consistency (PP) 12.1 Sample Language : : : : : : : : : : : : : : 12.2 Proof Techniques : : : : : : : : : : : : : : 12.2.1 Path set lemmas : : : : : : : : : : 12.2.2 Lookup lemma : : : : : : : : : : : 12.2.3 Traversal Theorem : : : : : : : : : 12.3 Application of Framework : : : : : : : : : 12.3.1 Addition of Concrete Class : : : : 12.3.2 Addition of Subclass : : : : : : : : 12.3.3 Addition of Abstract Class : : : : 12.3.4 Deletion of Abstract Class : : : : : 12.3.5 Addition of Reference : : : : : : : 12.3.6 Abstraction of Common Reference 12.3.7 Distribution of Common Reference 12.3.8 Replacement of Reference : : : : : 12.3.9 Generalization of Reference : : : : 12.3.10Renaming of Class : : : : : : : : : 12.3.11Renaming of Reference : : : : : : : 12.3.12Telescoping of Reference : : : : : : 12.3.13Telescoping of Inheritance : : : : : 12.3.14Summary : : : : : : : : : : : : : : 12.4 Related and Future Work : : : : : : : : : 12.5 Summary : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

207 : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : :

: 207 : 209 : 211 : 212 : 213 : 217 : 218 : 219 : 220 : 221 : 224 : 225 : 226 : 228 : 228 : 230 : 231 : 237 : 243 : 244 : 244 : 245

xviii

Contents

13 Maintaining Behavior and Consistency (CLOS) 13.1 Introduction : : : : : : : : : : : : : : : : : : : : : 13.1.1 Structural Consistency : : : : : : : : : : : 13.1.2 Assumptions : : : : : : : : : : : : : : : : 13.2 The CLOS Language Model : : : : : : : : : : : : 13.3 Code Transformations : : : : : : : : : : : : : : : 13.3.1 Addition of abstract class (AddA ) : : : : : 13.3.2 Distribution of common reference (DisR) : 13.3.3 Abstraction of common reference (AbsR) : 13.3.4 Replacement of reference (RepR ) : : : : : 13.3.5 Deletion of abstract class (DelA) : : : : : 13.3.6 Addition of concrete class (AddC) : : : : : 13.3.7 Addition of reference (AddR) : : : : : : : 13.3.8 Generalization of reference (GenR) : : : : 13.3.9 Addition of Subclass (AddS) : : : : : : : : 13.3.10Renaming of Class (RenC) : : : : : : : : : 13.3.11Renaming of Reference (RenR) : : : : : : 13.3.12Telescoping of Inheritance(TelI) : : : : : : 13.3.13Telescoping of Reference (TelR) : : : : : : 13.4 Discussion : : : : : : : : : : : : : : : : : : : : : :

247 : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : :

14 Maintaining Behavior and Consistency (C++) 14.1 The C++ Language Model : : : : : : : : : : : : 14.2 Code Transformations : : : : : : : : : : : : : : : 14.2.1 Addition of abstract class (AddA ) : : : : : 14.2.2 Distribution of common reference (DisR) : 14.2.3 Abstraction of common reference (AbsR) : 14.2.4 Replacement of reference (RepR ) : : : : :

: 248 : 248 : 249 : 250 : 253 : 254 : 254 : 254 : 254 : 255 : 257 : 257 : 257 : 258 : 258 : 258 : 258 : 259 : 259

261 : : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: : : : : :

: 261 : 262 : 262 : 262 : 264 : 264

xix

Contents

14.2.5 Deletion abstract class (DelA) : : : 14.2.6 Addition of concrete class (AddC) : 14.2.7 Addition of reference (AddR) : : : 14.2.8 Generalization of reference (GenR) 14.2.9 Addition of Subclass (AddS) : : : : 14.2.10Renaming of Class (RenC) : : : : : 14.2.11Renaming of Reference (RenR) : : 14.2.12Telescoping of Inheritance (TelI) : 14.2.13Telescoping of Reference (TelR) : : 14.3 Discussion : : : : : : : : : : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

: : : : : : : : : :

15 Evolutionary Metaobject Protocol 15.1 15.2 15.3 15.4 15.5

Re ective Systems : : : : Re ection and Evolution : Mademoiselle Closette : : Closette's EMOP : : : : : Conclusion : : : : : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: 266 : 267 : 267 : 267 : 267 : 268 : 268 : 268 : 269 : 269

271 : : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: : : : :

: 271 : 273 : 275 : 276 : 287

16 Conclusions

289

Bibliography

291

B Code Examples for the Window System

323

B.1 Window System in CLOS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 323 B.2 Window System in C++ : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 327

xx

Contents

List of Tables

1 2 3 4 5 6 7 8 9

Data model and meta primitives yield basic primitive transformations Basic primitive transformations for class graphs : : : : : : : : : : : : : Class abstraction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Summary of existing abstractness mechanisms : : : : : : : : : : : : : : The abstract library rule for leaf classes : : : : : : : : : : : : : : : : : Summary: bene ts of the abstract superclass rule : : : : : : : : : : : : Existing approaches to separate concerns from the basic concern : : : Dictionary of concepts : : : : : : : : : : : : : : : : : : : : : : : : : : : Overview of Adaptive Software Terms : : : : : : : : : : : : : : : : : :

xxi

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: 55 : 57 : 96 : 113 : 124 : 127 : 153 : 249 : 321

xxii

List of Tables

List of Figures

Thesis Advisor Genealogy 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

iv

Behavior extension facilitated by behavior preserving transformations Thesis outline and dependency between chapters : : : : : : : : : : : : Relation between extensional and intensional data : : : : : : : : : : : The duality between objects and classes : : : : : : : : : : : : : : : : : The major elements of an object-oriented system : : : : : : : : : : : : Expected access times for di erent evolution strategies : : : : : : : : : Expected total access times for di erent object evolution strategies : : Schema, object and program transformations : : : : : : : : : : : : : : Behavioral equivalence : : : : : : : : : : : : : : : : : : : : : : : : : : : Example of object graph consistent with class graph : : : : : : : : : : Characterization of primitive transformations : : : : : : : : : : : : : : Example of object-equivalence: G 1 G 2 : : : : : : : : : : : : : : : : : : Example of weak-extension: G 1 G 2 : : : : : : : : : : : : : : : : : : : : Example of extension: G 1G 2 : : : : : : : : : : : : : : : : : : : : : : : Extension at the object level : : : : : : : : : : : : : : : : : : : : : : : xxiii

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

: : : : : : : : : : : : : : :

4 7 20 21 24 34 35 38 39 51 68 79 79 80 83

xxiv 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

List of Figures

The concept of the abstraction mechanism : : : : : : : : : : : : : : : : : : : : : Example abstraction of CGA : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Example abstraction of TCGA : : : : : : : : : : : : : : : : : : : : : : : : : : : : Class hierarchy transformation to comply with the abstract superclass rule : : The regions of type-(un)safety : : : : : : : : : : : : : : : : : : : : : : : : : : : : Class hierarchy transformation for the square{rectangle example : : : : : : : : Computing the total local memory of systems running OS/2 2.1 : : : : : : : : : The major elements of an adaptive object-oriented system : : : : : : : : : : : : Propagation pattern os2 mem : : : : : : : : : : : : : : : : : : : : : : : : : : : : Example class transformation : : : : : : : : : : : : : : : : : : : : : : : : : : : : Example class transformation : : : : : : : : : : : : : : : : : : : : : : : : : : : : Illustration of calling-reachability : : : : : : : : : : : : : : : : : : : : : : : : : : Terminology overview and consistency dependencies : : : : : : : : : : : : : : : Path check and dynamic bypassing : : : : : : : : : : : : : : : : : : : : : : : : : Intertwining algorithm and synchronization : : : : : : : : : : : : : : : : : : : : Separation of concerns at both the conceptual and implementation level. : : : : Example object graph with optimal and non-optimal hub. : : : : : : : : : : : : Implementation with non-optimal hub Button. : : : : : : : : : : : : : : : : : : Implementation with the optimal hub Control. : : : : : : : : : : : : : : : : : : An illustration of the Cost Theorem : : : : : : : : : : : : : : : : : : : : : : : : An example of a tree illustrating cost, node weights, and accumulated weights : An illustration of the insertion update mechanism : : : : : : : : : : : : : : : : An illustration of the deletion update mechanism : : : : : : : : : : : : : : : : : Behavioral equivalence proof technique : : : : : : : : : : : : : : : : : : : : : : : Extending a class graph : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Steps in the object-preserving transformation : : : : : : : : : : : : : : : : : : : The distribution of responsibility among the three levels of objects : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : : : :

: : : : : : : : : : : : : : : : : : : : : : : : : : :

: 94 : 98 : 102 : 115 : 119 : 121 : 131 : 132 : 133 : 134 : 135 : 136 : 142 : 145 : 149 : 151 : 163 : 177 : 178 : 187 : 188 : 198 : 201 : 211 : 251 : 252 : 274

Chapter 1

Introduction

\A man who has the knowledge but lacks the power clearly to express it is no better o than if he never had any ideas at all." Thucydides (300 B.C.) Software evolution and maintenance is becoming more popular and more important in today's software industry. The major reason for this is the increasing number of software applications in use and the increasing costs associated with maintaining them and adapting them to changing requirements. Developing applications is expensive, but maintaining and evolving them is even more expensive.

1.1 Motivation and Background As Wilma Osborne of the National Institute of Standards and Technology (NIST), formerly the National Bureau of Standards, reports in an interview with Ware Myers [Mye88], a Bureau's study found that 60 to 85 percent of the total cost of software is due to maintenance. Equally surprising is the fact that only 20 percent of maintenance consists of xing bugs, so-called corrective maintenance. Another 20 percent goes to adaptive maintenance, work resulting from changes in the software environment. But the majority of maintenance e orts, about 60 percent, is invested for so-called perfective maintenance, continued development and evolution of a software system after 1

2

Chapter 1. Introduction

it has become operational. Therefore, as Johnson and Foote point out, an evolutionary phase is an important part of the lifecycle of a successful software product [JF88]. The emphasis given to the evolutionary aspect of software development is underscored by the advent and use of software life cycle models like the \spiral model" [Boe86], and others like \evolutionary prototypes", \reusable software", \operational prototypes" [BD91]. All of the abovementioned models are based on the assumption that the life of a successful software system does not fade out in a maintenance phase, but that, in fact, the maintenance phase is in itself another cycle through requirements, analysis, design and implementation phases. Object-oriented systems t very well into these life cycle models. Object-oriented languages are often touted as facilitating reuse, reducing the cost of not only development time but also the cost of maintenance, and simplifying the evolution of software systems [WEK90, JF88]. But even when built with advanced object-oriented technology, today's software systems are not immune against the e ects of changes. Moreover, no technology, no matter how re ned and advanced it is, can prevent changes from happening at all. Evolutionary changes to a system can occur at various stages in the life cycle of the system and for a number of reasons. Among the reasons are: (1) Experience shows how the system can be improved. Especially for object-oriented systems, an optimal class hierarchy is not always readily apparent for complex real-world situations, and its design is further complicated by the wealth of mechanisms provided by object-oriented languages. (2) User needs change and additional functionality has to be integrated. (3) The application domain modeled by the system changes and the system has to be adapted. The above reasons are inherent in the complexity of reality and in the limited ability of humans to cope with this complexity. Any change to a software system is a prodigious source of new software faults and inconsistencies in other parts of the system. Consider for example changing an object-oriented application by simply renaming an instance variable of a class. This modi cation renders all methods inconsistent that use the instance variable. As another example, consider the addition of an instance variable to a class. Such a change leaves all objects of the class inconsistent since they do not contain a value for the added variable. The evolution process must resolve such inconsistencies in order to keep the system operational. Therefore, changes potentially trigger other changes which in turn may trigger yet other changes. Moreover, any change must be followed by extensive regression testing as part of the revalidation and reveri cation process. As Jacobson observes [JCJO 92], \a seemingly small change often results in a disproportionately expensive alteration". The big costs of these activities are a major reason why often either the changes are not done, or the subsequent testing is reduced, both being unsatisfactory.

1.2. Objectives and Limitations

3

The research into evolution, restructuring and reorganization of object-oriented systems addresses therefore an issue with important theoretical and practical implications. But not only does this research help to manage the complexity and keep software systems up-to-date and wellmaintained, it also reveals how to build these systems so that they are more robust to a changing environment.

1.2 Objectives and Limitations There have been various e orts to solve the problems incurred by evolution, and to reduce both the costs and time involved. First of all, source code control systems like SCCS [Roc75], CMS [Dig82], and RCS [Tic85] were devised to keep track of the textual changes. Other tools like MAKE [Fel79] tried to reduce the e orts necessary to reconstruct system integrity by capturing the interdependencies of the les that build the system. There are other so-called con guration management systems like the CMTOOLS for multi-platform code management [Jam94], or the NuMIL prototype [NS87] whose goal is to reduce the impact of system changes, and to automate necessary change propagation. Such con guration management systems typically work at the level of \programming-in-the-large" capturing the interaction between larger modules, as opposed to \programming-in-the-small" which involves intra-module activities like data structure design, coding and so forth. In general terms, the goal of this thesis is to provide solutions to the software evolution problem at the source code level. The solutions support change management of the program code to keep track of and resolve inconsistencies of the system due to changes at the system design level. The change management activities are mostly programming-in-the-small activities but span the entire system and thus also solve problems in-the-large. System inconsistencies can in principle be resolved in a number of ways depending on the semantics and the purpose of the modi cation. The proposed change management system focuses on maintaining the consistency of an object-oriented system while preserving its behavior. Intuitively, the behavior of a system is preserved if its functionality is the same before and after a modi cation. Or in other words, if the system is run twice, before and after a modi cation, with the same input, it continues to produce equivalent output. The original and the modi ed system are then said to be behaviorally equivalent. The behavior preserving property of modi cations is important for two reasons. First, it is crucial in itself for all modi cations that are part of the perfective maintenance of a system. In particular, when the intention of the modi cation is to improve the system, e.g., to make it more reusable, more readable, more maintainable, more portable, and so forth. It is typically a case of hindsight by which one realizes how the system could have been designed better. Moreover, as Ong and Tsai point out [OT93], without some deliberate preventive maintenance e ort, the unstructuredness, or entropy, of a system will increase and hinder further code understanding and

4

Chapter 1. Introduction

maintenance. System administrators are usually reluctant to perform such perfective improvements out of fear they might break the code. System (structure and behavior) extending transformation

System−0

System−1

System−2

original

structure extended, behavior preserved

structure extended, behavior extended

Phase 1 Structure extending, behavior preserving transformation

Phase 2 Structure preserving behavior extending transformation

Figure 1: Behavior extension facilitated by behavior preserving transformations Second, the behavior preserving modi cations can also facilitate the extension of a system's behavior. For that purpose, the behavior extension can be broken down into two phases (see Figure 1). Phase one only performs the necessary structural changes to the system while still maintaining the original behavior. This phase makes use of the behavior preserving modi cations and involves all structural and static aspects of the system while also maintaining the consistency of the behavioral aspects. The resulting system is thus fully operational and serves as a platform for phase two which in general only needs to extend the behavior without having to deal with structural problems. Hence the behavior preserving modi cations of phase one constitute a critical rst step in the overall extension of the system. In more speci c terms, then, the goals of this thesis are the following. 1. The development of a comprehensive theory on the evolution of object-oriented systems including an overview of the state-of-the-art and related work. 2. The design of an evolution framework for maintaining consistency and behavior of an objectoriented system. The framework should be applicable to a variety of programming language models. 3. The design of an evolution management system which automates the changes necessary for behavior preserving software transformations, ensuring the consistency of the overall system.

1.3. Contributions

5

1.3 Contributions In accordance with the above goals, the salient contributions of this research include:

 The design of a framework for the evolution of object-oriented systems. The framework thoroughly investigates the issues that arise during evolution. In particular, it addresses the problem of maintaining the overall consistency of the system during evolution. The three major components of an object-oriented system are considered to be the schema, the objects, and the program code (methods), where the schema essentially consists of the class de nitions. The system consistency, then, consists of three parts: conceptual, structural, and behavioral consistency. The conceptual consistency relates to the schema design, structural consistency refers to the objects, and behavioral consistency refers to the program (methods) of the system. The framework presents a high-level process of how to automatically maintain the consistency of a system. As semantic guideline for the update process, the original and the updated system need to be behaviorally equivalent. For this purpose, the framework contains a formal de nition of behavioral equivalence and system consistency.

 The demonstration of the feasibility of the framework. A general object-oriented data model is de ned using class graphs to describe the structural aspects of an object-oriented schema. A comprehensive set of primitive schema transformations is presented including, for each transformation, a) a set of preconditions that guarantees conceptual consistency, b) an object transformation to preserve structural consistency, and c) a program transformation to maintain behavioral consistency. Program transformations are applied to a variety of object-oriented language models: (1) to CLOS, as representative of the untyped model; (2) to C++, as representative of the typed model; and (3) to propagation patterns, as representative of the adaptive software model. For the adaptive software model, the proofs of behavioral equivalence and consistency are demonstrated formally utilizing a new formal and executable semantics for adaptive software, and a newly developed set of proof techniques. An implementation of the evolution framework is suggested with a metaobject protocol, and outlined with Closette.

 The design of a set of fully dynamic algorithms to maintain the median in an evolving tree.

This contribution goes beyond the stated goal and represents an additional important result in an area related to evolution. A system that is designed for evolution should not only be able to maintain its consistency, but it should also employ algorithms that take advantage of the fact that the system is embedded in an evolving environment. Such algorithms are called dynamic algorithms. In contrast to a static (o -line) problem where a solution to a xed problem instance is looked for, dynamic

6

Chapter 1. Introduction

problems assume a solution for a given problem instance has been found previously, but need to nd a solution for a \slightly" modi ed (evolved) instance. The goal of a dynamic algorithm is to e ectively exploit the fact that a previous solution is known. A successful dynamic algorithm is expected to be faster than the corresponding static algorithm which needs to compute a solution without previous knowledge. We study the problem of maintaining the set of medians and their minimal objective function value in a tree while the tree is undergoing repeated dynamic modi cations. The supported modi cations are node insertion and deletion as well as node and edge weight changes. The employed data structure is such that only O(D) nodes need be traversed to update the tree after a modi cation, where D is the diameter of the tree. In contrast, the optimal static algorithm needs to traverse O(n) nodes to compute the median, where n is the number of nodes in the tree.

1.4 Thesis Outline This thesis is organized as outlined in Figure 2 which illustrates the structure of the thesis and how the chapters are interrelated. Chapter 2, Related Work, presents the relationship of this work with other research, and provides further motivation and context. Chapter 3, A Framework for Evolution, describes a framework for software evolution with emphasis on the problem of maintaining consistency and behavior of the system. It presents a thorough overview of concepts and terminology including de nitions for conceptual, structural, and behavioral consistency, as well as behavioral equivalence. Furthermore, this chapter gives an overview of issues arising during schema, object, and program evolution. Chapter 4, The Kernel Data Model, de nes the kernel data model that is used in most of the subsequent chapters, in particular it introduces the notions of class graph and object graph. Chapter 5, Primitive Transformations, describes an extensive set of primitive schema transformations including preconditions that ensure conceptual consistency. In addition, the object evolution process associated with each transformation is speci ed as an object transformation. Chapter 6, Class Graph Extension, analyzes in detail the object-extending class graph transformations and their impact on the object store. An algorithm is devised to decompose an objectextending transformation into a sequence of primitive schema transformations. Chapter 7, Class Graph Abstraction, uses the class graph extensions to de ne candidate class graphs for maximal reusability. An algorithm is presented that nds the biggest class graph such that two input class graphs are extensions of it.

7

1.4. Thesis Outline

Introduction

Related Work

Chapter 1

Chapter 2

Evolutionary Framework Chapter 3

Object−Oriented Data Model Chapter 4

Primitives Chapter 5

Extension Transformations

Schema Abstraction Chapter 7

Chapter 6

Adaptive SW Chapter 9

Information Acquisition Chapter 10

Median Chapter 11

Propagation Patterns Chapter 12

Abst. Superclass Transformations Chapter 8

C++ Chapter 13

CLOS Chapter 14

EMOP Chapter 15

Conclusions Chapter 16

Figure 2: Thesis outline and dependency between chapters

8

Chapter 1. Introduction

Chapter 8, The Abstract Superclass Rule, discusses advantages and disadvantages of the abstract superclass rule as a style rule in software engineering. As part of this chapter, the abstract superclass transformation is introduced and its impact on the object-oriented system analyzed. Chapter 9, Adaptive Software, presents the paradigm of adaptive object-oriented software as an e ective change avoidance approach using the separation of concerns technique. The primary contribution of this chapter is the de nition of a novel formal semantics for adaptive software. Chapter 10, Minimizing Information Acquisition Cost, represents a slight aside of the main topic. This chapter formalizes the problem of nding an optimal hub for collecting a set of data values distributed in a connected object structure. An optimal algorithm is presented for nding such a hub which can be used to automatically adjusting the hub when the class structure changes. Chapter 11, On Medians in Dynamic and Static Trees, continues the aside by providing the theoretical background of the information acquisition problem. A theory for characterizing single and multiple medians is developed which yields a novel linear-time algorithm for computing the median. Moreover, a set of fully dynamic algorithms is developed to maintain the median in a tree during the evolution of the tree. Chapter 12, Maintaining Behavior and Consistency (Propagation Patterns), analyzes the behavior preserving problem of schema evolution for propagation patterns. The essential question is when do propagation patterns adapt themselves to schema changes due to their robustness and when do they need to be changed to maintain behavior. Chapter 13, Maintaining Behavior and Consistency (CLOS), completes the investigation of the consistency and behavior maintenance problem with the case of CLOS. Chapter 14, Maintaining Behavior and Consistency (C++), investigates the impact of schema transformations on the consistency and behavior of C++ programs. This chapter closes with a comparison of the three programming languages, propagation patterns, C++, and CLOS, with respect to their changeability. Chapter 15, Evolutionary Metaobject Protocol, describes the implementation of the evolutionary framework with the metaobject protocol of Closette, making it a self-maintaining re ective system. Chapter 16, Conclusion, summarizes and evaluates the key ideas of the thesis. It also contains a list of open questions and possible areas of future research into software evolution. The core of the thesis consists of Chapters 3, 9, 11, 12, and 15.

Chapter 2

Related Work Code management is important because poor code management costs more money and time than good code management. Kevin Jameson [Jam94] This chapter provides further motivation for the general problem of the thesis by putting it into context with related work. The chapter establishes the state of the art in the area of object-oriented software evolution, and shows speci cally what has been done so far, and more importantly, what has not been done. Further discussions of related work, especially when it appeared more speci c to a certain chapter, is distributed throughout the thesis.

2.1 Schema evolution in database systems The area of software evolution has gained tremendous popularity in the research community during the last few years. The advent of commercial database systems supporting schema evolution, such as ORION [BKKK87], O2 [BDK92], or GemStone [PS87], has accentuated this trend. The literature in this area is quite extensive. Interestingly, most of the work is not very old and has been done primarily in the object-oriented database eld [SZ86, BKKK87, PS87, BCG+ 87, AH88, Osb89, LH90, Tre91, Bar91, Cas91, Zic92, Ber92, EvBD92, Sch93, SGD93, TS93, RR94, FMZ94b]. Special attention should be given to the seminal work by Banerjee et al. [BKKK87] at MCC who developed the ORION object-oriented database system and established a standard set of schema transformation primitives. These primitives were adapted and utilized in various forms by many other schema evolution research groups. The theory and implementations presented in the above 9

10

Chapter 2. Related Work

papers focus on maintaining structural consistency during the evolution of an (object-oriented) database schema; that is, the proposed systems guarantee the correctness of the performed schema changes and re ect the schema changes in the persistent instances of the database. The issue of behavioral consistency has been dealt with only marginally. A rst overview of existing approaches to schema evolution is given by Eduardo Casais in his thesis [Cas91]. In addition, Casais proposes an algorithmic approach to object-oriented software evolution which allows for automatic restructuring of a class hierarchy when classes are added to it. The hypothesis of his approach is that design aws can be uncovered at the time a hierarchy is extended with an additional object description. The approach deals in depth with all the aspects of class hierarchy restructuring but does not address their impact on the object base nor on code fragments.

2.2 Refactorings At the University of Illinois, a group headed by Ralph Johnson has investigated the refactoring approach, a program restructuring aid for designing object-oriented application frameworks [OJ90, Opd92, JO93]. Refactorings do not themselves change the behavior of a program, but they restructure it in a way that makes the software easier to extend and reuse. Opdyke analyzed a comprehensive set of 26 primitive and 3 high-level C++ refactorings [Opd92], similar in avor to the primitive transformations presented in Chapter 5. For each of the refactorings a set of preconditions is given which are claimed to preserve the behavior of the refactoring. The following notion of behavioral equivalence is used. \Semantic equivalence is de ned as follows: let the external interface to the program be via the function main. If the function main is called twice (once before and once after a refactoring) with the same set of inputs, the resulting set of output values must be the same." [Opd92, page 40] The major di erence between refactorings and the work presented here is that refactorings do not take into account persistent objects. This becomes apparent in the above notion of behavioral equivalence. For object-oriented programs that do not support persistence, objects themselves cannot be part of the output; only their constituting base types (e.g., integers, strings, and so forth) are \visible" as output. Since base types never change, they can actually be compared before and after a refactoring. However, if objects are persistent then they too must be considered as output (and input, for that matter), and thus must be compared before and after a change. But when a refactoring is applied, the type of the objects might be changed and, as a result, become incomparable. This is obviously not considered in the above de nition. In contrast, the framework

2.3. Program restructuring

11

presented here introduces a di erent notion of object-equivalence which explicitely uses the object transformation to account for the correspondence between old and new types of objects. This allows us to formally de ne when two programs are behaviorally equivalent. A related problem of refactorings is that they leave it completely open how \new" objects are created as compared to how the \old" objects were created. Moreover, how exactly objects di er in the new and old version of the programs is not considered. Even when no persistence is supported, there might be many possible ways to \transform" the corresponding objects. And for each such object transformation, a di erent code transformation might be necessary to guarantee behavior preservation. To ensure the semantic equivalence given in the above quote, for each primitive transformation a set of preconditions is provided that must hold before the primitive can be applied. No actual account is given on which behavioral inconsistencies might arise as a result of applying a primitive. Furthermore, no proofs are given that behavioral equivalence is actually maintained. This comes as no surprise given the richness of the C++ language. In contrast, Chapter 12 will prove in detail for adaptive software that the original and the transformed programs are behaviorally equivalent. Another di erence is that refactorings are formulated in terms of source code transformations at a lower level of abstraction as compared to the behavior preserving transformation where updates are de ned at the schema level. In addition, the refactorings presented in [Opd92] are C++ speci c, whereas the behavior preserving transformations are applied to a variety of languages. An implementation of refactorings for C++ is in a prototype stage, and has been sketched in [Opd92].

2.3 Program restructuring Griswold [Gri91] presents a transformation-based tool to automate program restructuring of meaningpreserving transformations. The tool is based on a model that exploits preserving data ow dependence and control ow dependence. The work focuses on transformations of the syntactic constructs of a block-structured language. His language of choice was Scheme, mostly because a tool was available for generating and manipulating program dependency graphs. The main contribution of his work is the technique of using program dependency graphs and control ow graphs to describe source to source transformations. An interesting generalization of his work would be to apply this technique to object-oriented languages. Other techniques to analyze the behavioral aspects of methods and their interactions have been proposed in [Wal91b, HTY89]. Waller [Wal91b] develops a formal model to capture the evolution of recursion-free method schemas and to obtain basic results on such evolutions. The core of this work is the presentation of an algorithm for incremental consistency checking. Hull et al. [HTY89] study issues of reachability, nontermination and well-typedness in the

12

Chapter 2. Related Work

context of restricted classes of object-oriented database systems. For this purpose they develop a theoretical model for analyzing the behavior of method execution. In itself, this work does not provide solutions for behavioral consistency, nor does it de ne any update mechanism. But it constitutes a theoretical foundation and a workable formalism for studying the behavior of objectoriented database systems. Erradi et al. [EvBH92, EvBD92] investigated the evolution of executable object-oriented speci cations of distributed systems. The fact that the speci cations are developed with formal methods allows for a systematic way of propagating changes at the design level down to the implementation level (modules, code). For this purpose, an object-oriented speci cation language, called Mondel, was developed. The implementation of the transformations was done in RMondel, a re ective object-oriented speci cation language, suitable for speci cation and modeling of distributed systems. The re ective nature of the system was exploited to provide for dynamically modi able speci cations. Only those type modi cations were considered for which the evolved types conform to old ones. In particular, the following structural modi cations were studied: addition of attribute, change of attribute type, addition of operation, change of operation signature, making a type a supertype, and addition of a new type.

2.4 Program transformations Palsberg and Schwartzbach present an extensive set of algorithms for the type system of their BOPL language [PS94]. The BOPL language was designed to accomodate several type systems within the same framework. In particular, the full BOPL language can model eight distinct language types: it can be any combination of typed or untyped, and featuring inheritance and/or genericity. The type system algorithms include type checking, type inference, and expansion of inheritance and substitution. They are all formulated as program transformations from one BOPL program text to another; that is, they are syntactic rewritings. The nature of the transformations is such that they are behavior preserving in the sense that the transformed program must have the same semantics as the original one. This property makes them very similar to both the transformations presented in this thesis, and to refactorings. Program transformations have also been used to transform formal speci cations into executable programs [HKB93, Par90]. An integral part of this process is that each step corresponds to a meaning-preserving transformation. Studies of evolving speci cations have been conducted in [BLY93], investigating software transformations that go beyond meaning-preserving program transformations.

2.5. Language preserving transformations

13

2.5 Language preserving transformations Bergstein's work [Ber94b] on the evolution of object-oriented systems focuses on maintaining functionally analogous programs in the presence of language-preserving transformations. He uses extended class dictionary graphs to model object-oriented schemas and associates with each class dictionary graph a grammar. The grammar de nes an object language which is used to describe instances of the schema through sentences in the language. The grammar can be used to convert a sentence into an object (parsing), and vice versa (printing). When the intersection between the original object language and the transformed object language is not empty then the original methods are transformed in a way that gives the same output for all the objects in the intersection. A restriction of the extended class dictionary graphs is that they are limited to specify only LL(1) grammars. Moreover, only tree objects are allowed; that is, any kind of object sharing cannot be modeled. Since input and output objects are modeled by extended class dictionary graphs, it is even required that programs may not change objects so that they share subobjects. For real world applications, this restriction seems too stringent. Jointly with the author, Bergstein has studied how to maintain behavior and consistency in C++ and CLOS for the object-extending transformations [BH93]. Chapter 13 and 14 extend on this work and show results for a more complete set of transformations and providing a full set of code examples. Bergstein also analyzed the object-equivalent transformations [Ber91] which were integrated and generalized by Lieberherr, Xiao, and this author into the object-extending transformations [LHX94] (see Chapter 6. Both can be viewed from a language-preserving viewpoint as well as from within the behavior and consistency preserving framework.

2.6 Change avoidance The philosophy behind the change avoidance approach is quite di erent from the ones discussed so far. While the conventional approaches employ a brute-force method to x everything that becomes inconsistent after a schema update, the change avoidance approach employes a more subtle method whose goal is to engineer software in a way that makes it adaptive and robust to schema changes. Change avoidance tries to ' x' inconsistencies before they occur. Thus, change avoidance can be characterized as an active, as opposed to re-active, update mechanism. Considering the fact that change propagation is usually expensive, especially when changes are frequent and/or the database is huge, it is well worth the e ort to design robustness into an application. Robustness comes in two facets: structural and behavioral robustness.

14

Chapter 2. Related Work

2.6.1 Structural change avoidance Structural change avoidance has been particularly interesting for the database community with respect to the organization of the persistent data store. The goal of this research area is to avoid database restructuring and reorganization after a schema update. So far, a number of researchers have addressed this topic [RR94, TS93, Ber92, Zic92, ALP91, BKKK87, SZ86], also in non-objectoriented databases [GR93]. The idea common to most of this work is the decoupling of the logical object model from its physical representation which makes instances immune to changes of the object model. A standard way to achieve this decoupling is by means of views on the database [SZ86, RR94, TS93, Ber92, ALP91]. There, changes are not actually performed on the underlying database, instead they are simulated via changing external views. Another example of a change avoidance technique is the organization of objects into transposed les [EF90]. Transposed les associate one le with each variable of a class. The representation of an instance is thus spread over several les and only reconstituted on demand. In an earlier form, the idea of change avoidance has been introduced in the classical threeschema (or ANSI/SPARC) architecture for database management systems [ANS75, ANS86]. The main goal of the separation of software into three levels is to provide both logical and physical data independence. The internal schema represents the lowest, so-called physical level and is concerned with the physical layout of the data. At the central level is the conceptual schema which describes the data logically and abstracts from the physical description. Finally, at the highest level is the external schema. The external schema is sometimes also called an external view (see above) and usually represents only that part of the conceptual schema that is used by a particular application. A variation on change avoidance is the lazy (or deferred) database update mechanism [FMZ94b, FMZ94a]. While lazy database updates do not avoid change, they try to defer change as much as possible. In contrast, eager (or immediate) database updates propagate schema changes to the database immediately. Which approach to use is a trade-o that has to be made by the system administrator. On the one hand, eager conversion potentially requires locking the entire database until all objects are updated. This e ectively stops all applications from accessing the database during the update. After the locks are released no run-time overhead is incurred by this appraoch. On the other hand, lazy conversion does not su er from performance degradation after the schema update, but incurs some space and run-time overhead during normal operation of the database. Upon every object access, the system needs to check whether a conversion is necessary. For that purpose, the system needs to maintain a history of all prior schema updates and their conversion functions [FMZ94b]. A more detailed discussion of object update strategies will be given in Section 3.3.

2.6. Change avoidance

15

2.6.2 Behavioral change avoidance Comparingly little attention has been given to the issue of behavioral change avoidance which tries to avoid or minimize behavioral inconsistencies due to schema modi cations. Currently, two approaches in this direction can be identi ed. In the database area, some of the work on views has been extended so that the views support a limited independance between manipulation programs and schema [SZ86, ALP91, TS93, RR94]. One such system, proposed by Skarra and Zdonik [SZ86], manages class evolution by using versions and exception handling mechanisms. A problem common to all view approaches is that in some cases, especially during the initial design and testing of a system, the applied updates are not just other \views" of the system but are supposed to actually change the core of the system. For example, if a design aw is discovered then simply de ning a view does not actually remove the

aw but only simulates the removal. In the programming languages area, the adaptive software project of the Demeter team at Northeastern University [Lie96, Xia94, SLHS94] supports change avoidance by minimizing dependencies between the program text and the class structure [LZHL95, LSLX94, LX93a, LHSLX92]. Only a brief description of adaptive software is given here, a full de nition including a formal semantics of adaptive software is found in Chapter 12. Traditional object-oriented programs build undesirable dependencies when hardcoding the details of the class structure into the methods. Because the class structure speci es the relations between objects, and since methods use these relations to implement object collaboration by sending messages along the relation links, the class structure becomes hardcoded into the methods. Consequently, changes to the class structure require changes to the methods. In contrast, using adaptive software, changes to the class structure require only minimal program maintenance. An adaptive program is written at a higher, schematic level that abstracts the details of object traversal and navigation. Di erent tasks tend to exhibit similar traversal paths in the class structure [WMH93]. These traversal patterns are e ectively separated from the other aspects of the task by using propagation patterns which employ a polymorphic reuse mechanism in the program speci cation [LZHL95]. A propagation pattern becomes executable when it is instantiated with a particular class graph. In this sense it is a kind of metaprogramming [Kic92, WY90]. Propagation patterns are written with only minimal and indirect knowledge of the class structure since they are implicitly parameterized by the schema. This allows propagation patterns to be polymorphically reused (instantiated) with an in nite set of schemas. Therefore, the consistency and behavioral equivalence of the high-level adaptive program is maintained in many cases of class structure evolution. A robustness motivation similar to the one in adaptive software is found behind several design

16

Chapter 2. Related Work

patterns as described in [GHJV94] including the Visitor, Observer , and Chain of Responsibility patterns.

Other change avoidance approaches have been proposed to support evolution at the more coarsegrained level of tool and module integration, minimizing dependencies between tools [NS87, SN92].

Chapter 3

A Framework for Evolution If we do not learn to manage change, we will become its victims, not its bene ciaries. Edward H. Berso and Alan M. Davis [BD91] This chapter takes a closer look at the general issues involved in software evolution. It introduces commonly used terminology and sets the stage for later chapters. At the center of the chapter is the framework for behavior preserving evolution. The chapter is organized as follows. Section 3.1 gives a quick tour of the basic issues. It answers the four question \Who initiates and performs changes?", \What is changed?", and \How and when is it changed?". By answering these question, we give an outline of the subsequent sections of this chapter. Sections 3.2, 3.3, and 3.4 discuss schema, object, and method evolution, respectively. The core of this chapter is Section 3.5 which details a framework for consistency and behavior maintaining software evolution.

3.1 Introduction A software system models those parts of the real world that are relevant to the application the system is intended for. These relevant parts are called the domain of the software system. An object-oriented system uses the object-oriented data model to describe its domain; it models with objects. The structure and behavior of an object are described by its class. The collection of all class de nitions that describe the domain objects is called the schema. A data model provides constructs to describe data descriptions. The object-oriented data model provides constructs like instance variables, methods, inheritance, and so forth to describe classes. It is important to distinguish 17

18

Chapter 3. A Framework for Evolution

Higher level of abstraction

between data and the description of data. In this context, the notions of intensional and extensional data are used to make this distinction. Intensional data denote the set of all possible data values (objects) the system can model. Extensional data denote the set of the actual data values present at a certain point in time [Bra87]. Data Model extension Schema description

intension

Schema extension Object description

intension

Object store extension Objects

models

Real world

Object−oriented System

Figure 3: Relation between extensional and intensional data The relation between intensional and extensional data for an object-oriented database system is illustrated in Figure 3. The contents of the data model, its extension, describes the schema, and thus de nes the intension at the schema level. The contents of the schema, its extension, describes the object store, and thus builds the intension at the level of the objects. The object store contains as its extension all the objects of the system which build the model of the real world (domain). In object-oriented systems, the intension is described by classes which de ne the set of all possible object values (see Figure 4). The objects at the data level are the entities that model the real world, and build the extension of the classes in the schema. The schema classes de ne the intension at the data level. When the schema classes are considered as objects, too, they build the extension of the classes at the data model level, the meta level. The classes at the meta level de ne the intension at the schema level. When these classes, too, are considered as objects they build the extension of even higher level classes. In principle, this scheme can be extended to arbitrarily high levels of abstraction. This highlights an interesting duality between objects and classes that is illustrated in Figure 4: objects build extensional data wheras classes de ne intensional data. Like the real world they model, software systems constantly change and evolve. When the domain of a software system changes, then so must the system itself in order to maintain the consistency between the domain and the model of the domain. Two di erent types of domain changes can be distinguished. The rst type of domain change is such that the objects that represent

19

3.1. Introduction

Extension

Class Object

Class name super attributes methods

describes

Schema level

Person name ssn

Person . Employee Person

Employee ISA Person dept describes

Data level

instance−of

instance−of

Peter 032−72−5372 CCS models object − class duality

Real world

Figure 4: The duality between objects and classes

Object−oriented system

Higher level of abstraction

Data model level (Meta level)

Intension

20

Chapter 3. A Framework for Evolution

the domain can be updated by means of normal system operations; that is, by the execution of methods of domain objects. In this case the intension de ned by the schema classes is still sucient to model the new extension. The second type of domain change is such that the objects modeling the new domain are not elements of the intensional data de ned by the schema. In this case normal system operations are incapable of updating the existing objects, and a change to the schema itself becomes necessary. The second type of domain change requires software evolution. Software evolution can be viewed as system operations with other means. It is applied whenever normal system operations become insucient to update the model's view of the real world. The above characterization of domain changes allows for the characterization of changes of the software system. Again two types are distinguished: extensional and intensional changes. Extensional changes manipulate extensional data, and similarly, intensional changes manipulate intensional data. Depending on the level of abstraction (see Figure 4), the types of changes can manipulate di erent entities of the system. At the lowest level of the system, normal system operations perform extensional changes, whereas software evolution performs intensional changes. In other words, extensional changes do what the system is designed to do while intensional changes modify the design of the system. Both types respond to changes in the domain in order to update the software model of the domain. At the next higher level, the schema level, intensional changes of the lower level become extensional changes; that is, changing the classes represents now extensional changes, whereas changing the data model classes represents intensional changes. The latter are triggered whenever changes to the schema objects cannot be achieved by execution of operations de ned in data model classes. In turn, completing the descent down the abstraction echelon, the changes to the schema objects was triggered because changes to the data objects could not be achieved by operations de ned in the schema classes. In turn, the changes to the data objects were triggered by changes in the domain. As an example, suppose the data model supports only single inheritance, and a domain change is such that a data model with multiple inheritance is required. This domain change would trigger a change over two levels up to the data model classes in Figure 4 to change their attribute super to reference a collection of superclasses instead of a single superclass. This would then allow schema classes to inherit from several superclasses, and so forth. The above process can be repeated to even higher levels of abstraction. Hence, depending on how powerful the system, and how drastic the domain changes are, several levels of abstraction can be involved. System designers have rarely the chance to modify their data model ad libitum. Usually, they have to live with one data model, and work their way around to accomodate the domain changes. For this reason, the remainder of this section will concentrate on the lowest level of the abstraction echelon. The higher levels will, however, be addressed later in Chapter 15 when

3.1. Introduction

21

discussing the evolutionary metaobject protocol. Several questions arise naturally in the context of software evolution.

Who? | Who performs and initiates changes to the software system? What? | What are the relevant elements of an object-oriented software system that can be changed?

How? | How can the software elements be changed? What tools are available to change software? When? | When should the software elements be changed? These questions will be answered brie y in the remainder of this section to give a quick tour of software evolution. Some aspects have already been discussed above, and many issues will be addressed in more detail in later sections and chapters.

3.1.1 Who performs and initiates changes?

Extensional changes are typically initiated by the user of the software system in order to manipulate the data objects. The update itself is performed by the system operations. This is what the system is designed to do. Intensional changes are initiated by the designer whenever extensional changes are not sucient enough to keep the model up-to-date with the changes in the domain. The designer is also the person who has to perform the updates manually unless there is automatic support to manipulate the design of the system.

3.1.2 What is changed?

At the lowest level of abstraction, extensional changes to an object-oriented system are performed to the data objects by using the system operations. Intensional changes a ect the object-oriented schema. In addition, intensional changes can be applied to the method implementations (the program) to change their functionality. Intensional changes can also be applied to the persistent object store to transform them into a consistent state after the schema has been updated. This typically is a transitional step after intensional changes to the schema. Assuming that the program works on a persistent object store (objects), three essential elements can be distinguished in an object-oriented system: schema, objects, and program (see Figure 5)1. Each of the three elements in an object oriented system has its own consistency rules. Extending the consistency terminology from [DZ91], we distinguish between conceptual consistency 1 The OMT methodology requires two additional elements, the dynamic model and the functional model [RBP+ 91], but as Rumbaugh et al. state, the object model (schema) is the most fundamental of the three.

22

Chapter 3. A Framework for Evolution

OBJECT−ORIENTED SYSTEM Schema

dependency

Objects

computes with

Program

Figure 5: The major elements of an object-oriented system (schema), structural consistency (objects), and behavioral consistency (program/methods). The inherent structure of the system makes the three elements and their consistency conditions mutually dependent. The fundamental element is the schema since it represents an abstraction of both the objects and the methods. On one hand, a change to the schema is likely to a ect both the objects and the code. However, changes to objects and code are not completely determined by schema changes. Given a schema change, the corresponding objects could be brought to a consistent state in several di erent ways. On the other hand, a direct change to the objects or the code might necessitate further changes to the schema. One of the goals of object orientation has been to ease the e ort needed to make changes to software systems. By closing the perception gap between objects in the domain and objects in the software system, changes in the domain can be more readily tracked down to the appropriate changes in the software. Furthermore, the strong encapsulation capabilities of objects limit the impact of code changes on other parts of the system. By adhering to the coding style rule proposed by the Law of Demeter [LH89], the ability to hide changes can be improved even more. Nevertheless, since objects have to interact with each other to achieve a coherent overall behavior, there is a limit to how far they can be isolated from each other, and thus, there always exist changes that a ect other parts of the system. Whenever intensional changes are applied, a software system evolves into its next version. Due to the interdependency of the system elements, initial changes trigger further changes if consistency has to be maintained within and between the system's elements. The intensional changes and the triggering of further changes is what we call software evolution. A simple change to one system component that does not also update the other inconsistent components leaves the system unusable. This is the cascade of evolutionary changes that are of particular interest to software engineers.

3.1. Introduction

23

To summarize this issue, an important property of a software evolution mechanism is its ability to change the entire system from one consistent state into another. From the above arguments it becomes clear in which order the triggering mechanism a ects the components of a system. There are two phases, the rst one determines which changes are necessary by going upwards in the abstraction levels of Figure 4. The second one then performs the changes going downwards in the abstraction levels. Starting from the change in the domain, the rst, upward phase, checks which system operations would be sucient to change the data objects. If no operations provide the necessary tools, then a change to the schema is performed. In principle, this phase could go to even higher levels of abstraction. It is assumed here that the change to the schema suces. Following the schema change, the second, downward phase, updates objects and methods to restore their consistency. An additional motivation for a top-down update order is provided by the following argument. Software evolution, as any development activity, has to be done according to software engineering principles, and has to t into established software life cycle models. Most life cycle models require a top-down approach proceeding from higher to lower levels of abstraction (e.g., the waterfall model or Boehm's spiral model [Boe86]). In accordance with these models it seems only natural to apply the initial intensional change to the schema and then proceed to methods and objects. In the following chapters, the term software evolution will be used to denote the process of changing the schema, followed by changes to the objects and the methods to restore their consistency. Depending on which consistency is restored, the terms schema evolution, object evolution, or method evolution will be used. These topics will be discussed further in sections 3.2, 3.3, and 3.4, respectively.

3.1.3 How to change?

There are basically two ways to specify the new schema: 1) Using the schema description language to describe the full new schema without reference to the old schema, or 2) using a schema update language to describe a -change relative to the old schema. The rst approach is advantageous if the modi cations are extensive, and describing the change in an update language becomes tedious. But describing only the new schema completely ignores the fact that the system is evolving from older to newer versions. Furthermore, it is wasteful when the changes are small and the schema is large. In addition, the information of what speci cally has changed in the transition (the -change) needs to be computed anyway in order to trigger further updates in a meaningful way. The second approach, a schema update language, provides for some guidance to the designer by specifying a set of primitive update operations (transformations), rather than to allow free,

24

Chapter 3. A Framework for Evolution

uncontrolled changes to the schema. An important role plays the granularity of these transformations. Basic primitive operations are determined by the richness of the underlying object model. In general, each entitity of the object model can be combined with one of the four meta primitive operations add, delete, change, or rename to build a basic primitive operation. An example of a catalogue of such primitives has been established for the ORION data model [BKKK87]. More coarse-grained, and thus more powerful and often semantically more meaningful transformations can be obtained by a sequence of basic primitives. If such a sequence is provided as a self-contained operation then it is referred to as a composite primitive operation. For the remainder of this thesis, we will follow the approach of using a schema update language to describe changes. A more detailed discussion of primitives and a catalogue of those used in our model is given in Chapter 5.

3.1.4 When to change?

There are three approaches currently recognized that perform change operations at di erent points of time: surgery, versioning, and tailoring, corresponding to the time of change strategy: now, never, and mixed, respectively [Cas91]. Surgery updates the software immediately and overrides the old software completely keeping only the latest version in the system. Versioning never deletes old software but instead adds new versions to the system without losing old information. Finally, tailoring keeps the old schema stable and limits changes to deriving new subclasses and adapting their inherited properties. As Casais points out [Cas91], tailoring techniques are useful for performing small adjustments to a schema. However, tailoring relies too heavily on the inheritance mechanism alone to be useful in general and can quickly lead to an incomprehensible class hierarchy. Schema surgery updates the system instantaneously. Its power depends heavily on the employed primitive transformations. The more powerful the primitives are the more powerful surgery becomes. If the primitives are too ne-grained then large schema updates become a problem since they entail large sequences of small updates, each one triggering potentially expensive updates of the entire system. This underlines the importance of the ability to bundle several primitive transformations into a single logical transformation. Such a bundling of primitives is referred to as transformation transactions. Transformation transactions are discussed in more detail in Chapter 5 but are otherwise not further considered. Versioning has been used in a number of object-oriented systems such as Encore [SZ86] and ORION [BKKK87]. Its main appeal comes from the fact that several versions of the same software can be run simultaneously. Also, it is ideal for testing di erent paths of evolution while still allowing the system to be used continually. The major drawback of the versioning approach is its large overhead necessary to both maintain links between versions and access objects. Another

3.2. Schema Evolution

25

disadvantage is that even for small changes which usually are not meant to build an entire new version, the full versioning mechanism needs to be invoked. Versioning has to deal with the same problems as surgery in terms of maintaining consistency of the system's components. The major di erence is that versioning must additionally maintain the versioning mechanism. For the purpose of simplicity, the approach followed in this thesis is schema surgery. It should be emphasized, however, that all the results are also applicable to versioning approaches. There is another timing problem in software evolution: when to update the persistent object store. As was brie y discussed in the previous chapter, the common techniques are eager and lazy database update [FMZ94b, FMZ94a, Sch93]. A third technique, called screening , performes only logical database updates, but never updates objects physically.

3.1.5 Summary

In summary, software evolution consists of intensional changes applied by system designers to the schema, the objects and the program. Following current life cycle models, initial updates are performed on the schema and are propagated to objects and methods in order to maintain their consistency. Updates are chosen from a set of primitive transformations that are made available by the change management system. In this approach, updates are performed immediately by surgically altering the system. The rest of this chapter is organized as follows. Sections 3.2, 3.3, and 3.4, discuss the evolution of schema, objects, and methods in more detail. A general evolution framework and process is presented in Section 3.5.

3.2 Schema Evolution The schema of a software system focuses upon the essential characteristics of the system as seen by the designer of the system. In an object-oriented system, these essential characteristics consist of the classes, their attributes and method signatures, and the class relationships. The schema represents an abstraction of the system that helps the designer cope with its complexity. The higher level of abstraction hides the unnecessary details of the system and concentrates on the relevant aspects only. Its focus on the essential characteristics makes the schema ideal for reasoning about evolution of the system, and for introducing evolutionary changes. The central role of the schema for evolution had been recognized very early in the database community where changes to the database system are introduced through the schema, at the conceptual level. It is in this context that the term schema evolution was coined. Banerjee et al. [BCG+ 87] de ne schema evolution to be \the ability to dynamically make changes to the class

26

Chapter 3. A Framework for Evolution

de nitions and the structure of the class lattice." Delcourt and Zicari included subsequent updates of the object store in the term schema evolution and also pointed out the need for updates of the methods [DZ91]. In order to factor out the di erent processes occurring during evolution, we will use the term schema evolution strictly for the modi cations of the schema: schema evolution is the process of changing the schema of an object-oriented application while maintaining its consistency. The goal of schema evolution is thus to evolve a schema from one consistent state to the next. The consistency of the schema is referred to as conceptual consistency. The conceptual consistency of a schema is typically de ned by a set of schema invariants (axioms). A schema is said to be conceptually consistent (well-formed) if it satis es all schema invariants. During schema evolution, the schema invariants must hold in every quiescent state of the system. An example of a set of schema invariants is provided for the O2 database system [CPLZ91, DZ91]. The invariants state that (1) the class structure is a directed acyclic graph (DAG), (2) the attribute names and types, and the method names and signatures are all compatible (method overloading must follow the covariant condition), and (3) no name con icts occur. Another example of schema invariants are the ones from the ORION database system [BKKK87]. (1) Class lattice invariant: the class lattice is a singly rooted and connected directed acyclic graph with uniquely named nodes and labeled edges. (2) Distinct name invariant: all instance variables and methods of a class must have distinct names. (3) Distinct identity invariant: all instance variables and methods of a class have distinct identity. (4) Full inheritance invariant: a class inherits all instance variables and methods from its superclasses, except when a violation of (2) and (3) occurs. (5) Domain compatibility constraint: rede nitions of instance variables must follow the covariance rule. Schema evolution is realized by means of applying schema transformations. The kind of primitive transformations vary from system to system. They depend on the employed data model and the supported evolution capabilities. For each application of a primitive transformation, the resulting schema must be conceptually consistent. There are essentially two approaches to enforce conceptual consistency: checking posteriorly and anteriorly. In the former approach, schema consistency is checked after the schema transformation is performed. The advantage of this approach is that all transformations use a single check for their validation. The disadvantage lies in the fact that the check is performed after the schema is changed. Consequently, the transformation needs to be reversed in case the check fails. Furthermore, checking the entire schema is costly, and, when transformations are small and local, unnecessary for the most part. Rather than checking schema consistency after performing a transformation, the second approach de nes a set of preconditions for each kind of schema transformation. It must be proven that for each instance of the transformation, schema consistency will be maintained if (1) the

3.3. Object Evolution

27

schema is in a consistent state prior to the update, and (2) the preconditions are satis ed. Since preconditions are speci c to each transformation, checking the consistency of the schema can be limited to the local range a ected by the transformation. Moreover, no roll back mechanism needs to be supported since inconsistencies are discovered prior to the modi cation. A disadvantage of using preconditions is that they need to be formulated in addition to the schema invariants, and proofs of consistency preservation need to be provided. A mixed approach was followed by the integrity constraint checker (ICC) proposed by Delcourt and Zicari [DZ91]. The ICC does not use preconditions but applies specialized consistency checks for each type of transformation. This scheme enables the ICC to indicate exactly where inconsistencies arise, if any. We will de ne a comprehensive set of primitive transformations in Chapter 5. The de nitions include preconditions such that it is possible to follow both of the above approaches. For a discussion of the extensive related work available on schema evolution, see Chapter 2.

3.3 Object Evolution Due to the dependency between the schema and the objects (Figure 5), the evolution of the schema will often, but not always, result in inconsistencies of the objects. The consistency of the objects with their de ning schema is referred to as structural consistency . The structural consistency conditions de ne when an object conforms with its schema. An object is said to be structurally consistent with its de ning schema if it satis es all consistency conditions. During evolution, the objects must be structurally consistent every time they are accessed. Object evolution is the process of changing the objects of an object-oriented system to restore the structural consistency with the changed schema. Structural consistency is not the only invariant at the object level. There may be additional consistency constraints de ned for the objects due to special requirements in the application domain which cannot be expressed with the object model. These consistency constraints evolve together with the rest of the system, and might make it necessary to check the new, structurally consistent object store for possible violations. Changing the contents of the new object store due to changed consistency constraints generally does not preserve the behavior of the application. Such additional changes to objects are thus considered to occur in the second phase of the evolution process, when the behavior is evolved (see Figure 1). For an object-oriented database system (OODBS), the key focus of interest is the data. It is therefore not surprising that in the area of OODBSs, and especially in commercial OODBSs, object evolution is an important feature. This is also the reason why support for schema evolution generally includes object evolution capabilities.

28

Chapter 3. A Framework for Evolution

In general, object transformation procedures are non-trivial and must deal with a number of issues, many of which are not yet fully solved. The existing approaches to object evolution can be categorized along two lines: time of update, and granularity. The granularity of the object evolution strategy refers to the context that is used to transform a single object. The strategy is either object based, class based or database based. One of the most elaborate object conversion mechanism, the Object Transformer Generator (OTGen) [SKG88, LH90], allows for arbitrarily complex transformations on the contents of individual objects as well as the database as a whole.

3.3.1 Object evolution strategies

Depending on the time objects are udpated, one distinguishes between immediate (eager) update, deferred (lazy) update, and no update (screening). Eager updates convert all objects to the new schema before allowing access to the database to continue. For large object stores an immediate update of all the objects renders the system unavailable for an extensive period of time which makes this approach unacceptable for certain applications. On the other hand, after the update is completed the entire object store is in a consistent state and can be used without further performance degradation. In particular, no knowledge about previous versions need be retained and the system behaves as if it was developed from the ground up. This approach has been implemented by the GemStone system [PS87, BMO+ 89], and by ObjectStore [LLOW91]. Lazy updates delay the conversion of the objects until the time when they are actually accessed. In e ect, the physical database update is separated from the logical database update. This allows the system to continue to function with a minimal period of interruption. Lazy conversion requires a test upon every access to the object store whether the accessed object has been converted or not. If the object is not up-to-date, then the conversion procedure is invoked. Since objects may date back several versions, a history of all updates including their conversion functions needs to be retained. Moreover, since it is hard to determine when all objects have been converted, testing continues even after the entire object store has been converted. Thus, the disadvantage of lazy conversion is that execution speed is compromised on a long term basis, which might be unacceptable in certain applications where degradation of execution speed during critical phases is not tolerable. The lazy update approach has been taken by the ORION system [BKKK87], and was also proposed for the GOODSTEP project [FMZ94b]. An in-depth analysis of lazy updates and their implementation has been described by Ferrandina et al. [FMZ94b, FMZ94a]. Screening also separates between physical and logical database update, but does not perform any physical updates. Whenever an object is fetched from the object store it is converted to its new type; the actual physical update is screened from the database. This approach is advantageous during experimental and testing phases of software evolution since it facilitates the reversion to the

29

3.3. Object Evolution

original state of the system. The disadvantage is that run-time performance is severely sacri ced since every access to an object is followed by a conversion, possibly over several revision steps. Screening is often used in connection with versioning approaches to database evolution which were brie y discussed in Section 3.1. With object versioning, each evolution step simply de nes a new version of the schema and the objects without a ecting older versions. Thus, old schemas and objects persist inde nitely, and are tagged with a particular schema version number. It is then the responsibility of the run-time system to emulate (or simulate) the new interface to old objects. A combined screening/versioning approach has been implemented in the Encore system [SZ86] and has also been suggested by others [Cla92]. A qualitative model that allows us to compare the above three object evolution strategies is developed in the next subsection. A very versatile approach that integrates the three strategies has been proposed for the OBST database system by Schiefer [Sch93]. Schiefer also provides an extensive comparison and discussion of object evolution capabilities in current OODBs. A problem that has not been addressed in the literature is how sharedness of objects is a ected by object conversion. Ideally, shared objects should remain shared after conversion.

3.3.2 A qualitative model

To give an intuition of the tradeo s involved in the above update strategies, we develop and analyze the following simple qualitative model. The model considers only the transition from one version to the next such that any object is either in the current version, or can be converted to the current version by the application of a single conversion function. Let there be N + 1 objects o0 ; : : :; oN in the object store, and let t(i) be the total time for accessing object oi using a lazy update strategy. The access time consists of three parts

t(i) = tacc (i) + tcheck(i) + (i)  tconv(i)

(1)

where the individual terms denote the following:

tacc(i) = access time to fetch object oi from the database tcheck(i) = time to check the version number of object oi tconv(i) = time to convert object oi to latest version ( (i) = 1 if version of object oi is not current 0 otherwise

The version check is typically independent of the object that is being checked, tcheck(i) = tcheck; 8i. Both tacc(i) and tconv(i) are typically dependent on the size or the class of the object. In a rst approximation, they will be considered constant: tacc (i) = tacc; tconv(i) = tconv; 8i. The

30

Chapter 3. A Framework for Evolution

magnitudes of the three constants compare as follows: tacc  tconv > tcheck. Equation 1 can now be rewritten.

t(i) = tacc + tcheck + (i)  tconv

(2)

Let F (i) be the relative frequency of object oi (0  i  N ) being accessed by a given application. P The values F (i) are normalized such that Ni=0 F (i) = 1. In the simplest case, the number of accesses to object oi can be modeled as Bernoulli trials with a binomial distribution. If the number of accesses to the database is high, then Bernoulli's law of big numbers says that the relative frequency F (i) gets arbitrarily close to the probability pi that object oi is accessed. Thus, we have pi  F (i), and de ne qi = 1 ? pi. According to the binomial distribution, the probability Pi(n; k) of object oi being accessed k times within the rst n database accesses after a conversion is !

Pi(n; k) = n pki  qin?k (3) k Now we can de ne Pi (n) to be the probability that object oi has not been accessed during the rst n database accesses (k = 0). With Equation 3 it follows that Pi (n) = Pi (n; 0) = qin = (1 ? pi )n  (1 ? F (i))n (4) Pi (n) expresses the probability of (i) = 1 after n accesses, and 1 ? Pi (n) expresses the probability of  (i) = 0. With this the expectation value T (n) of the database access time t(i) after n accesses, weighted over all objects oi , can then be computed as follows. T (n) = =

N X

i=0 N X i=0

F (i)  [Pi(n)(tacc + tcheck + tconv) + (1 ? Pi (n))(tacc + tcheck)] = F (i)  [tacc + tcheck + Pi (n)tconv]

= tacc + tcheck + tconv  = tacc + tcheck + tconv 

N X i=0 N X i=0

F (i)Pi(n) F (i)  (1 ? F (i))n

(5)

Equation (5) is now analyzed for two di erent access frequency distributions F .

Uniform distribution

First, the simplifying assumption is used that the access frequency of the objects is equally distributed; that is, F (i) = N 1+1 , and henceforth 

N

T (n) = tacc + tcheck + tconv  N + 1

n

(6)

31

Expected access time after n previous accesses

3.3. Object Evolution

T(n)

N = 1000 screening

F(i) = 1/(N+1)

t

t

deferred conversion F(i) ~ exp(−(i^2)/(2s^2)) s = 200

conv

check immediate conversion t acc

1000

2000

3000

n

number of accesses

Figure 6: Expected access times for di erent evolution strategies Plotting this function for N = 1000 is shown in Figure 6. The gure illustrates how the expected access time decreases steadily during the operation of the database. This is intuitively clear: the more objects have already been accessed (and converted), the less objects need be converted on subsequent accesses. For large n, the expected access time converges towards the value tacc + tcheck. The expected access times for the other two object evolution strategies are also evident in the gure. Immediate conversion results in no performance penalties during operation such that every access takes only tacc time (lower line). Screening carries full performance degradation such that every access takes tacc + tcheck + tconv time (upper line).

Normal distribution For the second access frequency distribution, it is assumed the objects are accessed according to a normal distribution. i2 (7) F (i) = 1  exp? 22

M

P

where M is a normalizing constant such that Ni=0 F (i) = 1, and  is the standard deviation. This distribution models an application that accesses objects with indices between 0 and  more often than the others, with object o0 being accessed most often. For this distribution, Figure 6 shows a numeric solution of Equation (5) with  = 200. The expectation value of the access time drops more quickly since with a normal distribution it is more likely that objects are accessed twice or more. In general, the smaller the value of  the faster the expected access time drops.

32

Chapter 3. A Framework for Evolution

Another important aspect for the comparison of the three strategies is the expected total time Ttot(n) to perform n accesses to the database. Ttot(n) is computed by summing up all expected access times T (k) for k = 0 to n ? 1. For the immediate conversion strategy, the time to initially convert the entire database needs also be counted. Immediate:

TtotI (n) = (N + 1)  (tacc + tconv) + n  tacc

Deferred:

TtotD (n) = n  (tacc + tcheck) + tconv 

Screening:

TtotS (n) = n  (tacc + tcheck + tconv)

N nX ?1 X k=0 i=0

(8)

F (i)  (1 ? F (i))k )

(9) (10)

Using an equal distribution F (i) = N 1+1 the total time for deferred update simpli es to

Ttot

D (n)

= n  (tacc + tcheck) + tconv 

nX ?1  k=0

N

N +1

k





N

= n  (tacc + tcheck) + tconv  (N + 1)  1 ? N + 1

n 

(11)

expected total access time for n accesses

T (n) tot t >> t > t acc conv check

screening

immediate conversion deferred conversion F(i) = 1/(N+1)

n number of accesses

Figure 7: Expected total access times for di erent object evolution strategies A qualitative picture of the total times for the three strategies is given in Figure 7. In this simpli ed model, Ttot for both screening and immediate updates is linear in n with slopes tacc + tcheck + tconv and tacc, respectively. Ttot for deferred updates is slightly curved, but asymptotically converges into a linear function with slope tacc + tcheck. Setting TtotI (n) = TtotD (n) and solving for n yields an estimate when the immediate conversion of the entire database is amortized in comparison with the deferred conversions. The number of

33

3.4. Method Evolution

accesses for which the two times are equal will be denoted n? . For n ?! 1, the last term in Equation 11 goes towards tconv  (N + 1), as expected, since the two strategies ultimately perform the same number of conversions. Thus for large n, the di erence between the two strategies lies only in the fact that deferred conversion performs a check on every access. Because tacc  tcheck, n? must be large such that the last term in Equation 11 will be close to tconv  (N +1). An approximate condition for amortization can thus be given as (N + 1)  tacc = n  tcheck Hence an approximation for n? is the following.

n?  (N + 1)  ttacc

check

(12)

Since tacc is typically three orders of magnitude greater than tcheck, n? is quite large. This concludes the analysis of object conversion strategies. The focus of this work is on method evolution. For that purpose, the actual object evolution strategy is not essential. Important is only the logical database update which constitutes the methods' view of the object store. For the logical database update it suces to specify the logical object conversion function present in all of the above approaches. The indication of when the physical update takes place is irrelevant in this context.

3.4 Method Evolution Due to the dependency between the schema and the program (Figure 5), the evolution of the schema will often result in inconsistencies of the methods de ning the program. The consistency of the methods with both the de ning schema, and the employed programming language is referred to as behavioral consistency . Behavioral consistency is constrained by several factors. First, the schema xes the allowed signatures of the methods. Through scoping and access rules the schema also determines which references may be used within the method bodies. Finally, syntactic aspects are controlled by the programming language. All these aspects are typically checked by a compiler. A method is said to be behaviorally consistent with its de ning schema if it satis es all consistency conditions de ned by the schema and by the programming language. During evolution, the methods must be behaviorally consistent at every quiescent state of the system. Method evolution is the process of changing the methods of an object-oriented system to restore the behavioral consistency. Method evolution is required by behavioral inconsistencies due to schema evolution. But behavioral consistency is not sucient to constrain method evolution succinctly enough. For example, even if a method compiles correctly after an update, it still could fail at run-time, or exhibit unexpected behavior. Thus, additional semantic constraints are needed to guide method evolution.

34

Chapter 3. A Framework for Evolution

This is intuitively clear since the transformation speci cation for evolving the schema contains only semantic information on how to change the structure of the schema; little or no information is conveyed as to how to change the behavior. The additional semantic constraint used to guide method evolution in this work is behavioral equivalence , de ned formally in the next section. To guide the evolution of a system's behavior either more semantic information needs to be attached to the schema transformations, or the transformations need to change the formal speci cations directly. Such approaches make use of formal speci cation languages such as Object-Z [CDD+89] or OOVDM [LS93]. Changing a method's behavior might be the ultimate goal of a schema update. However, for our purposes we assume that the system should not change its behavior due to a schema update. Thus, the goal of method evolution is to restore the system's original behavior prior to the schema update. Since the behavior of the system is exhibited by the objects in the system's object store, method evolution is heavily constrained by the updated schema and the updated object base. Unlike structural inconsistency, none of today's database systems provides a mechanism to restore behavioral consistency. Some papers propose possible mechanisms, but none of them have been implemented [Opd92, Zic92, Gri91, Wal91b, Bar91] There are primarily two reasons why changes to methods have only recently begun to be considered by the research community. The rst is that for database systems focus primarily on the data; the update of programs is therefore left to programmers. This may explain why the issue has not yet been attacked by the database community. The second reason is simply hardness. There is no doubt that capturing the intention of schema changes and assessing their consequences for instances and especially their methods is very dicult. Nevertheless, the ability to propagate schema changes to all the a ected components of a system is of great importance.

3.5 Evolution Framework The evolution framework that we now present characterizes and supports the process of transforming an object-oriented system from one consistent state to another while preserving the behavior of the system. This section de nes the notions of system, system consistency, behavioral equivalence, and proposes a model for the evolution process.

3.5.1 System consistency

For a given object-oriented data model, let S be the set of all consistent schemas in the data model. Furthermore, let O be the set of objects structurally consistent with a given schema S 2 S . O is the intension de ned by S. The set of objects O  O actually present during the operation of the system builds the extension of the schema. Finally, let P L be the set of programs behaviorally consistent with a given schema S 2 S , and a programming language L. Every program P 2 P L

35

3.5. Evolution Framework

consists of a set of methods. For the purpose of the framework, an object-oriented system is a triple hS; O; Pi where the three relevant components are (1) a schema S, (2) a set of objects O consistent with S, and (3) a program P consistent with S and with the programming language L. The overall consistency of a system relies on both the inter- and intra-component consistencies which depend on the underlying data and programming language model. As was discussed above, the consistency of a schema is typically de ned in terms of a set of schema invariants that must hold in every quiescent state of the system. Rather than checking schema consistency after performing a schema transformation, a set of preconditions is de ned for a given transformation which ensures that schema consistency will be maintained.

σ S

S’

τ consistent

O

consistent

computes with

ω

P

O’

computes with

P’

π

Figure 8: Schema, object and program transformations The schema S constrains the consistency of both the objects O and the program P (see Figures 5 and 8). The set of objects O is also referred to as the environment since it contains objects from both the persistent object store and the run-time environment. More precisely, while the system is not operational, O is identical with the database; however, during operation, O also contains objects created by the program at run-time. Object consistency is additionally constrained by the employed data model which may enforce data integrity constraints conjointly with the schema. Program consistency is additionally constrained by the rules of the programming language. A notion of consistency between programs and objects is de ned by transitivity through their consistency with the schema. Of course, system consistency does not imply program correctness.

36

Chapter 3. A Framework for Evolution

3.5.2 Behavioral equivalence

Figure 8 illustrates the evolution of an object-oriented system hS; O; Pi characterized by the three interrelated components schema, objects, and program. Given a schema transformation  : S ! S , maintaining the consistency of the system makes it necessary to apply an object transformation ! : O ! O0 , and a program transformation  : P ! P 0 , where O 0 is the intension of the new schema and P 0 is the set of programs consistent with the new schema. The transformation of the system as a whole is thus described by the triple  = h; !; i. Figure 9 illustrates the notion of behavioral equivalence. The behavior preservation problem presents itself as follows: given the software system hS; O; Pi, and the transformations  , and !, nd a program transformation  such that the transformed program computing with the transformed objects \behaves like" the original program computing with the original objects. The meaning of \behaves like" needs to be clari ed. A diculty arises when comparing the two environments resulting from the programs P and P'. The two environments cannot be directly tested for equality since they belong to two di erent class structures, namely S and S'. What is actually meant is that the output of P must correspond to the output of P', with the correspondence relation being the object transformation ! . S O*

O Run(P,O)

ω

ω S’ Run(P’,O’)

O’ = ω(O)

O*’ = ω (O*)

Figure 9: Behavioral equivalence Formally, behavioral equivalence can be de ned by requiring that the diagram in Figure 9 commute. Assume  is a system transformation  = h; !; i in a software system with schema

3.5. Evolution Framework

37

S and program P. Let O and O be two environments for schema S such that O is the resulting environment after executing P on the run-time environment initialized by O. If Run is the partial function that executes the program P on the object environment O, then the resulting environment is denoted by O = Run(P; O). Similarly, let O0 and O0 be two environments for schema S0 = (S) such that O0 = Run (P0 ; O0 ). The diagram in Figure 9 commutes if the following equation holds for all environments O consistent with S. Run(P0; ! (O)) = ! (Run(P; O))

The above notion of behavioral equivalence is a partial equivalence; that is, the equality only holds when Run terminates. This form of equivalence has also been termed weak equivalence [BLY93].

3.5.3 Designing a change management mechanism

Designing a change management mechanism that supports the maintenance of consistency and behavioral equivalence during evolution needs to address all of the issues discussed so far. Most importantly, evolution must be system-wide. The goal of the system transformations  = h; !; i is to restore consistency and maintain behavior. Consequently, when de ning a type of schema transformation, object and program transformations must be de ned along with it, and it must be proven that consistency and behavioral equivalence hold each time a speci c instance of the transformation is applied. The following list summarizes the required design decisions. 1. De ne or choose a suitable data model to express schemas. 2. De ne a set of consistency rules that de ne the conceptual consistency of schemas in the data model. 3. De ne structural consistency conditions for objects of the schemas. 4. De ne or choose an object-oriented programming language L. 5. Specify a set of primitive schema transformations. For each of the primitive transformations follow the process below: (a) De ne schema transformation  such that S0 = (S). De ne a set of preconditions that guarantees S' is well-formed if the preconditions are satis ed. (b) De ne object transformation ! such that O0 = ! (O). Prove that ! (O) is consistent with S'.

38

Chapter 3. A Framework for Evolution

(c) De ne program transformation  such that P0 =  (P). i. Prove that  (P) is consistent with S' and language L. ii. Prove that P and (P) are behaviorally equivalent. Given a system hS; O; Pi with a schema S, a program P consistent with S, and an environment O consistent with S, and given a speci c instance of a system transformation  = h; !; i, the following model of the evolution process performs the transformations outlined in Figure 8 to restore consistency. 1. Check that S and the transformation satisfy the preconditions. Either accept, reject, or clarify the transformation depending on the result of the check. 2. Apply the schema transformation  such that S0 =  (S). 3. Apply the object transformation ! such that O0 = ! (O). 4. Apply the program transformation  such that P0 =  (P). The design of the change management mechanism guarantees that the resulting system  (hS; O; Pi) is consistent and behaviorally equivalent. The above list of design decision gives a rough outline of the remainder of the thesis. In Chapter 4, an object-oriented data model is de ned to specify the structure of schemas, and a set of rules is given that de nes the conceptual consistency of schemas. In addition, the structural consistency rules for objects are included. Chapter 5 de nes a set of primitive schema transformations with preconditions, as well as a set of associated object transformations. Finally, Chapters 12, 13, and 14 de ne associated program transformations for the object-oriented programming languages of Adaptive Software, CLOS, and C++, respectively, and prove behavioral consistency.

Chapter 4

The Kernel Data Model There are considerable di erences among the various current object-oriented data models and their support of the object-oriented paradigm [Boo91, RBP+ 91, JCJO 92]. The richness of the data model impacts directly its modeling power. On the one hand, the richer the data model (the more concepts it contains), the more powerful it is and the more it faciliates the modeling of complex real-world situations. On the other hand, the more concepts a data model contains, the more complex it is and the harder it is to understand. This work strikes a di erence between the two extremes by de ning an object-oriented kernel model which is simple enough to be easily generalized to other data models and yet is able to capture the key features of the object-oriented paradigm. It is called the Kernel Model because it is restricted to a kernel of concepts common to many object-oriented data models. The Kernel Model is thus easily applicable to many of the above data models. This chapter provides a formal de nition of the model which will be used in most parts of the thesis. Parts of the Kernel Model are based on the Demeter data model [LX93b, SLHS94] but extended and adapted to move its terminology into the object-oriented mainstream. The chapter is organized as follows. Section 4.1 presents the Kernel Model Schema which is a high level design description of the system. The structural aspects of a Kernel Model Schema are de ned as class graphs in Section 4.2. Object graphs, discussed in Section 4.3, are used to describe instances of classes in the class graph.

4.1 The Kernel Model Schema As in all object-oriented systems, the conceptual entities of the domain are modeled as objects. Objects encapsulate their state and have a well-de ned behavior. An object's state is represented 39

40

Chapter 4. The Kernel Data Model

by the values of its instance variables which themselves can be objects. An object's behavior is represented by methods which consist of code that can manipulate the state of the object. 'Similar' objects are grouped together into a class. All objects belonging to the same class contain the same instance variables and can execute the same methods. If an object belongs to a class it is called an instance of that class. The domain of a class's instance variable is again a class. A class can share its structure and behavior with one or more other classes through inheritance. Inheritance de nes an \IS-A" or \kind-of" relationship between classes in which a subclass inherits from one or more superclasses. It is not allowed for a class to inherit from itself. A class can be either concrete or abstract. An abstract class is not allowed to have instances. Its only purpose is to serve as common repository of state or behavior for subclasses of it. We apply the simplifying rule that all superclasses are abstract. This rule has several advantages and disadvantages, but in general it is considered bene cial to inherit from abstract superclasses [Hur94]. A detailed discussion of this abstract superclass rule is given in Chapter 8. The structure of an object-oriented system can be described by two hierarchies, the inheritance hierarchy (also called the class hierarchy ) and the reference hierarchy . The inheritance hierarchy is built up through the inheritance relationship between classes, and the reference hierarchy is built up through the instance variables of classes. We say a class A references another class B if the domain of one of A's instance variables is B. Sometimes, we also say B is a part-of A (see also [Boo91]). A kernel model schema is de ned to be the collection of all class de nitions of a system which includes for each class: i) its immediate instance variables ii) its interface; that is, the signatures of its immediate methods, and iii) its superclasses. The following is a set of consistency rules, called invariants , for kernel model schemas. The invariants must hold for every schema. During schema evolution, the invariants must hold at any quiescent state of the schema; that is, before and after a schema transformation.

Class hierarchy invariant The class hierarchy is a directed acyclic graph (DAG). Unique name invariant All class names are unique. In addition, for each class, all instance variables and all methods, whether directly de ned or inherited, have distinct names within the class scope.

Full inheritance invariant A class inherits all instance variables and methods from each of its superclasses.

Abstract superclass invariant All superclasses must be abstract. The full inheritance invariant and the unique name invariant imply that no rede nition of

4.2. Class Graphs

41

instance variables or methods in subclasses may take place. A method may have di erent implementations in subclasses but its interface must be exactly the same as in the superclass where the method was de ned. In particular, neither the covariant nor the contravariant typing rules may be applied. This rule is also applied in C++ [ES90], where virtual functions may be overridden in subclasses. We will now describe the structural part of a Kernel Model schema as a graph, called the class graph. This structural part is common to all language models that will be considered in later chapters. The complementary behavioral part of a Kernel Model schema depends on the employed language model and will therefore be supplied later. For example, in the adaptive software model, the behavioral part is de ned through propagation patterns.

4.2 Class Graphs A class graph captures the inheritance and reference hierarchies of a schema. It has two kinds of labeled vertices and two kinds of edges. Abstract and concrete vertices represent abstract and concrete classes, respectively. A reference edge is labeled and represents the reference relationship between two classes. The label on a reference edge represents the name of the instance variable incorporating the reference. The source vertex of a reference edge represents the class de ning the instance variable; the target vertex represents the domain of the instance variable. An inheritance edge is unlabeled and represents an inheritance relationship between two classes. The source vertex of the inheritance edge represents the subclass; the target vertex represents the superclass. Table 9 in the appendix gives an overview of the employed terminology. Definition 1 (Class

graph) A class graph, denoted G = (V; E; L), is a nite, labeled, directed graph, where V is a set of vertices, E  V  V is a set of edges, and L is a set of labels. Each vertex represents either an abstract or

a concrete class, each edge represents either an inheritance or a reference relation between classes, and each label represents either a class name or a reference name. The set of vertices V is partitioned into three disjoint sets V = V A [ V C [ V P , where V A represents a set of abstract classes, V C represents a set of concrete, non-primitive classes, and V P represents a set of concrete, primitive classes. The predicate Abstract is true of vertices that represent abstract classes and is false otherwise. The set of abstract vertices is de ned as V A = fv 2 V j Abstract(v)g. The set of primitive vertices is de ned as V P  fInteger; Real; String; Booleang. Finally, V C = fv 2 V ? V P j :Abstract (v )g contains all concrete vertices not in V P . Vertices are labeled by the function Classname: V ?! LV that assigns each vertex v a unique class name Classname(v ) 2 LV where LV  L. It is assumed that (LV; ) is a total order.

42

Chapter 4. The Kernel Data Model

Each edge is labeled by an element ` of LE [ fg, where LE  L. If an edge e = (u; v ) has a label ` =  then e is called an inheritance edge indicating that the class represented by u is a subclass of the class represented by v . Inheritance edges are restricted to (V C [ V A)  V A; that is, primitive classes cannot inherit from other classes and superclasses must be abstract. The set of inheritance edges is denoted EI . If an edge e = (u; v ) has a label ` 2 LE then e is sometimes written (u; `; v ), and is called a reference edge indicating that the class represented by u has an instance variable named ` with a type represented by v . Reference edges are restricted to (V C [ V A)  V ; that is, primitive classes cannot have references. The set of reference edges is denoted ER. It is assumed that (LE; ) is a total order so that for each vertex its outgoing reference edges are ordered. The set of inheritance edges EI and the set of reference edges ER build a partition of the set of all edges: E = ER [ EI . In summary, for a class graph G = (V; E; L), the following set inclusions hold: V = V A [ V C [ V P ; E = ER [ EI ; L = LV [ LE . 3 Since there is a bijection between vertices in a class graph and their labels (Classname), the meaning of vertex in a class graph will be overloaded for notational convenience. For example, the formula v 2 VG =) v 2 VG should be interpreted as v 2 VG =) 9u 2 VG : ClassnameG (v ) = ClassnameG (u). The two kinds of edges of a class graph induce two kinds of reachability relations on classes. The inheritance reachability relation connects direct sub- and superclasses in a class hierarchy. It can also be used to compute, for a given class, the set of all its sub- and superclasses. 0

0

0

Definition 2 (Inheritance

reachability)

Let G = (V; E; L) be a class graph. The inheritance edges de ne a relation =) on (V C [ V A)  V A such that for every u 2 V and v 2 V A, u=)v if and only if (u; v ) 2 EI . The inheritance reachability relation, denoted by =+) is the transitive closure of the =) relation. A vertex v 2 V A is considered to be inheritance reachable from vertex u 2 V if and only if u=+)v. ? relation is de ned to be the re exive, transitive closure of the =) For completeness the =) relation. 3 Definition 3 (Sub-

and superclasses)

Let G = (V; E; L) be a class graph and v 2 V a vertex in G . The set of all direct sub- and

43

4.2. Class Graphs

superclasses of v , and the set of all sub- and superclasses of v are de ned as follows. DirectSubclasses (v ) = fuju=)v g DirectSuperclasses (v ) = fujv =)ug Subclasses (v ) = fuju=+)v g Superclasses (v ) = fujv =+)ug ? vg ? ug Subclasses? (v ) = fuju=) Superclasses? (v ) = fujv =)

3

It is sometimes important to know what kind of instances can be assigned to a variable of a given class. In general, these are instances of the given class itself and all of its subclasses. In the Kernel Model, these are only instances of those leaf classes in the class graph which are subclasses of the given class since only leaf classes are concrete. Definition 4 (Class

type)

The set of all concrete subclasses of a class is called its class type. The class type operator CType maps a given vertex to the set of all its concrete subclasses. Similarly, CType maps a given edge to the set of edges connecting all the concrete subclasses. It is straightforward to extend CType to sets of vertices and sets of edges. ? vg CType(v ) = fu 2 V C ju=) CType(u; `; v ) = f(u0 ; `; v 0)ju0 2 CType(u) and v 0 2 CType(v )g

3 Definition 5 (Reference

reachability)

Let G = (V; E; L) be a class graph. The reference relation expresses direct and inherited references between vertices in G . For every u; v 2 V the reference relation ?! on V  V is de ned by u?!v if and only if one of the following holds.

9` 2 LE : (u; `; v) 2 ER (direct reference) 9w 2 V A; ` 2 LE s.t. u=+)w and (w; `; v) 2 ER (inherited reference) ` v. If it is necessary to indicate the label involved in the ?! relation, the relation is written as u?! + is the transitive closure of the ?! relation. A The reference reachability relation, denoted by ?! + v. vertex v 2 V is considered to be reference reachable from vertex u 2 V if and only if u?! ? relation is de ned to be the re exive, transitive closure of the ?! For completeness the ?! relation. 3

For a given class, it is interesting to ask what the set of its direct references, and what the set of all (direct and inherited) references are. Those sets are given by the following de nitions.

44

Chapter 4. The Kernel Data Model

Definition 6 (References)

Let G = (V; E; L) be a class graph and v 2 V a vertex in G . The set of all direct references of v is de ned as: DirectRefs(v ) = f(`; u)j(v; `; u) 2 ERg The set of all (direct and inherited) references of v is de ned as: ` ug Refs(v ) = f(`; u)jv ?!

The set of all reference labels of V is de ned as: RefLabels(v ) = f`j(`; u) 2 Refs(v )g

3 Class graphs must satisfy the Kernel Model schema invariants. Two of the invariants are automatically satis ed by the way class graphs are de ned: (1) the formal de nition of references and the Refs function guarantees the full inheritance invariant, and (2) the abstract superclass invariant is automatically satis ed by the de nition of class graphs and the restriction on their inheritance edges. Finally, the notion of well-formedness of a class dictionary graph ensures that the other three schema invariants are satis ed. Definition 7 (Well-formedness of

class graphs) A class graph G = (V; E; L) is considered to be well-formed if the following constraints are satis ed.

Cycle-free inheritance constraint: 8u; v 2 V : u=+)v implies u 6= v Unique reference label constraint: 8v 2 V C ; x; y; w 2 V ; ` 2 RefLabels(v) : ? x ^ v =) ? y ^ (x; `; w); (y; `; w) 2 ER implies x = y v=) Unique class label constraint: 8v; w 2 V : v =6 w =) Classname(v) =6 Classname(w) 3 For the remainder of this paper all class graphs are assumed to be well-formed unless they result from a class graph transformation in which case their well-formedness needs to be proved.

45

4.3. Object Graphs

4.3 Object Graphs This section de nes the notion of object as it is used in subsequent chapters. An object is an instance of a class in a given class graph. Vice versa, a class graph de nes a set of objects. For a speci c class graph, valid objects must be structurally consistent with the class graph. Objects are embedded in bigger entities, called object graphs, which describe the structure of a group of objects mathematically. Definition 8 (Object

graph)

An object graph is a nite connected directed graph = (V ; E ; L ) representing a group of objects inter-connected through reference relations. Each vertex represents an object, and the function Class maps each object vertex to a vertex in some class graph representing \its class". Since abstract classes do not have instances, Class always maps to a concrete class. For a given vertex o, if Class(o) 2 V P then o represents the primitive value, if Class(o) 2 V C then o represents a unique object identi er (OID). Each edge in an object graph is labeled by an element of L . The edge (u; `; v ) indicates that the object represented by u has a reference to an object represented by v . For each vertex u and each label ` 2 L , there is at most one outgoing edge from u with label `. Given a class graph G = (V; E; L) and an object graph , is (structurally) consistent with G if for every vertex o of : (1) Class(o) 2 V C [ V P ? v (2) 8(o; `; o0) 2 E : 9(`; v ) 2 Refs(Class(o)) such that Class(o0)=) ? v (3) 8(`; v ) 2 Refs(Class(o)) : 9(o; `; o0) 2 E such that Class(o0)=) Given an object graph consistent with G and rooted at a vertex o, is considered a v -object ? v , for some vertex v 2 V . Simultaneously, is also considered a w-object graph if Class(o)=) graph for all superclasses w of v . The set of all object graphs consistent with class graph G is called Objects (G ). An object graph ' is an extension of object graph , written as  0 if and only if: (1) V  V

(2) (o; `; o0) 2 E implies (o; `; o0) 2 E

0

0

3

How does an object graph relate to single objects as referred to in programming languages like C++ or Smalltalk, or in database systems? There, an object is a unit of storage which contains all

46

Chapter 4. The Kernel Data Model

data members either directly or as pointers. In the object graph model, values of primitive vertices are contained directly; references to other vertices are contained through pointers. A conventional object is represented by a vertex in the object graph together with all its outgoing edges. A consistent object graph is therefore often a group of objects because all its immediate and nested referenced objects are forced to be included by the consistency constraint. Thus an object graph has similar semantics as what has been called in the literature complex objects [LP83] or composite objects [BKK+ 85].

Example 4.1 Consider Fig. 10 for an example of an object graph. The left-hand side pictures

a class graph; the right-hand side pictures a FurnaceTempSensor-object graph consistent with the class graph. The object node i69 together with the value 900 for the trigger reference and the pointer temp to the object vertex i144 represent a FurnaceTempSensor-object in the sense of C++.



Class Graph

Object Graph FurnaceTempSensor

trigger

Sensor

Real

i69 value

temp Kelvin

TempSensor

temp

Temperature

trigger

900 FurnaceTemp Sensor−object

i144 value

ProbeTemp Sensor

FurnaceTemp Sensor

Kelvin

Celsius

749

Figure 10: Example of object graph consistent with class graph While class graphs de ne the intension at the object level, objects build the extension. The set of all actual objects at a certain point in time is captured by the notion of environment. An environment describes the persistent object store, the extension of the class graph. Definition 9 (Environment)

An environment E consists of a set of object graphs, representing both a persistent object store and a run-time environment. Every environment contains a distinguished object stream of type String that simulates output to a terminal device. There is an accessor function EnvLookup which returns the object vertex referenced by a given variable with an environment. Given an object vertex o in an object graph in E , and a label `, EnvLookup(o:`; E ) returns the object vertex o0, where the edge (o; `; o0) is in .

4.3. Object Graphs

47

An environment E is consistent with a class graph G if all object graphs in E are consistent with G . An environment E ' is an extension of environment E , written as E  E 0 if and only if there is an isomorphism  : E ?! E 0 such that for each object graph in E there exists an object graph

0 = ( ) in E 0 with  0. For two given class graphs G and G ', an object transformation ! transforms environments consistent with G to environments consistent with G '. Any object transformation must leave the stream object invariant. 3

48

Chapter 4. The Kernel Data Model

Chapter 5

Primitive Transformations He found himself transformed in his bed into a gigantic insect. Franz Kafka | Metamorphosis Schema transformations are the force that drives the evolution process. Any software change management system must provide an interface that allows the designer to evolve the underlying application. The kernel of this interface consists of a set of primitive schema transformations. This set determines the capability of the change management system which, in turn, is dependent on the scope and goal of the system. Two kinds of primitive transformations are distinguished: basic and composite. The basic primitive transformations are most primitive from a data model point of view. All other transformations can be composed of these basic primitive transformations. Very often, some of these composed transformations are also provided by the change management interface. They are then called composite primitive transformations . All other transformations must be composed manually from the composite and basic primitive transformations. A collection of primitive transformations can be categorized according to several criteria the most important of which are completeness, correctness, minimality, information conservation, object preservation, and object extension. Other characteristics of transformations are granularity and transactional character. This chapter provides an analysis of the various aspects of primitive transformations. The chapter is organized as follows. Sections 5.1 and 5.2 discuss basic and composite primitive transformations. The core of the chapter is Section 5.3 which de nes a set of primitive transformations, a subset of which will be utilized in subsequent chapters. Section 5.4 characterizes individual 49

50

Chapter 5. Primitive Transformations

transactions and collections of transactions according to several criteria. Section 5.5 discusses issues pertinent to the transactional properties of transformations. Finally, a section on related work concludes the chapter.

5.1 Basic Primitive Transformations As was shown in Figure 3 on page 20, the data model de nes the intension at the schema level. In practice, this means that the data model provides the building blocks for the schema design. Consequently, the smallest elements of a schema that can be modi ed by schema transformations are these primitive entities provided by the data model. The transformations that modify such primitive data model entities are called basic primitive transformations. Since they modify the smallest schema elements, basic primitive transformations are atomic; that is, they cannot be composed of smaller transformations. Basic primitive transformations depend heavily on the employed data model. The more powerful and expressive the underlying data model the more detailed the primitives must be in order to be complete. For a data model to be object-oriented it must contain certain basic entities like class, method, attribute, inheritance relation, or delegation relation (see the rst row in Table 1). Each of these entities can be updated by one of the meta primitive transformations : add, delete, change, and rename. A full set of basic primitives can thus be de ned by the cross product of the set of data model entities and the set of meta primitives as illustrated for a sample object-oriented model in Table 1. Note that \unnamed" entities like the initial value of an attribute, the access control properties, and the inheritance relation cannot be renamed.

Meta primitive Model entitity

Add (Create) add class add method

Remove (Delete) delete class delete method

Attribute add attribute Initial value add value Access control add controla Inheritance relation add relation Composite relation add relation

delete attribute delete value delete controlb delete relation delete relation

Class (empty) Method signature

Change, Rename rename class rename method, args; change return and arg types rename attribute, change type change value change control change relation change relation

a override default b use default

Table 1: Data model and meta primitives yield basic primitive transformations Since the data model is nothing else but the meta schema, we can also say that the cross product of the meta schema entities with the meta primitives builds the set of basic primitive

5.2. Composite Primitive Transformations

51

transformations of the schema. Note that the set of basic primitives of Table 1 is not minimal with respect to completeness; we could do without the change and rename meta primitives. However, they add some important semantic variations to the set. A change (or a rename) operation is not equivalent to a remove followed by an add. The di erence is that changing (or renaming) an entity maintains its identity while removing and adding it loses its identity. For example, renaming an attribute clearly maintains the attribute's identity within the class and within all the objects, suggesting the attribute's value is retained in the objects; removing the attribute and then adding it with a di erent name would suggest that all objects need be updated and the attribute's value newly initialized. Basic primitive transformations build the backbone of the change management interface in the sense that they can be used to compose other transformations. This will be further explored in the next section. Since basic primitive transformations can potentially violate conceptual consistency, each one must be formulated with a set of preconditions which guarantees that the transformation preserves conceptual consistency.

5.2 Composite Primitive Transformations The basic primitive transformations build a complete set of schema transformations. In principle, all other schema transformations can be composed from the primitives. However, often a selected set of composed transformations is also provided by the change management interface. Such transformations are then called composite primitive transformations. All other transformations must be composed manually from the composite and basic primitive transformations. There are primarily three reasons why it is bene cial to provide composed transformations directly through the change management interface instead of using a sequence of basic primitive transformations. 1. Composite primitive transformations are more powerful. The designer does not need to go through every step of the sequence of basic primitives to achieve the desired e ect. 2. Composite primitive transformations have coarser granularity and thus represent a higher level of abstraction. 3. Composite primitive transformations have often more meaningful semantics. For example, the semantics of distributing a common reference from a superclass to all its subclasses is clearly di erent from the semantics of deleting the reference from the superclass and adding new references with the same label to all subclasses. While the e ect on the schema as a set of classes is equivalent, the semantic di erence becomes apparent both at the schema level, when viewing the schema classes as objects, and at the data level, when considering the

52

Chapter 5. Primitive Transformations

impact on the objects (see Figure 4 on page 21). On the one hand, deletion and subsequent insertion creates new identities for the involved schema objects, while distribution maintains identities. On the other hand, deletion and insertion at the data level loses the information contained in the distributed reference while distribution preserves it. The term primitive transformations will refer to the combined set of basic and composite primitive transformations. They are called primitive since they need to be used to build all other transformations. In Section 3.1.3, a complementary method for change speci cation was brie y discussed. The method directly speci es the entire new schema without giving change information. The method is advantageous when the schema modi cation is extensive since the change information becomes large, and designing the new schema directly is easier. However, as a consequence the problem arises of how to decompose the large change into a sequence of primitive changes that then could be executed by the change management system. The system should be able to either perform an automatic decomposition, or support the designer in nding such a decomposition. Chapters 6 will present an algorithm to automacally decompse an object-extending transformation, and Chapter 7 presents a schema abstraction mechanism that can be used to decompose schemas into smaller ones.

5.3 A Collection of Primitive Transformations This section introduces an extensive set of primitive transformation for the kernel model schema presented in Chapter 4. The transformations are rst outlined informally and then de ned formally in subsequent subsections. A four-letter mnemonic acronym will be used with each transformation for notational convenience. Not all of the presented transformations will be usable for the behavior preserving framework in later chapters. Whenever a transformation forces the deletion of information from the object store, it is in general not possible to maintain behavior. For example, the deletion of a concrete class would cause the subsequent deletion of all instances of that class. Previous behavior could therefore not be preserved. The following list covers a wide variety of useful schema modi cations, and deliberately includes also transformations that do not allow maintenance of behavior.

Basic primitive class graph transformations The basic primitive class graph transformations can be obtained by the cross product of the basic class graph entities and the meta primitives (see Table 2). The basic class graph entities are the three kinds of vertices (concrete, abstract, and primitive) and the two kinds of edges (inheritance and reference).

5.3. A Collection of Primitive Transformations

53

Meta primitive Class graph entitity

Add Remove Change, (Create) (Delete) Rename Concrete vertex (empty) AddC DelC RenC Abstract vertex AddA DelA RenC Primitive concrete vertex AddC DelC |a Reference edge AddR DelR GenR,RepR,RenR Inheritance edge AddI DelI |b a not allowed b renaming not

meaningful

Table 2: Basic primitive transformations for class graphs The transformations are informally de ned as follows.

 Addition of concrete class (AddC) adds a new concrete class, without any incoming or outgoing edges. The class may be primitive or not.

 Deletion of concrete class (DelC) deletes a concrete class which must not have any incoming nor outgoing inheritance edges.

 Renaming of class (RenC) renames a class.  Addition of abstract class (AddA) adds a new abstract class to the class graph. The new

abstract class has neither incoming nor outgoing reference edges, and no outgoing inheritance edges. It may have incoming inheritance edges from some of the existing classes in the graph.

 Deletion of abstract class (DelA) is the opposite transformation of (AddA), where an abstract class is deleted from the graph. The class to be deleted must not have any incoming nor outgoing reference edges, nor outgoing inheritance edges.

    

Addition of reference relation (AddR) adds a reference edge between two existing classes. Deletion of reference relation (DelR) deletes a reference edge between two classes.

Renaming of Reference (RenR) renames the label of a reference edge. Addition of inheritance relation (AddI) adds an inheritance edge between two existing classes. Deletion of inheritance relation (DelI) deletes an inheritance edge between a sub- and a superclass.

 Replacement of reference relation (RepR) reroutes the target of a reference edge from one abstract class v to another abstract class w, where v and w have the same set of concrete subclasses.

54

Chapter 5. Primitive Transformations

 Generalization of reference relation (GenR) reroutes the target of a reference edge from a subclass to its superclass.

Composite primitive class graph transformations  Addition of Subclass (AddS) adds a new class as a subclass of an existing abstract class.  Abstraction of Common Reference (AbsR) abstracts a reference edge that a group of subclasses have in common, and moves the edge up the inheritance hierarchy to a common superclass.

 Distribution of Common Reference (DisR) performs class attening , where a reference edge is pushed down the inheritance hierarchy from a superclass to all of its direct subclasses.

 Telescoping of Reference (TelR) replaces a single reference edge between two vertices with two reference edges that route through another vertex.

 Telescoping of Inheritance (TelI) replaces a single inheritance edge between two vertices with two inheritance edges that route through another vertex.

Each of the transformations above will now be formally de ned for class graphs. For each transformations below, let G = (V; E; L) be the original class graph and E an environment consistent with G . Each of the following subsections is structured similarly. First, an informal de nition is given for the class graph transformation, followed by a formal de nition. Then, the preconditions are stated, and nally, the object transformation is given.

5.3.1 Addition of Concrete Class Addition of concrete class (AddC) adds a concrete class v . The class v is added \empty" without any outgoing reference or inheritance edges. Class graph transformation. The class graph transformation \addition of concrete class v" is performed by the operation   AddC(G ; v ) which produces a class graph G 0 = (V 0; E; L0) such that V 0 = V [ fv g and LV 0 = LV [ fClassname(v )g

Preconditions. v 62 V; :Abstract(v) Environment transformation. Any object graph in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = ! (E ) = E .

5.3. A Collection of Primitive Transformations

55

5.3.2 Deletion of Concrete Class Deletion of concrete class (DelC) deletes a concrete class v . The class v must not have any incoming reference edges, nor any outgoing inheritance edges. Class graph transformation. The class graph transformation \deletion of concrete class v" is performed by the operation   DelC(G ; v ) which produces a class graph G 0 = (V 0 ; E 0; L0) such that

V C 0 = V C ? fvg; LV 0 = LV ? fClassname(v)g; ER0 = ER ? f(v; `; w) j (`; w) 2 DirectRefs(v)g; ( if 9u1 ; u2 2 V; u1 6= v : (u1; `; u2) 2 ER 0 LE = LE LE ? f`g otherwise

Preconditions. 9v 2 V; :Abstract(v); Superclasses(v) = ; 8u 2 V; ` 2 LE :6 9(u; `; v) 2 ER Environment transformation. Any object graph in E that contains a v-object will no longer be consistent with G '. Assume Refs(v ) = f(`i; vi ) j i 2 0::ng, with (n  0). Since v has

no incoming reference edges, and no superclasses, an object graph that contains a v -object will be rooted at a node o such that Class(o) = v , and o has n outgoing edges to vertices oi such that Class(oi) 2 CType(vi ); (0  i  n). Then, the environment transformation ! deletes the root vertex o and all the outgoing edges from , leaving n subgraphs rooted at the vertices oi; (0  i  n).

5.3.3 Renaming of Class

Renaming of class (RenC) renames a vertex labeled v to label w. Class graph transformation. The class graph transformation \renaming of vertex v; w" is performed by the operation   RenC(G ; v; w) which produces a class graph G 0 = (V 0 ; E 0; L0) such that

V 0 = V [w=v]; E 0 = E [w=v]; and LV 0 = LV [w=v] The substitution operator [w=v ] is read: \w for v ".

Preconditions. 9v 2 V; w 62 V Environment transformation.

Object graphs need not be changed since they do not refer directly to class labels. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = ! (E ) = E . In all the transformations discussed so far it has been

56

Chapter 5. Primitive Transformations

left open to the actual implementation of the object store how the function Class is computed. In general, Class needs information from both the class graph and the object graph. The implementation can decide whether the class information is directly stored with each object or computed in a more space saving manner. However, for the renaming of class transformation the dependency of the Class function needs to be made explicit since the transformation requires it to be updated. The new Class function substitutes w for v : Class0(o) = Class(o)[w=v ] =

(

w

if Class(o) = v

Class(o) otherwise

5.3.4 Addition of Abstract Class Addition of abstract class (AddA) adds an abstract class u and an inheritance edge from each class in S  (V A [ V C ) to u. S is the set of subclasses of u. The class u must not have any outgoing reference edges. Class graph transformation. The class graph transformation \addition of abstract class u" is performed by the operation   AddA(G; u; S ). Let S = fv1 ; : : :; vng  (V A [ V C ), then  produces a class graph G 0 = (V 0; E 0; L0) such that

V A0 = V A [ fug; EI 0 = EI [ f(vi; u) j i 2 1::ng; and LV 0 = LV [ fClassname(u)g

Preconditions. 8v 2 S : 9v 2 (V A [ V C ) u 62 V; Abstract(u) Environment transformation. Any object graph in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = ! (E ) = E .

5.3.5 Deletion of Abstract Class

Deletion of abstract class (DelA) deletes an abstract class u and inheritance edges from all its subclasses v1 ; : : :; vn . The class u must not have incoming reference edges nor any outgoing edges (reference or inheritance). An abstract class meeting these conditions is in e ect structurally useless, since it is not a part of any concrete class, and since it does not provide any structure for its subclasses to inherit. Definition 10 (Useless)

A class u is considered useless if the following predicate Useless holds for u:

Useless (u) i Abstract (u) ^ Refs(u) = ; ^ 6 9w 2 V : u=+) w ^ f(`; u) j 9w 2 V; ` 2 L : (`; u) 2 Refs(w)g = ;

5.3. A Collection of Primitive Transformations

57

3

Class graph transformation.

The class graph transformation \deletion of abstract class u" is performed by the operation   DelA(G ; u) which produces a class graph G 0 = (V 0 ; E 0; L0) such that

V A0 = V A ? fug; LV 0 = LV ? fClassname(u)g; EI 0 = EI ? f(v; u) j v 2 DirectSubclasses (u)g;

Preconditions. 9u 2 V; Useless(u) Environment transformation. Any object graph in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = ! (E ) = E .

5.3.6 Addition of Reference

Addition of reference (AddR) adds a new reference edge labeled ` between existing vertices v and w to the class graph. Class graph transformation. The class graph transformation \addition of reference v; `; w" is performed by the operation   AddR (G ; v; `; w) which produces a class graph G 0 = (V; E 0; L0) such that ER0 = ER [ f(v; `; w)g and LE 0 = LE [ f`g It is assumed that (LE 0; WT =2 for all other neighbors w of u.

Proof. Assume w = 6 v is a neighbor of u, and U contains all neighbors of u except w and v.

W(u; v ) < WT =2 implies W(v; u) > WT =2 and thus with Lemma 13 it follows that: W(u; w) = !n (u) + W(v; u) +

> !n(u) + WT =2 +

Lemma

T.

X

x2U

X

x2U

W(x; u)

W(x; u)  WT =2

2

15 The accumulated weight function W is non-decreasing on any path p = hx0; : : :; xni in For 0  i  n ? 2 : W(xi ; xi+1)  W(xi+1; xi+2 )

Proof. Assume U contains all vertices adjacent to xi+1 except xi+2 . In particular, U contains xi . Assume further that U 0 is such that U = U 0 [ fxi g. Then the equality in Lemma 13 can be used

to prove the inequality in this lemma.

W(xi+1 ; xi+2) = !n (xi+1 ) + W(xi ; xi+1) +

X

u2U

W(u; xi+1)  W(xi ; xi+1) 0

The above lemma has an immediate important implication to the cost function C .

2

11.3. Characterization of Median

183

16 Let p = hx0; : : :; xk ; : : :; xni be any path in T with k such that W(xk?1; xk ) < WT =2 and W(xk ; xk+1 )  WT =2. Then the objective cost function C is monotonic non-increasing on the path hx0; : : :; xk i and monotonic non-decreasing on path hxk ; : : :; xni.

Theorem

Proof. Let C (xi ; xi+1 ) = C (xi+1 ) ? C (xi). With Equation (12) if follows that C (xi ; xi+1 ) =

!e (xi; xi+1 )  (2  W(xi ; xi+1) ? WT ) for i 2 0 : : :n ? 1. Then C  0 on hx0 ; : : :; xk i and C  0 on hxk ; : : :; xni. 2

11.3.2 Single median Theorem 16 and the above lemmas allow a characterization of a median with respect to the accumulated weight function. Lemma

17 If W(u; v) > W(v; u) then the partition P (v; u) does not contain a median.

Proof. If W(u; v ) > W(v; u) then it follows with Theorem 11 that C (v ) > C (u). Furthermore, C is

non-decreasing on all paths hu; v; : : :i which together cover the entire partition P (v; u). Corollary

2

18 If W(u; v)  WT =2 then the partition P (v; u) contains at least one median.

Proof. Immediate using the facts that (1) there exists at least one median, and (2) W(u; v ) 

WT =2 if and only if W(u; v )  W(v; u).

2

The following theorem presents a necessary and sucient condition for a node to be a median.

19 (Median Theorem) A node m is a median if and only if for every adjacent node v of m it holds that W(m; v)  W(v; m).

Theorem

Proof. Suppose, rst, that node m is a median. Then 8u 2 V : C (m)  C (u). In particular, this

statement holds for every node v adjacent to m. Using C (v ) = C (m)+ !e (m; v )  (W(m; v ) ? W(v; m)) from Theorem 11, it follows that W(m; v )  W(v; m). Now, suppose that W(m; v )  W(v; m) holds for each adjacent node v of node m. This implies that W(m; v )  WT =2. With Theorem 16 it follows that C is monotonic non-decreasing along any path p = hm; v; : : :i. Since all these paths cover the entire tree T , it follows that C is at a global minimum and thus m must be a median. 2

184

Chapter 11. Maintaining Medians During Evolution

Similar forms of the Median Theorem have been presented earlier in [KH79] and [KRS84] obtained from di erent considerations. Notice the characterization of the median depends on W alone. Thus it proves a surprising fact: the location of the median in a tree is independent of the weight function on edges as long as the function is positive. This fact has been rst observed by Hakimi [KH79] and later by Karaata et al. [KPBG94]. Furthermore, the above theorems prove that the local minima of the objective function are also global minima in a tree. This has an important implication for the search for medians. It means that a local search optimization method may be utilized to nd the median [PS82, Chapter 19]. In addition, since from any one point in the tree, there is a unique path to the minimum, it follows that at any one point, there is only one path that lowers the cost. The criterion to nd this path can be deduced from Theorem 16. Assume u is the current point of the search and v a neighbor of u. Then the median will be found in the direction of v if W(u; v ) < WT =2. It follows from Lemma 14 that there exists at most one such node v . This criterion can thus be used to do a descent towards the median.

11.3.3 Multiple Medians

Based on the Median Theorem 19 and the Cost Theorem 11, the following results can be proved for the existence and the structure of multiple medians. Lemma

20 Let u and v be two adjacent vertices in T . Then both u and v are medians if and only

if W(u; v ) = W(v; u).

Proof. Suppose, rst, that u and v are both medians. This implies that C (u) = C (v ). Using

C (v) = C (u) + !e(u; v)  (W(u; v) ? W(v; u)) from Theorem 11, it follows that W(u; v) = W(v; u).

Now suppose that W(u; v ) = W(v; u). This implies that W(u; v ) = W(v; u) = WT =2. With Theorem 16 it follows that C (x) is monotonic non-decreasing on all paths p = hu; v; : : :i and all paths p = hv; u; : : :i. Since all these paths cover the entire tree T , C is at a global minimum at u and v , and thus both are medians. 2 Theorem

21 (Multiple Median Theorem) The set of medians M of the tree T has the follow-

ing structure.

1. M is connected, inducing a subtree TM of T . 2. If WT = 0 then TM = T ; otherwise TM has maximum degree not more than 2 (all medians are on one line). 3. All nodes of TM with degree 2 have weight 0.

185

11.3. Characterization of Median

Proof. First, connectivity is proved. Pick an arbitrary median, say m, from the set of medians.

With the previous Lemma 20, each of its neighbors v is a median only if W(m; v ) = W(v; m). If that is not the case then W(m; v ) > W(v; m) according to the Median Theorem. Using Lemma 17 it follows that the partition P (v; m) contains no other medians. On the other hand, if a neighbor v is also a median then, obviously, m and v are connected, and furthermore, the same argumentation can be applied recursively for v . Second, if WT = 0 then C (v ) = 0 for all nodes and thus M = V . Now, suppose WT > 0 and, for the sake of contradiction, suppose TM has maximum degree higher than 2. Let node m exhibit that maximum degree. Then m must have at least three neighboring medians, say x; y and z . According to Lemma 20, W(x; m) = W(y; m) = W(z; m) = WT =2. But then the total weight of the tree must be WT  !n (m) + W(x; m) + W(y; m) + W(z; m)  3=2  WT , a contradiction since WT > 0. Thus, the maximal degree of TM is less than 2. Third, if WT = 0 then the statement is trivially true. Now, suppose WT > 0 and TM consists of the vertices m1 ; m2; : : :; mn arranged on a line in that order. Using Lemma 13, it follows for 1 < i < n: X W(mi ; mi+1) = !n (mi ) + W(mi?1 ; mi) + W(u; mi); u2U

0

where U 0 contains all other neighbors of mi in T . Since W(mi; mi+1 ) = W(mi?1 ; mi), this implies that X !n(mi ) + W(u; mi) = 0 u2U

0

and thus, since W  0, it follows that !n (mi ) = 0. Note that the above equation also proves that any branch of mi that does not contain a median must have accumulated weight zero. 2 This result generalizes the one commonly found for trees with positive or unit node weight which says that a tree has a single or two adjacent medians. An immediate corollary of the theorem is the following classical property of the median. An alternative proof of this can be found in [BH90] for general graphs. Corollary

22 If weights on edges and nodes are positive then there exist either a single or two

adjacent medians in a tree.

Proof. Immediate from point (3) in Theorem 21.

2

For the classical median problem where all weights are 1, two medians can only exist when the total number of nodes is even.

186 Corollary

Chapter 11. Maintaining Medians During Evolution

23 If the weight function !n is uniformly equal to 1, then multiple median nodes can

only exist if the number of nodes is even.

Proof. If the weight function !n is unity then W(u; v ) represents the number of nodes in partition

P (u; v). The rest of the proof follows immediately from the theorem.

2

11.3.4 Summary This section has analyzed the location of medians and provided a sucient and necessary local condition for a median. For positive edge weight function !e , and nonnegative node weight function !n , the medians in a tree are completely determined by the accumulated weight function W. If there are multiple medians then they are all localized on one line with the interior nodes having weight zero. The objective cost function is such that local minima are global minima on the tree. Intuitively, the shape of the objective function C is convex (\bowl-like") such that on all paths in the tree C is rst decreasing and then increasing.

11.4 Median in a Dynamic Tree

In a dynamic tree, the median problem consists of keeping track of the median and its cost while the tree is evolving. Given a tree and its median(s), one wants to calculate the new median(s) and its cost after a node has been added or deleted, or a weight has been changed. The goal of a dynamic (on-line) median algorithm is to recompute the new median faster than the non-dynamic (o -line) algorithm. To achieve this goal, one needs to exploit the knowledge about the previous solution, thus gaining eciency by not having to recompute the median \from scratch". We will see that the Median Theorem can be used to build a fully dynamic algorithm that recomputes the median with a lower time bound than the static algorithm. The supported procedures on the tree T are: InsertionUpdate(v ), adding a leaf node v ; DeletionUpdate(v ), deleting a leaf node v ; NodeWeightUpdate (v;  ), changing the weight of node v by  ; EdgeWeightUpdate (v; u;  ), changing the weight of edge (v; u) by  ; Medians(), returning the set of medians of the tree T ; and MinCost(), returning the minimal cost of the median of T . For the tree transformations, let T = (V; E; !n; !e ) be the original tree with nonnegative node weight function !n , positive edge weight function !e , and positive total weight WT . Furthermore, assume for the original tree T the following entities are known: the accumulated weight function W for all edges of T , the total weight WT of T , the set of original median nodes M , and the minimal cost Cmin = C (m) for m 2 M . The initial computation of these entities requires O(V ) time using the algorithms presented in Section 11.3. For each of the entities, the primed (0) version will denote the changed entity after the modi cation. The dynamic tree algorithms maintain the accumulated

187

11.4. Median in a Dynamic Tree

weight function W, the set of median nodes M , and the minimal cost Cmin. Thus Medians() simply returns M , and MinCost() returns Cmin. The dynamic tree algorithms take advantage of the following supporting data structure. Superimposed on the original tree is a directed tree T~ = (V; E~ ), where E~ = f(u; v ) j W(u; v ) < W(v; u)g. T~ will be used to eciently nd the way to the nearest median. Lemma 14 guarantees that for each node u 2 V there is at most one outgoing edge in T~ . Moreover, Lemma 17 ensures that the edge points towards the median. Finally, from Theorem 19 it follows that there are no edges outgoing from medians. The implementation of T~ utilizes one additional variable per node. The set of medians is maintained in a singly linked list such that the neighbor relation of medians in T is preserved in the list. This is always possible according to Theorem 21. The linked list allows all nodes in M to be directed in T~ towards some node m 2 M in O(M ) time.

11.4.1 Insertion of a Node

The insertion of a node adds to T a new leaf node v with weight !n 0 (v ), connecting it to an existing node u through a new edge (v; u) with edge weight !e 0 (v; u). The new tree T 0 = (V 0 ; E 0; !n 0 ; !e 0) can be characterized as follows.

V 0 = V [ fvg; E 0 = E [ f(v; u)g; !n 0jV = !n ; !e 0 jE = !e; WT 0 = WT + !n0 (v)

(30) (31)

Assume (x; y ) 2 E is such that there is a path hv; : : :; x; y i. Then the following holds for the new values of the accumulated weight function. W0(x; y ) W0(y; x) W0(u; v ) W0(v; u)

= = = =

W(x; y ) + !n 0(v ) W(y; x) WT !n 0(v)

(32) (33)

The value of the accumulated weight function is unchanged on edges pointing towards the newly added node and increased by !n 0 (v ) on edges pointing away. The following lemma restricts the locations of the new median(s). Lemma

24 If !n 0(v) = 0 then M 0 = M ; if !n 0(v) > 0 then a new median can only be located on

the path from v to the nearest old median m.

Proof. It is straightforward to see that for !n 0(v ) = 0 nothing changes. Now, suppose that !n 0(v) > 0 and x 62 M is not on the path from v to m. Assume further that z; y1; : : :; yn are

188

Chapter 11. Maintaining Medians During Evolution

the neighbors of x, where z is the one closest to v (and m). Since x was not a median, it holds that W(x; yi ) > W(yi ; x); (1  i  n) and W(x; z ) < W(z; x). From Equation 32 we conclude that W0 (x; yi) = W(x; yi) + !n 0 (v ) and W0(yi ; x) = W(yi ; x); (1  i  n), and also W0 (x; z ) = W(x; z ) and W0(z; x) = W(z; x) + !n 0 (v ). Thus, it follows that W0(x; yi) > W0 (yi ; x); (1  i  n) and W0 (x; z ) < W0(z; x), which, with Lemma 17, concludes the proof. 2 As a consequence of the above lemma, new medians have to be searched for on the path to an old median only. The general criterion for a new median is the same as the one in the Median Theorem 19. However, in the dynamic tree case, it can be reduced to compare the accumulated weight function of on-path edges only.

25 In the new tree T 0, a node x is a median if and only if it is on the path from v to an old median m and it holds that W0(x; z )  WT 0 =2 and W0(x; y )  WT 0 =2, where z and y are the

Theorem

successor and predecessor nodes of x on the path, if any.

0 Proof. Suppose, rst, that node x is a median in T~ . With Lemma 24 it follows that x either is

an old median or must be on the path from v to the median closest to v . Furthermore, with the Median Theorem 19 it follows that W0 (x; z )  W0(z; x) and W0(x; y )  WT 0=2. This proves the \only if" part. Now, suppose that x is on the path from v to an old median m and it holds that W0 (x; z )  W0 (z; x) and W0 (x; y )  WT 0 =2, where z and y are the successor and predecessor nodes of x on the path, if any. Let w be a neighbor of x that is not on the path. Then we know that W(x; y ) > W(y; x). With Equation 32 it follows that also W0 (x; y ) > W0(y; x) holds. Hence, x satis es the conditions of the Median Theorem 19. 2 The above condition lets us nd the new median in the new tree. It requires that the new accumulated weight function is known on the path from v to m. These values can be computed with the following update mechanism for W which basically uses Equation 32. Insertion Update Mechanism. 1. WT 0 = WT + !n 0 (v ) 2. W0(v; u) = !n 0(v ); W0(u; v ) = WT 0 ? W0(v; u) = WT ;

3. Let m be the median closest to v . Apply Equation 32 to all edges on the path p(u; m) = hx0; : : :; xni, where x0 = u, and xn = m. For 0  i  n ? 1: W0 (xi ; xi+1) = !n 0(v ) + W(xi ; xi+1); W0 (xi+1 ; xi) = WT 0 ? W0 (xi ; xi+1);

189

11.4. Median in a Dynamic Tree

For a single update, the new median can be found correctly using the above mechanism and Theorem 25. However, the goal is to nd a mechanism that works repeatedly. The question is whether it suces that the mechanism updates only the accumulated weight values of all on-path edges, leaving the values of all the other edges incorrect. To answer this question notice that the update mechanism uses only old weights of edges pointing towards the old median. Thus, a sucient condition for the update mechanism to work correctly after repeated updates is that the accumulated weight values are correct on edges towards the old median. This condition is called update invariant. Update Invariant. The total weight WT , and each accumulated weight W(x; y ) of an

edge (x; y ) on a path to some node m 2 M is correct at any quiescent state of the tree.

Note that the update invariant does not require W(y; x) to be correct. If it can be proved that the update mechanism preserves the update invariant then it is enough to simply perform the mechanism without updating all accumulated weight values. Refer to Figure 37 for an illustration.

v

u

m’

m

M

T’

T

Figure 37: An illustration of the insertion update mechanism Theorem

26 The insertion update mechanism preserves the update invariant.

Proof. Assume the update invariant holds before the insertion. After applying the update mech-

anism, if follows from Equation 32 that all the accumulated weights on the path from v to m and from m to v will be correctly updated. In particular, since the new median is on this path, the accumulated weights of those edges leading to the new median will be correct. Furthermore, all other edges (y; x) on paths to the new median (including possible paths in M ) are such that there is no path hv; : : :; y; xi and therefore W0 (y; x) = W(y; x) after Equation 32. Thus, these edges, too, remain correct, and the update invariant holds after the updated procedure has been executed. This proves the lemma. 2

190

Chapter 11. Maintaining Medians During Evolution

Once a new median m? is found, the new value Cmin0 = C (m?) of the objective function is computed as follows. The new cost of an old median m is C 0 (m) = Cmin + !n 0(v )  d(v; m). The cost of m? can then be computed using Theorem 11, starting from m as a seed. The InsertionUpdate algorithm maintains the median after a node insertion by implementing the insertion update mechanism, and by using Theorem 25 to nd the new medians. In addition, it computes the new objective function value of the new median and maintains the directed tree T~ . It is assumed that !n 0(v) > 0; if !n 0(v) = 0 then nothing has to be done. InsertionUpdate(v ) 1. Initialization: Compute new total weight WT .

2. Forward phase: Starting from v , traverse T~ to the nearest old median m, computing the distance d(v; m). On the path, (1) adjust accumulated weights, (2) nd new medians M 0 (setting their links in T~ 0 to nil), and (3) adjust other links for T~ 0. At this point, T~ 0 is established. 3. Old median phase: For all medians in old M , establish new links in T~ towards m. Compute new cost of old median. 4. Backward phase: Starting from m, traverse T~ 0 to the nearest new median m0 , computing Cmin0.

Analyzing InsertionUpdate, initialization takes time O(1). Finding the old median is O(d(v; m)), where d denotes the unit distance. The processing done at each node on the path is O(1), so the worst case time for the forward phase is O(D), where D is the diameter of the tree. Adjusting the old medians takes O(M ) time, which is also O(D) in the worst case. Finally, nding the new median m0 from the old median m is O(d(m; m0)) using the new directed tree T~ 0. Thus, the backwards phase is also O(D). In total, InsertionUpdate takes time O(D) in the worst case.

11.4.2 Deletion of a Node

The deletion of a node removes from T a leaf node v connected to node u, together with the connecting edge (v; u). The weight functions are unchanged on the remaining nodes and edges. The new tree T 0 = (V 0 ; E 0; !n 0 ; !e 0) can be characterized as follows. V 0 = V ? fvg; E 0 = E ? f(v; u)g; (34) !n0 = !njV ; !e 0 = !e jE ; WT 0 = WT ? !n(v) (35) Assume (x; y ) 2 E 0 is such that there is a path hx; y; : : :; ui in T 0. Then the following holds for the new values of the accumulated weight function. W0 (x; y ) = W(x; y ) (36) 0

0

11.4. Median in a Dynamic Tree

191

W0 (y; x) = W(y; x) ? !n (v )

The value of the accumulated weight function is unchanged on edges pointing towards the deleted node and decreased by !n (v ) on edges pointing away. The location of the new median is less restricted than in the insertion case. For example, consider the case where the deleted node had such a large weight that it overshadowed all other nodes and made it the lone median. Deleting this node reveals the median of the overshadowed nodes which could potentially be any one of them. If the deleted node was not a median then the following restriction applies. Lemma

27 If v is not a median and hv; : : :; w; mi is the path from v to its nearest median m, then

the new median is in P (m; w).

Proof. Since m is the nearest median, w cannot be a median, and thus from the Median Theorem 19 it follows that W(m; w) > W(w; m). According to Equation 36, W0 (m; w) = W(m; w) and W0 (w; m) = W0 (w; m) ? !n (v ). Thus, it also holds that W0 (m; w) > W0(w; m). Using Lemma 17 it

follows immediately that P (w; m) cannot contain a median.

2

The search for the new median can thus be started from the old median (or from u if v was the old median), guided by Lemma 17 until the condition of the Median Theorem 19 is met (descent). The accumulated weights are rst updated on the path from u to m. Second, on the way to the new median, the accumulated weights of edges leading to the new median are updated so that the update invariant remains satis ed. Deletion Update Mechanism. 1. WT 0 = WT ? !n (v )

2. Let m be the nearest median of u. For 0  i  n ? 1, apply Equation 36 to the path p(u; m) = hx0; : : :; xn i, where x0 = u, and xn = m: W0(xi ; xi+1) = W(xi ; xi+1 ) ? !n (v );

3. Starting from m, traverse the tree in the direction (x; y ), where W0 (y; x) > WT 0 =2 updating the accumulated weight on traversed edges as follows. W0 (x; y ) = WT 0 ? W(y; x);

The deletion update mechanism uses only accumulated weight values of edges leading towards the old median. Thus, its computations are (repeatedly) correct provided the update invariant is preserved.

192

Chapter 11. Maintaining Medians During Evolution

m’

v

u

m

M

T

T’

Figure 38: An illustration of the deletion update mechanism Theorem

28 The deletion update mechanism preserves the update invariant.

Proof. Assume the update invariant holds before the deletion. After applying step 2 of the deletion

update mechanism, it follows from Equation 36 and the update invariant that all the accumulated weights on the path from u to m will be correctly updated. In particular, the accumulated weights of those edges leading to the new median will be correct. It remains to update all edges from m to the new median. This is done in step 3 of the update mechanism. The accumulated weight W0 (x; y ) on the path from m to the new median is correctly updated since according to Equation 36, W0 (y; x) = W(y; x). All other edges (s; t) on paths to the new median are such that there is a path hs; t; : : :; ui in T 0 and therefore W0(s; t) = W(s; t) again after Equation 32. Thus, these edges, too, remain correct, and the update invariant holds after the updated procedure has been executed. This proves the lemma. 2 Once a new median m? is found, the new value Cmin0 = C (m?) of the objective function is computed as follows. The new cost of an old median m is C 0(m) = Cmin ? !n (v )  d(v; m). The cost of m? can then be computed using Theorem 11, starting from m as a seed. The DeletionUpdate algorithm maintains the median after a leaf node deletion by implementing the deletion update mechanism, and by using the Median Theorem 19 to nd the new median(s). In addition, it computes the new objective function value of the new median(s) and maintains the directed tree T~ . It is assumed that !n (v ) > 0; if !n (v ) = 0 then nothing has to be done. DeletionUpdate(v ) 1. Initialization: Compute new total weight WT .

11.4. Median in a Dynamic Tree

193

2. Forward phase I: Starting from v , traverse T~ to the nearest old median m, adjusting all weights on the path and computing the distance d(v; m). 3. Old median phase: For all medians in old M , establish new links in T~ towards m. Compute new cost of old median. 4. Forward phase II: Starting from m, nd the nearest new median m0 , using the descent method. On the path, (1) adjust accumulated weights, (2) adjust the links towards new median for T~ 0, and (3) compute Cmin0.

5. New median phase: Find all new medians M 0 starting from m0, adjusting their links in T~ 0 to nil.

Analyzing DeletionUpdate, initialization takes time O(1). Finding the old median is O(d(v; m)). Updating the weights at each node on the path is O(1). Adjusting the old medians takes O(M ) time. Finally, nding the new median m0 from the old median m traverses O(d(m; m0)) nodes, and nding all new medians traverses O(M 0) nodes. As a consequence of Theorem 21, nodes in M and M 0 must be on the path from v to the new median that is furthest away from v , say m00. So the total number of nodes traversed is O(D). The path from v to m00 is found with the descent method which needs to explore, in the worst case, all edges on the adjacency list of each node on the path. Thus, if  is the maximum degree of nodes in T , then the worst case running time of DeletionUpdate is O(D  ).

11.4.3 Change of Node Weight

The change of node weight transformation changes T into T 0 = (V; E; !n0 ; !e ) such that !n is updated for a given node v and remains unchanged for all other nodes in V . The value !n 0(v ) must be positive as a precondition. The update mechanism for this tree modi cation proceeds similarly as in the previous cases. Let  = !n 0 (v ) ? !n (v ). If  is positive then the update procedure follows the one for node insertion. The new median can only be located on the path from the updated node to the nearest old median. This can be proved with similar arguments as in Lemma 24. The core of the InsertionUpdate algorithm can be applied as is to update the tree, with the exception that  is used instead of !n 0(v) to update the accumulated weight function. If  is negative, the procedure follows the one for node deletion. Again, similar arguments as in Theorem 28 hold, and the core of the DeletionUpdate algorithm can be used to update the tree. The algorithm NodeWeightUpdate thus discriminates on the sign of  and proceeds similar to either InsertionUpdate or DeletionUpdate. The running time is O(D) for  > 0, and

194

Chapter 11. Maintaining Medians During Evolution

O(D  ) for  < 0. In either case, NodeWeightUpdate preserves the update invariant.

11.4.4 Change of Edge Weight

The change of edge weight transformation changes T into T 0 = (V; E; !n; !e 0) such that !e is updated for a given edge (u; v ) and remains the same for all other edges in E . The value !e 0 (u; v ) must be nonnegative as a precondition. It follows immediately from the Median Theorem 19 that the set of medians M is unchanged by this transformation since the location of a median is independent of the edge weight function. In addition, it is clear that W0  W. Thus, the change of edge weight preserves the update invariant. The value of the objective function at the median, however, does change. The new value Cmin0 of the objective function can be computed from the original value Cmin as follows. Let  = !e 0(u; v) ? !e (u; v) be the di erence between the new and the old edge weight, and let m be some median in M . If m 2 P (u; v ) then C (m) is increased by the total weight in P (v; u) times the edge weight di erence.

29 If the weight of edge (u; v) is changed by the amount , then the objective function is changed by   W(u; v ) for all nodes w 2 P (v; u).

Lemma

Proof. Let L = P (u; v ) and R = P (v; u), and assume w 2 R.

C 0(w) = = = = =

X

x2V

!n (x)d0(x; w) =

X

!n (x)d0(x; w) +

X

!n (x)d0(x; w)

x2L x2R X 0 0 0 !n (x)[d (x; u) + d (u; v) + d (v; w)] + !n (x)d0(x; w) x2L x2R X X !n (x)[d(x; u) + !e (u; v) +  + d(v; w)] + !n (x)d(x; v) x2L x2R X X X !n (x)[d(x; u) + !e (u; v) + d(v; w)] + !n(x)   + !n (x)d(x; w) x2L x2L x2R X X X !n (x)d(x; w) + !n (x)d(x; w) +   !n (x) x2L x2R x2L X

= C (w) +   W(u; v )

Above, the fact was used that the distance between two nodes remains the same if the path between them does not contain (u; v ). 2

30 If the weight of edge (u; v) is changed by the amount  then the objective function value of the median is changed by   minfW(u; v ); W(v; u)g.

Corollary

11.5. Concluding Remarks

195

Proof. If m is in P (u; v ) then the previous lemma yields Cmin0 ? Cmin =   W(v; u), and if m is in P (v; u) then Cmin0 ? Cmin =   W(u; v). Since W(u; v) > W(v; u) implies m 2 P (u; v) (Lemma 17), the two cases can be combined to Cmin0 ? Cmin =   minfW(u; v ); W(v; u)g. 2

Note that minfW(u; v ); W(v; u)g is the accumulated weight value of the edge pointing towards the median. The update invariant guarantees the correctness of that value at all times. Using the directed tree T~ it is easy to nd the direction towards the median. The EdgeWeightUpdate algorithm below implements this. Algorithm 6 EdgeWeightUpdate (u; v;  )

1 2 3 4 5

if (u; v) 2 E (T~ ) then Cmin Cmin +   W(u; v); else Cmin Cmin +   W(v; u); return Cmin;

Note that Cmin is the only thing that needs to be updated. Therefore, the running time of the EdgeWeightUpdate algorithm is O(1).

11.4.5 Summary

All of the above update mechanisms preserve the update invariant. This has the important consequence that the transformations may be applied in any order, mixing deletions, insertions, and weight changes. Thus, the presented algorithms constitute a complete set of fully dynamic tree algorithms to maintain the median in the tree.

11.5 Concluding Remarks In this chapter, a complete theory of medians in a tree has been presented including the characterization of single and multiple medians. The study resulted in a number of ecient algorithms to nd the median in both dynamic and static trees. A novel o -line algorithm was presented that nds the median in a tree in time O(V ). The salient features of the algorithm is that it computes the cost function for all nodes and works even when the weights are negative. The core of the chapter presented a set of fully dynamic algorithms that maintain the median and its cost during the evolution of the tree due to insertion and deletion of a node, as well as weight changes. The running time for these algorithms is O(1) for an edge weight change, O(D) for insertion, O(D  ) for deletion, and a mixture of the latter two for a node weight change, where D is the diameter

196

Chapter 11. Maintaining Medians During Evolution

and  is the maximum degree of the tree. This result improves on the O(V ) time bound for static trees. Note that the dynamic algorithms are amenable to a distributed setting. All information is local and communication links are only along the edges of the tree. We are currently investigating a generalization of the algorithms to concurrent updates of the tree.

Chapter 12

Maintaining Behavior and Consistency (PP) This chapter is the rst in a series of three that apply the evolution framework developed in Chapter 3 to a speci c programming language model. The chapter at hand studies evolution in the Adaptive Software model as presented in Chapter 9 when the capacity-preserving primitive transformations of Chapter 5 are applied. Chapter 13 studies evolution for CLOS with the same set of primitive transformations. Finally, Chapter 14 investigates evolution under C++. This chapter is organized as follows. Section 12.1 de nes a sample language for method bodies in propagation patterns. Section 12.2 presents a set of proof techniques to prove behavioral equivalence for the semantics of propagation patterns. The core Section 12.3 applies the evolution framework to the Adaptive Software model and a set of thirteen primitive schema transformations. For each of these transformations, the corresponding object and program transformations are de ned, and a subsequent proof of consistency and behavioral equivalence is given. The chapter is concluded with a discussion of related and future work. This chapter represents joint work with Linda Keszenheimer [HK95].

12.1 Sample Language Chapter 9 introduced adaptive object-oriented software, and de ned the semantics for propagation patterns. The semantics was deliberately de ned independently of any speci c programming language L for method bodies. This demonstrated the broad applicability of the Adaptive Software concept. In order to investigate the behavior of propagation patterns during evolution, a speci c 197

198

Chapter 12. Maintaining Behavior and Consistency (PP)

language for method bodies needs to be supplied. The following language will serve as such a sample language. It will be used to demonstrate the mechanism and the components of the evolution framework. Definition 25 (Label

language)

The label language L3 is a simple language to express method bodies that can print arbitrary strings

and values of primitive objects (attributes), and make calls to other propagation patterns. Each method body consists of a sequence of expressions where an expression is either a call to another propagation pattern named N , a label `, or a string. Syntactic correctness of method bodies in L3 is de ned by the following grammar.

hMethodBodyi hExpSequencei hExpi

::= hExpSequencei ::= hExpi ; hExpSequencei j  ::= N () j ` j String

The intention of executing a string expression s is printing the string by appending it to the stream object within environment E . The intention of executing a label expression ` is printing the value of the object referenced by ` by appending it to the stream object within E . The intention of executing a propagation pattern call expression N () is to call the corresponding propagation pattern P N with the current environment. The exact semantics of the expressions is given below, where the value of the stream object within E is made explicit by the notation [E ; stream]. For ExpSequence = Exp; ExpSequence 0 Execute L3 (o; ExpSequence ; [E ; stream]) = = Execute L3 (o; ExpSequence 0 ; ExecuteL3 (o; Exp; [E ; stream])) ExecuteL3 (o; `; [E ; stream]) = [E ; stream  EnvLookup(o:l; E )] Execute L3 (o; s; [E ; stream]) = [E ; stream  s] ExecuteL3 (o; N (); [E; stream]) = Call(G ; P N ; [E ; stream]; o)

Let G = (V; E; L) be a class graph. A method body attached to a vertex v 2 V is semantically correct for G and L3 if and only if (1) for all label expressions ` in the method body there exists a pair (`; u) in Refs(v ) such that u 2 V P , and (2) for all propagation pattern calls N (), there exists a propagation pattern P = (N; D; M ); D = (S; X; T ) such that 9w 2 S : v 2 Subclasses? (w).

12.2. Proof Techniques

199

While L3 itself seems quite simple, it should be noticed that L3 constitutes only a part of the computational power of propagation patterns. In particular, propagation patterns themselves add message passing capabilities to the propagation pattern language. Another advantage of choosing L3 is that it allows to formally prove behavioral equivalence of propagation patterns given a set of class transformations. The function UseSet computes the set of labels used in a given method body. This function will be important during method evolution for some primitive transformations. 3

31 (Execute lemma) Let G be a class graph, E an environment consistent with G , m a method body consistent with G and L3 , and ! an object transformation on E such that E  ! (E ). Then for all o 2 E : !(ExecuteL3 (o; m; E )) = Execute L3 (o; m; !(E )) Lemma

Proof. Let [E ; stream] = ExecuteL3 (o; m; E ).

!(ExecuteL3 (o; m; [E; stream])) = = !([E ; stream]) = [! (E ); stream ] = ExecuteL3 (o; m; [!(E ); stream])

2

12.2 Proof Techniques The behavioral equivalence proofs of a number of primitive transformations share certain aspects and techniques. This section provides such general guidelines and techniques for demonstrating behavioral equivalence. These techniques can be applied to a variety of languages and transformations. For the Adaptive Software model, the three interrelated components hS; O; Pi as described in Figure 8 (page 38) become the three components hG ; E ; Pi: a class graph G , an environment E , and a propagation pattern P . These three entities de ne the context to which we apply the change management mechanism presented in Section 3.5.3 (page 40). In fact, by using hS; O; Pi as our software environment, and the schema transformations as presented in Chapter 5, we have completed the rst four design decisions. It remains to perform step 5 for each of the transformations. Given a class graph G , a propagation pattern P = (N; D; M ) consistent with G and L3 , an environment E consistent with G , and a class graph transformation, step 5 presents itself as follows. 1. De ne class graph transformation  such that G 0 = (G ). 2. De ne object transformation ! such that E 0 = ! (E ). Show ! (E ) consistent with G '.

200

Chapter 12. Maintaining Behavior and Consistency (PP)

3. De ne program transformation  such that P 0 = (N; D0; M 0 ) =  (P ). (a) Show that  (P ) is consistent with G ' and L3 . (b) Show that P and  (P ) are behaviorally equivalent. Of the steps prescribed by the above process, proving behavioral equivalence (step 3b) is substantially more involved than the others. The ultimate goal of proving behavioral equivalence is accomplished by showing that Run (P0 ; !(O)) = ! (Run(P; O)). In adaptive software, an executable program P G results from instantiation of a propagation pattern P = (N; D; M ) with a given class graph G . Running the program P G means executing method bodies in M while traversing the class graph according to the path set de ned by the propagation directive D, starting from a given initial object o 2 E . In short: Run (P G ; E ) = Call(G ; P ; E ; o). Figure 9 illustrated the notion of behavioral equivalence in general. The behavior preservation problem for Adaptive Software presents itself as follows: given the software system hG ; P ; Ei, and the transformations  , and ! , nd a program transformation  such that the transformed program working on the transformed environment behaves like the original program working on the original environment in the sense that the output of P must correspond to the output of P ' with the correspondence relation being ! . Let E and E  be two environments for class graph G such that E  is the resulting environment after executing P on E ; that is, E  = Call(G ; P ; E ; o) for some object o in E . Similarly, let E ' and E 0 be two environments for class graph G 0 = (G ) such that E 0 = Call(G 0 ; P 0 ; E 0 ; o). The two software systems are behaviorally equivalent if the following holds. Call(G 0 ; P 0 ; !(E ); o) = ! (Call(G ; P ; E ; o))

Showing the above equality is equivalent to showing that TravG ;M (o; PathSet(G 0 ; D0); !(E )) = ! (TravG ;M (o; PathSet(G ; D); E )) The proof system depicted in Figure 39 is employed to show the latter equality. The proof system consists of four building blocks. The major work is done by a traversal theorem which proves the desired equality. The traversal theorem typically relies on path set equivalence and method execution equivalence which in turn relies on method lookup equivalence. The rest of this section provides suitable lemmas for each of these building blocks. 0

0

12.2.1 Path set lemmas

The following three lemmas relate the path sets for di erent class graphs and propagation directives. These lemmas will be useful to show path set equivalence of transformed directives. The canonical

201

12.2. Proof Techniques

PathSet Lemma

PathSet = PathSet’ Trav = Trav’

Traversal Theorem Lookup Lemma

Lookup = Lookup’

Execute Lemma

Execute = Execute’

Figure 39: Behavioral equivalence proof technique operator for propagation directives (see De nition 19, page 138) plays an important role in showing equivalence. The following lemma proves that if two directives have the same canonical form then they de ne the same path set.

32 (Canonical PathSet Lemma) Given a class graph G and two propagation directives D and D0 compatible with G such that: D = Dc = D0 , then D; Dc, and D0 all de ne the same path set in G : PathSet (G ; D) = PathSet (G ; Dc) = PathSet(G ; D0) Lemma

Proof. Immediate from the de nition of PathSet and the fact that the canonical operator is

idempotent ensuring that: CType(S ) = CType(S ); C (T ) = CType(T ); CType(X ) = CType (X ). 2 The next path set lemma shows for which class graphs a single directive yields the same path sets.

33 (PathSet Lemma) Let G and G ' be two class graphs, and D = (S; X; T ) a propagation directive compatible with, and in canonical form with respect to both G and G ': DG = D = DG . If the following three conditions hold: (1) V C  V C 0; LE  LE 0, (2) 8v; w 2 V C; ` 2 LE : v;` G w i v ;` G w, and (3) 8u 2 V C 0 ? V C; s 2 S; t 2 T : s;? G u implies u ; 6 ? G t, then D de nes the same path sets in both G and G ': Lemma

0

0

0

0

PathSet (G ; D) = PathSet (G 0 ; D) Proof. First notice that D refers to the same concrete vertices both in

G and G '. Thus the

concrete subclass operator CType need not be considered in the de nition of PathSet. The lemma is proven in two steps. Step 1. PathSet(G ; D)  PathSet (G 0 ; D). Assume p = hv0; `1; v1; : : :; vn i 2 PathSet(G ; D). By the de nition of PathSet it holds that:

202

Chapter 12. Maintaining Behavior and Consistency (PP)

i+1 vi `; G vi+1; (0  i < n), and Source(p) 2 S ^ Target(p) 2 T ^ E (p) \ X = ;. Due to premise i+1 0 (2) this implies: vi `; G vi+1; (0  i < n), and thus p 2 PathSet(G ; D). Step 2. PathSet(G ; D)  PathSet(G 0; D). Assume p = hv0 ; `1; v1; : : :; vn i 2 PathSet(G 0 ; D). Because of premise (3) vi 2 V C; 8i 2 1::n. Using premise (2) and following the same argument as above yields that p 2 PathSet (G ; D). 2 0

The following corollary uses the previous lemmas to give a sucient condition for deciding when two directives and two class graphs yield the same path set.

34 (PathSet Corollary) Given two class graphs G and G ', and given two propagation directives D and D0 compatible with G and G ', respectively, such that both reduce to the same canonical directive in their respective class graph: DG = Dc = D0 G . If the following three conditions hold: (1) V C  V C 0; LE  LE 0, (2) 8v; w 2 V C; ` 2 LE : v ;` G w i v ;` G w, and (3) 8u 2 V C 0 ? V C; s 2 SDc ; t 2 TDc : s;? G u implies u;6 ? G t, then the following holds: PathSet (G ; D) = PathSet(G 0 ; D0) Corollary

0

0

0

0

Proof.

PathSet (G ; D) lemma = 32 PathSet (G ; DG ) = PathSet (G ; Dc) lemma = 33 = 32 PathSet(G 0; D0) = PathSet(G 0; Dc) = PathSet(G 0; D0G ) lemma 0

2

12.2.2 Lookup lemma The next lemma relates the results of method lookup for di erent class graphs. Lemma

35 (Lookup Lemma) For a compatible and unambiguous method map, Lookup remains

invariant while transforming a class graph as long as inheritance reachability stays invariant. If G = (V; E; L) and G 0 = (V 0 ; E 0; L0) are two class graphs such that V  V 0, and M is a method map unambiguous in, and compatible with G and G ', then the following implication holds. ? w i v =) ? If 8v; w 2 V : v =) G G w, then 8v 2 V : LookupG (v; M) = LookupG (v; M) 0

Proof. Due to M being compatible with G , it holds that 8w

0

2 V 0 ? V : M(w) = . Consider ? the following two cases. If LookupG (v; M) =  then 8w 2 V such that v =) G w : M(w) = .

203

12.2. Proof Techniques

From the above restriction of compatibility and the premise it then follows that 8w 2 V 0 such that ? v=) G w : M(w) =  and therefore LookupG (v; M) = . For the second case assume that LookupG (v; M) 6= . If M(v ) 6=  then LookupG (v; M) = M(v ) = LookupG (v; M), which proves the lemma. If M(v ) =  then assume LookupG (v; M) = M(u) for some u; v =+)u. Due to the ambiguity constraint, M must be unde ned for all superclasses w of ? w ^u = ? w. v except for those that are also superclasses of u: M(w) = ; 8w such that v =) 6) G G Due to the premise and the above restriction of compatibility, M must remain unde ned for all superclasses w of v in G 0 except for those that are also superclasses of u in G 0 : M(w) = ; 8w such ? w ^ u =6 ) ? w. So it follows that Lookup (v; M) = M(u) that v =) 2 G G G 0

0

0

0

0

0

12.2.3 Traversal Theorem

The goal of the traversal theorem is to show when two calls to Trav result in equivalent environments. Of course, equivalence can only be shown when both calls terminate. This notion of equivalence is referred to as partial equivalence. The proof of the traversal theorem relies on a further assumption about the inter-propagation pattern call graph structure de ned by mutual calls of propagation patterns. The call graph structure needs to be acyclic; that is, the graph needs to be a tree. In particular, a propagation pattern may not call itself recursively within the method bodies. This assumption does not restrict the intra-propagation pattern call graph from being recursive. The intra-propagation pattern call graph is built from the path set of the propagation pattern. The inter-propagation pattern call graph is built from calls to other propagation patterns within the method bodies. Note that the above assumption is required only for the traversal theorem and is not a restriction of the semantics. The following lemma proves a property used by the subsequent traversal theorem. Lemma

36 If E and E ' are two environments such that EE 0, then for all object vertices o in E

and for all labels ` of o: RTCG ;M (o; `; PS; E ) terminates immediately with result E if and only if 0 RTCG ;M (o; `; PS; E ) terminates immediately with result E 0 0

0

Proof. Immediate from the de nition of RTC and the following equivalence which is a consequence

of the de nition of object extension. (Class(o); `; Class(EnvLookup(o:`; E ))) 2 Car(Filt(PS; Class(o); )) if and only if (Class(o); `; Class(EnvLookup(o:`; E 0))) 2 Car(Filt(PS; Class(o); ))

2

204

Chapter 12. Maintaining Behavior and Consistency (PP)

37 (Traversal Theorem) Let G and G' be two class graphs, PS some path set, and ! an object transformation from environments consistent with G to environments consistent with G ' such that for all environments E consistent with G : E!(E ). Furthermore, let M and M' be

Theorem

corresponding pre x or sux method maps in M and M ', respectively, such that M and M0 are consistent with language L and with G and G ', respectively. If for all o 2 E :

!(Execute L (o; LookupG (Class(o); M); E )) = ExecuteL (o; LookupG (Class(o); M0 ); !(E )) 0

then the following implication holds: If TravG ;M (o; PS; E ) terminates in a recursive descent of depth d with result E  then TravG ;M (o; PS; !(E )) terminates in a recursive descent of depth d with result ! (E ). 0

0

Proof. The proof is done by induction on the depth d of the recursive descent of the calls to Trav.

A mathematically strict proof would use the x point induction technique which de nes Trav to be the lowest upper bound of a family of functions indexed by the depth d of the recursion. Induction basis. d = 0 TravG ;M (o; PS; E ) must terminate in one of the two termination cases discussed in the de nition of Trav. Since the number n of immediate calls to RTC is determined solely by the path set PS, the = TravG ;M (o; PS; !(E )) will be the same. Using lemma 36 in case (2), termination case for Trav0 def it follows that Trav' will also terminate in d = 0 steps. In either termination case, the e ective results are: 0

0

TravG ;M (o; PS; E ) = = Execute L (o; LookupG (Class(o); MS); ExecuteL (o; LookupG (Class(o); MP); E )) = E  TravG ;M (o; PS; !(E )) = 0

0

=

Execute L (o; LookupG (Class(o); MS0 ); ExecuteL (o; LookupG (Class(o); MP0 ); ! (E )))

premise

=

Execute L (o; LookupG (Class(o); MS0 ); ! (ExecuteL (o; LookupG (Class(o); MP); E )))

premise

! (ExecuteL (o; LookupG (Class(o); MS); ExecuteL (o; LookupG (Class(o); MP); E )))

=

=

0

0

"

!(TravG ;M (o; PS; E )) = !(E )

This proves the induction basis.

0

"

"

205

12.2. Proof Techniques

Inductive step. d = h + 1

The induction hypothesis says that the traversal theorem holds for all calls to Trav that terminate at depth d  h. TravG ;M (o; PS; E ) = Execute L (o; LookupG (Class(o); MS); RTCG ;M (o; `n ; PS; ... ; RTCG ;M (o; `1; PS; Execute L (o; LookupG (Class(o); MP); E )) : : :)) = E  TravG ;M (o; PS; !(E )) = = Execute L (o; LookupG (Class(o); MS0); RTCG ;M (o; `n; PS; ... ; RTCG ;M (o; `1; PS; Execute L (o; LookupG (Class(o); MP0); ! (E ))) : : :)) = 0

0

0

0

0

0

0

0

premise

=

Execute L (o; LookupG (Class(o); MS0); RTCG ;M (o; `n; PS; ... ; RTCG ;M (o; `1; PS; ! (Execute L(o; LookupG (Class(o); MP); E ))) : : :)) = 0

0

0

0

I:H:;Lemma 36

=

"

0

"

Execute L (o; LookupG (Class(o); MS0); RTCG ;M (o; `n; PS; ... ; ! (RTCG ;M (o; `1; PS; 0

0

0

"

Execute L (o; LookupG (Class(o); MP); E ))) : : :)) =

206

Chapter 12. Maintaining Behavior and Consistency (PP) I:H:;Lemma 36

=

Execute L (o; LookupG (Class(o); MS0); ! (RTCG ;M (o; `n; PS; 0

"

... ;

=

RTCG ;M (o; `1; PS; Execute L (o; LookupG (Class(o); MP); E )) : : :))) = ! (ExecuteL (o; LookupG (Class(o); MS);

=

RTCG ;M (o; `n ; PS; ... ; RTCG ;M (o; `1; PS; Execute L (o; LookupG (Class(o); MP); E )) : : :))) = !(TravG;M (o; PS; E )) = !(E  )

premise

"

All calls to RTC invoke a call to Trav that terminates with depth d  h, and at least one terminates with depth d = h. Thus, the total depth is d = h + 1. This concludes the proof of the traversal theorem. 2 Let hG ; P ; Ei be an initial adaptive object-oriented system, and hG 0; P 0 ; E 0 i and the transformed counterpart with P = (N; D; M ) and P 0 = (N; D0; M 0). For a speci c primitive transformation to be able to utilize the Traversal Theorem 37, the following properties must be shown: 1. The theorem assumes a single path set PS, therefore D0 must be such that: PathSet (G ; D) = PathSet (G 0 ; D0)

2. Assuming M is already consistent with G and L, it remains to be shown that M ' is consistent with G ' and L. 3. Directly from the premise, ! must be such that E! (E ) = E 0 . 4. Directly from the premise, it must hold that 8o 2 E :

!(Execute L(o; LookupG (Class(o); M); E )) = ExecuteL (o; LookupG (Class(o); M0); !(E )) 0

where M and M' are corresponding pre x or sux method maps in M and M ', respectively. Once all of the above premises for the theorem have been established, partial behavioral equivalence can be concluded.

12.3. Application of Framework

207

12.3 Application of Framework The feasibility of the Famous framework is demonstrated using the above example language for implementing method bodies in propagation patterns and a subset of the primitive class graph transformations presented in Chapter 5. This subset includes:

            

Addition of Concrete Class (AddC) Addition of Subclass (AddS) Addition of Abstract Class (AddA) Deletion of Abstract Class (DelA) Addition of Reference (AddR) Abstraction of Common Reference (AbsR) Distribution of Common Reference (DisR) Replacement of Reference (RepR) Generalization of Reference (GenR) Renaming of Class (RenC) Renaming of Reference (RenR) Telescoping of Reference (TelR) Telescoping of Inheritance (TelI)

For each of these primitive transformations, the model of the evolution process de ned in Section 3.5.3 is followed. In particular, for each of the transformations, a rigorous proof is given that behavioral equivalence is preserved as de ned in Section 3.5.2. For reading convenience, the de nitions of the class graph transformations from Chapter 5 are repeated.

12.3.1 Addition of Concrete Class

Addition of concrete class (AddC) adds a concrete class v . The class v must not have any outgoing reference edges. Class graph transformation. Preconditions: v 62 V and :Abstract(v).

208

Chapter 12. Maintaining Behavior and Consistency (PP)

The class graph transformation \addition of concrete class v " is performed by the operation   AddC(G ; v) which produces a class graph G 0 = (V 0 ; E; L0) such that V 0 = V [ fvg and LV 0 = LV [ fClassname(v)g Environment transformation. Any object in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = !(E ) = E . Program transformation. The propagation pattern need not be changed: P 0 = P ; thus the program transformation is the identity   I. Program consistency. Consistency of P ' with G ' and L3 clearly holds since M was not changed and neither compatibility nor ambiguity are a ected by the additional class v . Behavioral equivalence. Behavioral equivalence can be shown by using the Traversal Theorem 37. The prerequisites are ful lled: (1) PathSet (G ; D) = PathSet(G 0; D) using the PathSet Lemma 33. (2) Method map consistency is subsumed by program consistency. (3) E = ! (E ) =) E!(E ). (4) The Execute Lemma 31 applies since its premises can be shown to hold by the Lookup Lemma 35 and the above properties (2) and (3).

12.3.2 Addition of Subclass

Addition of subclass (AddS) adds a concrete class u as a subclass of v and an inheritance edge from classes u to v . The class v must exist and be abstract. Class graph transformation. Preconditions: v 2 V A; u 62 V . The class graph transformation \addition of subclass u; v " is performed by the operation   AddS (G ; u; v ) which produces a class graph G 0 = (V 0; E 0; L0) such that

V 0 = V [ fug; EI 0 = EI [ f(u; v)g; and LV 0 = LV [ fClassname(u)g

Environment transformation.

Any object in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = !(E ) = E . Program transformation. The added inheritance edge could potentially add new paths to the path set de ned by the original propagation directive. To prevent new paths, all calling edges with target u are added to the exclusion constraints of the new propagation directive. The method map need not be changed. The propagation pattern is changed to:

 = AddS(P ; u; v) = P 0 = (N; D0; M ) where D0 = (S; X 0; T ) and X 0 = X [ f(w; `; u) j 9w 2 V; ` 2 LE : w;` ug.

12.3. Application of Framework

209

Consistency of P ' with G ' and L3 clearly holds since M was not changed and neither compatibility nor ambiguity are a ected by the additional subclass. The syntactic and semantic correctness also remain invariant. The new directive D0 is compatible with G 0 by construction. Behavioral equivalence. Behavioral equivalence can be shown by using the Traversal Theorem 37. The prerequisites are ful lled: (1) PathSet(G ; D) = PathSet(G 0 ; D0) using the following lemma 38. (2) Method map consistency is subsumed by program consistency. (3) E! (E ). (4) The Execute Lemma 31 applies since its premises can be shown to hold by the Lookup Lemma 35 and the above properties (2) and (3).

Program consistency.

38 (AddS-PathSet Lemma) Given a class graph G , and a vertex v 2 V , let G0 = AddS (G ; u; v ) be the transformed graph of G under AddS. Similarly, let D be a propagation directive compatible with G , and D0 = AddS(G ; D; u; v ) the transformed propagation directive as de ned by Lemma

AddS. Then

PathSet (G ; D) = PathSet(G 0 ; D0)

Proof. The lemma is proven in two steps. (1) PathSet (G ; D)  PathSet (G 0 ; D0).

Assume p = hv0; `1; v1; : : :; vn i 2 PathSet(G ; D). By the de nition of PathSet it holds that: i+1 (p) 2 S ^ Target(p) 2 T ^ E (p) \ X = ;. Since V  V 0 vi `; G vi+1; (0  i < n), and Source i+1 and ER = ER0 this implies: vi `; vi+1 ; (0  i < n), and thus p 2 PathSet(G 0 ; D). FurtherG more since D0 only di ers from D in X and X 0 only adds bypassing edges to u, it follows that p 2 PathSet(G 0 ; D0). (2) PathSet(G ; D)  PathSet(G 0; D0). Assume p = hv0 ; `1; v1; : : :; vn i 2 PathSet(G 0; D0). Since V 0 = V [ fug, and any edges involving u i+1 are bypassed, it follows that V (p)  V C . It remains to show that, vi `; G vi+1; (0  i < n). But this is clear since E (p) cannot contain edges to u (bypassed) and E  E 0. Thus it follows that p 2 PathSet(G; D). 2 0

12.3.3 Addition of Abstract Class

Addition of abstract class (AddA) adds an abstract class u and inheritance edges from classes v1 ; : : :; vn to u. The class u must not have any outgoing reference edges. Class graph transformation. Preconditions: v1; : : :; vn 2 V ? V P , u 62 V and Abstract(u). The class graph transformation \addition of abstract class u" is performed by the operation   AddA(G; u; v1; : : :; vn ) which produces a class graph G 0 = (V 0 ; E 0; L0) such that V 0 = V [ fug; EI 0 = EI [ f(vi; u) j i 2 1::ng; and LV 0 = LV [ fClassname(u)g

210

Chapter 12. Maintaining Behavior and Consistency (PP)

Environment transformation.

Any object in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = !(E ) = E . Program transformation. The propagation pattern need not be changed: P 0 = P ; thus the program transformation is the identity   I. Program consistency. Consistency of P ' with G ' and L3 clearly holds since M was not changed and neither compatibility nor ambiguity are a ected by the additional class u. Behavioral equivalence. Behavioral equivalence can be shown by using the Traversal Theorem 37. The prerequisites are ful lled: (1) PathSet (G ; D) = PathSet(G 0; D) using the PathSet Lemma 33. (2) Method map consistency is subsumed by program consistency. (3) E = ! (E ) =) E!(E ). (4) The Execute Lemma 31 applies since its premises can be shown to hold by the Lookup Lemma 35 and the above properties (2) and (3).

12.3.4 Deletion of Abstract Class

Deletion of abstract class (DelA) deletes an abstract class u and inheritance edges from all its subclasses v1 ; : : :; vn . The class u must be useless in the sense of De nition 10. Class graph transformation. Preconditions: u 2 V and Useless(u). Then the class graph transformation \deletion of abstract class u" is performed by the operation   DelA(G ; u) which produces a class graph G 0 = (V 0; E 0; L0) such that

V 0 = V ? fug; EI 0 = EI ? f(v; u) j v 2 DirectSubclasses (u)g; LV 0 = LV ? fClassname(u)g

Environment transformation.

Any object in E consistent with G is also consistent with G '. Therefore, the environment transformation is simply the identity transformation: !  I implying E 0 = !(E ) = E . Program transformation. If the deleted class u is mentioned in the propagation pattern then the following program transformations are applied:

 = DelA(P ; u) = P 0 = (N; D0; M 0 ) The directive transformation produces a new propagation directive D0 = (S 0; X 0; T 0) by replacing each occurrence of u with the set of all its concrete subclasses CType(u). The transformation

211

12.3. Application of Framework

renders D0 canonical with respect to u.

S0 =

(

(S ? fug) [ CType(u) if u 2 S

S otherwise T 0 = (T ? fug) [ CType(u) if u 2 T T otherwise 0 X = (X ? fx 2 X j u 2 xg) [ fx0 = xu!C(u) j u 2 xg (

The method map transformation produces a new M ' in which for each pre x and sux method map M, a method attached to u is pushed down to those direct subclasses of u that do not de ne their own method. 8 > if v = u > <  0 M (v ) = > M(u) if M(u) 6=  ^ M(v ) =  ^ v 2 DirectSubclasses G (u) > : M(v ) otherwise

Program consistency.

The consistency of P ' with G ' and L3 follows immediately from the de nition of consistency, the above directive and method map transformations, and the fact that PathSet(G 0; D0) = PathSet (G ; D) which follows directly from the PathSet Corollary 34. Behavioral equivalence. Behavioral equivalence can be shown by using the Traversal Theorem 37. The prerequisites are ful lled: (1) PathSet (G ; D) = PathSet (G 0 ; D) (see above). (2) Method map consistency is subsumed by program consistency. (3) E = !(E ) =) E! (E ). (4) The Execute Lemma 31 applies since its premises can be shown to hold by the following DelA-Lookup Lemma 39 and the above properties (2) and (3). Note that the Lookup Lemma 35 cannot be applied since both the method map and inheritance reachability change.

39 (DelA-Lookup Lemma) Given a class graph G and a useless vertex u 2 V , let G0 = DelA(G ; u) be the transformed graph of G under DelA. Similarly, let M be a pre x or sux method map unambiguous and compatible with G , and M0 = DelA(G ; M; u) the transformed method map as de ned by DelA. Then 8v 2 V 0 : Lemma

LookupG (v; M0) = LookupG (v; M) 0

Proof. The lemma is proven by showing that the rst method body returned by Lookup? is the

same before and after the transformation. First note that for all v such that the useless class u is not a superclass of v the lemma is trivially true because M is changed only for direct subclasses of ? u. The lemma is proven by induction on u. Thus for the rest of the proof it is assumed that v =) the depth of the inheritance hierarchy rooted at v . To shorten the presentation Lookup?G (v; M) will be written as L(v ) whenever it is clear from the context which G and M are meant.

212

Chapter 12. Maintaining Behavior and Consistency (PP)

? u for some k 2 1::n. Assume Superclasses (v ) = fv1 ; : : :; vn j n  0g and vk =) Induction basis. Assume the inheritance hierarchy rooted at v has depth 1 and thus v1; : : :; vn are (all) the superclasses of v in G with u = vk for some k 2 1::n. Then v1 ; : : :; vk?1,vk+1 ; : : :; vn are the superclasses of v in G '. Two cases must be distinguished:

1. M(u) = . Then it is known that M0  M and thus: M(v )  M(v1)    M(vk )    M(vn ) M(v )  M(v1)    M(vk?1 )  M(vk+1 )    M(vn ) M0 (v )  M0 (v1)    M0 (vk?1 )  M0 (vk+1 )    M0(vn ) = M(v )  M(v1)    M(vk?1 )  M(vk+1 )    M(vn ) = Lookup?G (v; M)

Lookup?G (v; M) = = Lookup?G (v; M0) = = 0

2. M(u) 6= . Here two subcases have to be distinguished. (a) M(v ) = . Due to the ambiguity constraint it follows that M (vi ) = ; 8i 6= k and thus it follows: Lookup?G (v; M) =       M(u)       = M(u) Lookup?G (v; M0 ) = M(u)       = M(u) = Lookup?G (v; M) 0

(b) M(v ) 6= . Then the sequences returned by Lookup? are di erent but both start with M(v ) Lookup?G (v; M) = M(v )  M(v1 )    M(vn ) Lookup?G (v; M0 ) = M(v )  M(v1 )    M(vk?1 )  M(vk+1 )    M(vn ) 0

Inductive step. Assume the inductive hypothesis is true; that is, the rst method body returned by Lookup? in an inheritance hierarchy of depth d  n is the same both before and after

the transformation. It will now be proven that the rst body is also the same in an inheritance hierarchy of depth d = n + 1. Again, two cases are distinguished.

1. M(u) = . As in the induction basis it is known that M0  M. If v 2 DirectSubclasses G (u) then vk = u and Lookup?G (v; M) = M(v )  Lookup?G (v1; M)    Lookup?G (u; M)    Lookup?G (vn ; M) = M(v )  L(v1)    L(vk?1 )  L(vk+1 )    L(vn ) Lookup?G (v; M0) = M(v )  L0(v1 )    L0(vk?1 )  L0 (vk+1 )    L0 (vn ) 0

With the induction hypothesis it follows that the rst body is the same.

213

12.3. Application of Framework

If v 62 DirectSubclasses G (u) then the following holds:

Lookup?G (v; M) = M(v )  Lookup?G (v1 ; M)    Lookup?G (vn ; M) Lookup?G (v; M0 ) = M(v )  Lookup?G (v1; M0 )    Lookup?G (vn ; M0 ) 0

0

0

And again, with the induction hypothesis it can be concluded that the rst body is the same. 2. M(u) 6= . Again, there are two subcases according to the de nition of M'. (a) M(v ) 6= . Then M0(v ) = M(v ) holds and the rst method body is M(v ) for both Lookup?G (v; M) and Lookup?G (v; M0). (b) M(v ) = . As in the induction basis, the ambiguity constraint requires that for all i 6= k: Lookup?G (vi ; M) = . And so it follows that Lookup?G (v; M) = Lookup?G (vk ; M) If v 2 DirectSubclasses G (u) then vk = u and M0(v ) = M(u) (because M(u) 6=  and the ambiguity constraint). Thus it follows that Lookup?G (v; M0) = M(u). Since u does not have any superclasses, it also follows that Lookup?G (v; M) = Lookup?G (u; M) = M(u) and the conclusion has been reached. If v 62 DirectSubclasses G (u) then 0

0

Lookup?G (v; M0 ) = M(v )  Lookup?G (vk ; M0) = Lookup?G (vk ; M0 ) 0

0

0

By the induction hypothesis it can be concluded that Lookup? and Lookup?' start with the same body.

2

This concludes the proof.

12.3.5 Addition of Reference Addition of reference (AddR) adds a new reference edge labeled ` between existing vertices v and w to the class graph. Class graph transformation. Preconditions: v; w 2 V and 8u 2 Subclasses? (v ) : ` 62 RefLabels(u). The class graph transformation \addition of reference v; `; w" is performed by the operation   AddR(G; v; `; w) which produces a class graph G 0 = (V; E 0; L0) such that

ER0 = ER [ f(v; `; w)g and LE 0 = LE [ f`g It is assumed that (L0; getPosition(); }

Then Tool::getPosition is replaced by a pure virtual function, and the following new methods are added: Position *Tool::getPosition() = 0; Position *RectTool::getPosition() { return interface->getPosition();} Position *OvalTool::getPosition() { return interface->getPosition(); } Position *SelectTool::getPosition() { return interface->getPosition(); }

The above transformation introduces another problem in C++: if any of the subclasses called the distributed member function directly through the class resolution operator, this call results in a compile time error since the member function is now pure virtual (e.g., Tool::getPosition()). This can be resolved as follows. In each subclass that uses the class resolution operator to call the distributed method, create a new method with a unique name containing the original code of the distributed method and replace calls through the class resolution operator with calls to this new method. E.g., if the method OvalTool::getPosition() calls the method Tool::getPosition() explicitly, then the method OvalTool::getPosition_new() is created with the same code as in Tool::getPosition(). All calls to Tool::getPosition() through the scope resolution operator are replaced by this -> getPosition_new().

250

Chapter 14. Maintaining Behavior and Consistency (C++)

Note that the distribution of base class methods requires that attributes of the base class are not private, or, if they are, then they are accessed through non-private accessors. The requirement is needed since the distributed methods might also access non-distributed attributes of the base class. An even more serious problem for maintaining behavior occurs in the presence of multiple inheritance. The problem is very similar to the one encountered with the deletion of abstract class (DelA) modi cation in CLOS. Assume again a class D has three direct superclasses A, B, C, each of which de nes its own constructor (with the same name), and assume class A has another subclass E. Furthermore, suppose class B de nes a reference which is distributed to all its subclasses and which is used by its constructor B. The execution of constructors is incremental in nature, as was the execution of before and after methods in CLOS. The execution occurs in the order of least speci c to most speci c, so the creation of a D-object subsequently entails the execution of the following sequence of constructors: A B C D. Similarly, the creation of an E-object executes the constructor sequence A E. According to the transformation rule for DisR, every method that uses the distributed reference needs to be distributed to all subclasses. This would include the distribution of the constructor. The problem is that in the example constellation it is impossible to distribute the constructor code down without changing the behavior. Prepending B's code to D exchanges the order of B and C, and appending B to A introduces extraneous behavior during creation of an E-object. The only solution is to distribute all constructors down to the concrete classes such that the DisR rule need not be applied. This is not a viable solution considering the fact that it introduces a lot of redundant code. It is interesting to observe that the above problem again occurs due to incremental inheritance and multiple inheritance. 14.2.3 Abstraction of common reference (AbsR) As in the untyped model, no change is necessary for the implementation of member functions, since data members are de ned to be protected. Hence member functions of any subclass that accessed an abstracted reference still have access through inheritance. 14.2.4 Replacement of reference (RepR) In the untyped language model, reference replacement does not require any modi cation of the code since the objects that can be assigned to the replaced reference are unchanged. However, in a typed language, the reference replacement implies a change in the type declaration of the reference. Two problems occur in this case. First, messages sent to the reference might no longer be understood since there may be no such method known to the reference's new class. Second, wherever the reference is involved in an assignment statement, function call (as a passed parameter), or function return (as the returned value), the reference's new type will no longer be compatible.

251

14.2. Code Transformations

The rst problem can be solved by supplying, for each method de ned in the reference's old class, a corresponding pure virtual function in the reference's new class. Since each construction subclass now inherits methods from both the reference's new and old classes, it must provide its own method to resolve the ambiguity in favor of its original (possibly inherited) method. The second problem requires that objects be converted to the appropriate type in assignment statements, function calls, and function returns. Note that simple casting will not work in C++ under multiple inheritance. Consider what happens when the reference class of Screen's inputTool is changed from Tool to CanvasTool by reference replacement. Suppose that the following methods were originally de ned: void Tool::handleMouseClick(DrawWindow *win) = 0; void Screen::handleMouseClick(DrawWindow *win) { inputTool -> handleMouseClick(win)} void Screen::Screen(Tool *t) { inputTool = t; }

To solve the rst problem, we de ne a pure virtual function in the CanvasTool class and a disambiguating method in each construction subclass: void CanvasTool::handleMouseClick(DrawWindow *win) = 0; void RectTool::handleMouseClick(DrawWindow *win) {Tool::handleMouseClick(win); } void OvalTool::handleMouseClick(DrawWindow *win) {Tool::handleMouseClick(win); } void SelectTool::handleMouseClick(DrawWindow *win) {Tool::handleMouseClick(win); }

To solve the second problem, we generate methods to transform the type of objects from Tool to CanvasTool and from CanvasTool to Tool. Wherever inputTool occurs on the right hand side of an assignment, is passed as a parameter to a function, or is returned from a function, it is rst converted to its original type (Tool). Wherever inputTool occurs on the left hand side of an assignment statement, the expression on the right hand side is converted to its new type (CanvasTool). Tool *CanvasTool::CT_to_T() = 0; CanvasTool *Tool::T_to_CT() = 0; Tool *RectTool::CT_to_T() { return this; } CanvasTool *RectTool::T_to_CT()

{ return this; }

252

Chapter 14. Maintaining Behavior and Consistency (C++) Tool *OvalTool::CT_to_T() { return this; } CanvasTool *OvalTool::T_to_CT()

{ return this; }

Tool *SelectTool::CT_to_T() { return this; } CanvasTool *SelectTool::T_to_CT()

{ return this; }

void Screen::Screen(Tool *t) : inputTool(t -> T_to_CT()) { /* empty */ }

14.2.5 Deletion abstract class (DelA) As in the case of the untyped language model, one problem with deleting a \useless" alternation class is that there may be methods attached to the class. There is additionally the problem that the class name may be used in the static type declarations of objects. If there are any methods attached to the useless alternation class A, their implementations must be distributed to its immediate subclasses unless they are overridden in those subclasses. If anywhere in the program an explicit call (i.e., through the scope resolution operator \::") to a method A::m is made, we create a new method with a unique name, say A_m, de ned for each of A's immediate subclasses. The implementation for A_m is the same as for A::m. Then, every occurrence of an explicit call to A::m is replaced with a call to A_m. If there are any variables de ned of static type A in the program, then we need to nd an equivalent substitute type or else keep a class de nition of A to be used only to satisfy the type system. If there is an alternation class B with the same set of derived construction classes as for A, B can serve as a substitute type for A. In this case, all the member functions which were de ned for A are now declared as pure virtual functions in class B and class A is deleted. Wherever class A appeared in a type declaration, class B is substituted. Note that in conjunction with the reference replacement transformation, there is always such a corresponding class B . If there is no such corresponding class B , then A can not be deleted since it must continue to be used in type declarations. In this case, class A is preserved but contains only pure virtual functions. We regard A as a type rather than a class.

Example 14.2 Consider what happens when the Tool class is deleted in the transformation from

Figure 41{RepR to Figure 40{B. Suppose the methods declared in class Tool are these: virtual void handleMouseClick( DrawWindow * ) = 0; virtual Position getPosition() = 0; virtual CanvasTool *T_to_CT() = 0;

All the methods happen to be pure virtual, so there are no implementations to be distributed. Furthermore, class CanvasTool quali es as an equivalent substitute type for class Tool. For each

14.2. Code Transformations

253

method declared in class Tool, a pure virtual method is declared in class CanvasTool. Everywhere that class Tool is used in a type declaration, it is replaced with class CanvasTool. Finally, class Tool can be deleted. 

14.2.6 Addition of concrete class (AddC) As in the untyped language case, no changes are necessary for the method implementations. 14.2.7 Addition of reference (AddR) As in the untyped language case, no changes are necessary for the method implementations. 14.2.8 Generalization of reference (GenR) The problem that occurs with reference generalization is similar to one of the problems that occurs with reference replacement. If the reference class C of some reference is generalized to a base class of C , say B , then we must ensure that for every method in class C there is a corresponding method de ned in class B . This is done by de ning empty virtual functions in B wherever necessary. Moreover, as for reference replacement, wherever the reference is involved in an assignment statement, function call (as a passed parameter), or function return (as the returned value), the reference's new type will no longer be compatible. In this case, however, a simple cast will suce since the new class is a base class of the original. Note that the GenR transformation indicates that a behavior extension is in order. Our goal in this work is simply to ensure behavior preservation. The above transformation achieves this goal, but the resulting code is not desirable from a software engineering point of view. The inserted cast operations are therefore seen as a hint to the programmer as to where the behavior of the program should be extended. 14.2.9 Addition of Subclass (AddS) As in the propagation pattern model and the untyped model, this transformation does require any changes to existing code. 14.2.10 Renaming of Class (RenC) In C++ methods, class names are used in declarations, constructor and destructor de nitions, calls to constructors, and in connection with the class resolution operator \::". All these locations must be updated to ensure behavioral equivalence. In order to nd these locations, the correct scope needs to be taken into consideration. Using methods or variables with the same name as a class is considered bad programming style. The same caveat holds as in the untyped model: if meta information like the class name is utilized in the programs, then maintaining the behavior might require a very involved program

254

Chapter 14. Maintaining Behavior and Consistency (C++)

analysis, or might not be possible at all.

14.2.11 Renaming of Reference (RenR)

Renaming of reference requires the substitution of the new reference name for the old one at every point of use in the program. Doing this requires the knowledge of when the reference is in scope since the same name might be also used to denote other program entities. If the reference name is encapsulated in accessor methods, then the changes are localized to within the class de ning the reference, and in particular to within the accessor methods.

14.2.12 Telescoping of Inheritance (TelI)

Telescoping an inheritance relation does not require any modi cations to the program code except for one special case. Suppose the (empty) class SpecialTool is inserted between CanvasTool and SelectTool. If a SelectTool constructor called the CanvasTool constructor in its initialization list, then this call no longer remains valid since only direct base classes may be mentioned in initialization lists. SelectTool::SelectTool(MouseInterface) : CanvasTool(MouseInterface) { ... }

Such a call must be replaced with a call to the new base class, SelectTool, which must provide an appropriate constructor. SelectTool::SelectTool(MouseInterface) : SpecialTool(MouseInterface) { ... } SpecialTool::SpecialTool(MouseInterface) : CanvasTool(MouseInterface) { /* empty */ }

14.2.13 Telescoping of Reference (TelR) As in the untyped model, the entire interface of the telescoped class needs to be provided at the inserted class. The only task of these methods is to delegate the message to the telescoped class. Again, every read access through the original reference name needs to strap the inserted object, and every write access needs to insert a new object. In particular, a constructor for the inserted class needs to be provided which takes as an argument an object of the telescoped class. The above modi cations are considerably simpli ed if accessors are used throughout the program.

14.3. Discussion

255

14.3 Discussion When comparing the update operations necessary in the two language models, the di erences are striking. While in the untyped language model only few updates to method implementations are necessary, the programmer working in the typed language model is faced with numerous problems. For the untyped language model, we have shown that a schema extension can always be propagated to the method implementations such that the behavior of the program is preserved. However, for the typed language model, a behavior preserving update mechanism could only be outlined and is far from being satisfactory. The major reason for this is that the type system poses severe restrictions on how updates can be performed. Without semantic information on what the update's intentions are, it is not always possible to change the typing speci cations in a reasonable way. The above comparison underlines the popularity of untyped languages for prototyping purposes. Their ability to exibly adapt themselves to di erent class structures gives them a major advantage over typed languages in environments where structural changes occur frequently. For typed languages, the propagation pattern approach [LXSL91, LHSLX92] achieves the same exibility by decoupling the programs from the class structure. Consequently, any change in the class structure a ects the propagation pattern only marginally.

256

Chapter 14. Maintaining Behavior and Consistency (C++)

Chapter 15

Evolutionary Metaobject Protocol An important aspect of the evolution framework is its incorporation into the target programming environment. This chapter proposes an embedding of the evolution framework into a re ective system, using a metaobject protocol to provide an interactive and integrated meta-level interface to the target application. This protocol is referred to as the evolutionary metaobject protocol (EMOP). The EMOP is able to take advantage of the re ective and introspective facilities of the meta layer. The chapter is organized as follows. Section 15.1 presents a brief overview of re ective systems and de nes terminology. Section 15.2 motivates and describes the use of a re ective system to realize an evolution management system. Section 15.3 gives a short outline of Closette, the target re ective system for the EMOP. Finally, Section 15.4 presents an implementation for a subset of the primitive transformations.

15.1 Re ective Systems Re ection has been widely recognized as a powerful and exible mechanism to implement applications and programming languages [Smi84, dRS84, Coi87, WY88, Bel93, MC93, Mul94, FDM94, DF94]. A major advantage of re ective systems is their exibility which is achieved by providing a dual interface . The dual interface consists of both a traditional interface and an adjustment interface [Kic92]. This results in a so-called open implementation . Programmers can take advantage of the open implemention by adjusting it to their needs. In this way, as Kiczales et al. [KdRB91] point out, a re ective system provides an entire region of programming languages. Each language in the region can be realized by appropriately customizing the dual interface that is provided by 257

258

Chapter 15. Evolutionary Metaobject Protocol

the re ective system. As was shown in Section 9.4, re ection can also be used to achieve separation of concerns. Watanabe and Yonezawa [WY88] de ne re ection to be the process of reasoning about and acting upon itself. In her seminal paper, Pattie Maes provides the following de nitions [Mae87]:

Computational system. Computer-based system whose purpose is to answer questions about and/or support actions in some domain. We say a system is about its domain. It incorporates internal structures representing the domain. These structures include data representing entities and relations in the domain, and a program prescribing how these data may be manipulated. Re ective System. A computational system which is about itself in a causally connected way. [: : :]

A system which incorporates structures representing (aspects of) itself. We call the sum of these structures the self-representation of the system. This selfrepresentation makes it possible for the system to answer questions about itself and support actions on itself. Because the self-representation is causallyconnected to the aspects of the system it represents, we can say that: 1. The system always has an accurate representation of itself. 2. The status and computation of the system are always in compliance with this representation. This means that a re ective system can actually bring modi cations to itself by virtue of its own computation. Computational re ection. The Behavior exhibited by a re ective system. In a re ective system, one can thus distinguish two kinds of structures, those that represent the domain, and those that represent the system itself. The latter structures are commonly said to be at the meta level. Kiczales et al. pointedly remark [KdRB91]: \The meta level is about the program rather than about whatever the program happens to be about." For an illustration, refer also to Figure 4 (page 21). A language architecture where re ection is paired with metaclasses results in an object-oriented system that is open-ended and easily extendable since the metaclasses provide hooks to extend or modify an existing kernel. Examples of such systems are CLOS [Kee89, Ste90, LM91, KdRB91] and ObjVlisp [Coi87]. In addition, the introspective abilities of metaclass-based systems provide the basis for a range of program analysis tools and browsers. Analysis tools typically need to have

15.2. Reflection and Evolution

259

access to meta level information of the system at hand to answer queries about the system itself. The advantage of self-describing and re ective systems is that the same tools can be used by both developers and users of the system, and that access to data and meta data is uniform. The database community has early on recognized the power of self-describing database systems [FGMS81, Bra87, TS92] and has used them extensively to allow the user to interactively build up a database schema, and to query the structure of the schema. In the database context, the part of the system that contains the meta information is usually called data dictionary or meta-database . As was shown in Figure 4, the class objects of the object-oriented schema are instances of classes at the meta level; in database terminology, the schema is the content of the meta-database.

15.2 Re ection and Evolution A re ective system incorporates structures for representing itself. The basic constructs of the programming language, such as classes or object invocation, are described at the meta-level and can be extended or rede ned by meta-programming. Each object is associated with a metaobject through a meta-link. The metaobject is responsible for the semantics of operations on the base object. The interface between the base-level program and the higher levels is done through metaobject protocols. Metaobject protocols are interfaces to the programming language that give users the ability to incrementally modify the language's behavior and implementation. Cointe indicates a number of features that may be controlled by metaclasses [Coi87]:

   

The inheritance strategy (simple, multiple, method wrapping, etc.) The internal representation of objects The access of methods by implementing a caching technique The access of instance variable values

For the implementation of the evolution framework we go one step further and add to the above list the responsibility of evolving the system in a behavior preserving manner. The resulting metaobject protocol is called the evolutionary metaobject protocol (EMOP). The responsibility of managing the change of schema objects ts very well into the overall responsibility of a metaclass. In fact, we claim that metaclasses are the ideal place to put this responsibility since they have access to all the necessary information about the evolving system. Figure 42 illustrates the distribution of responsibility for the three levels of objects. In the same way as the schema objects (\classes" like Employee) de ne the behavior of their instances (e.g., Joanna), the metaobjects (e.g., Std-Class) de ne the behavior of their instances (e.g., Employee). The collection of all schema objects constitutes the schema. So in essence, the metaobjects de ne

260

Chapter 15. Evolutionary Metaobject Protocol

how the schema behaves. In particular, metaobjects are able to de ne how the schema reacts when, for example, a new attribute, or a new superclass, is added to a class since they have access to all the necessary information of the schema. Std−Class

instance−of

method−lookup() add−attribute() add−superclass() ...

Method Attribute

Counted− Class

inc−counter() meta objects

instance−of

Person

Employee

family−tree()

yearly−salary() is−retired()

schema objects (class meta objects)

instance−of

Joanna

John

data objects

Figure 42: The distribution of responsibility among the three levels of objects In summary, the idea of the evolutionary metaobject protocol is to provide metaclasses with the ability of evolving the schema. The advantages of a metaobject protocol based evolution system can be characterized as follows:

 Integrated and uniform access to data and meta-data  Interactive programming environment for evolution  Extendable programming environment for evolution Contrast the EMOP approach with a straightforward, non-re ective implementation of the behavior preserving transformations. The straightforward solution provides a simple transformation language whose purpose is to let the software evolution/maintenance engineer specify a schema

15.3. Mademoiselle Closette

261

transformation (e.g., an extension). The transformation system then applies the speci ed transformation to the source code of the application and outputs the transformed source code. In addition, the system transforms the object base to the new speci cations. A characteristic of this transformation system is that it is completely separated from the actual system; that is, the evolution system is not integrated into the overall development environment. All the information about the application needs to be built up from scratch every time the transformation system is invoked.

15.3 Mademoiselle Closette In the remainder of this chapter, we outline how to realize an EMOP by enhancing the metaobject protocol of Closette [KdRB91]. For an in-depth description of Closette we refer to [KdRB91]. This section describes only its basic characteristics that are relevant for the EMOP. Closette, as the name suggests, is the \little sister" of CLOS [Kee89]: it is a simple CLOS interpreter which, despite the simpli cations is representative of the architecture of CLOS. In general, Closette's complexity has been reduced by neglecting performance and error checking. As Kiczales et al. point out [KdRB91], the simpli cations in Closette are for pedagogical purposes only. The most signi cant is that it is an interpreter rather than compiler-based implementation. Other restrictions of Closette include:

        

No class rede nition. No method rede nition. No forward-referenced superclasses. Explicit generic function de nitions. Standard method combination only. No eql specializers. No slots with :class allocation. Types and classes not fully integrated. Minimal syntactic sugar.

Except for the above restrictions, Closette contains all the essential features of CLOS. In particular, it supports:

 classes with multiple inheritance

262

Chapter 15. Evolutionary Metaobject Protocol

 creation, initialization and manipulation of instances,  generic functions, de ning a common interface for sets of methods  methods, de ning the implementation of generic functions, and class-speci c behavior. User-de ned classes are represented as objects, called class metaobjects, and are created as the result of a call to defclass. As objects, they are instances of the class standard-class which de nes seven slots, one each for the name of the class, the list of direct superclasses, the list of direct slots, the class precedence list, the list of e ective slots, the list of direct subclasses, and the list of direct methods. A class inherits structure and behavior from all its direct and indirect superclasses in the order from most speci c to least speci c. The classes standard-object and t are always at the top of the inheritance hierarchy. Instances of user-de ned classes all share a common storage layout: in addition to storage for their slot values, they contain storage for their class such that each object has access to its class. An object's class is accessed through the function class-of. Besides class metaobjects, there are two other important kinds of metaobjects. A generic function metaobject represents a generic function which is created as the result of a call to defgeneric. A generic function metaobject contains slots for the name and argument list of the generic function, and one for its associated set of method metaobjects. A method metaobject captures the information supplied in a call to defmethod including a link to the associated generic function metaobject, the list of class metaobjects that are the method's parameter specializer, and a form that constitutes the body of the method.

15.4 Closette's EMOP As was shown in Section 5.6.2 (page 72, CLOS has already a limited capability of evolving programs. Closette has retained the generic functions change-class which changes the class of a single instance, and uses update-instance-for-different-class. In this section, a limited set of primitive transformation generic functions is presented that provides Closette with more powerful evolution capabilities. Following the guidelines of the evolution framework in Section 3.5, a primitive transformation is rst applied to the schema. In Closette, the schema is internalized and represented by the collection of class metaobjects. Thus, a primitive transformation rst operates on a class metaobject. Since every class metaobject is an instance of the class standard-class , this is achieved by de ning a primitive transformation as a generic function that spezializes on standard-class . The program maintainer can then automatically evolve the system by calling the generic function with the appropriate class metaobject. In C++ terminology, the maintainer can send an evolution message to the class metaobject.

263

15.4. Closette's EMOP

Consider for example the transformation deletion of abstract class (DelA), de ned below. Two methods are de ned for this transformation. The rst takes as an argument a name (symbol), the second takes a class metaobject. The transformation is typically initiated by giving the name of the class to be deleted. The corresponding method checks whether the class with the given name exists, retrieves the class metaobject (both done by find-class), and then calls the generic function again with the retrieved object. This time, the method with standard-class as specializer is invoked. This method rst checks whether the class is really useless. Then it updates the schema by adjusting the direct sub- and superclasses of the class to be deleted. Note that standard-object is always a direct superclass of a useless class. Since Closette maintains the class precedence list for each class, that must also be updated. As was seen before, no instances must be updated. To update the code all methods de ned for the abstract class must be distributed to its subclasses unless they override the method. Finally, the class metaobject itself is deleted from the class table. (defgeneric deletion-of-abstract-class (a-class)) (defmethod deletion-of-abstract-class ((a-class-name symbol)) ;; 1. Check preconditions I. (let ((a-class (find-class a-class-name))) (deletion-of-abstract-class a-class)) (defmethod

deletion-of-abstract-class ((a-class standard-class))

(progn ;; 1. Check preconditions II. (when (not (useless-p a-class)) (error "~&~S is not a useless class." a-class-name)) ;; 2. Update schema. ;; Delete the a-class object from the superclass list of all subclasses (dolist (subclass (class-direct-subclasses a-class)) (setf (class-direct-superclasses subclass) (if (equal (class-direct-superclasses subclass) (list a-class)) (list (find-class 'standard-object)) (remove a-class (class-direct-superclasses subclass))))) ;; Delete a-class object from standard-object's subclass list (let* ((standard-object-class (find-class 'standard-object)))

264

Chapter 15. Evolutionary Metaobject Protocol (setf (class-direct-subclasses standard-object-class) (remove a-class (class-direct-subclasses standard-object-class)))) ;; Recompute the class precedence list of ALL subclasses. ;; We don't need to re-compute the class-slots list of the subclasses ;; since the class about to be deleted is useless. (dolist (subclass (subclasses a-class)) (setf (class-precedence-list subclass) (compute-class-precedence-list subclass))) ;; 3. Update objects. ;; No instances need be updated ;; 4. Update code. ;; Distribute each method in a-class to its subclasses (dolist (method (class-direct-methods a-class)) (distribute-common-method a-class method)) ;; Delete a-class itself (forget-class (class-name a-class)) ;; No reasonable return value (values) ))

(defgeneric distribute-common-method (class method)) (defmethod distribute-common-method ((class standard-class) (method standard-method)) (let ((gf (method-generic-function method))) ;; First remove the method from the generic function list. ;; Caution: this sets the method-generic-function slot to nil. (remove-method gf method) ;; Then distribute the method down to every subclass (dolist (subclass (class-direct-subclasses class)) (let* ((subclass-method-specializers

265

15.4. Closette's EMOP (substitute subclass class (method-specializers method))) (subclass-method (find-method gf (method-qualifiers method) subclass-method-specializers nil))) (when (null subclass-method) (ensure-method gf :lambda-list (method-lambda-list method) :qualifiers (method-qualifiers method) :specializers subclass-method-specializers :body (method-body method) :environment (method-environment method)) ))) ))

In the method distribute-common-method it was assumed that only primary methods are present in the abstract class, and that no calls to call-next-method are done in subclass methods. If that is not the case then a more complicated merging of distributed and subclass methods must be done. This needs possible user interaction. Another example is the addition of abstract class transformation (AddA). Here only schema updates are necessary. (defgeneric addition-of-abstract-class (a-class subclass-names)) (defmethod addition-of-abstract-class ((a-class-name symbol) subclass-names) ;; 1. Check preconditions (done through find-class) (let ((a-class (eval (list 'defclass a-class-name () () ))) (subclasses (mapcar #'find-class subclass-names))) (addition-of-abstract-class a-class subclasses))) (defmethod addition-of-abstract-class ((a-class standard-class) subclasses) (progn ;; 2. Update schema. ;; For each direct subclass, add the new superclass to its superclass ;; list and add each subclass to the subclass list of the useless

266

Chapter 15. Evolutionary Metaobject Protocol ;; alternation class. (dolist (subclass subclasses) ;; If standard-object is the only class on the superclass ;; list of a given subclass then replace by the ;; useless alternation class, otherwise add the a-class ;; to the end of the superclass list. (if (equal (class-direct-superclasses subclass) (list (find-class 'standard-object))) (setf (class-direct-superclasses subclass) (list a-class)) (push-on-end a-class (class-direct-superclasses subclass))) (push-on-end subclass (class-direct-subclasses a-class)) ) ;; Recompute the class precedence list of ALL subclasses. ;; We don't need to re-compute the class-slots list of the subclasses ;; since we added an abstract class. (dolist (subclass (subclasses a-class)) (setf (class-precedence-list subclass) (compute-class-precedence-list subclass))) ;; 3. Update objects. ;; No instances need be updated ;; 4. Update code. ;; Code not affected. ;; Return newly added class a-class ))

The transformation renaming of class (RenC) reveals another advantage of the EMOP. Since class metaobjects are stored as objects in all structures that reference them (e.g., sub- and superclass lists), changing a class name once, in the metaobject slot name, updates all all references simultaneously. This would not be the case if references were kept by name resolution. The update

15.4. Closette's EMOP

of the code needs to change the class name in all locations it is used. ;;; Assumption: built-in classes are not allowed to be changed. (defgeneric rename-class (class new-class-name)) (defmethod rename-class ((old-name symbol) new-name) ;; 1. Check preconditions (let ((old-class (find-class old-name)) (new-class (find-class new-name nil))) (if (null new-class) (rename-class old-class new-name) (error "~&New class ~S already exists.")))) (defmethod rename-class ((class standard-class) new-name) (let ((old-name (class-name class))) ;; 2. Update schema. ;; change name in class metaobject (setf (class-name class) new-name) ;; assign new name in class table (assign-new-class-name old-name new-name) ;; Update all subclasses so that they reflect the correct superclass. ;;

--> this is automatically done since subclasses store their

;;

superclasses as objects.

;; 3. Update objects. ;; No instances need be updated ;; 4. Update code. (update-code-for-rename-class class new-name) class)) (defun update-code-for-rename-class (class new-name) (format t "Updating code for rename ...") ;; update all constructor calls.

267

268

Chapter 15. Evolutionary Metaobject Protocol

)

A transformation where both objects and code need to be updated is abstraction of common reference (AbsR). Note that in order to have access to all objects of a class (its extension), a new slot, instances, had to be added to the class standard-class together with an accessor class-instances. This has also an impact to the creation and deletion of objects: it must be ensured that the list of instances is kept up to date. ;;; Abstraction of Common Reference (AbsR) ;;; ;;; Abstract a slot common to all direct subclasses of a given target class ;;; up to the target class. ;;; ;;; The slot must be a valid slot in the direct subclasses of class target. ;;; It is assumed that the corresponding slot association lists in the ;;; subclasses are equivalent. It is the slot of the first subclass that ;;; is actually moved up to the superclass. (defgeneric abstract-common-slot (target slot-name)) (defmethod abstract-common-slot ((target-name symbol) slot-name) ;; 1. Check preconditions I. (let ((target (find-class target-name))) (abstract-common-slot target slot-name)))

(defmethod abstract-common-slot ((target standard-class) slot-name) (let ((slot (find slot-name (class-direct-slots (car (class-direct-subclasses target))) :key #'slot-definition-name))) ;; 1. Check preconditions II. ;; The slot to abstract must be a valid slot in each direct subclass. ;; Check for validity. (dolist (subclass (class-direct-subclasses target)) (when (null (find slot-name (class-direct-slots subclass) :key #'slot-definition-name))

269

15.4. Closette's EMOP (error "~&Slot ~S does not exist in subclass ~S." slot-name (class-name subclass)))) ;; Now it's safe to proceed. ;; 2. Update schema. ;; Add the slot to the target class. (push-on-end slot (class-direct-slots target)) (setf (class-slots target) (compute-slots target)) ;; 2./3. Update schema and objects. ;; Adjust the slots (direct and inherited) of all subclasses ;; and of all their instances. Note that target is not supposed to ;; have instances since it is an abstract class. (dolist (subclass (subclasses target)) (let ((old-location (slot-location subclass slot-name))) ;; Adjust effective slots. (setf (class-slots subclass) (compute-slots subclass)) (let ((new-location (slot-location subclass slot-name))) ;; 3. Update objects (need extension of class) ;; Rearrange the objects (EAGER conversion). (dolist (instance (class-instances subclass)) ;; rotate old to new slot-location and shift the rest (rotate-slot-storage (std-instance-slots instance) old-location new-location)) ))) ;; 4. Update code. ;; Add reader and writer methods for target class. (dolist (reader (slot-definition-readers slot)) (add-reader-method target reader slot-name)) (dolist (writer (slot-definition-writers slot)) (add-writer-method target writer slot-name)) ;; That's it for the target, now take care of the subclasses ;; For each direct subclass (dolist (subclass (class-direct-subclasses target))

270

Chapter 15. Evolutionary Metaobject Protocol ;; Remove the slot from subclass. (setf (class-direct-slots subclass) (remove slot-name (class-direct-slots subclass) :key #'slot-definition-name)) ;; Remove reader and writer methods from subclass. ;; Note again: readers and writers are assumed to be ;; identical in each subclass! (dolist (reader (slot-definition-readers slot)) (let* ((reader-gf (find-generic-function reader)) (reader-method (find-method reader-gf '() (list subclass) nil))) (remove-method reader-gf reader-method))) (dolist (writer (slot-definition-writers slot)) (let* ((writer-gf (find-generic-function writer)) (writer-method (find-method writer-gf '() (list (find-class 't) subclass) nil))) (remove-method writer-gf writer-method)))) target))

Below is the corresponding implementation for the transformation distribution of common reference (DisR). ;;;-----------------------------------------------------------------------;;; ;;; Distribution of Common Reference (DisR) ;;; ;;; Distribute a slot from a source class to all its direct subclasses. ;;; ;;; It is important that the abstract superclass rule is followed since ;;; otherwise this operation would be tantamount to deleting a part of ;;; an object, an operation that cannot possibly maintain behavior. ;;;

15.4. Closette's EMOP

(defgeneric distribute-slot (source slot-name)) (defmethod distribute-slot ((source-name symbol) slot-name) ;; 1. Check preconditions I. (let ((source (find-class source-name))) (distribute-slot source slot-name))) (defmethod distribute-slot ((source standard-class) slot-name) (let ((slot (find slot-name (class-direct-slots source) :key #'slot-definition-name))) ;; 1. Check preconditions II. ;; The slot to distribute must be a valid slot in the source class. ;; Check for validity. (when (null slot) (error "~&Slot ~S does not exist in class ~S." slot-name source-name)) ;; Now it's safe to proceed. ;; 2. Update schema. ;; Remove the slot from the source and recompute its effective slots. (setf (class-direct-slots source) (remove slot (class-direct-slots source))) (setf (class-slots source) (compute-slots source)) ;; 2./3. Update schema and objects. ;; Adjust the slots (direct and inherited) of all subclasses ;; and of all of their instances. Note that the source is not ;; supposed to have instances since it is an abstract class. (dolist (subclass (subclasses source)) (let ((old-location (slot-location subclass slot-name))) ;; Adjust effective slots. (setf (class-slots subclass) (compute-slots subclass)) (let ((new-location (slot-location subclass slot-name))) ;; Rearrange the objects (EAGER conversion). (dolist (instance (class-instances subclass)) ;; rotate old to new slot-location and shift the rest

271

272

Chapter 15. Evolutionary Metaobject Protocol (rotate-slot-storage (std-instance-slots instance) old-location new-location)) ))) ;; 4. Update code. ;; Remove reader and writer methods from source class. (dolist (reader (slot-definition-readers slot)) (let* ((reader-gf (find-generic-function reader)) (reader-method (find-method reader-gf '() (list source) nil))) (remove-method reader-gf reader-method))) (dolist (writer (slot-definition-writers slot)) (let* ((writer-gf (find-generic-function writer)) (writer-method (find-method writer-gf '() (list (find-class 't) source) nil) )) (remove-method writer-gf writer-method))) ;; That's it for the source, now take care of the subclasses ;; For each direct subclass (dolist (subclass (class-direct-subclasses source)) ;; Add the distributed slot. (push-on-end slot (class-direct-slots subclass)) ;; Add reader and writer methods. (dolist (reader (slot-definition-readers slot)) (add-reader-method subclass reader (slot-definition-name slot))) (dolist (writer (slot-definition-writers slot)) (add-writer-method subclass writer (slot-definition-name slot))) ) source))

Other transformations can be done similarly following the guidelines of Chapter 13.

15.5. Conclusion

273

15.5 Conclusion This chapter suggested an implementation of the evolution framework with a metaobject protocol, embedding it into a re ective system. For demonstration purposes, the extension of the metaobject protocol of Closette, a subset of CLOS, was utilized.

274

Chapter 15. Evolutionary Metaobject Protocol

Chapter 16

Conclusions All good things have an end | only sausages have two.

Swiss proverb

The evolution and maintenance of software systems has gained tremendous importance in the last decade. Software evolution deals with the management of changes to a software system. Any change is a prodigious source of new software faults and inconsistencies in other parts of the system. Unless the evolution process resolves such inconsistencies, the system stops being fully operational. The extensive costs incurred by evolution activities make the study of evolution strategies and automatic evolution mechanisms a worthwhile topic for research. This thesis made the following contributions to the management of the evolution of objectoriented system:

 We have developed a comprehensive theory on the evolution of object-oriented systems including an overview of the state-of-the-art and related work.

 We have designed an evolution framework for maintaining consistency and behavior of an

object-oriented system. The framework presents a high-level process of how to automatically maintain the consistency of a system. As semantic guideline for the update process, the original and the updated system need to be behaviorally equivalent. For this purpose, the framework contains a formal de nition of behavioral equivalence and system consistency.

 We have demonstrated the feasibility of the framework. A general object-oriented data model

has been de ned using class graphs to describe the structural aspects of an object-oriented 275

276

Chapter 16. Conclusions

schema. A comprehensive set of primitive schema transformations was presented including, for each transformation, a) a set of preconditions that guarantees conceptual consistency, b) an object transformation to preserve structural consistency, and c) a program transformation to maintain behavioral consistency. Program transformations are applied to a variety of object-oriented language models: (1) to CLOS, as representative of the untyped model; (2) to C++, as representative of the typed model; and (3) to propagation patterns, as representative of the adaptive software model. For the adaptive software model, the proofs of behavioral equivalence and consistency were demonstrated formally utilizing a new formal and executable semantics for Adaptive Software, and a newly developed set of proof techniques. An implementation of the evolution framework was suggested with a metaobject protocol, and outlined with Closette.

 We have designed a set of fully dynamic algorithms to maintain the median in an evolving tree.

This contribution represents an additional important result in an area related to evolution. The supported tree modi cations are node insertion and deletion as well as node and edge weight changes. The employed data structure is such that only O(D) nodes need be traversed to update the tree after a modi cation, where D is the diameter of the tree. In contrast, the optimal static algorithm needs to traverse O(n) nodes to compute the median, where n is the number of nodes in the tree.

Bibliography

[ABV92]

Mehmet Aksit, Lodewijk Bergmans, and Sinan Vural. An Object-Oriented LanguageDatabse Integration Model: The Composition-Filters Approach. In O. Lehrman Madsen, editor, European Conference on Object-Oriented Programming (ECOOP), pages 372{396, Utrecht, The Netherlands, June/July 1992. Springer Verlag, Lecture Notes in Computer Science. Vol. 615. 143, 147

[ABvdSB94] Mehmet Aksit, Jan Bosch, William van der Sterren, and Lodewijk Bergmans. RealTime Speci cation Inheritance Anomalies and Real-Time Filters. In Mario Tokoro and Remo Pareschi, editors, European Conference on Object-Oriented Programming (ECOOP), pages 386{407, Bologna, Italy, July 1994. Springer Verlag, Lecture Notes in Computer Science. Vol. 821. 142, 143, 145, 146, 147, 148 [AH88]

Serge Abiteboul and Richard Hull. Restructuring Hierarchical Database Objects. Theoretical Computer Science, 62:3{38, 1988. 9, 67, 233

[ALP91]

Jose Andany, Michel Leonard, and Carole Palisser. Management of Schema Evolution in Databases. In International Conference on Very Large Data Bases, pages 161{170, Barcelona, Spain, September 1991. 13, 15 277

278

Bibliography

[ANS75]

ANSI/X3/SPARC. Study Group on Data Base Management Systems: Interim Report 75-02-08. FDT (Bulletin of ACM SIGMOD), 7(2), 1975. 14

[ANS86]

ANSI/X3/SPARC. Reference Model for DBMS Standardization. SIGMOD Record, 15(1):19{58, March 1986. 14

[AWB+ 94] Mehmet Aksit, Ken Wakita, Jan Bosch, Lodewijk Bergmans, and Akinori Yonezawa. Abstracting Object Interactions using Composition-Filters. In M. Guerraoui, O. Nierstrasz, and M. Riveill, editors, Object-Based Distributed Processing. Springer Verlag, Lecture Notes in Computer Science, 1994. To appear. 143, 146 [Bar91]

Gilles Barbedette. Schema modi cations in the LISPO2 persistent object-oriented language. In Pierre America, editor, European Conference on Object-Oriented Programming (ECOOP), pages 77{96, Geneva, Switzerland, July 1991. Springer Verlag, Lecture Notes in Computer Science. Vol. 512. 9, 34

[BCG+ 87]

Jay Banerjee, Hong-Tai Chou, Jorge F. Garza, Won Kim, Darrell Woelk, and Nat Ballou. Data model issues for object-oriented applications. ACM Transactions on Oce Information Systems, 5(1):3{26, January 1987. 9, 25

[BCN92]

Carlo Batini, Stefano Ceri, and Shamkant B. Navathe. Conceptual Database Design, an Entity-Relationship Approach. Benjamin/Cummings, Redwood City, CA, 1992. ISBN 0-8053-0244-1. 63, 99

[BCW90]

Mokrane Bouzeghoub and Isabelle Comyn-Wattiau. View integration by semantic uni cation and transformation of data structures. In Hannu Kangassalo, editor, 9th International Conference on Entity-Relationship approach, pages 413{430, Lausanne, Switzerland, October 1990. 100

[BD91]

Edward H. Berso and Alan M. Davis. Impacts of life cycle models on software con guration management. Communications of the ACM, 34(8):104{118, August 1991.

Bibliography

279 2, 17

[BDK92]

Francois Bancilhon, Claude Delobel, and Paris Kanellakis. Building an ObjectOriented Database System, the Story of O2. Morgan Kaufmann, San Mateo, CA, 1992. ISBN 1-55860-169-4. 9

[Bel93]

Z. Bellahsene. An active meta-model for knowledge evolution in an object-oriented database. In C. Rolland, F. Bodart, and C. Cauvet, editors, International Conference on Advanced Information Systems Engineering (CAiSE), pages 39{53, Paris, France, June 1993. Springer Verlag, Lecture Notes in Computer Science. Vol. 685. 257

[Ber91]

Paul Bergstein. Object-preserving class transformations. In Andreas Paepcke, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 299{313, Phoenix, Arizona, October 1991. ACM Press. Special Issue of SIGPLAN Notices, Vol.26, No.11. 13, 64, 79, 84

[Ber92]

Elisa Bertino. A view mechanism for object-oriented databases. In International Conference on Extending Database Technology, pages 136{151, Vienna, Austria, 1992. 9, 13

[Ber94a]

Lodewijk Bergmans. Composing Concurrent Objects. PhD thesis, University of Twente, Enschede, The Netherlands, July 1994. 147

[Ber94b]

Paul L. Bergstein. Managing the Evolution of Object-Oriented Systems. PhD thesis, College of Computer Science, Northeastern University, Boston, MA, June 1994. 12, 67, 76

[BH90]

Fred Buckley and Frank Harary. Distance in Graphs. Addison-Wesley, Redwood City, CA, 1990. ISBN 0-201-09591-2. 174, 185

[BH93]

Paul L. Bergstein and Walter L. Hursch. Maintaining Behavioral Consistency during Schema Evolution. In S. Nishio and A. Yonezawa, editors, International Symposium on Object Technologies and Advanced Software (ISOTAS), pages 176{193, Kanazawa,

280

Bibliography

Japan, November 1993. JSSST, Springer Verlag, Lecture Notes in Computer Science. Vol. 742. 13, 235, 247 [BI94]

Kenneth Baclawski and Bipin Indurkhya. The notion of inheritance in object-oriented programming. Communications of the ACM, 37(9):118{119, September 1994. Technical correspondence. 105, 113, 114

[BJ66]

Corrado Bohm and Giuseppe Jacopini. Flow Diagrams, Turing Machines and Languages with only Two Formation Rules. Communications of the ACM, 9(5):366{371, May 1966. 109

[BKK+ 85] D. G. Bobrow, K. Kahn, G. Kiczales, L. Masinter, M. Ste k, and Zdybel F. CommonLoops: Merging Common Lisp and Object-Oriented Programming. Intelligent systems Laboratory Series, ISL-85-8:311{322, 1985. 45 [BKKK87] Jay Banerjee, Won Kim, Hyong-Joo Kim, and Henry F. Korth. Semantics and implementation of schema evolution in object-oriented databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 311{322, San Francisco, California, December 1987. ACM Press. SIGMOD Record, Vol.16, No.3. 9, 13, 23, 24, 26, 28, 65, 66, 84 [BLY93]

Valids Berzins, Luqi, and Amiram Yehudai. Using transformations in speci cationbased prototyping. IEEE Transactions on Software Engineering, 19(5):436{452, May 1993. 12, 37, 233

[BMO+ 89] R. Bretl, D. Maier, A. Otis, J. Penney, B. Schuchardt, J. Stein, E. H. Williams, and M. Williams. The GemStone Data management System. In W. Kim and F. H. Lochovsky, editors, Object-Oriented Concepts, Databases and Applications. AddisonWesley, Reading, MA, 1989. 28

Bibliography

281

[Boe86]

B. W. Boehm. A Spiral Model of Software Development and Enhancement. Software Engineering Notes, 11(4), 1986. 2, 23, 116

[Boo91]

Grady Booch. Object-Oriented Design with Applications. Benjamin/Cummings, Redwood City, CA, 1991. ISBN 0-8053-0091-0. 39, 40, 151, 153

[Bra83]

Ronald J. Brachman. What IS-A Is and Isn't: An Analysis of Taxonomic Link in Semantic Networks. IEEE Computer Magazine, 16(10):30{36, October 1983. 113, 114

[Bra87]

Richard P. Bragger. Wissensbasierte Werkzeuge fur den Datenbank-Entwurf. PhD thesis, ETH Zurich, Zurich, Switzerland, 1987. Diss. ETH Nr: 8290; also published in Verlag der Fachvereine Zurich, ISBN 3-7281-1616-5. 17, 259

[BW86]

M. Barbacci and J. Wing. Specifying Functional and Timing Behavior for Real-Time Applications. Technical Report CMU/SEI-86-TR-4 ADA178769, Software Engineering Institute (Carnegie Mellon University), 1986. 146

[CAD+ 94] R.G.G. Cattell, Tom Atwood, Joshua Duhl, Guy Ferran, Mary Loomis, and Drew Wade. The Object Database Standard: ODMG-93, Release 1.1. Morgan Kaufmann, 1994. ISBN 1-55860-302-6. 157, 171 [Cas91]

Eduardo Casais. Managing Evolution in Object-Oriented Environments: an Algorithmic Approach. PhD thesis, University of Geneva, Geneva, Switzerland, May 1991. Thesis no. 369. 9, 10, 24, 84

[CCHO89] Peter S. Canning, William R. Cook, Walter L. Hill, and Walter G. Oltho . Interfaces for strongly-typed object-oriented programming. In Norman Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 457{467, New Orleans, Louisiana, October 1989. ACM Press. Special Issue of SIGPLAN Notices, Vol.24, No.10. 109

282

Bibliography

[CDD+89] D. Carrington, D. Duke, R. Duke, P. King, G. Rose, and G. Smith. Object-Z: An Object-Oriented Extension to Z. In International Conference on Formal Description Techniques (FORTE), pages 281{296, Vancouver, Canada, December 1989. Elsevier Science Publishers B.V. (North-Holland). 34 [Cla92]

Stewart M. Clamen. Type evolution and instance adaptation. Technical Report CMUCS-92-133, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, June 1992. 28

[CLR90]

Thomas Cormen, Charles Leiserson, and Ronald Rivest. Introduction to Algorithms. The MIT Electrical Engineering and Computer Science Series. McGraw-Hill, 1990. ISBN 0-07-013143-0. 164

[Cod70]

E. F. Codd. A relational model for large shared data banks. Communications of the ACM, 13(6):377{387, June 1970. 67

[Coi87]

Pierre Cointe. Metaclasses are First Class: the ObjVlisp Model. In Norman Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 156{167, Orlando, Florida, October 1987. ACM Press. Special Issue of SIGPLAN Notices, Vol.22, No.12. 257, 258, 259

[Coo89]

William R. Cook. A Proposal for Making Ei el Type-Safe. The Computer Journal, 32(4):305{311, 1989. A preliminary version of this paper appeared in the proceedings of ECOOP '89. 109, 113

[CPLZ91]

Alberto Coen-Porisini, Luigi Lavazza, and Roberto Zicari. Updating the schema of an object-oriented database. Quarterly Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 14(2):33{37, June 1991. Special Issue on Foundations of object-Oriented Database Systems. 26

Bibliography

283

[CT91]

Robert F. Cohen and Roberto Tamassia. Dynamic Expression Trees and their Applications. In Symposium on Discrete Algorithms (SODA), pages 52{61, San Francisco, CA, January 1991. ACM-SIAM. 176

[Der83]

Nachum Dershowitz. The Evolution of Programs, volume 5 of Progress in Computer Science. Birkhauser, Boston, Massachusetts, 1983. ISBN 0-8176-3156-9. 63, 68

[Deu83]

L. Peter Deutsch. Reusability in the Smalltalk-80 Programming System. In Proceedings of the Workshop on Reusability in Programming, pages 72{76. ITT, 1983. 105, 119

[Deu89]

L. Peter Deutsch. Design Reuse and Frameworks in the Smalltalk-80 System. In Ted J. Biggersta and Alan J. Perlis, editors, Software Reusability; Volume II, Applications and Experience, chapter 3, pages 57{71. ACM Press, 1989. ISBN 0-201-50018-3. 119

[DF94]

Scott Danforth and Ira R. Forman. Re ections on metaclass programming in SOM. In Eliot Moss, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 440{452, Portland, Oregon, October 1994. ACM Press. Special Issue of SIGPLAN Notices, Vol.29, No.10. 257

[Dig82]

Digital Equipment Corporation. CMS/MMS Code/Module Management System Manual, 1982. 3

[DoD81]

United States Department of Defense DoD. The Programming Language Ada Reference Manual, volume 106 of Springer Verlag, Lecture Notes in Computer Science. Springer Verlag, Berlin Heidelberg New York, 1981. ISBN 3-540-10693-6. 157

[dRS84]

Jim des Rivieres and Brian Cantwell Smith. The Implementation of Procedurally Re ective Languages. In ACM Symposium on LISP and Functional Programming, pages 331{347, Austin, TX, 1984. ACM Press. 257

284

Bibliography

[DT92]

Mahesh Dodani and Chung-Shin Tsai. ACTS: A Type System for ObjectOriented Programming Based on Abstract and Concrete Classes. In O. Lehrman Madsen, editor, European Conference on Object-Oriented Programming (ECOOP), pages 309{ 324, Utrecht, The Netherlands, June/July 1992. Springer Verlag, Lecture Notes in Computer Science. Vol. 615. 105, 109, 110, 111

[DZ91]

Christine Delcourt and Roberto Zicari. The Design of an Integrity Consistency Checker (ICC) for an Object Oriented Database System. In Pierre America, editor, European Conference on Object-Oriented Programming (ECOOP), pages 97{117, Geneva, Switzerland, July 1991. Springer Verlag, Lecture Notes in Computer Science. Vol. 512. 21, 25, 26, 27

[EF90]

Thibault Estier and Gilles Falquet. Le Modele Farandole 2 et son Implantation. Technical Report Cahiers du CUI No.53, Centre Universitaire d'Informatique, Geneve, Octobre 1990. 14

[EG85]

S. Even and H. Gazit. Updating Distances in Dynamic Graphs. Methods of Operations Research, 49:371{387, 1985. 176

[EIT+ 92]

David Eppstein, Giuseppe F. Italiano, Roberto Tamassia, Robert E. Tarjan, Je ery Westbrook, and Moti Yung. Maintenance of a Minimum Spanning Forest in a Dynamic Plane Graph. Journal of Algorithms, 13(1):33{54, March 1992. 176

[ES90]

Margaret A. Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual. Addison-Wesley, Reading, MA, 1990. ISBN 0-201-51459-1. 40, 92, 100, 106, 133, 149, 167, 235, 247

[EvBD92]

Mohammed Erradi, Gregor v. Bochmann, and Rachida Dssouli. A Framework for Dynamic Evolution of Object-Oriented Speci cations. In Marc Kellner, editor, Conference on Software Maintenance, pages 96{104, Orlando, Florida, November 1992. IEEE Computer Society Press. 9, 12

Bibliography

285

[EvBH92]

Mohammed Erradi, Gregor v. Bochmann, and I. Hamid. Dynamic Modi cations of Object-Oriented Speci cations. In International Conference on Computer Systems and Software Engineering, The Hague, The Netherlands, May 1992. 12

[FA93]

Svend Frlund and Gul Agha. A Language Framework for Multi-Object Coordination. In Oscar M. Nierstrasz, editor, European Conference on Object-Oriented Programming (ECOOP), pages 346{360, Kaiserslautern, Germany, July 1993. Springer Verlag, Lecture Notes in Computer Science. Vol. 707. 146

[FDM94]

Ira R. Forman, Scott Danforth, and Hari Madduri. Composition of before/after metaclasses in SOM. In Eliot Moss, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 427{439, Portland, Oregon, October 1994. ACM Press. Special Issue of SIGPLAN Notices, Vol.29, No.10. 257

[Fel79]

S. I. Feldman. MAKE { A Program for Maintaining Computer Programs. Software { Practice and Experience, 9:255{265, 1979. 3

[FGMS81] Jerome M. Fox, J. Carlos Goti, Claude R. Miller, and James M. Sagawa. Implementing a self-de ning entity-relationship model to hold conceptual view information. In Peter P. Chen, editor, Proceedings of International Conference on Entity-Relationship Approach, pages 569{581, Washington, DC, October 1981. Elsevier Science Publishers B.V. (North-Holland). Entity-Relationship Approach to Information Modeling and Analysis, ISBN 0-444-86747-3. 259 [Flo62]

Robert W. Floyd. Algorithm 97 (SHORTEST PATH). Communications of the ACM, 5(6):345, June 1962. 164, 174

[FMZ94a]

Fabrizio Ferrandina, Thorsten Meyer, and Roberto Zicari. Correctness of Lazy Database Updates for an Object Database System. In International Workshop on Persistent Object Systems, Tarascon, France, September 1994. 14, 25, 28

286

Bibliography

[FMZ94b]

Fabrizio Ferrandina, Thorsten Meyer, and Roberto Zicari. Implementing Lazy Database Updates for an Object Database System. In International Conference on Very Large Data Bases, Santiago, Chile, September 1994. 9, 14, 25, 28

[GHJV94]

Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Professional Computing Series. Addison-Wesley, Reading, MA, October 1994. ISBN 0-201-63361-2. 15, 125, 150

[GI91a]

Zvi Galil and Giuseppe F. Italiano. Fully Dynamic Algorithms for 2-Edge Connectivity. Technical Report CUCS-016-91, University of Columbia, 1991. 175, 176

[GI91b]

Zvi Galil and Giuseppe F. Italiano. Maintaining Biconnected Components of Dynamic Planar Graphs. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 339{350, Madrid, Spain, July 1991. Springer Verlag, Lecture Notes in Computer Science. Vol. 510. 176

[GJ79]

Michael R. Garey and David S. Johnson. Computers and intractability | A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York, NY, 1979. ISBN 0-7167-1044-7. 90

[Gol71]

A. J. Goldman. Optimal Center Location in Simple Networks. Transportation Science, 5:212{221, 1971. 170, 175, 181

[GOP90]

Keith Gorlen, Sandy Orlow, and Perry Plexico. Data Abstraction and Object-Oriented Programming in C++. John Wiley & Sons, New York, NY, 1990. ISBN 0-471-92751-1. 247

[GR83]

Adele Goldberg and David Robson. Smalltalk{80, the Language and its Implementation. Addison-Wesley, Reading, MA, 1983. ISBN 0-201-11371-6. 105, 149

Bibliography

287

[GR93]

Jim Gray and Andreas Reuter. Transaction Processing: Concepts and Techniques. Data Management Systems. Morgan Kaufmann, San Mateo, CA, 1993. ISBN 1-55860190-2. 13, 65

[Gri91]

William G. Griswold. Program Restructuring as an Aid to Software Maintenance. PhD thesis, Department of Computer Science & Engineering, University of Washington, Seattle, WA, USA, August 1991. 11, 34

[Gro93]

John A. Grosberg. Comments on considering 'class' harmful. Communications of the ACM, 36(1):113{114, January 1993. Technical correspondence. 114

[Hak64]

S. L. Hakimi. Optimal Location of Switching Centers and the Absolute Centers and Medians of a Graph. Operations Research, 12:450{459, 1964. 170, 174

[HB95]

Walter L. Hursch and Ivan Baev. Minimizing information acquisition cost in objectoriented systems. Technical Report NU-CCS-95-09, College of Computer Science, Northeastern University, Boston, MA, June 1995. 153

[HK95]

Walter L. Hursch and Linda M. Keszenheimer. Automating the Evolution of ObjectOriented Systems. Technical Report NU-CCS-95-06, College of Computer Science, Northeastern University, Boston, MA, July 1995. 123, 150, 197

[HKB93]

Berthold Ho mann and Bernd Krieg-Bruckner. Program Development by Speci cation and Transformation. Springer Verlag, Lecture Notes in Computer Science, Berlin, Germany, 1993. Vol. 680. 12, 233

[HL95]

Walter L. Hursch and Cristina Videira Lopes. Separation of concerns. Technical Report NU-CCS-95-03, College of Computer Science, Northeastern University, Boston, MA, February 1995. 123, 141, 143, 146, 151

288

Bibliography

[HLM93]

Walter L. Hursch, Karl J. Lieberherr, and Sougata Mukherjea. Object-Oriented Schema Extension and Abstraction. In Symposium on Applied Computing, pages 54{62, Indianapolis, Indiana, February 1993. ACM, ACM Press. 87

[HM79]

Gabriel Y. Handler and Pitu B. Mirchandani. Location on Networks: Theory and Algorithms. Signal Processing, Optimization, and Control. The MIT Press, Cambridge, MA, 1979. ISBN 0-262-08090-7. 174

[HO87]

Daniel C. Halbert and Patrick D. O'Brien. Using types and inheritance in objectoriented programming. IEEE Software, 4(5):71{79, September 1987. 113

[Hol93]

Ian M. Holland. The design and representation of object-oriented components. PhD thesis, College of Computer Science, Northestern University, Boston, MA, June 1993. 151

[HS96]

Walter L. Hursch and Linda M. Seiter. Automating the Evolution of Object-Oriented Systems. In International Symposium on Object Technologies and Advanced Software (ISOTAS), Kanazawa, Japan, March 1996. Springer Verlag, Lecture Notes in Computer Science. Accepted for publication. 123

[HSX91]

Walter L. Hursch, Linda Seiter, and Cun Xiao. In any CASE: Demeter. American Programmer, 4(10):46{56, October 1991. 101, 151

[HT92]

Yasuaki Honda and Mario Tokoro. Soft Real-Time Programming through Re ection. In International Workshop on Re ection and Meta-Level Architecture, pages 12{23, Tama-City, Tokyo, Japan, November 1992. 143, 146

[HTY89]

Richard Hull, Katsumi Tanaka, and Masatoshi Yoshikawa. Behavior Analysis of Object-Oriented Databases: Method Structure, Execution Trees, and Reachability. In Proceedings of the International Conference on Foundations of Data Organization and Algorithms, Paris, France, June 1989. Springer Verlag, Lecture Notes in Computer Science. 11

Bibliography

289

[Hur94]

Walter L. Hursch. Should Superclasses be Abstract? In Mario Tokoro and Remo Pareschi, editors, European Conference on Object-Oriented Programming (ECOOP), pages 12{31, Bologna, Italy, July 1994. Springer Verlag, Lecture Notes in Computer Science. Vol. 821. 40, 103

[IK83]

T. Ibaraki and N. Katoh. On-line Computation of Transitive Closure for Graphs. Journal of Information Processing, 16:95{97, 1983. 176

[Ita88]

Giuseppe F. Italiano. Finding Paths and Deleting Edges in Directed Acyclic Graphs. Journal of Information Processing, 28:5{11, 1988. 176

[Jam94]

Kevin Jameson. Multi-Platform Code Management. O'Reilly & Associates, Inc., Sebastopol, CA, August 1994. ISBN 1-56592-059-7. 3, 9 Ivar Jacobson, Magnus Christerson, Patrik Jonsson, and Gunnar O vergaard. ObjectOriented Software Engineering. Addison-Wesley, Reading, MA, 1992. ISBN 0-20154335-0. 2, 39, 150, 153

[JCJO 92]

[JF88]

Ralph E. Johnson and Brian Foote. Designing Reusable Classes. Journal of ObjectOriented Programming, 1(2):22{35, June/July 1988. 1, 2, 100, 105, 118, 119, 151

[JO93]

Ralph E. Johnson and William F. Opdyke. Refactoring and Aggregation. In S. Nishio and A. Yonezawa, editors, International Symposium on Object Technologies and Advanced Software (ISOTAS), pages 264{278, Kanazawa, Japan, November 1993. JSSST, Springer Verlag, Lecture Notes in Computer Science. Vol. 742. 10, 68

[Joh77]

Donald B. Johnson. Ecient algorithms for shortest paths in sparse networks. Journal of the Association for Computing Machinery, 24(1):1{13, 1977. 174

290

Bibliography

[Joh93]

Ralph Johnson. Abstract superclasses in object-oriented libraries. Private communication, November 1993. 105, 119

[KdRB91]

Gregor Kiczales, Jim des Rivieres, and Daniel G. Bobrow. The Art of the Metaobject Protocol. The MIT Press, Cambridge, Massachusetts, 1991. ISBN 0-262-11158-6 (hc.), second printing, 1992. 67, 146, 237, 257, 258, 261

[Kee89]

Sonya E. Keene. Object-Oriented Programming in Common Lisp: A Programmer's Guide to CLOS. Addison-Wesley, Reading, MA, 1989. ISBN 0-201-17589-4. 67, 105, 235, 237, 258, 261

[KH79]

O. Kariv and S. L. Hakimi. An Algorithmic Approach to Network Location Problems. II: the p-Medians. SIAM Journal of Applied Mathematics, 37(3):539{560, December 1979. 170, 175, 183

[Kic92]

Gregor Kiczales. Towards a New Model of Abstraction in the Engineering of Software. In International Workshop on Re ection and Meta-Level Architecture, Tama-City, Tokyo, Japan, November 1992. 15, 144, 145, 146, 257

[Kni89]

Kevin Knight. Uni cation: A multidisciplinary survey. ACM Computing Surveys, 21(1):93{124, March 1989. 100

[KPBG94] Mehmet Hakan Karaata, Sriram V. Pemmaraju, Steven C. Bruell, and Sukumar Ghosh. Self-stabilizing Algorithms for Finding Centers and Medians of Trees. In ACM Symposium on Principles of Distributed Computing, page 374, August 1994. Full version appeared as Technical Report TR 94-03, Department of Computer Science, The University of Iowa. 175, 183 [KR88]

Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Software Series. Prentice-Hall, Englewood Cli s, NJ, second edition, 1988. ISBN 0-13-110362-8. 247

Bibliography

291

[KRS84]

E. Korach, D. Rotem, and N. Santoro. Distributed Algorithms for Finding Centers and Medians in Networks. ACM Transactions on Programming Languages and Systems, 6(3), July 1984. 170, 175, 183

[LBSL91]

Karl J. Lieberherr, Paul Bergstein, and Ignacio Silva-Lepe. From objects to classes: Algorithms for optimal object-oriented design. Journal of Software Engineering, 6(4):205{228, July 1991. 100

[LH89]

Karl J. Lieberherr and Ian M. Holland. Assuring Good Style for Object-Oriented Programs. IEEE Software, 6(5):38{48, September 1989. 22, 124, 154

[LH90]

Barbara Staudt Lerner and A. Nico Habermann. Beyond Schema Evolution to Database Reorganization. In Norman Meyrowitz, editor, Proceedings OOPSLA/ECOOP '90, pages 67{76, Ottawa, Canada, October 1990. ACM Press. Special Issue of SIGPLAN Notices, Vol.25, No.10. 9, 27, 77

[LHM92]

Karl J. Lieberherr, Walter L. Hursch, and Sougata Mukherjea. Optimal and Ecient Schema Abstraction. Technical Report NU-CCS-92-2, College of Computer Science, Northeastern University, Boston, MA, January 1992. 87

[LHSLX92] Karl J. Lieberherr, Walter L. Hursch, Ignacio Silva-Lepe, and Cun Xiao. Experience with a Graph-Based Propagation Pattern Programming Tool. In Gene Forte et al., editor, International Workshop on CASE, pages 114{119, Montreal, Canada, July 1992. IEEE Computer Society Press. 15, 127, 255 [LHX91]

Karl J. Lieberherr, Walter L. Hursch, and Cun Xiao. Object-extending Class Transformations. Technical Report NU-CCS-91-8, College of Computer Science, Northeastern University, Boston, MA, September 1991. 69

[LHX94]

Karl J. Lieberherr, Walter L. Hursch, and Cun Xiao. Object-extending Class Transformations. Formal Aspects of Computing, the International Journal of Formal Methods, 6(4):391{416, July 1994.

292

Bibliography

13, 64, 65, 69 [Lie96]

Karl J. Lieberherr. Adaptive Object-Oriented Software: The Demeter Method with Propagation Patterns. PWS Publishing Company, a Division of Wadsworth, Inc., Boston, Massachusetts, 1996. ISBN 0-534-94620-X. 15, 83, 101, 150, 151

[Lip93]

Stanley B. Lippman. C++ Primer. Addison-Wesley, Reading, MA, second edition edition, July 1993. ISBN. 247

[LL94]

Cristina Videira Lopes and Karl J. Lieberherr. Abstracting Process-to-Process Relations in concurrent Object-Oriented Applications. In Mario Tokoro and Remo Pareschi, editors, European Conference on Object-Oriented Programming (ECOOP), pages 81{99, Bologna, Italy, July 1994. Springer Verlag, Lecture Notes in Computer Science. Vol. 821. 143, 145, 146

[LLOW91] Charles Lamb, Gordon Landis, Jack Orenstein, and Dan Weinreb. The ObjectStore Database System. Communications of the ACM, 34(10):50{63, October 1991. 28 [LM91]

Jo A. Lawless and Molly M. Miller. Understanding CLOS, The Common Lisp Object System. Digital Press, Bedford, MA, 1991. ISBN 1-55558-064-5. 67, 237, 258

[Lop95]

Cristina Videira Lopes. Adaptive parameter passing. Technical Report NU-CCS95-05, College of Computer Science, Northeastern University, Boston, MA, February 1995. 234

[Lop96]

Cristina Videira Lopes. Adaptive Parameter Passing. In International Symposium on Object Technologies and Advanced Software (ISOTAS), Kanazawa, Japan, March 1996. Springer Verlag, Lecture Notes in Computer Science. Accepted for publication. 151

[LP83]

R. Lorie and W. Plou e. Complex objects and their use in design transactions. In Database Week 1983 (Database for Engineering Applications), pages 115{121. ACM, May 1983.

Bibliography

293 45

[LP91]

Wilf LaLonde and John Pugh. Subclassing 6= Subtyping 6= Is-a. Journal of ObjectOriented Programming, 3(5):57{62, January 1991. 113

[LS93]

Amarit Laorakpong and Motoshi Saeki. Object-Oriented Speci cation Development using VDM. In S. Nishio and A. Yonezawa, editors, International Symposium on Object Technologies and Advanced Software (ISOTAS), pages 529{543, Kanazawa, Japan, November 1993. JSSST, Springer Verlag, Lecture Notes in Computer Science. Vol. 742. 34

[LSLX94]

Karl J. Lieberherr, Ignacio Silva-Lepe, and Cun Xiao. Adaptive object-oriented programming using graph-based customization. Communications of the ACM, 37(5):94{ 101, May 1994. 15, 143, 146, 150

[LX93a]

Karl Lieberherr and Cun Xiao. Object-Oriented Software Evolution. IEEE Transactions on Software Engineering, 19(4):313{343, April 1993. 15

[LX93b]

Karl J. Lieberherr and Cun Xiao. Formal Foundations for Object-Oriented Data Modeling. IEEE Transactions on Knowledge and Data Engineering, 5(6), June 1993. 39, 114

[LXSL91]

Karl J. Lieberherr, Cun Xiao, and Ignacio Silva-Lepe. Graph-based software engineering: Concise speci cations of cooperative behavior. Technical Report NU-CCS-91-14, College of Computer Science, Northeastern University, Boston, MA, September 1991. 255

[LZHL95]

Ling Liu, Roberto Zicari, Walter Hursch, and Karl Lieberherr. The Role of Polymorphic Reuse Mechanisms in Schema Evolution in an Object-Oriented Database. IEEE Transactions on Knowledge and Data Engineering, 1995. Accepted for publication. 15

[Mae87]

Pattie Maes. Concepts and Experiments in Computational Re ection. In Norman Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 147{155, Orlando, Florida, October 1987. ACM Press. Special Issue of SIGPLAN Notices, Vol.22, No.12.

294

Bibliography

146, 258 [MC93]

Philippe Mulet and Pierre Cointe. De nition of a re ective kernel for a prototypebased language. In S. Nishio and A. Yonezawa, editors, International Symposium on Object Technologies and Advanced Software (ISOTAS), pages 128{144, Kanazawa, Japan, November 1993. JSSST, Springer Verlag, Lecture Notes in Computer Science. Vol. 742. 257

[Mey88]

Bertrand Meyer. Object-Oriented Software Construction. International Series in Computer Science. Prentice-Hall, Englewood Cli s, NJ, 1988. ISBN 0-13-629049-3. 106, 109

[Mey90]

Bertrand Meyer. Lessons from the design of the Ei el libraires. Communications of the ACM, 33(9):68{88, September 1990. 151

[Mey92]

Bertrand Meyer. Ei el: The Language. Object-Oriented Series. Prentice-Hall, Englewood Cli s, NJ, 1992. ISBN 0-13-247925-7. 100, 106, 109, 149

[Min78]

Edward Minieka. Optimization Algorithms for Networks and Graphs, volume 1 of Industrial Engineering. Marcel Dekker, Inc., New York, 1978. ISBN 0-8247-6642-3. 170

[Mul94]

Philippe Mulet. Modelling delegation in a re ective and prototype-based language. Submitted for publication, February 1994. 257

[MY93]

Satoshi Matsuoka and Akinori Yonezawa. Analysis of inheritance anomaly in objectoriented concurrent programming languages. In Gul Agha, Peter Wegner, and Akinori Yonezawa, editors, Research Directions in Concurrent Object-Oriented Programming, chapter 1, pages 107{150. The MIT Press, Cambridge, Massachusetts, 1993. 142

[Mye88]

Ware Myers. Interview with Wilma Osborne. IEEE Software, 5(3):104{105, May 1988. 1

Bibliography

295

[NS87]

K. Narayanaswamy and Walt Scacchi. Maintaining Con gurations of Evolving Software Systems. IEEE Transactions on Software Engineering, 13(3):324{334, March 1987. 3, 16

[OI94]

Hideaki Okamura and Yutaka Ishikawa. Object Location Control Using Meta-level Programming. In Mario Tokoro and Remo Pareschi, editors, European Conference on Object-Oriented Programming (ECOOP), pages 299{319, Bologna, Italy, July 1994. Springer Verlag, Lecture Notes in Computer Science. Vol. 821. 143, 144, 145, 146

[OJ90]

William F. Opdyke and Ralph E. Johnson. Refactoring: An aid in designing application frameworks and evolving object-oriented systems. In Proceedings of the Symposium on Object-Oriented Programming emphasizing Practical Applications (SOOPA), pages 145{160, Poughkeepsie, NY, September 1990. ACM. 10

[Opd92]

William F. Opdyke. Refactoring: A Program Restructuring Aid in Designing objectOriented Application Frameworks. PhD thesis, Computer Science Department, University of Illinois, May 1992. 10, 11, 34, 68

[Osb89]

Sylvia L. Osborn. The role of polymorphism in schema evolution in an object-oriented database. IEEE Transactions on Knowledge and Data Engineering, 1(3):310{317, September 1989. 9

[OT93]

C.L. Ong and W.T. Tsai. Class and Object Extraction from Imperative Code. Journal of Object-Oriented Programming, 6(1):58{68, March/April 1993. 3 M. Tamer O zsu and Patrick Valduriez. Principles of Distributed Database Systems. Prentice-Hall, Englewood Cli s, NJ, 1991. ISBN 0-13-691643-0. 170

[O V91]

[Par90]

Helmut A. Partsch. Speci cation and Transformation of Programs: a Formal Approach to Software Development. Texts and Monographs in Computer Science. Springer Verlag, Berlin, Germany, 1990. ISBN 0-387-52356-1. 12, 233

296

Bibliography

[PS82]

Harilaos Papadimitriou and Irving Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, Englewood Cli s, NJ, 1982. ISBN 0-13-152462-3. 183

[PS87]

Jason D. Penney and Jacob Stein. Class modi cation in the GemStone object-oriented DBMS. In Norman Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 111{117, Orlando, Florida, October 1987. ACM Press. Special Issue of SIGPLAN Notices, Vol.22, No.12. 9, 28, 84

[PS90]

Jens Palsberg and Michael Schwartzbach. Type substitution for object-oriented programming. In N. Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 151{160, Ottawa, 1990. ACM Press. 100

[PS94]

Jens Palsberg and Michael I. Schwartzbach. Object-Oriented Type Systems. Wiley Professional Computing. John Wiley & Sons, New York, NY, July 1994. ISBN 0-47194128-X. 12, 100, 151, 237

[PXL95]

Jens Palsberg, Cun Xiao, and Karl Lieberherr. Ecient Implementation of Adaptive Software. ACM Transactions on Programming Languages and Systems, 17(2):264{ 292, March 1995. 123, 137, 138, 139, 140, 141, 150, 151

[Rau92]

Monika H. Rauch. Fully Dynamic Biconnectivity in Graphs. In IEEE Symposium on Foundations of Computer Science, pages 50{59, 1992. 176

[RBP+ 91]

James Rumbaugh, Michael Blaha, William Premerlani, Frederick Eddy, and William Lorensen. Object-Oriented Modeling and Design. Prentice-Hall, 1991. ISBN. 21, 39, 150, 153

[RdP91]

S. Crespi Reghizzi and G. Galli de Paratesi. De nition of Reusable Concurrent Software Components. In Pierre America, editor, European Conference on Object-Oriented Programming (ECOOP), pages 148{166, Geneva, Switzerland, July 1991. Springer Verlag, Lecture Notes in Computer Science. Vol. 512. 146

Bibliography

297

[Rie94]

Arthur J. Riel. Object-Oriented Design Heuristics: Gateways for Design Transformation Patterns. Submitted for publication, February 1994. 117

[Roc75]

M. J. Rochkind. The Source Code control System. IEEE Transactions on Software Engineering, 1:364{370, December 1975. 3

[RR94]

Young-Gook Ra and Elke A. Rundensteiner. A Transparent Object-Oriented Schema Change Approach Using View Evolution. Technical Report R-94-4, Department of Electrical Engineering and Computer Science, The University of Michigan, Ann Arbor, MI, April 1994. 9, 13, 15

[Sak92]

Markku Sakkinen. A Critique of the Inheritance Principles of C++. Computing Systems, The Journal of the USENIX Association, 5(1):69{110, Winter 1992. 115

[Sch93]

Bernhard Schiefer. Eine Umgebung zur Unterstutzung von Schemaanderungen und Sichten in objektorientierten Datenbanksystemen. PhD thesis, Forschungszentrum Informatik an der Universitat Karlsruhe, Karlsruhe, Germany, December 1993. 9, 25, 29

[SGD93]

Stefan Scherrer, Andreas Geppert, and Klaus R. Dittrich. Schema Evolution in NO2 . Technical Report Nr. 93.12, Institut fur Informatik der Universitat Zurich, Zurich, Switzerland, April 1993. 9

[SH95]

Linda M. Seiter and Walter L. Hursch. Adaptive Behavioral Components: Bridging the Class ^ Module Gap. Technical Report NU-CCS-95-13, College of Computer Science, Northeastern University, Boston, MA, September 1995. 148, 151

[SKG88]

Barbara Staudt, Charles Krueger, and David Garlan. TransformGen: Automating the Maintenance of Structure-Oriented Environments. Technical Report CMU-CS88-186, Department of Computer Science, Carnegie Mellon Univerity, Pittsburgh, PA, November 1988. 27

298

Bibliography

[SLHS94]

Ignacio Silva-Lepe, Walter L. Hursch, and Greg Sullivan. A Report on Demeter/C++. C++ Report, 6(2):24{30, February 1994. 15, 39, 83, 101, 114, 151

[Smi84]

Brian Cantwell Smith. Re ection and Semantics in Lisp. In ACM Symposium on Principles of Programming Languages, pages 23{35, Salt Lake City, UT, January 1984. ACM Press. 146, 257

[SN92]

Kevin J. Sullivan and David Notkin. Reconciling environment integration and software evolution. ACM Transactions on Software Engineering and Methodology, 1(3):229{ 268, July 1992. 16

[ST83]

Daniel D. Sleator and Robert Endre Tarjan. A Data Structure for Dynamic Trees. Journal of Computer and System Sciences, 26:362{391, 1983. 176

[Ste90]

Guy L. Steele. Common Lisp, The Language. Digital Press, Bedford, MA, second edition, 1990. ISBN 1-55558-041-6. 67, 133, 235, 237, 258

[SZ86]

Andrea H. Skarra and Stanley B. Zdonik. The management of changing types in an object-oriented database. In Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 483{495. ACM Press, September 1986. 9, 13, 15, 24, 28

[Szy92]

Clemens A. Szyperski. Import is Not Inheritance. Why We Need Both: Modules and Classes. In O. Lehrman Madsen, editor, European Conference on Object-Oriented Programming (ECOOP), pages 19{32, Utrecht, The Netherlands, June/July 1992. Springer Verlag, Lecture Notes in Computer Science. Vol. 615. 149

[TB67]

Michael B. Teitz and Polly Bart. Heuristic methods for estimating the generalized vertex median of a weighted graph. Operations Research, 16:955{961, 1967. 175

Bibliography

299

[Tic85]

Walter F. Tichy. RCS { A System for Version Control. Software { Practice and Experience, 15(7):637{654, 1985. 3

[Tre91]

Markus Tresch. A framework for schema evolution by meta object manipulation. In Proceedings of the 3rd International Workshop on Foundations of Models and Languages for Data and Objects, Aigen, Austria, September 1991. 9, 84

[TS92]

Markus Tresch and Marc H. Scholl. Meta object management and its application to database evolution. ETH Zurich, Department of Computer Science, 1992. 259

[TS93]

Markus Tresch and Marc H. Scholl. Schema modi cation without database reorganization. SIGMOD Record, 22(1):21{27, March 1993. 9, 13, 15, 63

[TS94]

Christiaan Thieme and Arno Siebes. An approach to schema integration based on transformations and behaviour. In G. Wijers, S. Brinkkemper, and T. Wasserman, editors, International Conference on Advanced Information Systems Engineering (CAiSE), pages 297{310, Utrecht, The Netherlands, June 1994. Springer Verlag, Lecture Notes in Computer Science. Vol. 811. 67, 233

[Tsa94]

Chung-Shin Tsai. Corrections to the Abstract Concrete Type System. Private communication, March 1994. 111

[TT92]

Kazunori Takashio and Mario Tokoro. DROL: An Object-Oriented Programming Language for Distributed Real-Time Systems. In Andreas Paepcke, editor, ObjectOriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 276{294, Vancouver, Canada, October 1992. ACM Press. 141

[US87]

David Ungar and Randall B. Smith. Self: The power of Simplicity. In Norman Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 227{242, Orlando, Florida, October 1987. ACM Press. Special Issue of SIGPLAN Notices, Vol.22, No.12. 149

300

Bibliography

[Wal91a]

Jim Waldo. Controversy: The Case for Multiple Inheritance in C++. Computing Systems, The Journal of the USENIX Association, 4(2):157{171, 1991. 105, 114, 115

[Wal91b]

Emanuel Waller. Schema updates and consistency. In C. Delobel, M. Kifer, and Y. Masunaga, editors, Proceedings of the International Conference on Deductive and Object-Oriented Databases (DOOD), pages 167{188, Munich, Germany, December 1991. Springer Verlag, Lecture Notes in Computer Science. Vol. 566. 11, 34

[Wan94]

Paul S. Wang. C++ with Object-Oriented Programming. PWS Publishing Company, a Division of Wadsworth, Inc., Boston, MA, 1994. ISBN 0-534-19644-6. 247

[Weg90]

Peter Wegner. Concepts and paradigms of object-oriented programming. OOPS Messenger, 1(1):7{87, January 1990. 113

[Weg95]

Peter Wegner. A perspective on object-oriented research. Theory and Practice of Object Systems, 1(2), 1995. To appear. 115

[WEK90]

Ann L. Winblad, Samuel D. Edwards, and David R. King. Object-Oriented Software. Addison-Wesley, Reading, MA, September 1990. ISBN 0-201-50736-6. 2

[WH94]

Christian Wohrle and Walter Hursch. E-mail communication. Universitat, Frankfurt, Germany, November 1994.

J.W. Gothe247

[Win92]

Jurgen F. H. Winkler. Objectivism: 'class' considered harmful. Communications of the ACM, 35(8):128{130, August 1992. Technical correspondence. 113, 114

[Wir88]

Niklaus Wirth. The Programming Language Oberon. Software { Practice and Experience, 18(7):671{690, July 1988. 149

Bibliography

301

[WMH93]

Norman Wilde, Paul Matthews, and Ross Huitt. Maintaining object-oriented software. IEEE Software, 10(1):75{80, January 1993. 15, 120, 124, 125, 150, 154

[WY88]

Takuo Watanabe and Akinori Yonezawa. Re ection in an object-oriented concurrent language. In Norman Meyrowitz, editor, Object-Oriented Programming Systems, Languages and Applications Conference (OOPSLA), pages 306{315, San Diego, CA, September 1988. ACM Press. Special Issue of SIGPLAN Notices, Vol.23, No.11. 257, 258

[WY90]

Takuo Watanabe and Aknori Yonezawa. Re ection in an Object-Oriented Concurrent Language. In Akinori Yonezawa, editor, ABCL | An Object-Oriented Concurrent System, chapter 3, pages 45{70. The MIT Press, Cambridge, Massachusetts, 1990. ISBN 0-262-24029-7. 15, 146

[WZ88]

Peter Wegner and Stanley B. Zdonik. Inheritance as an Incremental Modi cation Mechanism or What Like Is and Isn't Like. In S. Gjessing and K. Nygaard, editors, European Conference on Object-Oriented Programming (ECOOP), pages 55{77, Oslo, Norway, August 1988. Springer Verlag, Lecture Notes in Computer Science. 113

[Xia94]

Cun Xiao. Foundations for Adaptive Object-Oriented Software. PhD thesis, College of Computer Science, Northestern University, Boston, MA, September 1994. 15, 141, 146, 151, 153

[ZG92]

Christian Zeidler and W. Gerteis. Distribution: Another Milestone of Application Management Issues. In G. Heeg, B. Magnusson, and B. Meyer, editors, Technology of Object-Oriented Languages and Systems (TOOLS Europe), pages 87{99, Dortmund, Germany, March 1992. 146

[Zic87]

Roberto Zicari. Schema Updates in the O2 Object-Oriented Database System. Technical Report 89-057, Politecnico di Milano, Dipartimento di Elettronica, Milano, Italy, October 1987. 84

302 [Zic92]

Bibliography

Roberto Zicari. A Framework for Schema Updates in an Object-Oriented Database System. In Francois Bancilhon, Claude Delobel, and Paris Kanellakis, editors, Building an Object-Oriented Database System, the Story of O2 , chapter 7, pages 176{182. Morgan Kaufmann, San Mateo, CA, 1992. ISBN 1-55860-169-4. 9, 13, 34

Index calling path, 136 calling reachability, 136 calling reachable, 136 Class Abstraction, 96 class graph, 45, 132 Class graph relations, 77 class hierarchy, 44 class metaobjects, 276 class type, 47 compatible, 138, 140, 141 Completeness, 69 complex objects, 50 composite objects, 50 composite primitive transformations, 53, 55 composition of separated concerns, 150 Computational re ection, 272 Computational system, 272 conceptual consistency, 28 conceptually consistent, 28 Consistency, 70 consistent, 49, 51, 141 Correctness, 69 covered, 76

Objects , 81 CGA , 97 TCGA , 100

Abstract Superclass Rule, 110 Abstraction, 95 abstraction, 95, 99 Abstraction of Common Reference, 58, 84 abstraction set, 95 ACID, 70 adaptive behavioral component, 156, 157 adaptive software, 17 addition of abstract class, 57, 84 addition of concrete class, 57, 85 addition of inheritance relation, 58 addition of reference relation, 58, 85 addition of subclass, 58 ambiguity constraint, 140 ambiguous, 140 atomicity, 70 basic primitive transformations, 53, 54 behavioral consistency, 36 behavioral equivalence, 36 bypassing edges, 138

data dictionary, 273 data model, 54 303

304 Deletion of abstract class, 57, 84 Deletion of concrete class, 57 Deletion of inheritance relation, 58 Deletion of reference relation, 58 design patterns, 18 Distribution of common reference, 58, 84 domain, 19 dual interface, 271 durability, 70 eager database update, 27 Eager updates, 30 EMOP, 271 enlarged, 82 environment, 38, 50 evolutionary metaobject protocol, 271, 273 Extend, 87 extension, 51, 81, 87 extension relations, 81 Extensional changes, 22 extensional data, 20, 22 Generalization of reference relation, 58, 85 generalized, 81 generic function metaobject, 276 incremental inheritance, 147, 256 Information conservation, 68 information content, 69 inheritance edge, 46 inheritance hierarchy, 44 Inheritance reachability, 46 instantiation, 99 intension, 81 intensional changes, 22 intensional data, 20, 22 invariants, 44 isolation, 70

Index

join, 147 kernel model schema, 44 Label language, 208 lazy database update, 27 Lazy updates, 31 merge, 147 meta level, 272 meta primitive transformations, 54 meta schema, 54 meta-database, 273 Method evolution, 36 Method lookup, 140 method map, 139 method metaobject, 276 Minimality, 70 new, 82 object, 49 Object equivalence, 77 Object evolution, 30 Object extension, 69 Object graph, 49 object graph, 49 Object preservation, 69 object transformation, 51 open implementation, 271 original, 81 overriding inheritance, 147, 256 primitive transformations, 56 propagation directive, 138 propagation pattern, 132, 141 propagation scope, 139 reference edge, 46 reference hierarchy, 44

305

Index

Reference reachability, 47 Reference type sets, 76 References, 48 Re ective System, 272 Renaming of class, 57 Replacement of Reference, 85 Replacement of reference relation, 58 restructuring function, 72 x schema abstraction, 94 schema evolution, 28 schema invariants, 28 Screening, 31 screening, 27 self-representation, 272 semantic correctness, 140 semantically correct, 208 Semantics of propagation patterns, 141 separation of concerns, 148 serializability, 70 slice of behavior, 157 software evolution, 22, 25 source vertices, 138 structural consistency, 29 structure preserving, 83 Sub- and superclasses, 47 Surgery, 26 syntactic correctness, 140 tailoring, 26 target vertices, 138 telescoping of inheritance, 58 telescoping of reference, 58 Template abstraction, 99 template abstraction set, 100 Template class graph, 99 template vertices, 99

tightly covered, 76 transformation transactions, 27 unique name invariant, 76 update invariant, 197 useless, 61 Versioning, 26 views, 17 Weak extension, 78 well-formed, 28, 48 well-formedness, 48 Well-formedness of class graphs, 48

306

Appendix A Symbol Data Model G = (V; E; L) V VA VC VP E EI ER L LV LE 

Classname(v)

u=)v u?!v

CType(v) DirectRefs(v) Refs(v) RefLabels (v)

Appendix A.

Description

Reference

Class graph Vertices (classes) Abstract vertices (abstract classes) Concrete vertices (concrete, non-primitive classes) Primitive vertices (concrete, primitive classes) Edges (relationships) Inheritance edges (inheritance relationship) Reference edges (reference relationship) Labels (names) Vertex labels (class names) Reference edge labels (reference names) Inheritance edge label Classname of vertex v Inheritance reachability (u subclass of v) Reference reachability (u has reference to v) Concrete subclass operator Set of direct references of v Set of direct and inherited references of v Set of reference labels of v

Def. 1, p.45

Calling reachability (u can call v) Calling path hv0; `1 ; v1; : : :; `n; vni Set of rst edges in P . Set of truncated paths in P . Set of paths in P starting with v; `. Propagation pattern Propagation pattern name (method name) Propagation directive Source vertices Traversal exclusion constraints (bypassing edges) Target vertices Edges in X that match e Pair of method maps Pre x method map Sux method map Calling paths in G satisfying D Vertices in G relevant to D

Def. 17, p. 136 Def. 18, p. 136

Behavioral Model u;v p Car(P ) Cdr(P ) Filt(P; v; `) P = (N; D; M ) N D = (S; X; T ) S X T MatchSet G (e; X ) M = (MP; MS) MP MS PathSet(G ; D) Scope (G ; D)

Def. 2, p. 46 Def. 5, p. 47 Def. 3, p. 47 Def. 6, p. 48

Def. 23, p.141 Def. 19, p. 138

Def. 21, p. 139 Def. 19, p. 138 Def. 20, p. 139

Table 9: Overview of Adaptive Software Terms

Appendix B

Code Examples for the Window System This appendix provides a full- etched code example in both CLOS and C++ for the class graph in Figure 40A. Both examples are referred to in Chapters 13 and 14, respectively.

B.1 Window System in CLOS

;;--------------------------------------------------------------------------;; CLOS class definitions for class graph ORIGINAL ;; (defclass DrawWindow () ((canvas :accessor getCanvas :initarg :canvas) (shapes :accessor getShapes :initarg :shapes))) (defclass Screen () ((window :accessor window :initarg :window) (inputTool :accessor getInputTool :initarg :inputTool))) (defclass Tool () ((interface :accessor getInterface :initarg :interface))) (defclass MouseInterface () ()) (defclass RectTool (Tool) ())

307

308

Appendix B. Code Examples for the Window System

(defclass OvalTool (Tool) ()) (defclass SelectTool (Tool) ()) ;; We'll use the Lisp internal list mechanism to store shapes ;; (defclass ShapeList (Rest) ;;

((car :accessor getCar :initarg :car)

;;

(cdr :accessor getCdr :initarg :cdr)))

;; ;; (defclass Rest () ()) ;; ;; (defclass End (Rest) ()) (defclass ShapeList () ((shapes :accessor shapes :initarg :shapes))) (defclass Shape () ((position :accessor getPosition :initarg :position))) (defclass Rectangle (Shape) ()) (defclass Oval (Shape) ())

;;=========================================================================== ;; Method definitions for class dictionary graph ORIGINAL ;; ;;--------------------------------------------------------------------------;; Handle a mouse click in the drawing window ;; (defmethod handleMouseClick ((drawWindow DrawWindow) &optional win) (handleMouseClick (getCanvas drawWindow) drawWindow)) (defmethod handleMouseClick ((screen Screen) &optional win) (handleMouseClick (getInputTool screen) win)) (defmethod handleMouseClick ((rectTool RectTool ) &optional win) (addShape win (make-instance

B.1. Window System in CLOS 'Rectangle :position (currentPosition (getInterface rectTool))))) (defmethod handleMouseClick ((ovalTool OvalTool) &optional win) (addShape win (make-instance 'Oval :position (currentPosition (getInterface ovalTool))))) (defmethod handleMouseClick ((selectTool SelectTool) &optional win) (shapeSelected win (currentPosition (getInterface selectTool)))) (defmethod currentPosition ((mouseInterface MouseInterface)) 5 )

;; always simulate position 5

(defmethod addShape ((drawWindow DrawWindow) shape) (addShape (getShapes drawWindow) shape)) (defmethod addShape ((shapeList ShapeList) shape) (setf (shapes shapeList) (cons shape (shapes shapeList))))

;; --------------------------------------------------------------------------;; Handle the selection of a shape indicated by the selected position pos ;; (defmethod shapeSelected ((drawWindow DrawWindow) pos &optional canvas) (shapeSelected (getShapes drawWindow) pos (getCanvas drawWindow))) (defmethod shapeSelected ((shapeList ShapeList) pos &optional canvas) (do ((sl (shapes ShapeList) (cdr sl))) ((hasPosition (car sl) pos) (selected (car sl) canvas)) )) (defmethod hasPosition ((shape Shape) pos) (= pos (getPosition shape))) (defmethod selected ((shape Shape) canvas) ) (defmethod selected ((rect Rectangle) canvas) (print "Displaying Rectangle on Canvas") (values))

309

310

Appendix B. Code Examples for the Window System

(defmethod selected ((oval Oval) canvas) (print "Displaying Oval on Canvas") (values))

;;--------------------------------------------------------------------------;; Initial Object ;; (setf *los* (cons (make-instance 'Rectangle :position 1) (cons (make-instance 'Rectangle :position 3) (cons (make-instance 'Oval :position 5) nil)))) (setf *dw1* (make-instance 'DrawWindow :canvas (make-instance 'Screen :inputTool (make-instance 'SelectTool :interface (make-instance 'MouseInterface))) :shapes (make-instance 'ShapeList :shapes *los*))) (setf *dw2* (make-instance 'DrawWindow :canvas (make-instance 'Screen :inputTool (make-instance 'RectTool :interface (make-instance 'MouseInterface))) :shapes (make-instance 'ShapeList :shapes *los*)))

B.2 Window System in C++ // ----------------------------------------------------------------------

311

B.2. Window System in C++ // Class definitions for class dictionary graph ORIGINAL // class DrawWindow { protected: Screen

*canvas;

ShapeList *shapes; public: DrawWindow( Screen *, ShapeList * ); void handleMouseClick(); void shapeSelected( Position *pos ); void addShape( Shape *shape ); }; class Screen { protected: Tool *inputTool; public: Screen( Tool * ); void handleMouseClick( DrawWindow * ); }; class Tool { protected: MouseInterface *interface; public: Tool( MouseInterface * ); virtual void handleMouseClick( DrawWindow * ) { / * empty * / virtual Position *getPosition(); }; class RectTool : public Tool { public: RectTool( MouseInterface * ); void handleMouseClick( DrawWindow * ); }; class OvalTool : public Tool { public: OvalTool( MouseInterface * ); void handleMouseClick( DrawWindow * );

}

312

Appendix B. Code Examples for the Window System

}; class SelectTool : public Tool { public: SelectTool( MouseInterface * ); void handleMouseClick( DrawWindow * ); }; class MouseInterface { public: Position *getPosition(); }; class Position { }; class List { public: virtual void shapeSelected( Position *, Screen * ) { } }; class EmptyList : public List { }; class ShapeList : public List { protected: Shape *firstShape; List

*restShapes;

public: ShapeList( Shape *, List * ); void shapeSelected( Position *, Screen * ); void addShape( Shape * ); }; class Shape { protected: Position *position; public: Shape( Position *); virtual void shapeSelected( Screen * ) = 0; virtual Boolean hasPosition( Position *); };

313

B.2. Window System in C++ class Rectangle : public Shape { public: Rectangle( Position *); void shapeSelected( Screen * ); }; class Oval : public Shape { public: Oval( Position *); void shapeSelected( Screen * ); };

// ---------------------------------------------------------------------// Constructors // DrawWindow::DrawWindow( Screen *c, ShapeList *s ) { canvas = c; shapes = s; } ShapeList::ShapeList( Shape *s, List *r ) { firstShape = s; restShapes = r; } Screen::Screen( Tool *iT ) { inputTool = iT; } Tool::Tool( MouseInterface *mi ) { interface = mi; } RectTool::RectTool( MouseInterface *mi ) : Tool( mi ) {

}

OvalTool::OvalTool( MouseInterface *mi ) : Tool( mi ) {

}

SelectTool::SelectTool( MouseInterface *mi ) : Tool( mi ) {

}

Shape::Shape( Position *pos ) { position = pos; } Rectangle::Rectangle( Position *pos ) : Shape( pos ) { } Oval::Oval( Position *pos ) : Shape( pos ) { }

// ----------------------------------------------------------------------

314 //

Appendix B. Code Examples for the Window System Handle a mouse click in the drawing window

// void DrawWindow::handleMouseClick() { canvas -> handleMouseClick( this ); } void Screen::handleMouseClick( DrawWindow *win ) { inputTool -> handleMouseClick( win ); } Position *Tool::getPosition() { return interface -> getPosition(); } Position *MouseInterface::getPosition() { / * return position of mouse at click * / } void RectTool::handleMouseClick( DrawWindow *window ) { Position *pos = getPosition(); window -> addShape( new Rectangle(pos) ); } void OvalTool::handleMouseClick( DrawWindow *window ) { Position *pos = getPosition(); window -> addShape( new Oval(pos) ); } void SelectTool::handleMouseClick( DrawWindow *window ) { Position *pos = getPosition(); window -> shapeSelected( pos ); }

// ---------------------------------------------------------------------// Handle the selection of a shape indicated by the selected position pos // void DrawWindow::shapeSelected( Position *pos ) { shapes -> shapeSelected( pos, canvas ); } void ShapeList::shapeSelected( Position *pos, Screen *canvas ) { if ( firstShape -> hasPosition( pos ) ) firstShape -> shapeSelected( canvas ); else restShapes -> shapeSelected( pos, canvas ); }

B.2. Window System in C++ void Rectangle::shapeSelected( Screen *canvas ) { / * send messages to canvas to highlight the rectangle * / } void Oval::shapeSelected( Screen *canvas ) { / * send messages to canvas to highlight the oval * / } Boolean Shape::hasPosition( Position *pos ) { return( position == pos ? TRUE : FALSE ); } void DrawWindow::addShape( Shape *shape ) { shapes -> addShape( shape ); } void ShapeList::addShape( Shape *shape ) { / * Add a new shape to the ShapeList * / }

315

Suggest Documents