Rethinking Software Updating - The Department of Information

Rethinking Software Updating; Concepts for Improved Updatability Dan Österberg Johan Lilius TUCS Turku Centre for Computer Science Department of Computer Science, Åbo Akademi University Lemminkäisenkatu 14, FIN-20520 Turku, Finland

Turku Centre for Computer Science TUCS Technical Report No 550 September 2003 ISBN 952-12-1213-6 ISSN 1239-1891

Abstract While there exists a fair number of partial solutions to enabling dynamic updating in arbitrary applications, none of them have proved to be superior. This paper does not suggest "yet another" dynamic updating system that no one will use in the end anyway. Instead, we dig deep into the heart of software development, and examine what dynamic updating and static evolution is all about. We identify bottle necks and sources of complications, and come up with suggestions as to how these can be fundamentally solved. We present two novel concepts that could improve the updatability and adaptability of applications; the sequence model for dealing with code, and entity-oriented programming for dealing with data and object-orientation. In order to demonstrate these, as well as tie in even more improvements for updatability and general programming flexibility, we introduce an updatable programming language. Some of the previous work in the field has developed dynamic updating systems by refining some existing programming language. Our approach is not just "our own" implementation of such an approach. Our approach is different and novel, because it constructs the whole programming language from ground up based on updatability demands. To ensure that this language is useful enough, we make it superficially look like popular object-oriented languages. This paper is a revision of the Master's thesis "Rethinking Software Updating; Concepts for Improved Updatability".

Keywords: runtime updating, dynamic reconfiguration, dynamic code replacement, on-the-fly program modification, on-line software version change, software hot swapping, programming languages, object-orientation, updatability

TUCS Laboratory Embedded Systems Laboratory

Contents 1 Introduction 1 1.1 Dynamic Updating..............................................................................1 1.1.1 Problem Domain.........................................................................2 1.1.2 Redundant Hardware...................................................................2 1.1.3 Application Specific Dynamic Updating....................................3 1.1.4 State Transfer..............................................................................3 1.1.5 Dynamic Linking........................................................................4 1.2 Objectives...........................................................................................4 1.2.1 Previous Work.............................................................................4 1.2.2 Goal.............................................................................................5 1.2.3 Thesis Outlining..........................................................................6 2 Aspects of Dynamic Updating 7 2.1 Characteristics of a Dynamic Software Updating System..................7 2.1.1 Requirements...............................................................................7 2.1.2 Functionality...............................................................................9 2.1.3 Modification Type.......................................................................9 2.2 Update Timing and Validity.............................................................10 2.2.1 Validity......................................................................................11 2.2.2 Invoke Model vs. Interrupt Model............................................13 3 Existing Concepts and Primitives 15 3.1 Classification of Existing Building Blocks.......................................15 3.1.1 Low Level Primitives................................................................15 3.1.2 Programming Languages...........................................................17 3.1.3 Including State..........................................................................18 3.2 Updating Data Structures..................................................................20 3.2.1 Dealing with Modified Data Definitions...................................20 3.2.2 Version Barrier..........................................................................22 3.2.3 Global Update...........................................................................22 3.2.4 Passive Partitioning...................................................................23 3.2.5 Active Partitioning....................................................................24 3.2.6 Sub-Typing and Type-Checking...............................................24 3.2.7 Adaptable Objects.....................................................................26 3.2.8 Dealing with Multiple Simultaneous Versions of Objects........27 3.3 Updating Subroutines........................................................................28 3.3.1 Dealing with Modified Subroutines..........................................29 3.3.2 Active Subroutines....................................................................29 3.3.3 Converting Call Stacks that Contain Modified Subroutines.....31 3.4 Updating High Level Units...............................................................32 3.4.1 Classes.......................................................................................32 3.4.2 Processes...................................................................................33 3.5 Existing Implementation Strategies..................................................34 3.5.1 Global Update and the Interrupt Model....................................34 3.5.2 Dynamic Re-linking and the Invoke Model..............................35 3.5.3 Program Restructuring..............................................................35

4 Concepts for Improved Updatability 38 4.1 Improving Updatability.....................................................................38 4.1.1 What is Wrong?.........................................................................38 4.1.2 State Machines..........................................................................39 4.1.3 Reducing Complexity................................................................41 4.1.4 State Machines vs. Imperative Languages................................43 4.2 The Updating Aspect........................................................................45 4.2.1 Aspect-Oriented Programming.................................................45 4.2.2 Adaptive Programming.............................................................46 4.3 The Sequence Model.........................................................................47 4.3.1 Control Flow Graphs.................................................................47 4.3.2 Basic Concepts..........................................................................48 4.3.3 Avoiding Recursion..................................................................51 4.3.4 Building Blocks.........................................................................52 4.3.5 Performing Updates...................................................................55 4.3.6 Summary...................................................................................57 4.4 Entity-Oriented Programming..........................................................59 4.4.1 Localized Functionality vs. Localized Modifications...............60 4.4.2 Basic Concepts..........................................................................61 4.4.3 Class Hierarchy.........................................................................64 5 The Updatable Virtual Architecture 66 5.1 Introducing Uva................................................................................66 5.1.1 Background...............................................................................66 5.1.2 The Naming Issue......................................................................67 5.1.3 Overview...................................................................................67 5.2 General Issues...................................................................................68 5.2.1 Data Types.................................................................................68 5.2.2 Data Type Naming....................................................................70 5.2.3 Sequence Chains.......................................................................71 5.2.4 Classes, Features and Entities...................................................74 5.3 The Uva Source Language................................................................75 5.3.1 Various Issues...........................................................................75 5.3.2 Declaring Classes and Features.................................................76 5.3.3 Declaring Chains.......................................................................77 5.3.4 Special Instructions...................................................................80 5.4 Uva Viewed from Above..................................................................81 5.4.1 Regarding Chain Blocks...........................................................81 5.4.2 Uva vs. Java...............................................................................82 6 Discussion 83 6.1 Updatability Evaluated......................................................................83 6.1.1 Fibonacci Terms........................................................................83 6.1.2 The Dining Philosophers...........................................................84 6.2 Final Words.......................................................................................84 6.2.1 Summary...................................................................................84 6.2.2 Conclusion and Contribution....................................................85 6.2.3 Future Work..............................................................................86 A Appendix A

87

A.1 Glossary...........................................................................................87 A.2 Definitions........................................................................................90 A.3 Abbreviations...................................................................................91 B Appendix B 92 B.1 Pseudo Code Semantics...................................................................92 C Appendix C 94 C.1 Polygon Rendering...........................................................................94 C.2 Fibonacci Terms...............................................................................95 C.3 The Dining Philosophers..................................................................99 References

105

1 Introduction Software updating has been exercised from the very birth of the software industry, but its importance is still ever increasing. Not only is existing software extended and bugs fixed, but more and more demands are put upon update automation. Ease of software evolution is critical for any software company, and dynamic updating is a prerequisite for many modern application areas for mobile and embedded systems. So far, developers are content with designing their systems with updating in mind and thus adding restricted support for evolution and dynamic updating. However, more general and disciplined methods for automatically achieving flexibly and safely updatable software are in high demand. This paper will give an overview of software updating, but also introduce new concepts and new ways of thinking that could possibly increase the updatability of future applications. This paper will focus on dynamic updating, while keeping in mind the demands put upon evolution. Many existing software updating systems fail to deliver what they promise largely because they rely too heavily on existing software development concepts. This paper does not claim to do any better. However, the goal of this paper is to examine these concepts and propose alternatives. By opening our minds to new ideas and fresh approaches, we will unlock the door to application development, and in the blink of an eye be... rethinking software updating.

1.1 Dynamic Updating As of today, there is no well established, widely adopted, "correct" term or definition for the act of dynamically updating applications. This paper calls the act dynamic software updating or dynamic updating, as it seems to be the most commonly used term in more recent work. In literature, a broad range of other terms have also been used, including dynamic software reconfiguration, dynamic code replacement, runtime updating, onthe-fly program modification, on-line software version change and software hot swapping. The terms dynamic, runtime, on-line etc. reflect the fact that updating is performed while the software is executing. In practice, a dynamic updating system is required to keep the application running during and after the update, and make sure that the interruption caused by updating does not drastically disturb normal execution. Here, updating refers to replacing part or all of the code of an application, and mapping the current state to correspond to this change. Usually, updating implicitly means upgrading the application to a newer version (containing bug fixes, optimizations and new features), but the code replacement can also be more general. Furthermore, application code can be replaced to accomplish reconfiguration not explicitly enabled in software, or to change some application behavior that is not directly linked to the software version (such as temporarily activating a Christmas theme for some server during the Christmas holidays). The earliest research about dynamic updating was made roughly thirty years ago, and ever since, several different approaches, theories and updating systems have been proposed. In spite of this, no real break-through has been made. The main challenges still remain, and no "best solution" has been found. However, a greater understanding of the problem domain, the popularity of modern programming languages such as Java, and the demands on modern software systems have led to an increased interest in the

1

area, and some of the new efforts are quite promising in terms of usability. See section 1.2.1, "Previous Work" for a brief summary on previous research in this field. 1.1.1 Problem Domain There are basically two motivations for using dynamic updating. The first one is to avoid interruptions in a running application. Traditionally, when a system needs to be updated, it is first shut down, then updated and finally restarted. This can be annoying for any home-user, but is disastrous for high availability systems such as telephone switches, where denial of service and downtime are unacceptable and costly. The second motivation is to preserve the state of a running application / system. The current state can be the result of a long-term execution and difficult or impossible to re-establish. If the system is shut down, some of the state - such as active documents - can usually be saved, but socket connections would be lost, as would input from the user and a large part of the internal state. This is a domain where dynamic updating can provide functionality that simply is not possible with traditional techniques. Dynamic updating is also a prerequisite for a number of brand new application areas. Currently, the safety and flexibility of dynamic updating systems typically are not high enough to make such applications really useful. But in the future, mobile devices could automatically distribute and apply update patches. They could distribute software extensions that they have for example generated or learned using some A.I. algorithm. Applications could - when required for compatibility reasons - switch version by reconfiguring themselves, and so on. In a survey about the cost of downtime in the year 2001 [ERA01], 54 % of all participating companies stated that each hour of downtime would cost the company more than $50K, and 8 % said that each hour would cost over $1M. Of all the companies, 4 % estimated that the survival of the company would be at risk if the downtime lasted less than one hour, and 39 % suspected that downtime lasting up to one day would put the survival of the company at risk. Although downtime caused by static updating most often would not be counted in hours, the survey still shows that minimizing downtime becomes increasingly important for more and more companies. Network databases commonly require an annual downtime of less than one minute, and certain systems have such strict requirements that they permit at most a few seconds of annual downtime. As a concrete example, the switch Lucent 5ESS-2000 Mobile Switching Center (MSC) achieves a downtime as low as 10 seconds per year. The original specifications for FAA's Advanced Automation System required even less than three seconds of annual downtime, but the project turned out to be one of the biggest fiascos in software history, and was abandoned / restructured in 1994 while being several years behind schedule [Per97]. 1.1.2 Redundant Hardware Non-stop systems have high demands on availability and low fault-tolerance. Denial of service (DoS) due to crashes or updating simple is not an option, and therefore redundant hardware is usually used for such systems. Redundant hardware can be used to detect and fix errors by having the same operations executed in parallel; on separated hardware and usually also using different software algorithms. Redundant hardware can also be used to enable a system to be shut down and updated while an identical system is kept running. The 2

system that was shut down would then be restarted and at some point be allowed to take over control again - either gradually or by synchronizing the state of the two running systems. This is essentially a form of dynamic updating, but it is hardware based and both expensive and inflexible. However, it is currently the only widely adopted general dynamic updating strategy. To mention a few examples, Ericsson's AXE switches synchronize redundant hardware to provide high availability, and Sun's operating system Solaris has a Dynamic Reconfiguration feature that can be used together with redundant hardware for e.g. Sun's Fire 3800-6800 Servers. 1.1.3 Application Specific Dynamic Updating Dynamic updating functionality is built into some software, enabling part of the application to be updated on the fly. This dynamic updating is, however, not based on any general dynamic updating algorithm, but instead closely connected to the particular application. The types of updates that can be handled is usually rather restricted, and only certain components and plug-ins confirming to predefined interfaces are updatable. This is fine in many cases, but we would really want a more general scheme where arbitrary changes to an application can be handled, and done so with as little application-specific code as possible. 1.1.4 State Transfer One general strategy for performing dynamic updating is something that could be seen as the software variant of the redundant hardware approach. A new process running the updated version of the application is started and executed in parallel with the process executing the old version. The state of the old process will then either be incrementally transferred to the new process, or at some suitable point atomically mapped to the state of the new process. When the states are completely synchronized, the old process can be terminated and the new process is allowed full control. What is positive with this approach is that it avoids some of the most difficult aspects of dynamic updating, such as updating active subroutines and updating when the stack is nonempty. The downside is a huge performance loss during updating, difficulties in synchronization and problems related to re-performing all initializations. If, for example, user input is required at application startup, the new process would need the same input, but this request cannot be granted since the updating process should be transparent to the user. Performing the state transfer correctly and verifying validity after the transfer is also a complicated matter, especially if the code undergoes larger changes. (The relation between sub-states can be difficult but vital to model correctly.) Finding a suitable point where synchronization can be performed is not trivial either. State transfer can be made possible and prove very useful - for applications designed from the very beginning with such updating in mind, but scales poorly to arbitrary applications. One possible way to implement state transfer is to identify the persistent state, that is, the smallest part of the full state that is absolutely necessary to be transferred [Hic01]. The rest of the state is called ephemeral, and may differ in the old and new processes. If the state can be divided into these two substates, then only the persistent state needs to be converted, and state synchronization becomes a bit easier to perform. However, although part of the state can be ignored, for the update to be transparent and valid, the persistent state is likely to contain most of the full state. Identifying what parts of the state should be persistent is also an error-prone task. 3

1.1.5 Dynamic Linking Dynamic linking - including linking executables with shared libraries, dynamically loading Java classes, and loading plug-ins - is a very simple way of enabling dynamic updating. In itself, dynamic linking only enables extension of an application, but if the loaded modules can also be unloaded, then the most fundamental prerequisites for dynamic updating are fulfilled. The Java language specification [GJS96] supports garbage collecting and unloading of classes, but not all JVMs - particularly not older versions - implement this (optional) feature. Unloading requires that certain (conservative) conditions are met, and we cannot guarantee or assume that unloading will be triggered. Automatic unloading such as this is much like a specialized feature that automates some task but that the developer cannot rely on. The Linux kernel allows modules to be loaded and unloaded on demand [SM96], and browsers and many other applications support various kinds of plug-ins to be loaded and unloaded. However, dynamic linking suffers from two restrictions. First of all, the dynamically loaded modules usually have a fixed interface, through which they provide all their functionality. This means that their interface must be fixed beforehand, restricting the set of possible updates. Secondly, a module in use cannot be unloaded and therefore cannot be updated either. This is a huge restriction and simplification, which essentially results in the same updating approach as traditional updating. The part of the application that should be updated is shut down, then updated and finally restarted. (Compare this to terminating, updating and restarting an application while keeping the OS running.) Java - and similar dynamically extensible programming languages - only suffer from the latter restriction but, unfortunately, this restriction is the source of most complications.

1.2 Objectives 1.2.1 Previous Work Much research has been conducted in the field of dynamic software updating, and various implementations of such updating systems have been made. Because of the complexity of the problem domain, none of these are perfect all-rounders, and no future implementation is likely to achieve such a status either. None of these implementations have been widely adopted in themselves, but the underlying ideas form a solid base for deployment of updatable applications. This paper will not describe these systems in detail, as they have been listed and compared in various other papers [Ajm02] [Hic01] [VB02] [SF93]. A brief summary is nevertheless given below. The Dynamic Modification System (DYMOS) [Lee83] was a pioneering system, capable of updating units as small as loop bodies - which is much more than can be said about the average modern updating system. It is not only an updating system, but an entire updating environment, including a compiler, editor and interpreter. Because of this integration, it can take advantage of knowledge about the source code, compilation artifacts and execution data. As a whole, the system is rather powerful, but not practical enough. It requires the source code to be available, the tight integration is impractical, it is not flexible enough, and it is bound to a specific programming language (StarMod), and the syntax of this programming language is even altered in order to support dynamic updating. The Procedure-Oriented Dynamic Updating System (PODUS) [SF93] takes advantage of segmented virtual memory algorithms for providing version information 4

and implementing indirect references. It is a fairly efficient system, but incapable of updating active procedures (see Active method, Glossary 2). Dynamic C++ Classes [HG98] modifies the structure of arbitrary applications to enable dynamic updating. This includes adding extra indirection classes for separating the interfaces from the implementations. Hicks [Hic01] relies heavily upon standard dynamic linking, and proposes a simple and efficient solution that requires applications to be designed in a specific way. In recent years, several implementations targeting Java have been developed. This not only underlines the popularity of Java, but also shows that people have realized that Java is well suited for many dynamic updating tasks. The fact that Java applications are run on top of a "virtual machine", and the thought through specifications for this JVM are the main factors that make Java an attractive platform. Most implementations extend the JVM to enable dynamic updating. Java Distributed Runtime Updating Management System (JDRUMS) [DH01] and the Dynamic Virtual Machine (DVM) [MPGBB00] follow this approach. Other solutions exist, such as Dynamic Updating through Swapping of Classes (DUSC) [ORH02], which is a tool that modifies the structure of arbitrary applications, much like Dynamic C++ Classes. 1.2.2 Goal This paper goes as far as to define a full scale programming language (chapter 5, "The Updatable Virtual Architecture"), which is specified in more detail in [Öst03]. However, the goal of this paper is not that language, but instead the discussion that precedes it. The goal is to open the reader's eyes to alternative programming styles and programming concepts that can potentially improve software updatability. We feel that dynamic updating and evolution should be a more natural part of programming, but this should not be strived for "at any cost". Established programming languages and their concepts should not be blindly disregarded, and in an ideal world, improved updatability could be accomplished via transparent extensions of existing primitives. Since this is hardly the case in reality, the least we can do is try to minimize the distance between the new and existing concepts, and this paper does so from the perspective of a Java programmer. Taking Java as a starting point, we will mainly focus on object-oriented languages, but keep the discussion more general whenever possible. When a more concrete environment needs to be discussed, we will turn to Java bytecode for embedded systems. Contrary to most other papers targeting embedded systems, we will not assume that our target applications are high-performance systems with lots of hard real time requirements, but rather that they are arbitrary user applications e.g. in modern cellular phones and PDAs (Personal Digital Assistants). There are several reasons for this conscious choice. First of all, including strict real time requirements does not give much freedom for designing the updating system and puts many restrictions upon it, resulting in a less flexible, less general, less user friendly updating system. Secondly, the target environment would most likely be a virtual machine running higher level applications on top of a real time system in a mobile device. Then the application itself would not have many real time constraints, because they would all be handled by the lower level operating system and the virtual machine. It can be seen as somewhat contradictory to develop dynamic updating techniques that target Java and similar interpreted languages when many application areas require high-performance systems with hard real time constraints and low response time. 5

However, these systems often need redundant hardware anyway to guard against hardware faults and bugs. We feel that a new generation of applications and an increasing demand of dynamic updating in normal applications justify this approach. The ultimate goal, set far ahead in the future, is to make general dynamic updating a natural part of any application, preferably with no additional development overhead. Segal and Frieder claim that "any program can be so poorly written that it cannot be dynamically updated" and that systems for dynamic updating are mainly intended to be used for some well-accepted, well-understood program structure - not arbitrary program structures [SF93]. This paper claims otherwise, and therefore tackles a number of issues where existing dynamic updating systems usually fail to impress. Poorly written programs might require extra work from the developer who is writing the update patch, but even so, specifying and applying an update patch should still be a fully reasonable task. 1.2.3 Content Outlining In chapter two, we will more thoroughly define and characterize dynamic updating. In chapter three, we will examine existing concepts and consider how well they support software updating. Having - in chapter two - explained what demands are put upon dynamic updating, we will discuss what problems arise and how some of them can be solved. We will conclude the chapter by presenting a few existing dynamic updating systems that deal with the issues discussed. In chapter four, we will approach the problem from another perspective. Knowing what makes dynamic updating difficult, we will incrementally introduce new concepts that could potentially support updating better than the existing concepts already discussed. We will introduce the sequence model for structuring code, and entityoriented programming for dealing with objects. In chapter five, we will give a brief introduction to a programming language called the Updatable Virtual Architecture (UVA). This language is built upon the new concepts we have described, but still closely follows popular programming practices, and has many similarities with popular programming languages and Java in particular. More detailed specifications for this language can be found in [Öst03]. In chapter six, we evaluate the usefulness of the theory we have explained, summarize what the contribution of this paper is, and discuss what future work needs to be done.

6

2 Aspects of Dynamic Updating In this chapter, we will discuss what characterizes a good dynamic software updating system. We will define update validity and consider how to safely apply update patches. In chapter 3, "Existing Concepts and Primitives", we will identify the building blocks in traditional software applications and discuss their relevance in terms of dynamic updating. We will mention various strategies and solutions for performing softwarebased dynamic updating at different levels of granularity. We will consider the different aspects of - and complications arising from - dynamic software updating. In chapter 4, "Concepts for Improved Updatability", we will consider how we could avoid or solve some of these problems if we were not bound to existing programming language concepts.

2.1 Characteristics of a Dynamic Software Updating System To clarify what a dynamic software updating system really is, what demands are put upon it, and what aspects to consider when designing such a system, some common features are listed below. 2.1.1 Requirements We need some notion of desired and required functionality in order to design useful dynamic updating systems and valuate existing ones,. Hicks [Hic01] identifies four criteria that any practical dynamic updating system should satisfy. These criteria are listed and briefly explained below. The flexibility of the updating system is determined by the range of updates that the system is capable of handling. An ideal dynamic updating system would be able to update a(n updatable) program in an arbitrary way. Timing constraints - that is, when an update is allowed - also affect the flexibility of the updating system. Robustness is an important criteria in any system, and in particular in a dynamic updating system, since the very idea of using such a system in the first place is to keep the application running - to avoid shutting down (or crashing) it. Hicks further identifies five properties that improve the robustness of an application: Safety (e.g. syntactic safety and security integrity), completeness (all changes should be addressed somehow), well-timedness, simplicity and rollback-enabling (the ability to discard the update and revert to the old version if an error is detected after updating). The efficiency of a dynamic updating system is critically important when used in hard real time systems, but perhaps the least important criteria when used by an end user to update the user's favorite desktop application. Ease of use is an important criteria since the usefulness of a system is tightly connected to how easily and transparently it can be applied. A complex system requiring lots of mechanical work simply is not useful - regardless of how good it is. The ideal system would be fully automated, taking the old and new version of the application as input and giving a patch that can be applied "on the press of a button" as output. Dynamic updating systems cannot be fully automated, since it is impossible to e.g. "guess" how arbitrary state transfer should be made. However, applying a patch can be fully automated, making the actual updating process transparent to the user. The

7

static process of creating a patch can also be largely automated via various tools, easing the developer's workload. Hicks mention that other criteria could also be considered - such as deployability, portability and elegance - but that he has found the four criteria above to be the most important ones. This paper considers deployability to be a fifth important criteria, being similar to flexibility but applied on the application level. An ideal dynamic updating system would be applicable to any (existing or future) application while a system with poor deployability would require the application to be built in a strictly specified way, taking updating in account already in the development process. (Hicks' system would, for example, have a deployability in between these two extremes.) Poor deployability not only prevents existing legacy systems from being updated, but also complicates the design and implementation of new software, and will usually - as a side effect - degrade the flexibility. Segal and Frieder [SF93] define the characteristics of a practical dynamic software updating system in a slightly different - and perhaps less structured - way. In essence, most - but not all - of these characteristics are similar to the ones given by Hicks, but grouped in another way. Listing them as well gives a more complete overview and understanding of the requirements put upon dynamic updating systems. First of all, the updating system should preserve program correctness. This applies both during updating and normal execution, and is similar to the robustness criteria listed by Hicks. If the updating system preserves program correctness well, updating can be performed with little or no human assistance, making updating transparent and the system easy to use. Segal and Frieder also argue that this correctness could be extended to account for time, in the sense that e.g. a slow update might break the timing constraints in a real time system, affecting program correctness. The updating system should minimize human intervention. This is partly accomplished if program correctness is automatically and transparently preserved. There should be support for low level program changes. This requirement is similar to the flexibility requirement listed by Hicks, but only deals with various independent, low level modifications. Such modifications would be changes to some subroutine interface and to a data type definition. Support for higher level modifications such as changes at the architectural level is taken into account in the requirement to support code restructuring. Modifications that would fall into this category include addition of new modules, removal of existing modules, changes to module relations etc. The updating system should also be able to update distributed applications. The motivation for this requirement is that many programs that uses dynamic updating are distributed by nature. The updating strategy is required to scale to large distributed programs, but Hicks does not mention this requirement. The dynamic updating system should not require special-purpose hardware. This is quite obvious since we are focusing on software-based updating systems. Finally, the updating system should not constrain the language and environment. An ideal system would be applicable to any programming language, environment and application. The most important point in this requirement is, however, not languageindependence, but rather, as Segal and Frieder puts it: "An updating system must not force programmers to write code or call operating system primitives in a radically different manner .../... Ideally, updating systems should tolerate a variety of

8

programming styles.". This requirement is in large similar to the deployability requirement mentioned above. 2.1.2 Functionality The required and desired functionality of a dynamic updating system partly depends on the design of the system as well as on the target environment. To give an overview of what tasks a dynamic updating system has to carry out, some of the most common functionalities are listed below. This list is largely the same as the one given in [DH01], and some of the tasks have already been indirectly mentioned in the requirements section above. Dynamic code replacement, allowing static code to be dynamically updated. Code restructuring, allowing subroutine interfaces to change. Data restructuring, allowing data type definitions to change. Reconfiguration at different levels of granularity, implying that variables, data types, subroutines, objects, modules, threads etc. can all change. Separation between updating and application development. Transparency, so that the updating system is easy to use and does not need aid from the user. State preservation, requiring that the application can be kept running without loosing data. Update verification and validation, ensuring both statically and at runtime that the update patch can be safely applied. Well-timed updates, ensuring that updating is only allowed at certain points, and retaining update validity at runtime. Quiescence is a common special case which ensures that updating only occurs when the component being replaced is not active. In an object-oriented environment, the component could for example be a method or a class, of which a method is said to be active if it is on the calling stack of any thread and a class is said to be active if it contains any such active methods. In addition, entirely safe updating of classes requires that the classes do not have any referenceable instances either. Support for multi-threading. Support for distributed systems. Support for multiple simultaneous versions. This is not always required, but often more or less a requirement for distributed systems where synchronization would typically be too difficult and inefficient. Hardware independence. Design and development method independence, enabling arbitrary applications to be updated. This ideal is usually not achieved in traditional dynamic updating systems. Programming language independence. In most dynamic updating systems, this is not achieved either. Accommodation to real time constraints. 2.1.3 Modification Type There are a number of distinguishable modification types that a dynamic software updating system should be capable of handling. A first, rough classification of these can 9

be made by differentiating between whether it is the source code or some external data (such as an image or a configuration file) that has been modified. The latter case is trivially handled, since it only requires that the modified data is reloaded, but the former case is what dynamic software updating is all about. Source code modification can be further grouped according to relevance in a dynamic updating system, as is done below. Extending functionality by adding modules, classes, functions, methods, fields etc. This kind of modification is almost irrelevant to a dynamic updating system, since the new functionality is not reachable from the old version of the code. An environment supporting dynamic extension could perhaps detect the new functionality and make use of it, but such a scenario is also conflict free, since it is essentially the same as if the functionality was "always there". There is no conflict, because no references to the new functionality exist. The only thing that the updating system really needs to do, is initialize the new functionality and make sure that more memory is allocated where needed and old references are kept the same. Doing so might, in some environments, require extra work, but can quite easily be implemented for example in a JVM. Modifying existing functionality by editing the code, changing the data types of variables (e.g. changing an integer variable to a float) etc. In principle, this can be supported by providing state mappings for all the changed functionality, but in practice, only relying on state mappings is usually too cumbersome to be useful. One solution is to require quiescence, as described when discussing "Well-timed updates" in section 2.1.2, "Functionality". Doing so avoids most conflicts, but at the same time seriously reduces the flexibility of the updating system. One requirement that this kind of modification (and dynamic updating in general) is assumed to fulfill, is that the modified functionality still performs the same tasks as it did before the update. The internal algorithms - as well as the results - may change, as long as all code using this changed functionality is correspondingly and simultaneously updated. Changing the internal structures that provide the functionality is perhaps the most complicated high level modification type possible. Changing the internal structure refers to anything from changing the relationship between data types or changing subroutine interfaces to heavily modifying the class hierarchy or architectural design. These modifications can be very heavy weight, since all the modules that are involved must be updated appropriately, and each modification must be handled somehow (see Robustness in section 2.1.1, "Requirements") - typically using some state mapping. A large, complex update is also likely to require a lot of work from the developer, since many state mappings etc. will be needed, and it will be difficult or impossible for automated methods to correctly guess these mappings.

2.2 Update Timing and Validity The correctness of an update depends not only on how the update patch is created and how the updating system is designed, but also largely on the time when it is applied. The thing that distinguishes dynamic updating from static updating is that dynamic updating has to deal with the state of the application, and this state changes over time. Updating should be applied in an update-safe state, but guaranteeing that a state is update-safe is usually too complex a task. Performing updating in a non-update-safe state might lead to an unreliable state after the update, with incorrect behavior or a crash as the outcome. In an object-oriented application, the state can be grouped into two categories: a) live objects and b) active methods (Glossary 2). Modifications made to

10

methods are more problematic than modifications made to class and instance fields, and cause problems both for live objects and active methods. 2.2.1 Validity To be able to ensure that an update is safe and valid, we will have to decide upon what we mean by valid. The most widely adopted definition of validity is the one given by Gupta [Gup94]. It says that, replacing program  in process P with  ' at a given instance in time, is a valid update if P is guaranteed to reach a reachable state in  ' in a finite amount of time. By reachable state, we refer to all states reachable from the initial state of an application. Once a reachable state has been reached, all following states will also be reachable. In this paper, we feel that the definition of validity that Gupta gives is either too strict or too inexact. If code and data are updated separately - as is the case in every existing dynamic updating system - the converted data may either be such that the current execution would not have produced it, or even such that it could not have been produced by the new version (and thus is not a reachable state). If the old version of an application creates an array containing numbers 0 through 10, and the new one an array containing numbers 1 through 10, then the first entry can simply be discarded when updating. But say, for example, that the old version reads 10 strings, sorts them and adds them to a list. If the new version only reads 5 strings, then we have a problem. We can discard 5 of the strings, but we do not know which ones, since their order changed. And by doing so, 5 user inputs would be lost anyway. We can probably let there be 10 strings in the list, but then we are not in a reachable state of the new version. It is unrealistic to require that, after an update, the application should behave exactly as if the new version had been run in the first place. If we further consider I/O, we notice that states are dependent on previous I/O requests, and we cannot perform arbitrary updates if we stick to such a requirement. Hence, we will instead require that the persistent state is transferred, and errors are avoided. Our new definition of validity then becomes: Replacing program  in process P with  ' at a given instance in time, is a valid update if P is guaranteed to reach a reachable state in some permutation of  ' in a finite amount of time. A permutation of  ' is  ' run from some modified reachable state (and thus having its own reachable states), but behaving as intended. (Definition 6) This definition could be seen as the weakest possible definition of validity, allowing the application to behave unexpectedly for a while, as long as it at some point starts behaving like the new version of the application. Any other definition would be more strict and make guaranteeing validity harder. Gupta further shows that determining if a sequential program is valid or not, is undecidable in the general case. In other words, no algorithm can tell us whether an update is valid or not, and we will need to provide safety checks and mechanisms for detecting and dealing with invalid updates. It is also important to keep in mind, that the simpler the updating strategy is, the easier it is to reason about, and the smaller the risk of applying an unsafe update patch. Unnecessary complexity should therefore be avoided. Invalid updates can be the result of any of the following. An ill-formed patch, most likely resulting from a mistake made by the developer, or by neglecting the completeness requirement (see Robustness in section 2.1.1,

11

"Requirements"). If the updating system is complex and poorly automated, then patches are likely to be unreliable. Bad timing, which is difficult to reliably identify, but can have disastrous and entirely unexpected consequences, such as data loss, confidentiality degradation, crashes etc. It might be possible to detect some of these corruptions instantly, but others may remain undetected for a long while. The updating system should at least attempt to improve the odds of updating safely, and it can do so by using constraints, control points and more. Malicious code, which can be provided in the form of a patch that is meant to crash the system, open a back door, or perform some other ill-intended action. The updating system should not in itself be a security risk, and should a) not allow "anyone" to apply updates b) be able to detect and reject harmful patches. As mentioned previously, we cannot generally guarantee that an update is valid, and must thus be able to detect faults after an update patch has been applied. Node A Node B Node C Node D T This detection typically only involves catching uncaught exceptions and assuming that they were caused by a T faulty update. The user could also be T given the option to manually invalidate an already applied update patch. Once the update has been T detected to be invalid, some kind of rollback can be used to revert to the synchronized global state old version, at some instance of time T before updating. For maximum robustness, this usually implies that the whole state is reverted to some state prior to updating. Such an implementation is memory T consuming, since not only old version code must be stored, but also the full T state, which can possibly be huge. In addition, anything that happened after the update will be "forgotten", and T possibly result in data loss. In the case of a server, communication with clients will probably also be messed up, and so on. T error For a distributed system, one-step independent rollback is not enough. If Illustration 1: Rollback in a distributed system. The all nodes were updated and the update vertical lines represent nodes and the circles locally was detected to be faulty, it would backed up states. Diagonal arrows visualize make sense to revert to the old communication between nodes, and the time axis runs version for all contributing nodes. downwards. If an error is detected in node C, at time C,3, all grayed lines would have to be rolled back. If a Even if this was not the case, we Tglobal state was synchronously backed up, this would at least have to apply rollback avalanche effect could be avoided. recursively to all nodes that have 0

B,1

D,1

1

B,2

A,2

C,2

Time

B,3

C,3

12

communicated with the rolled-back node after the point of rollback. We define that the global state of a distributed system contains the local states of all contributing nodes (see section 3.1.3, "Including State"). The avalanche rollback effect possibly following a local rollback can be avoided if the global state - rather than the local states independently - is backed up. This requires a certain degree of synchronization and might be difficult to accomplish in large distributed systems. 2.2.2 Invoke Model vs. Interrupt Model Approaches to dealing with timing in dynamic updating systems can roughly be categorized as following either the invoke model or the interrupt model [Hic01]. The invoke model is simpler and more straight forward, but not as widely adopted. It basically enforces the designer to hard code one or more points in the application where updates can (hopefully) be safely applied. Each time one of these update points are encountered, the system will check if there are any updates pending. Normally, there will be none, and execution will immediately continue. However, if an update request has been made, the system will apply all pending update patches before continuing execution. Following this model, arbitrary applications cannot be updated, and those applications that should be made updatable must be developed with updating in mind from the very beginning. Updating multi-threaded or distributed applications can also be problematic. The advantage with this method is its simplicity. It a) is much easier to implement b) makes reasoning about update-validity a whole lot easier c) can - correctly implemented - partially solve problems such as updating long-duration active methods (Glossary 2). The interrupt model is more automated and flexible than the invoke model, but thereby also more complex - sacrificing implementation simplicity and perhaps some robustness and reliability. In this model, update patches can (theoretically, but not necessarily in all implementing systems) be applied to any application. In addition, they can be applied at any time (interrupting the normal execution), providing that some runtime check determines that it is safe to perform updating at that particular instance in time. Theoretically, this is a superset of the invoke model and can find all update points in an application. However, in practice, the runtime check is much simplified and rather conservative. A good runtime check would never allow updates to be applied at bad times, but do so without being overly conservative. In an object-oriented environment, the conditions for when updating is allowed can be that the methods to be updated are not active, that the whole class is inactive, or that some more detailed requirement is met. In DYMOS [Lee83], the patches contain when-conditions stating update-relations between procedures (DYMOS was designed for a variant of Modula, and therefore uses procedures). These conditions are written in the form update P, Q when P, R, S idle, stating that procedures P, R and S must be inactive when P and Q are updated. These conditions allow the developer the specify arbitrary updating-relations between the modules being updated, but not at finer grained levels. If, say R, is executing an infinite loop, these conservative conditions will never allow the update to be applied. Perhaps even worse, relying on the developer to correctly identify these relations is both time consuming and error-prone. One could say that the invoke model tells when updates are allowed for sure, while the interrupt model expresses conditions for when updates are not allowed. One could also note, that the invoke model fixes an update point that must be valid for any possible future update. If that update point is poorly chosen, then certain updates might not be 13

possible without using tweaks such as updating twice, with the intermediate update doing preparations for the real update.

14

3 Existing Concepts and Primitives The static "building blocks" of software applications are the programming language primitives for the application's source code programming language. The dynamic building blocks are lower level machine instructions and hardware primitives. If an interpreter runs the application, these building blocks might be the same, but even if this is not the case, they are still related. The programming language controls what kind of lower level code will be produced for the application, but is itself designed based on available low level primitives. John von Neumann (1903-1957) described the well-known von Neumann architecture, upon which virtually all computers are based. According to this architecture, a computer should include four parts: an arithmetic-logic unit (ALU), a control unit, memory and an input/output unit. The architecture also specifies that control flow, access to memory etc. should be sequential and not parallel. Modern computers are still tightly bound to this architecture, and both low and high level primitives are based on it. This chapter will discuss what implications this has upon dynamic updating.

3.1 Classification of Existing Building Blocks In order to discuss how updating is performed, we need to identify what primitives and concepts we have at our disposal. Different primitives are likely to need different handling when it comes to dynamic updating, so we will classify them accordingly. Most of the primitives described below should be well-known to the reader, but it is important to describe them in some detail, because we will have to keep in mind what fundamentals we deal with, when we go on to considering dynamic updating. 3.1.1 Low Level Primitives Every task that is performed by a computer is executed inside a process. Sometimes, multiple processes (several smaller tasks) co-operate to perform some larger task. The processes may be executed one at a time, sequentially on one single processor, or in parallel on multiple processors. Regardless of the inner workings of these processes, they always contain certain private parts, and certain common parts, which they share with all the other processes. A thread is often called a light-weight process, and can either be implemented as being part of a process (e.g. a Java thread inside the JVM process), or itself be an external process (e.g. a Linux thread). The main difference between a thread and a process is that the thread intuitively shares data with its owning process while a process does not. Context switches between threads might sometimes be faster than between processes. Each process / thread contains data, code, a constant pool, a program counter and some input / output state. Data can further be split into statically allocated data, registers, a heap, and one or more stacks. Processes communicate through inter-process communication (IPC) primitives, such as message passing, semaphores and shared memory.

15

statically addressed

Process 2 code constant pool PC I/O static data registers heap stack

data

variable value

IPC

data

fixed value

Process 1 code constant pool PC I/O static data registers heap stack

dynamically addressed

shared memory area

Illustration 2: Low level primitives - an abstraction enforced by hardware.

To perform its tasks, a process executes the code it loaded when it was created. Part of the code can be shared between processes (dynamically linked library functions), but typically, no part of the code may be overwritten during execution. The constant pool contains constants (strings in particular) that the code can reference. Together with the constant pool, the code defines the full static image of the process (application in general), and contains all information necessary to describe any possible behavior. The rest of the low level primitives define the dynamic image of the process (application). The program counter (PC) is the part of the dynamic image (state) that declares what code is currently being executed. Normally the PC is automatically incremented after processing one instruction, resulting in linear execution of the code, but special go-to and go-sub(routine) instructions can relocate the PC to virtually any part of the loaded code. Statically allocated data is - unlike constants - variable value data, but is always accessed at the same address, and will never be removed from that location. Global variables are the most common statically allocated data. Registers are a special kind of statically allocated data. They are memory slots that provide fast access, and are used to store intermediate results during execution. Some registers are reserved for some special purpose, such as serving as a stack pointer, while others can be used freely. In stack based machines, such as JVMs, a stack serves the same purpose as registers in register machines. The heap is used for long-term storage during execution. Globally accessible allocated memory in procedural environments, and objects in object-oriented environments, would typically be stored in it. The heap is the data storage primitive that resembles physical memory the most. Stacks are used for short-term storage. A fixed amount of memory is typically reserved for each stack, and anything that is pushed onto a stack must be popped from it in the reverse order. Stacks should therefore only be used as temporary storage, but they are a powerful, efficient and natural solution for storing states that must later be re-established. Such a state can be the local state in a subroutine before calling another subroutine or the register values before calling an interrupt handler. The input / output (I/O) state is not as clearly defined as the other primitives because a) there are so many different types of I/O devices b) I/O is shared between processes. The I/O state consists of e.g. handles to opened files and communication ports, system clocks, pending I/O interrupts, input buffers containing data that can be read (e.g. from the keyboard) and output buffers containing data that has been written but not yet sent to any I/O device. Part of the I/O state is process-specific while the 16

other part is not. While a normal file handle is process-specific, processes are forced to compete if they, for example, want to read input from the keyboard. It can be quite a challenge for a dynamic updating system to take the I/O state into account..

e pt s nc co

3.1.2 Programming Languages Based on the von Neumann architecture and hardware level primitives, higher level primitives have been invented. Larger high level concepts (object-orientation, parallelism etc.) include some pro ed t n g e r a i subset of these mmin t-or g l jec class hierarchy a b primitives and some ng o ua ge philosophy for classes reasoning about applications. Based on data these concepts and data structures primitives, a large number of programI/O ming languages with states functions slightly different feathr tures have been proead s ti posed. The first ve pr programming languages og r process ammi were low level and ng lan guage concepts closely bound to the Illustration 3: Traditional programming languages - a conceptual hardware architecture abstraction. (machine language look-alike assembly languages). Later on, the level of abstraction was raised and the coupling to machine language loosened somewhat (high level imperative languages). More recently, the level of abstraction has been raised yet, and some programming languages have been designed solely based on some high level concept, without paying much attention to hardware constraints (functional programming, logic programming, parallelism, persistence, distribution etc.). Applications written in these languages must still be converted to machine language at some point, and are typically slower because of the overhead work that "simulating" the desired functionality requires. The most popular programming languages as of today are imperative or objectoriented, and still rather tightly bound to the low level primitives. We will consider the primitives that these use to be the units that a dynamic updating system should be able to update. Imperative programming (procedural programming) is the programming style of both the low level machine languages and higher level languages such as C and Pascal. Any high level addition is essentially only syntactic sugar with the sole purpose of making code more readable or easier to write or maintain. This includes structured declarations, grouping of data, code blocks, and loops instead of ad-hoc control flow. Since the machine language for virtually any computer is imperative, any program written in any programming language can be converted to an imperative program. Object-oriented programming could be seen as a subset of imperative programming, but is conceptually different. Conceptually, objects are independent entities with the capability to communicate with each other. In practice, objects are only ra pe im

17

a grouping of data and subroutines (called methods in object-orientation terms), with restrictions to how these may be accessed. Communication can be implemented using for example message passing (as in Smalltalk), but direct invocation (calling the subroutines) is the most common, imperative-like implementation. The source code and architectural design of an object-oriented application is typically quite different from a corresponding imperative application. However, this difference is mostly caused by the higher level view, and is not really a practical difference. In the case of dynamic updating, what can be said about imperative programming, almost always holds for object-oriented programming as well. An important concept in most programming languages is the separation of fixed value (code and constants) and variable value (data) elements. Code and constants are considered to be static and unmodifiable, but data can be altered (and often also created and destroyed) at runtime. Furthermore, data is often accessed only via data types and data structures. (Languages such as C allow arbitrary pointers, but most valid access still points directly or indirectly to some allocated memory region.) This makes data handling more disciplined and unified than when only using low level facilities for accessing data. 3.1.3 Including State State is a runtime image. It can be the runtime image of a subroutine call, of an application, of the whole operating system or of an entire network. The state describes solely and fully everything that "is" at a given time. Reconstructing the state of an application would have it continue executing in exactly the same way as it did when the state was captured (assuming that external I/O is also replayed in the same order). Since code is static and does not change, it does not have to be included in the state description, but everything else constitutes to the state. The subroutine state could be called the process local state (but should not be confused with local state, defined below). It includes all the subroutine-local data, the PC inside the subroutine, the values of registers etc. If another subroutine is called, this state is usually memorized by being pushed onto the stack. Typically, no subroutine can alter the local state of another subroutine, but subroutines can still alter the behavior of each other by altering the process global state. This is the part of the full process state that is not local to any particular subroutine. This includes heap allocated data, the I/O state and the calling stack. The local state is usually referred to as the full state in one single computer. It contains the process states of all active processes (alternatively all processes that are participating in executing some application), as well as information shared between processes - such as the system clock. The global state is the full state in an entire network. It is the super-state for all other states, and contains the local states of all network nodes (alternatively all network nodes participating in some distributed computing). The global state of a single-threaded, non-distributed application would only include the process state of the process executing the application and the nonprocess-specific local state. The global state represents the dynamism in running applications, and if it would not change, then runtime updating would not be dynamic. One way to make updating easier is to reduce the impact that global state has upon an application. By using as late binding times as possible, the flexibility increases, since the link between static code and dynamic state is blurred and made less important. Generally speaking, early binding is efficient, while late binding is flexible. Binding time 18

"Conceptual primitives"

"Low level primitives" Code block

constant pool code

dynamically addressed

(variable value)

static data registers I/O PC stack heap

instr functions uctio ns

dependency

p roce ss st at da ta local data

e

Data block statically addressed

ns tio i n i class / data def structures

(fixed value)

external state

global data execution location

Illustration 4: Mapping programming language concepts onto low level primitives. As denoted in the figure, the process state is dependent on constant declarations, but still separated from them. For example, direct addresses in the data block might point to constant data which has been loaded and stored to a dynamic location. But no constant information needs to be included in the state, since the application name and version completely define such information.

decisions that can be made are, among others, dynamic vs. static linking and early vs. late binding of subroutines. For example, the object-oriented programming language Smalltalk uses late binding of method signatures, so at every invocation, the correct method is first resolved and only then invoked. This effectively means that the implementation of individual methods can be changed almost at any time. Binding can also be postponed by using indirect references to reduce the number of fixed addresses. This way, anything pointed to by such a reference (subroutine, variable, data structure etc.) can global st be replaced. This can ate also be accomplished ates t s by book-keeping cal lo system state pointer assignments. RPC This book-keeping can state process s be used to update tat es pointers appropriately I/O state IPC when some unit has state been replaced. Bookdata PC keeping can potentially save memory since all references need not be process code modification indirect, but is more responsibility complex and erroraltering data type modification prone and also most likely less efficient. If neither indirect Illustration 5: Linking modification type to state alteration. Solid arrows show requirement relations between modification types, and dashed references nor book- arrows "might require" relations between modifications and sub-states. keeping is used, relinking must be performed when a unit is replaced. Even if the application is relinked, pointers still pose

19

a problem, and the presence of a garbage collector (GC) requires either indirect references or book-keeping anyway. Different types of modifications affect different subsets of the global state. If the affected subset is large, than the effects of the modification are wide-spread. We define that the complexity - in dynamic updating terms - of a modification is not determined by the scale of the modification, but by the effect it has upon the global state (Definition 2). Illustration 5 visualizes how general modification types affect the global state, and how the modification types are related to each other. We note that, in itself, changing a data structure definition or the data type of a variable only affects the data stored as this data type. But we will most likely also have to modify the code - to correctly handle the changed data type. (This change might in some high level languages be transparent in source code and only be seen in compiled binary code.) Code modification is rather heavy weight, and trivially changing a data type might have a much larger indirect impact than one would assume. Having introduced the underlying concepts and primitives, we will in the next sections discuss updating at different levels of granularity.

3.2 Updating Data Structures A traditional programming language contains two fundamentally different units - data and code structure. The structure of both code and data will follow different rules based on what language is used, but in essence, any modification can always be treated as affecting either or both of these two units. The tasks of updating them are fundamentally different, so it seems reasonable to deal with them separately. This section deals with updating changed data types and modified data structure definitions. 3.2.1 Dealing with Modified Data Definitions The requirement for completeness (see Robustness in section 2.1.1, "Requirements") implies that any modification to data definitions and variable types must be addressed. Since the new version of the application must be type-correct, new code will already have addressed this modification. When executing old code, the old data types must be used, and the new ones used when new code is executed. The remaining problem is hence converting old data to new data as execution of new code is begun. Some dynamic updating systems also demand that we can go back to executing old code from time to time, and new data must then be converted back to old data. This reverse conversion can usually be dealt with in exactly the same way as the natural conversion from old to new data. Besides converting existing data, switching from one version to another might require that some data structures or variables are discarded and some new ones are initialized. This is a simple scenario, because detecting when this should be done is the only difficulty that is imposed. For data that needs to be converted, the conversion can be as simple as casting an integer variable to a floating point variable, but equally well as difficult as restructuring both the inheritance hierarchy and internal contents of a complex object. Converting (subroutine-)local data is generally easier than converting global data that can be accessed from multiple locations. If both old and new code is to be frequently accessed, it might pay off to allow both the old and the 20

new version of an object to co-exist. This might be the case when large distributed systems are to be updated and update synchronization is tricky. Version 1.0 Version barrier

Version 1.1

(No objects cross the version border)

Global update (All live objects are converted)

Passive partitioning (No objects are converted)

Active partitioning

(Selected objects are converted)

tupdate

Time

Illustration 6: Conversion policies for migrating data structures and objects. Some objects cease to exist before tupdate , some are created after tupdate , some are converted at time tupdate , and some remain unconverted.

21

Vandewoude and Berbers [VB02] focus on dynamic updating for embedded systems - as does this paper - and briefly consider three different data structure conversion policies suitable for such. All data structures can be converted at once (called global update below), making updates time consuming and having them interrupt the application for the duration of the update. Data structures can also be left unconverted (called partitioning below), requiring that different versions of data structures can co-exist and will be properly accessed and used. Updating on demand (called incremental global update below) will convert data structures no sooner than when they are referenced, avoiding temporarily halting the application. Since embedded systems often have real time constraints, this approach, which leads to unpredictable object dereferencing times, is not favored by Vandewoude and Berbers. In addition to what is argued by Vandewoude and Berbers, we note that updating all data structures at once generally does not scale well to distributed systems, and we would also like to avoid interruptions in program execution. Leaving data structures unconverted is not tempting either, since certain data structures will likely exist for the entire duration of running the application - i.e. will never be converted. We would also like to avoid having update requests pending, and would prefer to have them served in a short amount of time. Therefore, none of these conversion policies entirely fulfill our needs, and we might have to settle for some compromise. A more detailed analysis on data structure conversion policies, identifying policies named version barrier, global update, passive partitioning and active partitioning, can be found in [MPGBB00], [HG98], [Dug01] and [DW02]. All these conversion policies are visualized in Illustration 6 and described below. For the sake of simplicity, we will talk about objects and classes instead of data structures and data structure definitions, but the theory applies equally well to non-object-oriented applications. 3.2.2 Version Barrier The simplest and least flexible approach is to allow a class to be updated only at time intervals when there are no accessible instances of it. This is the most conservative approach possible. Since no state needs to be converted, this scenario is identical to statically replacing or dynamically reloading the class. It is easy to replace unused classes, because doing so avoids the actual problem with dynamic updating and basically only simulates traditional static updating. This approach is inflexible to such a degree that utilizing it, hardly any practical application could undergo even slightly larger dynamic updating. 3.2.3 Global Update This approach could be called the dual of version barrier, taking the approach of converting every single object of an updated class. As with the version barrier approach, this approach also prevents multiple versions of objects to co-exist. One possible implementation is to use proxy objects for all updatable classes, and atomically or incrementally update their implementation pointers when an update is performed. This is the approach taken in Dynamic C++ Classes [HG98]. An advantage of this proxy indirection is that library methods can safely be updated, and the disadvantages are that the proxy objects degrade performance and only classes prepared for updating can ever be updated. 22

Global update is often compared to garbage collection. It resembles garbage collection in the sense that they both essentially function in the same way - finding objects that should be converted / deleted and then converting / deleting them. In similarity with garbage collection, several implementations of global update can be considered. To avoid interrupting the application while finding and converting objects which might be a time consuming operation - incremental global update can be implemented (compare with incremental garbage collecting algorithms). This is the implementation choice in, among others, [MPGBB00] and [DH01]. Global update is generally not suited for updating distributed systems, because it requires that all nodes participating in the distributed computing are synchronized, negotiate when to update and then halt until informed that all nodes have completed updating. This is true also for incremental global update, because the time of updating must still be synchronized, and only one version of each object may exist simultaneously. Such a synchronization would not only cause a significant interruption in program execution, but also be difficult to negotiate. Already the simplified scenario of finding a suitable updating point in a non-distributed, but multi-threaded application is difficult, and may either lead to infinitely postponed updates (if at all times, at least one thread is always in a non-updatable state) or dead-locks (if, in an attempt to avoid infinitely postponed updates, threads are locked independently when becoming updatable). If update timing is flexible enough, synchronization is only a minor difficulty, but with strict update timing constraints, global update is not an option for conversion policy in a flexible dynamic updating system with support for multi-threaded or distributed applications. 3.2.4 Passive Partitioning Instead of unconditionally converting all objects, some or all of them can simply be left unconverted, resulting in the co-existence of objects instantiated from different versions of classes. Passive partitioning converts no objects at all, but uses updated classes to instantiate new objects. This reduces the need for complicated synchronization and is therefore well suited for distributed systems. Implementing passive partitioning is, however, more difficult than implementing global update. The problem is to ensure that objects and methods of different versions can inter-operate correctly and in a type-safe manner. Typically, objects would have to be converted on-the-fly from the old version to the new one and vice versa depending on where and how they are used. The update might never be fully completed either, if at least one instance of an old class remains referenceable. This implies that both versions of the code must always be kept in memory, and on-the-fly conversions must always be enabled. Both the performance degradation and memory overhead might be quite substantial. Type renaming is one way of enabling multiple versions of data definitions (classes in an object-oriented system) to co-exist. Using type renaming instead of type replacing, the different versions are distinguished, and in some sense treated as different type definitions. This can generally be accomplished with direct re-linking [Hic01], and in the case of an interpreted language such as Java, also by including some version field to the identity of types.

23

3.2.5 Active Partitioning Active partitioning is almost identical to passive partitioning, with the only difference being that existing objects are allowed to be converted when a certain condition is met. This partitioning is called active, because it allows the developer (or the user) to actively select which objects to convert, and when those conversions can safely be performed. This is a rather flexible approach, but increases the work-load for creating patches. Identifying good partitioning strategies can also be very tricky and complex, since these depend on both syntax and context, often resulting in too conservative or unreliable updates. 3.2.6 Sub-Typing and Type-Checking Strictly speaking, global update, passive partitioning and active partitioning cannot as such be safely applied to running applications. The reason for this is illuminated in the following pseudo code example which is adopted from [Dug01] and introduced in Illustration 7. T x = T.get(); ...

// // // // T y = T.get(); // y.calculate(x); //

1) Retrieve an instance of T. 2) An update patch is applied here. T.get() is modified to return an instance of a subclass of T. 3) Retrieve an instance of the subclass. 4) Perform some calculations.

Illustration 7: An example showing code executed before and after updating.

Consider that the original method T.calculate(T) performs some arbitrary calculation, and as a design decision, the developers have decided to switch to using subclass TSub instead of T in the new version of the application. Illustration 8 shows how the subclass TSub overrides the implementation of that method. This is a simple update - only method get() is modified - and perfectly legal, since the new version of the application would work just as intended. However, the program will crash when executing y.calculate define TSub.calculate(T t) (x), since the contextual usage for the return TSub tsub = (TSub)t; value of method get() changed, and no ... conversion was performed to compensate that. end Illustration 8: Overridden method for If the contextual usage for some variable, class or instance field changes, we can address this subclass TSub (which is not updated). change while converting the data type (and forcing it to be converted regardless if it is been physically modified or not). The problem here is that the contextual change is the origin of the data structure (return value or creation) and not its reference. In the example above, we might perhaps be able to address the contextual change inside the method, but if x is stored to some global variable, it will escape the method. Any future code execution might rely on TSub instances having been stored, and crash the program when this is not the case. We would basically be forced to keep track of where each variable has been created, but not even that is enough. If, for example, x was cloned, then the clone would have to be considered as being created in a correct context, but the subject of cloning would still invalidate this context.

24

One could argue that this problem is slightly theoretical and lacks practical importance, but it is still a serious problem. Since the source of the problem is a contextual and not a syntactical modification, it is difficult to handle. We could of course prevent classes with referenceable instances from being updated, but then we will effectively be using version barrier as object conversion policy (see section 3.2.2, "Version Barrier"). Several existing dynamic updating systems, such as JDRUMS [DH01] and DVM [MPGBB00] actually seem to suffer from this safety threat. The source of the problem is the semi-loose typing. In a dynamically typed language, the data type can change arbitrarily, and in an object-oriented language, references declared to refer to a certain type can actually refer to subtypes as well. This loose typing allows contextual changes to be hidden - without altering the syntax. If more strict typing was required, contextual changes would never go unnoticed, but then object-orientation would not be the same either. Having identified sub-typing as being the source of the problem, we consider what solutions we have at our disposal. We can use stubs to convert old version instances to new version instances whenever a version mismatch occurs. This was discussed in relation to passive partitioning for converting objects. But now, instead of using version tags attached to classes, the developer would mark a class (the old version, T) to be contextually replaced with another class (the new version, Tsub). The approach does not work equally well here, since Tsub might not have entirely replaced T, but perhaps only in some context (such as return values from method T.get(T)). If so, not all objects of type T should be converted. In addition to the version tag, another tag could be used to identify in what context each instance was created, and based on that be converted or not. Identifying the points where contextual class replacing occurs between two versions of an application could be semi-automated using some diff-tool. In practice, however, such a solution with sufficient expressiveness is likely to be inefficient, complex and cumbersome. We notice that, instead of casting the parameter t to type TSub, we could redefine the type of t to be TSub in the subclass method. This is theoretically valid, but not supported by traditional programming languages such as C++ and Java. We note that we in fact want the argument to be of the same type as the instance that invokes the method. We call such methods binary methods, where the term binary comes from the fact that the methods acts on two objects of the same type - the instance on which the method is invoked and the argument. The method would be called binary even if it took more than one argument of that same type. Several problems are related to binary methods, and [Bru95] gives a good overview of that problem domain. In our own domain, we can take advantage of work done on binary methods when we consider something that could be called method type specialization [CHC94] [DW02]. What we consider to be most crucial, is avoiding runtime errors (program crashes) caused by the dynamic updating. Therefore, if the conflicts described above could be statically typechecked, we could at least guarantee safe updates (in T x = new Tsub(); T y = new T() respect to the described problem) and reject conflicting x.calculate(y); ones. Method type specialization addresses related issues, Illustration 9: An example that by relaxing typing rules for method arguments and return causes a runtime error when values. Instead of declaring the type of argument t as T in method type specialization is method T.calculate(T t), it can be declared as used. ThisType, being treated as the type of the class implementing the method. This way, the method signature 25

stays the same, but in class Tsub - which overrides the method - returned objects may be instances of Tsub or one of its sub-classes, but not of T or any other superclass. This helps, but we still cannot statically catch all type mismatches, since reference variables may dynamically be assigned instances of subclasses. The code in Illustration 9 (executed after applying the update described above) would pass a static type-check and fail at runtime, when attempting to invoke x.calculate(y). (Contrary to the example above, this failure could be detected already on invocation, before actually calling the method.) It can in fact be argued, that there is no subtype relationship between Tsub and T, although Tsub is a subclass of T. According to [DW02], this is one of the reasons for the lack of support for method type specialization in object-oriented languages. In addition, this does not solve all our problems, since it only works when the required return or argument type is the declaring class, and not in more general situations. Multi-methods, a.k.a. generic functions [Bru95] [CL95], are another solution to dealing with binary methods. A multi-method is a collection of method bodies associated with one single method name. Hence, the methods can be seen as different methods with one single method signature. Traditionally, object-oriented systems support dynamic dispatching for virtual methods, choosing the correct method implementation based on the class of the instance upon which the method was invoked. This is done by performing a simple look-up in the virtual method table (v-table) associated with that instance. Multi-methods are polymorphic for all their arguments, which means that the runtime type of the actual parameters are also considered when deciding the most appropriate method body. The self reference and method arguments are actually considered to be equally privileged, and in a pure multi-method-based language, the method is a global (non-instance) method that dispatches method bodies solely based on the types of actual parameters. This would be referred to as multiple dispatching in contrary to single dispatching where instance methods are used and the self reference is privileged. In a multiple dispatching system, all methods could basically be defined outside the class, leaving classes with no other purpose than to store data. This could be seen as violating the concept of object-orienting, but should perhaps rather be seen as a separation between code and data declarations. After all, methods would still be indirectly associated with classes. Multi-methods provide precise control on how different combinations of argument types are handled. This can be a problem because of the blow-up of number of methods needed to cover all combinations. Still, in solving the problem we focus on, this is not likely to be a big problem, due to the restricted usage of - and need for - multi-methods. 3.2.7 Adaptable Objects Yet another solution, called adaptable objects, is described in [Dug01] and [DW02]. It could be seen as a cross between stubs and method type specialization, and does not treat the old and new version types as distinct types. Adaptable objects can be seen as a generalization of sub-typing in object-oriented systems (where the typing constraints of a reference is relaxed to include subclasses of the actual class). Subtype relation rules are introduced when a new type replaces an old type, declaring that the new type is to be considered a subtype of the old type (Tnew 0] a: 1 b: 1 n: 1

a: 2 b: 3 n:-1

a: 1 b: 2 n: 0 [n=0]

[n=0]

[n>0]

[n=0]

Illustration 11: A subset of a state machine for calculating Fibonacci numbers. This subset demonstrates how the second term in the Fibonacci series is calculated. The result is stored in variable a, and intermediate results in b. Dashed lines show state transformations that we - according to the algorithm - would perform, had the value of n been another.

conditions. Any useful state machine is finite, which means that both the set of states and their contents are finite. External input makes the execution possibilities infinite, but the machine itself is still finite. A state can be seen as an image of the memory in a computer, with a PC saying what memory location is currently being executed. A state machine containing such states, and given an initial state (including the PC value), would then execute the application just like a regular computer would. To make state machines useful in practice, their syntax must be simplified. An abstract state machine (ASM) [Bor98] is a type of state machine for which a simplified textual syntax is used, and we will use them in the continued discussion. They are defined as having an arbitrary state definition and a fixed set of rules (conditional state transformations). (See Definition 1 Initial state for an alternative a: 0 and more b: 1 complete n: ? definition.) err: false Hence, the rules are fixed, but the Rules state changes. The [n=?] / n:=read input rules can refer to [n=0] / a:=0 || b:=1 || n:=? external input and [n>0] / a:=b || b:=a+b || n:=n-1 current state [n0 ^ a>1] / a:=b-a || b:=a || n:=n+1 whole application, and Modified Rules updating could perhaps be delayed until that part is [n>0] / a:=a+1 || b:=b*(a+1) || n:=n-1 no longer executed. We could also ask if PATCH 2 the updated ASM must Mapping ASM Rules immediately act as if the [a>0] / a:=0 || b:=1 || n:=n+a new version had been run in the first place. This is Modified State Definition of course desired, but Remove a perhaps not necessary in situations when mappings Modified Rules are difficult to come up [n=0] / b:=1 || n:=? [n>0] / b:=b*n || n:=n-1 with. According to our definition of validity (Definition 6), instead of Illustration 13: An example of two patches for the Fibonacci ASM delaying updating as described in Illustration 12. The first patch converts the ASM to an suggested above, we ASM which calculates faculty instead of Fibonacci numbers, and the could allow intermediate second patch optimizes the faculty algorithm. In both patches, the mapping ASM is used to undo calculations that have already been calculations to be made and enable the new algorithm to be run with the given input. incorrect if we know that This requires one single state transformation for patch 2, but multiple we will shortly end up in state transformations for patch 1. If input was remembered, patch 1 could also be mapped with one single state transformation. a valid state anyway. One of the good things with state machines is that they make no concrete difference between data and code. An ASM simplifies state machines by providing an alternative to code, namely fixed rules. Both the good and bad thing with these rules is that they do not encode any execution structure (hence what we would call code). This is good, because then we do not complicate updating or anything else for that matter, but less good, because we will then have one large set of rules which are active at all times. This is a serious information overload, resulting in inefficient code that is too complex to overview. It would be better if we could stick to restricted sets of changing rules. This leads us to the edge of separating code from data. To better understand the concepts of rules, code and data, let us consider what defines code. Code is (usually) static, constant instructions that define the behavior of an application. Code could be seen as location specific rules that are structured according to sequential execution, embeds location and prevents conflicting rules. An ASM can simulate code execution in two ways: 1) By including code and a PC in the state definition, declaring a rule for each machine-instruction and at each step executing the instruction pointed to by the PC. This is a direct simulation of how a computer executes code (and was already mentioned earlier, in the introduction to state machines). It results in a small set of rules, but an unnecessarily large state, of which part is treated as read-only (and therefore does not even fit into the definition of state). 2) By including a PC in the state definition, and using the PC in all guardians. This means that we need rules for each and every line of code. To reduce the number of 42

distinguishable states, all consecutive code that can be executed in parallel is grouped together under one single guardian. We do not want to settle for either one of the code simulations described above. But we will keep in mind the second alternative of grouping together parallel statements (that will in practice be executed sequentially). One way to avoid traditional code and still reduce the number of rules in an ASM would be to build up applications from several small ASMs, and at appropriate times switch between these. This approach closely resembles approaches that use submodules, and the state definition could either be common to all ASMs or be extended to include local data. 4.1.4 State Machines vs. Imperative Languages Let us have a look at how the Fibonacci example could be implemented in an imperative language. Illustration 14 shows both a high level and a corresponding low level implementation for all three versions of the Fibonacci example program. Changes between versions are written in bold face (except removed lines, which are not visualized), and from that, we can see that only very small fragments of the code change. But looking at the low level code, we note that line numbers change more frequently. Where the ASM only had to manipulate the variables when applying an update, an imperative approach must also take the PC into account. This of course means that every line of code in the old version must be mapped to some line in the new High Level v. 1.0 while true do a:=0; b:=1; n:=read input; if n0 do tmp:=a; a:=b; b:=b+tmp; n:=n-1; end; end Low Level v. 1.0 01 a:=0 02 b:=1 03 n:=read input 04 jc n>=0, 07 05 output error; 06 break 07 jc n=0, 07 05 output error; 06 break 07 jc n=0, 06 04 output error; 05 break 06 jc n1 do tmp:=a; a:=b-a; b:=tmp; n:=n+1 PC Mapping 08-09 Mapping: undo() New PC: 08 10 Mapping: a:=tmp; undo() New PC: 08 11 Mapping: n:=n-1; if n>0 then undo(); fi New PC: 07 12 Mapping: if n>0 then undo(); fi New PC: 07 PATCH 2 REMOVE a PC Mapping 02-07 New PC: PC-1 08 Mapping: n:=n+a; b:=1 New PC: 07 09 Mapping: n:=n+a-1; b:=1 New PC: 07 10 Mapping: n:=n-1; if n>0 then n:=n+a; b:=1; fi New PC: 06 11 Mapping: if n>0 then n:=n+a; b:=1; fi New PC: 06

Illustration 15: The patches for the pseudo code of the Fibonacci example. The PC mappings specify how the old version is transformed to the new version depending on what line of low level code would be executed next if the old version was kept running. Specifying these mappings is an errorprone task. It also requires a lot of work, although simplifications are used. These simplifications include using a macro for duplicate code in patch 1, specifying the same mapping for multiple lines, and referencing the old PC value.

version. But this is not the only implication. The mappings that we provide will also be dependent on the PC. Since our atomic units are now statements and not a group of multiple state transformations, we will also have to provide mappings for intermediate calculations. As can be seen from the patch descriptions in Illustration 15, patches are larger and more complicated than patches written for the ASM. It is also quite obvious that the risk of producing incorrect mappings for slightly more realistic applications is imminent. Compared to the ASM, the imperative programs are longer and more complicated, but better represent the sequential algorithms that they implement. The thread of control is clearly visible, but this is as much a curse as it is a blessing. The exact thread of control fixes the structure based on execution instead of updatability. It forces any reasoning about the application to be made in terms of this structure, and complicates any task that does not benefit from such a structure. Whereas any patch can be applied to an ASM by first having it execute a mapping ASM and then replacing its rules and its state definition, potentially every line of code might require a different mapping when updating an imperative application. This makes updating fragmented and more complex, simply because the enforced division of mappings serves another purpose than easy updatability (i.e. structured, sequential execution). Additional structure, such as separated local and global variables, stack, registers etc. further complicate updating. These have been left out in the example, but must be taken into account in real applications. Mappings for data that is accessible from multiple locations (such as global variables and pointer targets) might conflict with mappings that are performed somewhere else but targets the same data. Temporary storage in the stack and the registers must also be accounted for in the mappings. This can be a problem in compiled languages, because different compilers or slightly modified code can produce 44

widely different compiled code that can be hard to relate to each other. In these cases, it might not help to have both the source code and compiled code for both versions, because the clearly seen modifications that have been made to the source code can be tricky to relate to the differences in compiled code.

4.2 The Updating Aspect The problem we are facing - with software having bad support for dynamic updating could be described more generally in the context of adaptive and aspect-oriented programming. 4.2.1 Aspect-Oriented Programming Aspect-oriented programming (AOP) [KLMMLLI97] is a direct reaction to conflict-ofinterest in software development. Software is designed and expressed in terms of functionality, but functionality might not be the only criteria that an application must fulfill. Aspects such as efficiency, optimization of memory usage, efficient network bandwidth usage, efficient distributed computing, real-time constraints, synchronization, and in our case updatability, can all be vitally important as well. The first two aspects can - to some degree - be accounted for by a sufficiently advanced optimizing compiler, but the other aspects would traditionally have to be indirectly dealt with, nested inside the rest of the (functional) code. There are two reasons why multiple aspects might cause conflicts-of-interest. Firstly, different views or models are suited for different aspects. For example, subroutines, classes etc. express functionality, and not (for example) real-time constraints. In spite of this, all aspects are described using the same (functional) view. Secondly, the aspects are typically independent of - but still tightly bound to - each other. For example, if the memory or bandwidth usage changes, then the functionality will also change. This is called crosscutting. If we would model different aspects in their own, more suitable views, we would get a clearer overview of those aspects. We would have full control over them, and be able to both inspect and modify them as we please. In the Fibonacci example, patch 1 was more difficult to apply than patch 2, because of something that could be seen as a form of cross-cutting. The grouping of distinguishable states was incompatible for version 1.0 and 1.1 of the Fibonacci program, because not only the implementation, but also the algorithm changed. Generally, aspects that cross-cut - and thereby conflict - complicate matters more than the individual aspects together would do. In practice, An AOP application consists of a component program, one or more aspect programs and an aspect weaver. One or more (arbitrary) programming languages can be used to write the individual programs. The component program implements components and hence defines the functionality, while the aspect programs each model one single aspect. The aspect weaver is used to transform and combine all programs into one single executable, either at compile time (CT weaving) or at runtime (RT weaving). [KLMMLLI97] Development using AOP is in practice nothing fancy, and only involves transforming programs based on rules defined by various aspects. Hence, a compiler that optimizes an application for efficiency or memory usage - as mentioned previously - is itself an aspect weaver.

45

Typical implementations of AOP [AspectJ] provide join points as locations where transformations can be made. Typical join points are subroutine calls and data access. Point cuts select join points based on some criteria, such as pattern matching for the name of called subroutines, active subroutines, control flow and more. At each join point selected by some point cut, advice is inserted before, after or around the joint point code. This advice is simply code that can do some pre- or post-processing, alter arguments, and so on. In addition to this functionality, the class hierarchy can also be altered somewhat. The idea of modeling aspects separately is intriguing, but difficult to realize well in practice. Implementations such AspectJ and AspectC++ allow code to be automatically inserted at various locations, but are still bound to defining aspects in a functional context. They are also too simple and neither expressive nor general enough. In theory, modeling an updatability aspect in AOP would be the solution to our problem, but in practice, this is easier said than done - especially since section 2.1.1, "Requirements" declared that updatability requires flexibility, robustness, efficiency, ease of use and deployability... 4.2.2 Adaptive Programming Adaptive programming (AP) is a refinement or specialization of object-oriented programming. It tries to improve support for software evolution by raising the abstraction level of cross-object communication, and can be seen as taking an adaptiveness aspect into account in AOP. Hence, it can be combined with other objectoriented development methods. An application can be made adaptable using localization and information hiding, and hence tackles the problems discussed in section 4.1.1, "What is Wrong?". Adaptiveness is achieved by following the Law of Demeter (LoD, Glossary 18). The most straight forward approach to doing so is by simply restricting data access to the class that defines the data, and requiring classes to access data from other classes indirectly through methods. This enables internal implementations to be easily replaced, but also restricts usage relations to one single level of depth. Hence, a class only has knowledge of its superclasses (but not even their internal data) and the classes it directly uses (being its closest neighbors). If the object model changes, then the change is local and only affects nearby classes. Another approach to following the LoD is by using AP and the Demeter Method [Lie96]. In this approach, a class dictionary specifies a partial class structure, or rather rules for class structures. Infinitely many class structures can satisfy each class dictionary, and a customization is used to specify one such concrete structure. Communication between classes is specified in terms of the class dictionary, so the customization can be modified freely as long as it satisfies the class dictionary. Propagation patterns, a.k.a. traversals, enable this abstraction. They are the building blocks of class dictionaries, and they specify loosely how classes reach other classes. Instead of specifying exactly how one class reaches another, we use traversal directives like "from class A to B" or "from class A to B, via C". Such directives usually makes sense, because most of the time, we intuitively know that the classes are connected "somehow". We know that a body can access a hand, a company an employee etc. but can implement these relations in many different ways, using various sets of collaboration classes, different class hierarchies, and so on. Updatability is closely related to adaptability. If the adaptability of an application is improved, then so is the updatability. However, the main concern of adaptability is

46

static modification and evolution of applications. Adaptability most certainly also affects dynamic updating, but updatability is a larger concept. As we go on to further simplifying and specializing abstract state machines, we have the choice of separating code from data or not. Simplifications could be made in both cases. If we'd decide not to separate code from data, we could for example define a system with a main control flow (a brain) that stimulates small abstract state machines (cells) with blurred separation of code and data to change state and perform certain actions. We could potentially provide quite good support for replacing and restructuring, perhaps allow interleaved parallel execution between cells, and so on. However, the abstraction level of an approach such as this would be high, and the link to low level primitives very vague. Worse yet, writing applications in such a language would be radically different from what developers are accustomed to, and algorithms might be more difficult to formulate. Therefore, we decide to separate code from data instead, and that is what we will do in the next two sections.

4.3 The Sequence Model If we are to update a state machine, we can let the developer provide mappings for a subset of all states present in the old version but not in the new one. These mappings are trivial, and of the form "map state A to state B". When an update request is made, we postpone updating until the current state is updatable. An updatable state is one that either exists in the new version, or has a mapping defined. Writing an update patch and performing updating are both trivial (but cumbersome) actions. A state machine is impractical, because state is an unorganized concept with virtually infinite combinations. Separating code from data simplifies matters, as does introducing new concepts such as subroutines and stacks. All such simplifications group together large numbers of states according to functionality, which is the most important aspect, since software is designed in terms of functionality. The act of updating is deemed to become more complex when we group states in this way, and as we will see, we end up with almost the same concepts as those that are used today. However, keeping in mind what we have learned so far, we will view them from a slightly different perspective. Updating can be kept at least similar to updating state machines if a) non-modified code can be mapped with ease b) modified code can be fairly easily mapped c) data can be converted separately from code. This section will focus on the first two requirements. 4.3.1 Control Flow Graphs A control flow graph (CFG) is a classic visualization of imperative code, or algorithms in general. A CFG is simple and easy to understand, but sufficiently low level to only make assumptions about algorithms, and not about high level concepts. A CFG consists of a number of computation and selection blocks, executed in a well defined, sequential order. Subsets of CFGs can further be grouped into black boxes that hide the internal control flow and thus raise the abstraction level. CFGs closely resemble state machines with data removed from the state. While a state machine selects what state to go to next based on some input, and optionally writes some output, a CFG selects what block to go to next based on some data, and optionally modifies some other data. In other words, a CFG could be seen as a state machine with data accessed as I/O. The only thing we lose by using CFGs instead of state machines is the direct link between code and data. After 47

applying an update patch, the data can potentially be such, that it could not have been created by the new CFG. In practice, this is not a problem if the update patch is safe and well formed, and our definition of validity (Definition 6) actually allows this kind of a mismatch. The discussion above goes for CFGs with one-level scope (always having exactly one execution environment). If subroutines are not allowed, then the CFG has a one-level scope, a:=0; and if they are, then the scope of b:=1 the CFG can be made one-level by inlining the subroutines. (Inlining recursive subroutines n:=read input can result in infinite CFGs, but this does not pose any theoretical problem.) The shortcoming of Illustration 16: A CFG n0 and basically only output error allow loops to be defined. But we contains two types of need subroutines to avoid true false blocks - one for duplicate code altogether. And [exit] computation and one for tmp:=a; subroutines are simply too selection. The selection a:=b; practical for us not to support blocks are used to b:=b+tmp; simulate if-, switch-, n:=n-1 them. So let us consider what while- and formakes CFGs with multi-level statements. scope more difficult to update than CFGs with one-level scope. Execution starts in the main scope, and goes down one level every time a subroutine is called or a "begin-end" code block is entered. In one sense, this aids updating by keeping modifications localized. But the problem is that these scopes overlap, and we will get a whole hierarchy of active scopes. We note that a) how to update is no longer single-handedly defined by the current execution location b) we cannot update if some super-scope has also been modified. The first problem is unavoidable since it is a direct result of code reuse - code can be called from multiple locations - and we will just have to deal with it somehow. The second problem can be avoided if the overlapping scopes are split so that exactly one scope is always active. This does not mean that we use a standard one-level scope, because we can still have multiple subroutines active. But the scope would end when a subroutine is called and the next one start when returning from it. This is the initial idea behind the Sequence Model that we are about to describe. 4.3.2 Basic Concepts We find that grouping of code into subroutines (a.k.a. functions, procedures and methods), as is done in traditional imperative programming languages, is not satisfying. One motivation is the well known fact that subroutines ought to be short, but still always end up being long and complex. Quoting Koopman in [Koo89]: "If a procedure contains more than seven distinct operations, it should be broken apart .../... the human mind can only grasp two or three levels of nesting of ideas within a single context. This strongly suggests that deeply nested loops and conditional structures should be 48

arranged as nested procedure modified scopes main calls..." Instead of subroutines, we scope A decide to use sequences of code. A sequence closely resembles a B subroutine, but should be seen as a C slightly more general and lightweight unit. We know that code Illustration 17: must be executed sequentially, and The dilemma with also that subsets of CFGs can be multi-level grouped together. We notice that the scopes. The main D is at level benefit of grouping code together is scope 1, scopes B and F code re-usage, but the multi-level are at level 3, scopes resulting from calling re- and so on. E usable code degrade updatability. Modifications to F are local and We now define a sequence to be only affect F F any collection of sequential code, itself. But having one single entry point and modifications to multiple exit points, and for which A affect B, C and every instruction is executed at D and are thus most once (Definition 4). According poorly localized. to this definition, sequences are not allowed to call other sequences, and loops are not considered to be sequences. However, loop bodies, single statements, code blocks etc. can be treated as sequences. We remember that CFGs contain two types of blocks - computational and selective. We could for example group together consecutive computational blocks and consider them to be one single sequence. If we wanted to, we could also group together all blocks that are related to a selective block (such as a loop body or an "if-else" code block). To enable code reuse, we introduce the concept of sequence chains (or simply chains) and define these to be chains of sequences and sequence chains (Definition 5). By defining sequences and sequence chains as different concepts, we support code reuse through sequence chains, but still only allow one-level scopes. Much like subroutines, multiple sequence chains may be simultaneously active, but unlike subroutines, sequence chains are more like an abstract concept, with the non-overlapping sequences being the base unit in our sequence model. This difference might seem like a trifle, but in fact represents a significantly different way of thinking. Illustration 18 tries to point out the important distinction between subroutines (or code blocks in general) and sequences. Sequence chains can also be seen as a generalization of code reuse. Arbitrarily small sequences can be reused (in contrary to only being able to reuse subroutines), and code reuse in loops is not distinguished from code reuse implemented through recursion. In practice, we might still want to restrict the visibility scope of a sequence to one single sequence chain. Readability can be improved and name clashes avoided by restricting the scope of visibility to modules, hierarchies of sequence chains, and so on. One such example would be to allow chains to be declared inside other chains, as supported by e.g. Pascal. This is only syntactic sugar and does not affect the updatability. By viewing code through the concept of sequences rather than subroutines, we can better reason about update timing. Updating is possible between any two consecutive sequences. As mentioned earlier, having a modified subroutine lower down in the call stack is a problem, but we solve it by instead using sequences and sequence chains, 49

which makes sequence chains independently sub A sub C sub B chain B updatable. Updating chain C need not be delayed any longer than until the call B currently executing call C chain is updatable. When the current chain becomes updatable, we simply perform updating. When we return from a chain, the chain chain D sequences sub D blocks caller is always in an updatable state. After call D having begun updating, a virtual sequence that the old version will crosses chain never again be needed, boundaries and code updating can Illustration 18: Subroutines vs. sequences. Subroutines are the base unit in imperative code, but the corresponding sequence model primitive - either be performed all once, or sequence chain - is more like an abstract concept. Instead, the at sequential control flow - sequence - is the key unit, and its elements - incrementally when chain blocks - are the concrete units of code. returning from chains. This is similar to incremental global update for data structures. For this to be possible, we require that every chain call which is made from a modified chain is considered to be an update point. Every update point must then specify a mapping from the old version to the new one. Several update points can often share the same mapping, and many mappings can be trivial or even empty. Sometimes, an active subroutine might no longer exist in the new version, and whenever that happens, instead of performing update mappings, we must perform discard mappings and unwind the call stack. A subset of our update points would hence also require discard mappings, but these could most of the time be left empty. We only allow iteration through sequence chain calls, so infinite or long-lived loops can naturally and quickly be updated. We may further require that tasks that can halt execution - such as listening to a socket, reading input or sleeping - are also considered to be update points if they are located inside a modified sequence chain. The penalty of requiring so many update points is of course that the developer would have to provide quite a few update point mappings. But many of these would be trivial and the task could be largely automated. The overall benefit is that we have actually accomplished to make update timing and synchronization a triviality. And we have also provided a partial solution for the difficult problem of updating active subroutines. The observant reader should realize that sequences can largely - especially in low level code - be simulated using subroutines. The subroutines only need to be split into as small parts as necessary for every part to confirm to the definition of a sequence. In spite of this similarity, the concepts of sequence and sequence chain are novel, and forces subroutines to be viewed from a non-traditional angle. subroutines:

sequences: chain A

50

4.3.3 Avoiding Recursion As suggested in section 3.3.3, "Converting Call Stacks that Contain Modified Subroutines", recursion can seriously degrade updatability. In a scenario where an update patch is applied when 10,000 instances of a modified power function are on top of the stack, we would have to perform mappings for all those 10,000 instances. Many recursive calls are short lived, and one could argue that updating could be delayed until the recursion finishes. But this breaks the requirement for how update points must be selected, and unexpected delays might prevent updating. If we could avoid recursion altogether, we would have both straight forward control flow (iteration is allowed only through chain calls) and straight forward call flow. This would make active subroutines more manageable. In addition, avoiding recursion also has the potential of saving memory, but this obviously depends on how much overhead memory is spent on recursion removal. Modern compilers support a primitive form of recursion removal through tailrecursive optimization (TRO) [BW01] [Cli98]. Such optimization transforms recursion to normal iteration, which is essentially what any recursion removal approach must do. However, this optimization only applies to tail-recursive subroutines, which are subroutines for which the (only) recursive call is the last instruction. Hence, a power function returning x * power(x, y-1) could be optimized, but the same is not true for one returning multiply(x, power(x, y-1)) and neither for one returning power(x, y/2) * power(x, (y+1)/2).

define p(x, y) if y = 0 then return 1 elif y = 1 then return x else return mp(x, y/2, x, (y+1)/2) end p(5,6)

ret mp(5,3,5,3)

ret*p(5,3) p(5,3)

define mp(x1, y1, m2, y2) return p(x1, y1) * p(x2, y2) end

ret*p(5,3) mp(5,1,5,2)

ret*p(5,2)*p(5,3) p(5,1) 5*ret*p(5,3) p(5,2)

5*ret*p(5,3) 5*ret*p(5,1)*p(5,3) mp(5,1,5,1) p(5,1) 25*ret*p(5,3) p(5,1)

125*ret p(5,3)

125*ret mp(5,1,5,2)

125*ret*p(5,2) p(5,1) 625*ret p(5,2)

625*ret mp(5,1,5,1)

625*ret*p(5,1) p(5,1) 3125*ret p(5,1)

15625

Illustration 19: An example of indirect recursion. Subroutine p is an abbreviation of power and mp an abbreviation of multiply power. In this example, p(5, 6) is called in order to calculate 56. Using normal recursion, the columns represent the call stack. It contains at most seven stack frames, and e.g. four when executing mp(5, 1, 5, 2). Grayed arrows show where execution will continue after stack unwinding. Using recursive returns, the call stack never contains more than one stack frame, and subroutines are called in the order denoted by the solid arrows. In this scenario, the grayed text above each subroutine presents the content of the recursion stack before calling the subroutine. The recursion stack never contains more than one occurrence of ret, which represents the return value that is to be calculated. When all subroutines have been called, the recursion serving chain will have calculated the value 15625, which is then returned to the caller of p(5, 6). This illustration reveals that recursive returns support indirect recursion as long as every return instruction that is involved in recursion is treated as a recursive return.

51

A general solution to recursion removal is to use a recursion stack to record "postponed" operations. This solution, discussed in [WB99], pushes the first recursive call encountered and all following instructions to a local stack. Local variables used in these instructions are substituted with fixed values. The subroutine is then exited, and another subroutine is immediately called. The called subroutine pops and executes instructions - one by one - from the local stack. The first instruction will be a call to the recursive subroutine, and that call might push more instructions to the local stack. When there are no instructions left on the recursion stack, the helper subroutine ends, and execution continues in the caller of the recursive subroutine. This approach to recursion removal can transform any recursive algorithm to an iterative one, but will waste an amount of memory relative to the number of postponed operations. In order to be practical, the subroutine should therefore finish shortly after making the first recursive call. This is often the case, since recursive subroutines commonly return shortly after calling themselves one or more times. As a conclusion, we suggest that recursive returns should be used. This novel concept is in practice a return instruction that is followed by a fixed sequence of linear (no branching is allowed) instructions. When executing a recursive return, a normal return instruction followed by these instructions are first pushed onto a thread-local recursion stack. The chain then "returns", but immediately calls a recursion serving chain. This chain pops and executes all instructions on its recursion stack. During this execution, the original chain might be called again and then push new instructions to the recursion stack. When the recursion stack is finally empty, the recursionserving chain returns with the same return value and side-effect as the original recursive chain would have produced. As shown in chapter 5, "The Updatable Virtual Architecture" and [Öst03], this can be implemented much more efficiently than by pushing every single instruction to the stack. The concept of recursive returns is similar to the concept of continuations [HD90]. Mainly functional programming languages - Scheme, SML and more - support continuations, but these will not be further discussed here. Indirect recursion - as the result of, for example, chain a calling b calling a - can be implemented simply by using recursive returns in all the chains that participate in the recursion. The chain calls must then share the same recursion stack, but this will of course be the case for singlethreaded recursion. 4.3.4 Building Blocks An implementation of the sequence model would require a set of supported chain blocks, much like instructions in imperative programming languages. Most of these would be sequence blocks, containing one single sequence. Some would be call blocks, calling another sequence chain, and possibly performing some pre- or post-action. Some would be end-point blocks, specifying the end of a sequence chain. Every block would have one single entry point, but might have multiple exit points (just like sequences). In the following, we will list a set of basic chain blocks, typically needed by any implementation. A computational sequence block encapsulates one or more computational instructions. These include mathematical and logical operations, setting the value of variables etc. This block has one single exit point. A selection sequence block contains one single selection instruction. Selection instructions correspond to conditional jumps in traditional low level languages and "if" and "switch" statements in traditional high level languages. This block has two or more exit points, of which one is the "default" exit point. 52

The continue end-point block is used to return "with success" from a sequence chain. In case the computational chain was called as a loop, the loop would then continue. Has no exit point. parallel The break end-point block is used to return "with failure" from a sequence chain. In case the call chain was called as a loop, the loop would then be broken and execution continue after the loop call. Has no exit point. loop A standard call block calls another sequence chain. It has one exit point. selection A loop call block implements recursion-less looping of any sequence chain. It evaluates some condition before or after having called the chain, and calls the chain again as long as the condition is I/O evaluated to true. Has one exit point. In addition to the basic chain blocks listed above, more specialized blocks could be supported. The communication purpose with the specialized blocks is either to provide a more user-friendly programming interface allocation or to force sequences to end before executing some special code. As discussed previously, a sequence Illustration 20: A sample of what the must end before calling a sequence chain, and chain blocks could look like. The preferably also before performing tasks that might arrows represent entry and exit points. Grayed, dashed exit points block or delay execution, such as accessing I/O. A parallel sequence block is similar to a represent optionally defined exit such as if a call returns with computational sequence block, but behaves as if points failure or if an exception is thrown. every instructions was executed in parallel. A Multiple such exit points can compiler can add block-specific temporary variables typically be defined. A call block to enable this seemingly parallel behavior. Useful e.g. with a failure exit point is the same for swapping the value of two variables, and closely as a test call block. resembles state transition in a state machine. Has one single exit point. A test call block is similar to a standard call block, but has two exit points. If the called sequence chain returned using a continue end-point block, then the "success" exit point is followed, otherwise execution follows the "failure" exit point. A recursive return end-point block would implement the recursive return concept discussed in section 4.3.3, "Avoiding Recursion". A recursive return block is followed by one short branch containing both recursive calls and normal instructions. The code in this branch is to be pushed onto the recursive stack, and then iteratively popped by a general recursion serving chain. An I/O call block is used to read input, write output, or trigger an I/O service. It is similar to native method invocation in Java, and has one or two exit points. If two exit points are used, then one of them defines the behavior on error or time-out. A communication sequence block for adding built-in support for distributed computing. It could implement multiparty interaction [EFK89], in which each process participating in the named interaction specifies a set of variables that it will reveal to the other processes. All processes are first synchronized, and then execute their own 53

interaction code, modifying variables based on the values of the variables revealed by the other processes. This block has one or two exit points, and if two exit points are used, one of them defines the behavior on error or time-out. An allocation call block is used for instantiating new objects in an object-oriented implementation and more generally for allocating memory. In the case of instantiating objects, the class constructor will be invoked after allocation. This block has one single exit point. Each chain block can have temporary variables that are allocated on the stack when entering the block, and removed upon exiting. Each block can also take input and produce some output. If desired, each block could be a completely independent black box - much like a filter in a pipe-and-filter software architecture [SG96]. Doing so would satisfy the aspiration for localization well, but is not very well suited for describing algorithms. Many blocks are likely to need access to the same common data in a mixed order. Allowing local data to be associated with sequence chains is perhaps the most reasonable solution, but when allowing that, we once again take a step towards existing concepts. However, this should not make updating any more difficult in practice, and by defining loop bodies in separate sequence chains, we still get finer grained localization than when using traditional subroutines. A potentially powerful additional feature would be to define sequences to be atomic. This would mean that once a sequence has started executing, it will not be interrupted. Some exceptions to this rule could be allowed, but the bottom line is that context switching between threads is only performed when no thread executes intermediate calculations. This would provide a built-in, easy to use mutual exclusion facility and still avoid dead-locks. As a side note, update synchronization would be even easier if context switching was only allowed at update points. The reason that all of this is possible with sequences and none of this with subroutines, is that sequences only perform quick and simple tasks and can never block. In practice, context switching is commonly implemented using external interrupts (signals). These can typically be disabled and enabled at will, using instructions such as CLI (CLear Interrupt flag) and STI (STore Interrupt flag) [KK94]. Disabled interrupt requests are not discarded, but postponed, so these very fast instructions are exactly what we need. We can minimize the imposed overhead and make update synchronization automatic simply by allowing context switching only at potential update points (such as chain calls, I/O requests etc.) Such a solution can even be trivially applied to traditional subroutines. There might be some pitfalls, such as making sure that non-context switching interrupts and runtime exceptions (referencing null pointers, indexing outside of array boundaries etc.) are handled correctly. But all these issues should be manageable. Letting multiple versions co-exist is generally complex and heavy-weight. Using sequences, update timing is trivial in single-threaded applications, so global update is therefore favored. Synchronizing processes in multi-threaded applications and distributed system poses no threat, because each process can be made updatable very quickly, and updating is not delayed for long. IPC and RPC instructions are compulsory update points just like chain calls, so processes cannot communicate with other processes before becoming updatable. But threads can share information more freely, and if the application has critical sections, update synchronization between threads could lead to deadlocks. If mutual exclusion is restricted to atomic sequences, as described above, deadlocks are completely avoided and both distributed and multithreaded applications still well supported. 54

If desired, we can also guarantee that computational sequences are treated as atomic transactions. This means that the sequences either succeed or have no effect, i.e. do not mess things up. This functionality can be provided by, at compile time, adding temporary variables and rearranging code in such a way that only temporary variables are allowed to be modified until all code that can throw exceptions has been executed. This could slightly improve application robustness, but perhaps not enough to justify the overhead (which would not be large in average, but could be substantial for some sequences). There are a few optional features that can aid both the developer and an automated tool in writing mappings for update points. A trace stack could be used in such a way, that every branch would push an ID for the selected path. A loop call would push the loop count, and hence update this value every iteration. This does, of course, waste some memory, but not much, since sequence chains are typically short, and traces for iteration and recursion are not remembered. Compared to the overall memory usage and memory availability in modern applications, stack memory consumption can usually be neglected. For minimum overhead, we get the exact control flow of every active sequence chain, and this information can then be used to advantage when writing mappings. To be able to provide fully correct mappings, we would also need to know the values of variables before they are overwritten. Local variables have statically defined initial values, and are thus trivially handled. An initial value pool with a statically calculated size can be associated with each chain call. The initial values of all arguments and global variables that might be modified, are stored in this pool, as are all return values (from called chains and I/O) that might be lost (by overwriting the variables that originally stored the return values). Together with the trace stack, the initial value pool would enable exact update mappings. It would also greatly improve the quality of mappings suggested by some automated tool. Both the trace stack and the initial value pool are possible to efficiently implement because no instruction can ever be executed more than once (unlike traditional programming languages). If recursion is only allowed through recursive return blocks, the stack overhead will truly be negligible, and the performance overhead the only reason why not to use these two features. As becomes clear in chapter 5, "The Updatable Virtual Architecture" and [Öst03], the overhead for an almost complete implementation can be nearly insignificant. 4.3.5 Performing Updates We have already, at several occasions, pointed out that the sequence model is closely related to the traditional subroutine model and that it can largely be simulated by it. But we should also point out that it has many similarities with the second alternative to simulating code in an ASM, as described in section 4.1.3, "Reducing Complexity". (Have a look at Illustration 22 for an example of sequence model code.) Having said that, we go on to discussing how sequence chains can be updated. The mapping order is of importance when the call stack contains modified chains below the top-most chain. The mapping order can either be bottom-up or top-down. Bottom-up (or replay) mapping is exact and capable of producing results that are identical to having executed the new version in the first place. It requires that all chains that are higher up in the call stack are replayed, and would be the same as restarting active subroutines (see section 3.3.2, "Active Subroutines") if only the first mapping had

55

modified sequences update

chain E

chain D

1.4

(modified)

1.3

chain C

2.2

chain B

1.2

(modified)

chain A

1.1 2.1

tupdate Time

Illustration 21: Mapping order matters. Sequence chains B and D have been modified and must thus be updated. Chain E is on the top of the stack when performing the update. Arrows 1.1 through 1.4 show the mappings (and replays) required for bottom-up order mapping, and arrows 2.1 and 2.2 the mappings needed for top-down order mapping.

also been a replay. The problems with this approach are the same as with the approach of restarting active subroutines. Top-down mapping is more inexact than bottom-up mapping and can produce results that are not the same as if having run the new version all along. As argued while discussing validity (section 2.2.1, "Validity"), this is not necessarily a problem, and in the rare cases when it is, the developer could decide how to perform mappings based on the current call stack. In order to fulfill the requirement that we must be able to map non-modified code with ease (section 4.3, "The Sequence Model"), bottom-up mapping is not an option. There is a tight connection between updating sequences and implementing an updatability aspect in AOP. Every transition from one sequence to the next is a join point. A well defined point cut selects some of these join points to be update points. The mapping associated with an update point is the advice for that point cut. As a conclusion, we realize that our updating approach is essentially dynamic AOP. A thread is updatable if a) the currently executing sequence has not been modified b) an update point has been reached c) the thread is suspended (sleeping, waiting for I/O etc.) at an update point. Applying an update patch involves the following steps: 1) An update patch is uploaded to the updating system on one of the network nodes running the application. 2) The updating system distributes the patch to all nodes and processes that are participating in distributed computing. 3) Distributing the patch might be time consuming, so an initial synchronization is made, with the purpose of informing all processes when every process has received the whole patch. 4) Until now, the application has executed without halting, but perhaps a bit slower since the updating system has executed in parallel. But from this point on, every thread is blocked whenever it becomes updatable, and when all threads in a process are updatable, that process informs that it is ready to commence updating.

56

5) When all processes have informed that they are ready to start updating, we have performed a second synchronization. This synchronization is fast, because threads become updatable very quickly. Now each process applies the update patch, reloads code, updates references and performs mappings. At this time, the user might detect a small interruption in application execution, but the interruption can be minimized by moving reloading etc. to step 2 and, if needed, using incremental instead of instantaneous global update. 6) When a process has finished updating, it informs the other processes that it is ready to start executing again. 7) When all processes are ready to continue execution, we have performed a third synchronization, and can continue executing the application in all processes.

[chain main]

loop prepare

PATCH 1 Modified Sequences a=a+1 [chain calc b=b*a use: a,b,n] n=n-1

[break] ]

Mappings [chain prepare var: a,b,n]

a=0 b=1

read n

0 share: a,b,n

[continue] ]

a=0 chain calc: b=1 n=ivp(" "read n" ") PATCH 2 Modified Sequences [chain calc b=b*n [chain prepare use: b,n] n=n-1 var: b,n]

b=1

Compulsory Update Points "loop calc" " [chain calc use: a,b,n]

a=b b=b+a n=n-1

Mappings b=1 ": b=1 chain calc: n=n+a "loop calc"

[continue] ]

Illustration 22: The original Fibonacci program, expressed in sequence model terms, along with patches for both updates made in the Fibonacci example (Illustration 12 and Illustration 13). In the syntax given here, chains are allowed to share variables. The special function ivp(x) returns the value associated with variable or instruction x in the initial value pool. At first glance, the benefit of using the sequence model might not be obvious, and the implementation and patches might appear to be complex. However, much is trivial and can be automated. Especially the patches are much more straight forward than the pseudo code patches in Illustration 15. The big gain comes from updating being inherently supported. Updating is more local, controlled, flexible and safe. Patches are straight forward and complete.

4.3.6 Summary To get a better understanding of how the sequence model relates to traditional imperative code, consider the example in section C.1, "Polygon Rendering". This example shows a sample implementation for coloring a polygon by, for every horizontal line, finding two points on the polygon border, and then drawing the line in between these two points. This is an efficient algorithm for filling polygons, and the example shows a Java implementation and a pseudo code sequence model implementation. The main disadvantage with imperative code is tight connection. Tiny modifications result in large subsets of affected code. Another disadvantage is that it is less controlled than sequence model code, and infinite loops etc. can slip by unnoticed, resulting in patches that

57

can never be applied. A third disadvantage is that version mappings and state transformations can be more difficult to correctly perform than in the sequence model. Minor modifications to imperative code can easily affect the compiled code in such a way, that at least the line numbers for almost every other line changes. If code is added or removed, then the line numbers of subsequent code will certainly change, as is the case if, for example, error checks were added to the polygon rendering example. In sequence model code, all such modifications would be local. While developing sequence model applications, we could trivially apply patches for e.g. adding or removing debugging code, and could hence debug and develop running application - without loosing the current state every time something needs to be tested. Development practices often strive to split applications into modules and classes, with the goal being high modularization and abstraction that hides away internal details and allows units to be replaced as plug-in black boxes. The sequence model largely does the same thing, i.e. enables the contents of subroutines to be more easily restructured and replaced. If imperative code is modified, then not only that code, but the whole subroutine is indirectly modified. This means that a modified unit will be active for the whole duration of that subroutine - which includes the duration of all called subroutines. Unless active subroutines are updatable, even a tiny modification would prevent updating as long as the surrounding subroutine is executing. In sequence model code, only directly affected, short lived chain blocks would be modified. A skeptic might argue that subroutines in themselves are units that are small enough, and that we only need to declare precisely enough update points to prevent active methods from infinitely postponing updating. And there is certainly some realism in such skepticism; in many cases, neglecting active subroutines is the simplest and smartest way to go about. But doing so will not always do, since active subroutines must at times be interrupted and updated. And - as the skeptic might have suggested - adding update points only to suitable locations in those methods is a very unreliable activity. In imperative code, it can be difficult to overview the call graph of subroutines and thus find every control flow that requires an update point. Remembering that our goal is to develop disciplined means for reliable and flexible updating, we are not too keen on accepting such a solution. That solution is not very unified, but instead rather seems like an ad-hoc solution to supporting dynamic updating. Using the sequence model, we also get superior update synchronization, which allows us to update large distributed systems, apply several update patches in sequence, jump between one version and another, and so on. If we do not need all this, we can go about just as for imperative code and only define mappings where we consider them to be absolutely necessary. Even then, we will benefit from the sequence model and its inherent support for code modification and update point declaration. In short, we can dynamically update imperative code, but need something like the sequence model in order to get a unified, truly flexible, reliable and highly controlled updating environment. Updating is made easier in the environment we have described because: – Non-modified chain blocks are not directly affected by modifications in other blocks, unlike unmodified instructions in traditional imperative languages, for which at least the line numbers would change. – Of the same reason, chain blocks can be removed, added and relocated without directly affecting other blocks. 58

–

–

–

–

– – – – –

Active chains do not degrade updatability much. Instead, active chain blocks degrades updatability about as much as active subroutines do. Although many chains can be simultaneously active, only one chain block is active at a time. Furthermore, each chain block is a separate unit, which is very short lived and cannot halt. Because of this, active units are more easily avoided and handled than in imperative code. Large and complex chains are largely avoided (unlike when using subroutines). This improves software design and makes both functionality and modifications more local. Developers are encouraged to write small units of code. (Commonly, developers are recommended to keep subroutines short, but almost exclusively tend not to do so.) They are also encouraged to think in terms of functionality rather than implementation, in the sense that, when writing code, they add loop statements and chain calls where needed, and only afterwards write the implementation for those loop and call chains. Loops are considered equally important as functions, so a) long-duration loops can be broken b) code reuse is generalized and no instruction is executed twice c) there is no unnecessary data sharing between loops and enclosing code. Required update points guarantee easy update synchronization and updating "anywhere", without requiring too much overhead work. Together with a trace stack and initial value pool, we have full control of the control flow, and can provide exact mappings. Mutual exclusion and safer context switching is made possible through the use of atomic sequences. Chain blocks can have temporary (localized) data. Patches can contain the incremental patches for multiple versions, and these can easily be applied step-by-step because of the easy synchronization.

The penalty for these improvements is performance loss. The abstraction level is higher than that of traditional imperative languages, and performance will thus degrade when converting to a lower abstraction level. Transitions between chain blocks might require some overhead work to be done, and iterative algorithms would be as inefficient as their recursive equivalence. However, some programming languages implement iteration only through recursion - typically functional languages, such as Lisp and SML - and are still fairly efficient. Hence, declaring all loops as sequence chains could also be made efficient enough. Even better performance can be achieved by breaking sequence model rules in a controlled way, as suggested in chapter5, "The Updatable Virtual Architecture" and [Öst03].

4.4 Entity-Oriented Programming The previous section discussed how code could be structured to improve updatability. We mentioned that one of the means to accomplish such structuring is to keep update dependency between code and data to a minimum. In this section, we will more thoroughly discuss this, as well as other issues related to updating data. A flexible updating system would allow classes to replace each other, functionality to be moved from one class to another, inheritance to change and usage relations to be modified. And any such modification should affect the rest of the system as little as possible. We summarize that good updatability for data is achieved if a) data can be converted

59

independently of code b) arbitrary restructuring of data is allowed c) modifications to and restructuring of data has only minimum impact on code and unmodified data. Existing updating systems do not satisfy these requirements very well. For example, object-oriented updating systems typically have very limited support for changing class hierarchies. This says more about object-orientation than about the updating systems; the object-orientation concept actually supports updatability rather poorly. 4.4.1 Localized Functionality vs. Localized Modifications Object-orientation localizes functionality, by grouping code and data together according to functionality. This means that, in the code, data and other code are accessed at fixed, explicit locations. Desire to avoid Class hierarchies All functionality redundant details redundant further relate clasinformation ses to each other. Globally used functionality From a statical point of view, this required information is very good, because it minimizes scope, groups acDesire to provide cording to functioLocally used functionality enough functionality nality, provides good modulariza- Illustration 23: The desire to reduce the amount of redundant information as well as make all functionality accessible cause a conflict of interest. The tion, allows inter- global ideal would be to to include all functionality ever needed, but no more. nal implementa- On a finer granularity level, it would be even better to only include as much tions to be repla- functionality as is locally required, but the local need must then include the ced, allows reaso- union of the needs of used services (called subroutines). ning at higher abstraction levels, and so on. However, this explicit localization only works well in a non-evolving environment. From a dynamical, flexible point of view, there are too many fixed relations, and every functionality is bound somewhere. The functionality is localized, but as a result, modifications are not. Internal implementations may change, but even a tiny, local modification to some external part (method signature, inheritance hierarchy, usage relation, data or method location etc.) can have a large, global impact. If possible, we would like to have good localization for both functionality and modifications. Adaptive programming provides a solution that only applies to static evolution. The workload of developing new versions of applications can be reduced with the aid of AP, but compiled applications are still standard object-oriented applications. Hence, AP does not improve runtime updatability. But the basic idea behind AP is good, and we can take that as a starting point. The problem with traditional object-orientation is that interfaces are created implicitly from classes. This fixes data and methods to be located in some particular class, and also includes mostly redundant information such as class hierarchy. If we instead provide independent intermediate interfaces, we get more loose coupling and a system that is much more flexible when it comes to evolutionary changes. These interfaces would externally seem to be independent and unrelated to the underlying classes, and thus enable almost arbitrary modifications to classes. Instead of having only one single view of class definitions interleaved with code, we would have two independent views - a class view and an interface view. Classes would still contain both 60

code and data, but code could only access these through the interface view. Code Data Independent interfaces could in principle be implemented in the same Code Interfaces Data way as in AP, but the runtime Illustration 24: Localization in object-orientation. overhead would be huge, and Classes contain code and data, and code accesses these unaccustomed developers might also through interfaces. In traditional object-orientation, an find writing AP propagation patterns interface is a subset of a class, and is hence implicitly bound to it. This enables localized functionality but a bit repugnant. One of the main in implementing results in poorly localized modifications. If we would difficulties use independent interfaces that are not directly bound to independent interfaces is doing so classes, we would have a form of indirect referencing without loosing useful objectthat better supports both forms of localization. orientated sub-typing features. One possible solution is to use more generalized sub-typing rules, as is done in our proposed programming style entityorientated programming (EOP), which is described in the following. Implicit Interfaces Independent Interfaces Interfaces= Classes Classes

4.4.2 Basic Concepts The problem we have identified in object-orientation is insufficient information hiding, as discussed in section 4.1.1, "What is Wrong?". Code encodes redundant information, and that makes both code and data modifications tricky to handle. In aspect-orientation terms, there is a cross-cutting or conflict-of-interest between the design and updatability aspects. When looking at application design, we want all functionality to be available and want it to be well structured and clearly localized. But when considering updatability, we want to remove redundant functionality and redundant structuring. Clearly, there must be some compromise that both aspects can settle for, and Illustration 23 tries to visualize just that. As it turns out, we do not need to settle for any compromise at all. Let us start by clarifying some typing terminology. Strict typing (strong typing) implies that all variables have an explicitly declared type, and type-checking will be used to guarantee that they are only accessed as this type. Loose typing (weak typing) means that the data type of variables may change over time, and no strict type-checking will be made. Static typing means that type-checking is performed statically, at compile time, while dynamic typing means that it is performed at runtime. Scripting languages typically use loose, dynamic typing, while serious programming languages use strict, static typing and thus produce semantically correct, safer code. If we allow loose typing, then the data structures behind variables can change arbitrarily as long as the functionality we use is present. Hence, loose typing embeds no redundant information, but still enables all existing functionality to be used. Clearly, we would want the flexibility of loose typing, and still retain the safety of strict typing. But safety surely is not the same as strict typing. If we use loose, static typing, then we get the benefit of loose typing, but can statically verify that the application contains no typing errors and can thus be safely run. Apart from safety, strict typing also makes code easier to read, since the reader then knows more about each variable; what kind of data it represents, what it is used for etc. In contrary, a developer who wants to call a subroutine from some API that

61

uses loose typing would have to read some human written documentation for that subroutine in order to find out what kind of arguments the subroutine accepts. We feel that, instead of allowing "any" data type, and instead of forcing a particular data type, we should declare exactly what functionality we require. Subtyping should be based on functionality instead of some invented ad hoc class hierarchy. We should somehow declare what data type is required, so that anyone reading the code knows what kind of variables can be used. But the declared data type should not contain any redundant information, and any imaginable compatible data type should be a valid subtype. This is accomplished by the use of universal polymorphism and sub-typing through interfaces rather than data types. A superset (extended) interface is considered to be a subtype and can be used anywhere that the subset interface can be used. (Keep in mind that superset here corresponds to subtype / subclass and subset to supertype / superclass!) One of the reasons why this has not been done in the first place, is that the number of possible interfaces is huge, and virtually explodes as new functionality is added. According to combinatory mathematics, the number of interfaces for a data type with n n accessible functions is 2 -1. It is much easier and more manageable to group fixed sets of functionality together, and that is what classes do. While it is natural to declare an object to be an instance of a certain class, we realize that it is impractical to declare each variable to have a type like "interface containing functionA, functionB, ..., functionZ". And equally well, it is impractical to invent names for every single interface. A natural solution is to automate typing, and that is what we will do. Every variable is assigned to be a loosely typed entity. Entity, according to the Oxford English int x, y int getX() int getY()

supertype

The forth (most extended) data type has 8 functions => 28 - 1 = 255 interfaces

subset

int x, y int getX() int getY()

int z int getZ()

setZ

setX getX

int z int getZ()

getY void setX(int) void setY(int) void setZ(int)

Color c int getColor() subtype void setColor()

void setX(int) void setY(int) void setZ(int) setY

Color c int getColor() void setColor() superset

getZ

getColor

7 interfaces for the second data type: getX getY getZ getX, getY getX, getZ getY, getZ getX, getY, getZ

setColor

Illustration 25: Different views of sub-typing. The leftmost figure shows classic data types with subtyping through extension. The middle figure shows the same data types as sets and sub-typing through supersets. The rightmost figure shows a subset of all interface types that can be parsed from the data types. Any subset of the full set of functionality forms a valid interface, so combinatory mathematics gives us 2n-1 interfaces for a data type with n accessible functions.

Dictionary, is "being, existence, as opposed to non-existence; the existence, as distinguished from the qualities or relations, of anything". Hence, data is distinguished based on relations, but can be "anything" and is not fixed to any detailed specification. Entity is a more vague, abstract term than class and object. We allow entities to be named and hence assumed to be of the same type, but the visibility of such names is just the scope where they are declared (class, method). This data type naming policy could be called silent, because any name will do, and that name need not be introduced or defined anywhere. Entity names are only used to make code more readable and tell that multiple entities have the same type. Unnamed entities are assumed to have their own, individual type. In practice, writing code using entities would be very similar to writing objectoriented code. Every object declaration would be prepended with the entity mark #. As 62

an example, the object declaration Rectangle rect could be rewritten as a silently named entity #Rectangle rect, or as an unnamed entity # rect. During compilation, the required interface for every entity would be generated and preferably inserted into the source code as auto-generated comment lines. More importantly, the required interface would be included in the compiled code and also statically typechecked based on the data type of all objects that can be stored to each entity variable. In EOP, type checking is static, but types still need to be runtime managed. This can be accomplished in two distinguishable ways. Compiled interfaces could be sorted lists containing the hash codes of each method and variable signature. This way, typechecking could be performed simply by comparing two such lists, and the v-table of each object would be a hash table with signatures as keys. Method indexes, inheritance etc. could thus change arbitrarily as long as the required functionality remains. Support for method renaming could be provided simply by adding an invisible stub whose hash code in practice directly points to the renamed method. A more efficient implementation is to treat entities as "runtime" classes. Objects would then be created from statically defined classes, but accessed through runtime entities. Each interface would get its own entity structure through which compatible objects are accessed, and classes would be mapped to these entities. This would allow an implementation whose only overhead is one extra reference indirection. Instead of accessing an instance variable or method based on its index in the defining class, it would be accessed based on the index in the compatible entity, which points to an index in the defining class. This second approach to dynamic type management is more thoroughly explained in section 5.2.4, "Classes, Features and Entities", while describing an updatable programming language. Each instance method reveals what interface it requires from the instances it is invoked upon, and each variable reveals what interface stored objects must implement. During compilation, the required interface of method arguments and local variables could be calculated as the union of the required interface for every statement involving that variable. For an object invoking a method, the required interface is the interface for instances invoked upon the called method. For an object used as an actual parameter in a method invocation, the required interface is the recursively calculated required interface for that formal parameter. The required interfaces for class and instance variables can be calculated by a preprocessor in two ways. Either the interface is the union of all functionality ever used on objects stored to that field, or it is the intersection of the functionality of all stored data types. The second alternative is easier to calculate, but will commonly produce too large interfaces (i.e. interfaces that include redundant information) and is exactly as inflexible as standard object-orientation. Clearly, the first alternative is to be preferred. One thing to keep in mind, is that alias names must be taken into account when calculating these interfaces. Because we use highly restricted interfaces, the LoD is not fully as important as in standard object-orientation, and we could perhaps allow instance fields to be accessed directly. If so, these would be considered to be part of the interface as well, and dealt with in the same way as methods. Class methods and data do not degrade updatability much by being fixed to specific classes, because they can freely be relocated in between access. Hence, they are accessed as in traditional object-orientation. Objects are allowed to access both their own and their superclasses' fields and methods directly, but cannot access other objects except through entity variables. This means that, unlike for example C++ and Java, access to objects of the same class is equally restricted as to objects of other classes. More than a few experts would consider this to be a positive side effect...

63

Entity-orientation supports updating better than object-orientation because all redundant information is removed. This means that: – Data is more loosely connected to the code. – The class hierarchy can be changed much more easily. – Objects can be replaced with any other object that also contains the required functionality. – Modifications are more local. 4.4.3 Class Hierarchy Since sub-typing in EOP is not dependent on class relations, the only need for inheritance is code and data reuse. Thinking in terms of interfaces, classes would be collections of reusable functionality, and inheritance would not be needed at all. This would solve the controversial question of allowing multiple inheritance or not [VRB98]. In traditional OOP, a professor class could be seen as inheriting from both a researcher and a teacher class. In EOP, the professor class would be treated as an entity with researching and teaching skill. Specifying every functionality separately would result in very fragmented and confusing class hierarchies. And anyway, there are usually undeniable similarities in classes belonging to a given class hierarchy. A professor is most certainly a human, as is both a researcher and a teacher. Hence, a practical solution is to support traditional single inheritance, and multiple inheritance indirectly through common features. Multiple Inheritance:

Single Inheritance & Features:

Animal x, y, z move()

Animal x, y, z move()

Horse hooves, tail, mane move() gallop()

Bird wings, tail move() fly()

Horse hooves, tail, mane move() gallop()

Bird wings, tail move() fly()

Feature Flight Req: wings fly()

Pegasus wings Pegasus

might cause conflicts in subclass Pegasus

instantiate

instantiate

myPegasus x, y, z, hooves, mane, wings, tail gallop() fly() move()

myPegasus x, y, z, hooves, mane, wings, tail gallop() fly() move()

Illustration 26: Multiple inheritance vs. features. Features solve the problems haunting multiple inheritance, and fit elegantly into EOP. The rounded boxes show instances of class Pegasus, and possible conflicts are written in bold face.

A feature is similar to an interface in Java, but includes an implementation. Multiple inheritance faces difficulties such as naming conflicts, repeated inheritance issues and obscurity. Repeated inheritance means that two or more superclasses of a class have some common base class, and data in that class must then be shared (virtual inheritance) or duplicated. If duplicated, changes will only affect one of the copies, which is hardly the intent. If shared, then unexpected side effects may occur. Obscurity means that 64

incorrect virtual methods are dispatched, as the result of method dispatching being designed for single inheritance. These problems have haunted multiple inheritance, and many experts strongly advice not to use it. Features implicitly avoid all the problems mentioned above. Another way to look at features, is by comparing it to automated delegation [VRB98]. In automated delegation, a class may declare to forward invocations on some class to one of its objects which has a compatible data type. Such an approach is only a hack to get the work done, and features does about the same thing in a more natural way. A feature may contain private, internal data that is not accessible from outside. Such data does not cause naming conflicts. In addition, a feature might require to share data with other features and declaring classes. A feature does not contain external data, but instead requires that declaring classes have some minimum interface (of both data and methods). Hence, duplicate data and naming conflicts are once again avoided. Using EOP, features are natural components in class hierarchies. Inheritance relations can be designed as normally, but when the need for code reuse is identified, some functionality can be moved away from some class, into a feature. This resembles the well established act of moving common functionality to superclasses, inserting superclasses if needed. Because we are using EOP and thereby sub-typing through interfaces, moving functionality between features and classes is completely transparent to all other than the affected classes. In other words, we get both high reusability and good updatability. We can also define new classes very quickly by gathering together a set of (possibly unrelated) features. Potentially, we could even allow objects to be instantiated from runtime-generated, on-the-fly-assembled classes.

65

5 The Updatable Virtual Architecture This chapter presents partial specifications for a programming language which is suitable for writing applications with inherent support for evolution and dynamic updating. By presenting such a language, we essentially demonstrate how to apply the theory discussed in the previous chapter at a more detailed and practical level. Many of the design decisions made in these specifications have little impact on updatability, and other solutions could be proposed as well. But the language specified is still intended to be a practical, fully functional, full scale programming language and not just an educational purpose implementation of the concepts discussed. This chapter only scratches the surface of the specifications, and leaves out low level details altogether. A more detailed and lower level specification can be found in [Öst03].

5.1 Introducing Uva 5.1.1 Background The obvious disadvantage with using a special programming language for writing updatable applications is that the resulting deployability will be poor. (See section 2.1.1, "Requirements".) There are basically two alternatives to using a special purpose programming language. One is to use static AOP to produce updatable applications, and thus automatically transform existing applications so that updatability is taken into account. This would be done by automatically embedding some of the concepts introduced in the previous chapter, such as adding update checks at every updatable point cut. Using such a solution, no underlying updating environment is needed, and every application can update itself - without forcing the developer to take updating into account. But this duplicates updating related functionality for each application, and it would be better to gather updating functionality below the application layer. The other alternative is hence to add updating support to the platform on which applications are run, and is most suitable for interpreted languages, such as Java. This also allows existing applications to be updated without any porting whatsoever. A JVM could be extended with a Jini interface for providing and distributing patches in a distributed system, and we would have good deployability. Although we could apply some of the ideas from sequence model theory and extend a JVM to support more flexible updating than most existing dynamic updating systems – such as allowing active methods to be updated - we decide not to do so in this paper. Existing programming languages are not designed to build updatable applications, and simply do not provide enough freedom for making flexible updatability extensions. On the other hand, odd programming languages falling far from mainstream languages are not easily adopted. An updating system relying on such a language would truly have poor deployability and hardly be of much practical use. But if the language instead strongly resembles traditional object-oriented languages, using it in place of some existing language is merely a trifle. For this reason, we define a three level model. At the highest, source code level, the language looks very much like C++ and Java. At interpreter level, most of the update friendly concepts we have described are supported, and some overhead is unavoidable. Since we want our language to be efficient enough to be useful, the third and lowest level is optimized and less updatable code. The optimizations remove overhead and 66

speed up code by neglecting or removing update related details wherever possible. The key point is that we can automatically and transparently move between any of these three levels. Optimized code can virtually be made to run as fast as code for any other virtual machine based language. The virtual machine only needs to guarantee that, whenever needed, it can switch back to running code at the intermediate level without any (by the user) noticeable delay. The most important level is without doubt the intermediate level, since intermediate level specifications are what actually make applications more (or less) updatable. It is out of scope for this paper to describe every detail as thoroughly as should be done for a complete programming language and environment - even if we leave out low level details. The language we specify has its roots in Java, so the specification takes Java as a starting point, and focuses on differences relative to the Java [GJS96] and JVM [LY96] specifications. Details that are left out or inadequately explained are either irrelevant or the same as in Java. The reader is assumed to have at least some basic knowledge about Java, but should be acquainted with the inner workings of a JVM to fully understand the specification presented here and in [Öst03]. 5.1.2 The Naming Issue As already suggested, what we are about to describe is more than just a programming language. It is an entire environment, named the Updatable Virtual Architecture or Uva in short. (The abbreviation should be treated as a name and thus not written with uppercase letters.) Uva is not just a source code programming language, and neither a virtual machine, nor an updating system, but all these together. This entire environment essentially describes a virtual architecture with inherent support for updating, and thus motivates the use of the name Uva. Furthermore, Uva means grape in Latin, Italian, Spanish and Portuguese, and is on rare occasions also used in English. If Java is coffee, then Uva is high quality wine. While coffee is tasty only when fresh and newly brewed, wine literally improves with time, which symbolically speaks of better support for evolution, bug-fixing and updating. This can be difficult to achieve in practice, but even small improvements would give a major advantage over traditional techniques where updating typically adds complexity and degrades code quality. The Oxford English Dictionary defines Uva as "a grape or raisin; a grape-like fruit", and the Webster Dictionary as "a small pulpy or juicy fruit containing several seeds and having a thin skin, as a grape". The observant reader should also notice that the English pronunciation resembles that of Java. In this chapter, we will give partial specifications for Uva source code (high level code) written in the Uva Source Language (USL). In [Öst03], we will give full specifications for Uva bytecode (assembly code) run in the Uva Virtual Machine (UVM). The reader should keep in mind, that the lower level specifications are much more important than the high level specifications. The same goes for Java, although the common man thinks of Java as being just a C-like language. 5.1.3 Overview To summarize, Uva is entity-oriented and built upon the sequence model. It is stackbased, platform independent, and supports multi-threading. Pointers are not supported, and indirect reference handles are used instead. All objects are allocated on the heap, and a garbage collector is used to deallocate objects that cannot be referenced. Error 67

handling is supported both via exceptions and by the capability to return "failure" from arbitrary sequence chains. Recursion is allowed only from chain blocks specially defined to be reentrant. The UVM can either interpret or partially JIT compile the bytecode it runs. It supports dynamic loading and reloading of classes, and has built-in support for applying update patches. Both class definitions and individual sequence chains can be replaced without apparently affecting the environment. This requires indirection that degrade performance, but optimizations can be used to work around most of the overhead. Any optimization is allowed, with the only requirement that, on a longer (in machine instruction terms) time scale, the UVM must "appear" to function according to the specifications. Modest use of optimizations that temporarily break specification rules is allowed and even encouraged, as long as the optimized code is executed as if being atomic, and the optimizations are externally undetectable. TM Especially hot spot optimizations, like the ones performed in the Java HotSpot Virtual Machine [HotSpot] are encouraged. Uva does not support all features that are made possible in the sequence model. Things like atomic transactions and parallel blocks are left out since the source code view hides away the underlying chain blocks. As far as updatability is concerned, this loss is irrelevant. The specifications given here is based on Uva version 1.0, which is a slightly revised version of the original Uva which appeared in the Master's thesis "Rethinking Software Updating; Concepts for Improved Updatability".

5.2 General Issues This section provides general specifications that are common for the whole Updatable Virtual Architecture. These specifications largely define the Uva environment, but do not provide details about things such as how code is written or executed. 5.2.1 Data Types Variables are dynamically typed, but statically type checked. The data type of a variable is either a primitive data type or a reference type. From a storage point of view, there are nine primitive data types. Four of these are used for signed integer values; int8, int16, int32 and int64, being 8, 16, 32 and 64 bit, respectively. Two are used for unsigned integer values - uint8 and uint16, and two stores floating point values according to the IEEE (Institute of Electrical and Electronics Engineers) standard 754; float32 for 32 bit single precision and float64 for 64 bit double precision. The remaining type, bool, is used to store boolean values, but is in practice just an 8 bit integer value. It is strictly typed, but solely used for the purpose of readability. From an execution point of view, there are only two primitive (runtime) data types; int and float. Each sequence chain is declared to have a word size of either 32 or 64 bit, and the bit-length of types int and float is always the same as the current word size. This means that all chain arguments, local variables, intermediate stack data etc. have a bitlength equaling the word size of the chain they are used in. As is shown in [Öst03], data sharing between chains is arranged so that incompatible word size between those chains does not raise conflicts. In a 64-bit computer, the effective word size could be chosen to always be 64 bit. 68

Results from arithmetic expressions are automatically cast to floating point if at least one operand was floating point, but the bit length always equals the current word size. When stored to global and instance variables, these values are automatically - and typically without issuing any warning - cast to the type of that variable, even if the casting causes an overflow or a loss of precision. The reason for generously allowing casting as well as using only two distinguishable primitive data types in the first place is to simplify data type exchanges. As long as every variable is used as intended, all casts will also be valid. Code that uses some variable should only need to be modified when the relevant interface to that variable changes and not just because its internal representation changed. All int types are entirely interchangeable, as are both float types. Any int can also be cast to any float, and vice versa, but changing data type between int and float affects the bytecode, so ints and floats are not transparently interchangeable. All primitive data types also have alias names that are the same as the names of the corresponding data types in Java. These alias names are only used for signatures, and the already mentioned names are otherwise preferred, because they underline that the variables are interchangeable and that their size only affects memory storage. The alias names and signatures are listed in Illustration 27. Variables of reference type are references (pointers) to objects, and have the same size as a native pointer. If this size is greater than 32 bit, then the effective word size is chosen to always be 64 bit. Objects can only be accessed via references, and are thus dynamically heap allocated. Arrays are special kinds of objects, which are treated like normal objects, but declare an extended interface and contain some additional attributes, such as a length and a content data type. Any object "looking like" an array - that is, providing a compatible interface - can be used in place of an array. This means that arrays are accessed like normal objects and not with special opcodes, such as in Java bytecode. The performance loss can be made up for with runtime optimizations. Strings are compatible with arrays, but are in fact immutable arrays. An immutable array is an array for which overwriting array elements is a valid operation but has no effect. In Uva source code, strings are in some respects treated as primary data types. Only strings may be assigned to string variables, and assigning non-string objects to string variables will first have them cast to strings, even if they provide a compatible interface. This is unlike the usual policy of only looking at interface compatibility. However, strings are treated exactly as normal objects in the UVM. In addition to having different data types, each variable can be classified as being one of the following three distinguishable types; local, class or instance variable. Local variables are local to some sequence chain, or shared between a set of sequence chains. Class variables are global variables that are defined in - and accessed from some class. Class and instance variables are often referred to as fields, and they are read-write protected by default, meaning that they can be read and modified (only) by the class itself and its subclasses. They can optionally be declared as private to the class or public to all classes. Unlike in Java, classes in the same package may not access each other's protected parts. If a variable is declared to be protected or public, it must always be declared as either read-only or both readable and writable. The write protection only applies to the actual access modifier, so a public read-only variable can be written to by code that could access it if it was protected. This is a powerful safety mechanism that is not used in traditional programming languages.

69

primitive data types reference data types array all types types

type int8 int16 int32 int64 uint8 uint16 float32 float64 bool string #n

size 1 2 4 8 1 2 4 8 1 p p

alias byte short int long ascii char float double bool -

signature B S I J A C F D Z #string; #n;

#[t t #=

p p

-

[t =

#>

p

-

>

required chain int cmp(#=) int toInt() string toString() #n get(int i) void put(int i, #n n v) int length()

description signed integer

unsigned integer floating point boolean value Unicode string Any reference type, silently named n Array containing type t A type interface-equal the declaring class A type interface-equal or -extending the declaring class

operation =, ==, != (int) (string) a[i] a[i] = v a.length()

Illustration 27: A summary of Uva data types. The uppermost table lists the data types that variables can be declared as. The size is expressed in bytes. Size p stands for the size of a native pointer. All primitive data types have alias names, and all data types have signatures. The bottommost table shows required chains that implement certain operations. Type #n in get(...) and put(...) is in practice replaced by the type of the array content.

The entity type of a reference variable can be described by its required interface. This interface is the set of instance chains and instance variables that must be accessible from objects that are compatible with the entity. In OO, objects are said to be instances of classes, but in EO, objects are instead said to be compatible with entities. 5.2.2 Data Type Naming Normal reference variables must typed with the silent naming policy, and written as # followed by an arbitrary name. Hence, general unnamed entity variables are prohibited. The scope of a silent name is the whole class or feature in which it is defined. String variables must be declared to have type string, but will in low level code be converted to normal entity variables with type #string. Array classes are named [t, where t is the array content data type, and this name is used when instantiating an array. Since arrays are treated as normal objects, they are accessed as entities. They may optionally use their own special declaration, #[t, whose sole purpose is revealing that compatible objects must provide an interface that lets them be used as arrays. Arrays can equally well be declared as normal reference variables, and non-arrays with a compatible interface may be declared as arrays. Array entities declare their runtime content type, such as #[int for a one-dimensional array containing int values, # 70

[[#color for a two-dimensional array containing objects that are silently named color, and #[[[int for a three-dimensional array containing int values. Array classes declare their storage content type, such as [[[int16 or [#[[int (where the content is a two-dimensional array entity with a storage content type). Variables can be specially typed #= and hence be unnamed. The entity of such variables is the self-entity, which contains the full interface of the defining class. The self entity is used internally to compare classes and entities, but also enables typing according to the class hierarchy to be simulated. The motivation for such a capability is that a class might need to do something with its own (or compatible) instances, and should not allow other entities to be used although they contain the interface required by that particular action. 5.2.3 Sequence Chains Chains are restricted to return at most one value, and hence act either as procedures (no return value) or functions (one return value). A stack based implementation could equally well allow multiple return values and validate that both the returning chain itself and its callers always push and pop the correct number of return values. Disallowing this feature implies that envelope entities must be used whenever multiple values are to be returned. But this is the way developers are accustomed to reason about subroutines, and we want to provide a language that developers can easily absorb. Sequence chains can be nested at two levels. The outer level chains are public, protected or private, and can be called from anywhere, as long as the access policy permits invocation. While public chains may be invoked by any code, protected chains may only be invoked by subclasses, and private chains only by the class itself. Inner level chains are declared inside an outer level chain, and may only be called from the outer chain and the other inner chains of the same outer chain. This is similar to procedures declared inside procedures in e.g. Pascal. Inner level chains are always static, and they share the local variables and arguments of the outer chain. An inner chain of an instance outer chain thereby acts as an instance chain although it is static. Nesting is restricted to two levels in order to avoid unnecessary complexity, and there are two reasons why nesting is permitted in the first place: improved code structuring and efficient variable sharing. We get better code structuring by grouping small chains together, which leads to improved readability, and - as a side effect - reduces the naming scope. Closely connected chains potentially degrade updatability, so some skepticism towards local variable sharing is justified, but on the other hand, global variables and object data are shared anyway. Any chain may also contain an optional precondition that is especially useful for loops but also well suited as a general guardian. If a chain containing a precondition is called, the code inside the precondition block is first executed, and the precondition then checked. If the precondition is not satisfied, the chain breaks. Otherwise the actual chain body is executed. Non-loop calls may declare a "failure" branch that is taken if the chain call breaks instead of returns. A chain that breaks must still return some value, since catching a failure is optional. More advanced error handling is supported through the use of exceptions. Any chain block may declare to catch any number of exceptions, and must then also specify chain blocks for handling each of those exceptions. Exceptions are caught based on fixed class names, which perhaps is not entirely what would be expected from an entity-oriented programming language. However, exceptions are used 71

for identification of some error and their class structure is not likely to change. Efficiency and automatic exception specialization through class inheritance are good enough reasons for abandoning any thought of somehow naming exceptions indirectly. Every chain may declare a "finally" branch that will be executed regardless of whether exceptions are thrown or not, but individual chain blocks cannot declare such branches. Loop calls and recursion indirectly include an invocation counter, and loops may optionally declare an iteration restriction. Loops can explicitly be restricted to, for example doing at most 10 iterations, but also to, for example do at least 5 iterations. Until we have performed the minimum number of loops required, the chain precondition (if present) will not be checked, and breaks will effectively be treated as normal returns. Recursion can be assigned an upper bound by including the invocation count in the precondition. Recursive calls are only allowed inside reentrant chain blocks, which in practice are implementations of recursive return blocks. Reentrant chain blocks are not allowed to exit with failure, but may of course throw exceptions. As in Java, thread synchronization is accomplished by entering monitors that are guarded by some key. At any time, only one thread can own a given key, so a thread inside a monitor must exit before another thread can enter. Deadlocks when updating are largely avoided because of the way compulsory update points are selected. There are no special I/O instructions, so - just like in Java - all I/O operations are performed by native sequence chains. Event handling could be supported in very much the same way as in Java, but is not further discussed here. Chain signatures contain the signature of every formal parameter. Chain overloading is allowed as long as, for every two overloaded chains, either a) the signature is different in respect to amount of arguments or primitive data types b) the required interface for every formal reference parameter is extended or the same in one of the chains. This restriction to overloading is necessary since it must always be decidable which chain is the "best fit" for a given set of arguments. Most chains are not overloaded, and dispatching them typically only involves looking up the chain signature and calling it. However, if the chain is overloaded, the chain to dispatch is selected, not only based on the instance data type, but also on the data types of the actual parameters. Since chain signatures declare entities, or "required interfaces", instead of fixed data types, dispatching overloaded chains becomes similar to multi dispatching with multimethods. (See Multi-methods in section 3.2.6, "Sub-Typing and Type-Checking".) The best fit can be detected by examining entity compatibility for conflicting arguments, but such comparison results in performance loss. Still, this overhead only applies to the rare chains that are overloaded and have conflicting arguments. More advanced structures for implementing multiple dispatching is discussed in e.g. [ABGR01]. The return type of a chain is - just as for methods in C++ and Java - not included in its identity signature. This means that code which calls a chain without using its return value will consider the chain to have return type void, but will accept chains with any return type. As explained in [Öst03], this approach does not leave junk on the data stack, as would be the case in Java. Chains are automatically analyzed and marked to be atomic if they a) only call (directly or indirectly via other chains) non-native chains or native chains that can themselves be guaranteed to be atomic b) do not use thread synchronization c) execute at most x instructions, where x is some fixed integer value. The third and last requirement implies, among other things, that loops must be explicitly constrained. The virtual machine will check that all control flows satisfy all of these requirements. In this context, atomic means that the chain is relatively short, can be executed fairly 72

quickly, and cannot block. Optimizations can then make use of this information to inline code, convert loops to optimized chains that perform the loop multiple times before returning, and so on. Both optimized and map unoptimized versions of (short) sequence chains can be stored in map memory. If an update is pending, the execution engine must respond by fairly quickly switching back to running unoptimized code, but "fairly quickly" only means that a possible delay must be so short that it goes unnoticed by the user. As already mentioned, each chain is declared to use a word size of either 32 or 64 bit, and the chosen word size can be transparently changed whenever needed. The default word size is 32 bit, so chains that require a map larger bit depth must be declared to be 64 bit. Since chains are commonly shorter than traditional subroutines, making Illustration 28: An example illustrating how chain blocks are rearranged in an updated chain. (Note that the two chains all variables 64 bit if one single are horizontally flipped.) Grayed blocks represent blocks that variable needs to be 64 bit are not found in the old chain, i.e. are modified or new. should not be seen as too much Dashed arrows show how the chain must be mapped if it is of a problem. An implementation active. If the arrow is labeled "map", then mapping code is can either choose to always use required, otherwise mapping the PC will do. In addition to block-specific mappings, the whole chain gets a mapping, 64 bit words but only use the that will map local variables etc. and perhaps do mapping lower 32 bit whenever the word that is dependent on the exact execution trace. size is 32 bit, or choose to only waste as much memory as is necessary. In the latter case, the stack must be converted when the word size of an active chain changes. Since word size only affects the stack and local variables, always allocating 64 bits for local variables typically does not waste too much space, and is probably the implementation strategy to recommend. If the word size changes from 64 to 32 bit, nothing needs to be done, and if it changes from 32 to 64 bit, only the high 32 bits needs to be zeroed. Analysing a chain block and comparing it with other chain blocks is a lot easier than analysing and comparing entire subroutines. This means that we can rather easily detect modified, new and rearranged chain blocks, and also automatically generate most update patches. chain a, ,

chain a, ,

version 1.0

version 1.1

73

5.2.4 Classes, Features and Entities Classes are named in the same way as in Java, and the fully qualified name of a class or feature is hence its package name followed by its short name. Class my.own.SmallClass would have package my.own and name SmallClass. In contrary to Java, the fully qualified name is almost never used, but in analogy with Java, classes are still filed according to the package they belong to. The example class would be stored in a directory my/own/ and have filename SmallClass.class. Classes are accessed solely based on their short name, but packages decide class visibility. Only classes in imported packages can be seen, so this way, name clashes can be avoided, while packages are still free to be renamed, and classes still free to move between packages without affecting updatability. Neither class names nor package names are used for accessing instance variables and chains, and only the short class name is used when accessing class (static) variables and chains. But accessing code only sees imported packages, so the developer can thus avoid importing conflicting packages. If a conflict still seems unavoidable, class functionality must be divided among multiple classes that each import a conflict-less set of packages. uva.lang.Object int toInt() string toString() #= clone() int cmp() #class getClass()

uva.lang.Throwable ...

uva.lang.Exception ...

uva.lang.Class ...

uva.lang.Thread ...

uva.lang.String ...

feature uva.lang.Serialize void writeObject(uva.io.ObjectOutputStream) void readObject(uva.io.ObjectInputStream)

uva.lang.Error ...

Illustration 29: The class hierarchy for some of the most fundamental classes and features, along with a subset of the chains that they contain.

Every class must extend some other class, and at the top of the inheritance hierarchy is class uva.lang.Object. This class contains a minimum interface that all objects provide, but the implementations may be overridden in subclasses. For every class that is loaded, an instance of uva.lang.Class is created. This instance is used for static chain synchronization, reflection and more. Feature uva.lang.Serialize is provided by many of the standard API classes, and strings are instances of class uva.lang.String. Each thread that is executing has an instance of uva.lang.Thread associated with it. All exception classes must be subclasses of uva.lang.Throwable, and are typically subclasses of either one of its subclasses uva.lang.Error and uva.lang.Exception. Any variable (primitive type or object) can be duplicated and also cast to an int or a string. The integer representation, for example, can be used both as a hash code and a numeric identifier. For primitive data types, duplication and casting to an int is internally implemented, and casting to a string is performed by creating a new string from the primitive data type. For objects, instance chain int toInt() is used for casting to an int, chain string toString() for casting to a string, and chain #=

74

clone() for duplication. Unlike Java, clone() is a public chain, so no java.lang.Cloneable interface is needed. In Uva, entities are as important as classes. A class or feature statically defines data and functionality; in other words a static interface. An object is an instantiation of one single class, which means that it indirectly contains definitions from one or more classes and features. An entity defines a runtime interface that objects are mapped to. The distinction between these three units is important, as the following example attempts to illustrate. Consider that we have classes House, Umbrella, Shield, Tree and Jacket. These are all conceptually very different things. However, when it starts raining, instances of all of them can be used as shelter. As humans, we do not regard a jacket as an umbrella, but might still intuitively pull it over our heads when it starts pouring. To deal with information, our minds strive to simplify matters, and therefore group together similar objects - to classes. These classes only contain information relative to the group; a jacket is a cloth and used when it is chilly outside. All redundant information is left out. But an actual instance of a jacket is much more than just a cloth, and can be used in countless imaginative ways. The concepts of class and entity are like two aspects in AO, which represent both an abstract definition and a concrete usage definition. All classes statically declare what entities they use. These entities correspond to the entity names of reference variables used in the class, and contain automatically generated interfaces. At runtime, every entity declared by any class is created, and all compatible classes are mapped to them. In practice, this means that only known entities are dealt with, but that these are dealt with efficiently. For example, whenever traditional object-orientation would invoke a virtual chain, an entity chain is invoked instead. In the case of virtual chain invocation, there is no overhead whatsoever instead of looking up the chain in the v-table of the class, it is looked up in the entity table (e-table) of the entity. In the case of non-virtual chains and instance data, access must still go through the e-table, and thus has one more reference indirection compared to the object-oriented variant. One single dereferencing is still a very low overhead, and the gain is superior flexibility.

5.3 The Uva Source Language This section gives a partial specification for the USL. Appendix C contains source code for both Java and Uva, and should be read alongside this chapter, as a practical demonstration of what is described here. 5.3.1 Various Issues The high level view in Uva is Uva source code which is written in the Uva Source Language (USL). It corresponds to Java source code and is used to hide away chain blocks and other aspects that are irrelevant in terms of programming and problem solving. Uva source code looks much like a combination of Python and Java source code, and could well be compiled to, for example Java bytecode or native executables. However, it does contain certain constructs that support Uva better than other languages. One minor, but utterly important issue is how Uva source code is grouped into chain blocks. The order and grouping of chain blocks must be unambiguous, but the internal contents may be compiled in any semantically correct way. This is 75

important, since chain blocks are not explicitly declared in source code, but low level code must still be linked to the source code so that modifications can be correctly detected and update patches easily created. The bit-length of integer and floating point constants is defined by the word size of the defining chain, so Java-like 64 bit constants, such as 123L are neither needed nor supported. Constants are considered to be written in hexadecimal form if they have prefix 0x, in binary form if they have prefix 0b, and in octal form if they have none of the above, but still prefix 0. The only valid boolean constants are true and false, and constants surrounded by single quotes (e.g. 'a') are converted to the 7 bit ASCII number representing the quoted character. Characters can also be specified as '\0abc' where abc is a octal ASCII value, as '\0xab' where ab is a hexadecimal ASCII value, or as '\uabcd' where abcd is a hexadecimal 16 bit Unicode value. The commenting style is the same as that in C++, with two additional comment markers. Comments starting with //* and ending with *// are virtually identical to /* and */ comments, but intended for temporarily commenting away large portions of code - much like #if 0 in C. Comments starting with /# and ending with #/ are treated as auto-comments and are typically generated by the compiler. They are not to be edited manually because they will be removed and regenerated when recompiling. Semicolons are not used to separate statements, so statements must instead be written on separate lines. In addition, blocks of code are not declared using some begin- and endblock notation, but must instead be visualized using indention. This is more an influence of Python than of Java. However, classes and features are ended with the keyword end. Extra indention is not allowed where a code block is not needed, but line feeds are still allowed after infix operators, parenthesis etc. and subsequent lines must then be indented more than the first line and at least as much as the previous line. Word parametrization is written like something{param}, and used among other things for declaring word size for chains and access policy for class and instance fields. 5.3.2 Declaring Classes and Features Classes and features are written in text files that must be named *.uva. The first two statements in such a file must be version v and package name, but these may be declared in any order. Here, name is the package name for all classes and features in the file, and v a 32 bit version number in the format major.minor.build. The major and minor version numbers are unsigned 8 bit integers, and the build number is a 16 bit unsigned integer. If classes or features from other packages are to be referred, the packages for these classes must be imported as import package_name. Unlike Java import statements, specific classes cannot be imported this way, and wild-cards are not allowed in the package name. Each class declaration starts with class C, where C is the name of the class. Classes are public by default, so if the class should not be visible from other packages, this declaration is proceeded by the modifier protected. If the class contains any abstract chains, then it must be - and otherwise it may be - proceeded by the modifier abstract. If the class has some other superclass than uva.lang.Object, this is declared using extends S, where S is the superclass. This must be written on the same line as the class name. Features are extensions of classes, and are declared just like classes. The class declaration ends with the keyword end, but before that, an arbitrary number of outer chains and declaration blocks are declared. Each declaration block may 76

be declared at most once, and for classes these are var and provides. For features, these are var and requires. /#################################/ All these declaration blocks are explained /## #Point /## matches: Point, Line below. /## interface: void move(#Point) If the class provides some common /## int getX() feature, this is declared in declaration /## int getY() /## #Color block provides. This block lists all /## matches: Color features provided, and if a feature requires /## interface: void getRGB() /#################################/ some interface or variables to exist in the class Line extends Point providing class, these can optionally be provides given alias names. Alias names are used to ColorAltering map names in the providing class to names Rotation{axis=this, p=p2} var in the feature. They are declared as a protected{r} static int count comma separated list of feature_name #Point p2 = class_name, and written as a para#Color c metrization of the provided feature. public void draw() If a feature requires some interface to be ... present in providing classes, then it declares end // Line these interfaces in declaration block requires. This block lists the interface /#################################/ (including variables and chains) that all /## #Point /## matches: Point, Line providing classes must satisfy. In instance /## interface: void move(#Point) chains declared and implemented by the /## int getX() feature, the variables and chains declared by /## int getY() /#################################/ this interface can be accessed freely, with feature Rotation the same access policy as for subclasses of requires the providing class. #Point axis #Point p Variables and constant fields are declared in declaration block var. public void rotate(float angle) Variables must declare a data type and a ... end // Rotation name. Constants are declared with Illustration 30: An sample Uva source file modifier const, and class variables with containing one class and one feature. modifier static. Every variable is readwrite protected by default, but may be assigned some other access modifier. To give a few examples, int32 i is a protected read-write instance variable, protected{r} static j a protected read-only class variable, and public{rw} k a publicly accessibly read-write instance variable. Furthermore, variables may declare an initialization by appending = expr to their declaration, where expr is an expression evaluated to the type of the variable. Constant variables must declare an initialization. version 1.0.27 package gfx2d

5.3.3 Declaring Chains As already mentioned, chains are declared amongst the declaration blocks in a class or feature. An outer level sequence chain is declared as [modifiers] ret_type name([args]). Valid modifiers include static for non-instance chains, native for chains with a native implementation (similar to using JNI in Java), abstract if 77

subclasses are to provide an implementation, asm if the code is written in Uva assembler format instead of Uva source code format, synch if they are synchronized, and word{32|64} if they declare another word size than the default. For outer level chains, the default word size is 32. If the synch modifier is present, then the chain is guarded by a monitor. If the chain is static, then the key to this monitor is the instance of uva.lang.Class associated with the class, and if the chain is instance, the self reference is the key. In addition to these modifiers, access modifiers public, protected or private may be used, but they may not be parametrized. An inner level chain is declared like an outer level chain but is indented one step, always static, and may not declare access modifiers nor modifier abstract. It inherits its word size from the outer chain, but may override this by declaring a word size of its own, and compilation must then make sure that shared local variables will be accessed and updated correctly. Inner chains that have an instance outer chain share the self reference with the outer chain. Every chain may contain a declaration block called throws. If present, this block declares the class names of exceptions that the chain may throw. However, calling chains need not catch or declare to throw these exceptions, and the compiler will only issue a warning if they do not. chains classes The reason for this is to better classes in the same public public support code reconfiguration. package protected protected Any chain that neither classes private in another declares modifier native nor package fields abstract contains a chain public{rw} body, and may further declare public{r} subclasses blocks named var, entry, protected{rw} pre, catch and finally. protected{r} Local variables are declared in non-subclasses private block var, use runtime data types (that hide away the bit Illustration 31: Access modifiers in Uva. The circles reveal what length), and are not allowed to access modifiers classes, chains and fields can have. have any modifiers. Block Surrounding circles have more loose access restrictions than entry contains code that is contained circles. What a class can access is determined based executed when the chain is on whether it is a subclass of the declaring class and whether it belongs to the same package as the declaring class or not. called. Block pre is a precondition block that declares code like the entry block, but in addition contains a selection - as in the selection statement described in the next section. It is written like pre(bool_expr), pre (int_expr) selection or pre(ref_expr) selection, and will typically be executed before the entry block. The precondition is then checked, and if it is not satisfied, the chain breaks. Otherwise execution continues in the entry block. Multiple catch blocks may be declared, and each one is written like catch exception e where exception is the exception class name and e a reference to its instance. These blocks contain normal code that can also access the temporary local variable e. Block finally contains code that is executed prior to exiting from the chain.

78

Definitions: field_access: class_access: chain_access: field_type:

var_type: ret_type: chain_mod: chain_args:

public[{r | rw}] | protected[{r | rw}] | private public | protected public | protected | private int8 | int16 | int32 | int64 | uint8 | uint16 | float32 | float64 | bool | string | #= | #> | #silent_name int | float | bool | string | #= | #> | #silent_name var_type | void [chain_access] [static] [synch] [word{32 | 64}] var_type [a1][, var_type [a2][...]]

Uva Source Code file syntax: version x package p

inner level chain

outer level chain with body

outer level chain without body

[import p1 ...]

imported packages

[class_access] [abstract] class | feature c [extends s] [provides provided ...] features [requires required interface ...] (features only) [var [field_access] [static] [const] [field_type] v [= expr] ...] [[chain_mod] native | abstract ret_type g([chain_args]) [throws exception1 ...]] ] [[chain_mod] [asm] ret_type h([chain_args]) [throws exception1 ...] [var local data_type v1 [= expr] variables ...] [pre optional precondition ...] [catch exception1 e ...] exception [catch exception2 e handlers ...] ... [finally ...] actual chain implementation [entry starts here ...] [[chain_mod] [native | abstract | asm] ret_type inner_h( [chain_args]) ...] ...] end

class and instance variables

Illustration 32: A brief outlining of the syntax of a Uva source file. Brackets specify optional elements and | "either-or" elements. The definitions shown in this illustration are merely used to make the syntax outlining more readable. Most modifiers and blocks may be declared in any order, but indenting must follow the indenting illustrated here.

79

5.3.4 Special Instructions Apart from arithmetic and logical operations, storing values, casting data types and other familiar instructions, a few instructions are unlike those in traditional imperative languages. Selection is used in place of "if-else" and "switch" instructions. Boolean selection is written as select(bool_expr) ... else ... where the else branch is optional. If the boolean expression is evaluated to true, then the code following the selection is executed, else the code in the else branch (if present) is executed. Integer and reference selection is written as select(expr) selection ... else selection ... else ... and can contain zero or more else branches. Selections are lists of integer values / references or comparisons of these separated by boolean operators. To clarify matters, a typical selection is simply an integer / reference to which expr is compared. But it can also be a comparison like >value, entry1 modifier asm does entry str = str.trim() ald str in fact contain code select(str[0]) 'h' || 'H' call_entity trim() showHelp() entry1 -> entry2 that has already been return ts_top split into chain else 'q' || 'Q' ast str blocks. The syntax throw new AbortException() ald str else iconst_0 of assembler chains call parse(str) call_entity get(int) will not be described catch Exception e entry2 -> entry3 System.out.println( ts_top here, but Illustration "Invalid input") sel 33 gives an idea of return ipush 'h' how Uva assembler je char_h ipush 'H' chains are written, je char_h and thereby also ipush 'q' je char_q how Uva bytecode ipush 'Q' relates to Uva source je char_q nxt code. entry3 -> return \ Most chain catch Exception -> error blocks declare a ald str call_static parse(string) "next chain block", Illustration 33: A code sample char_h -> return and all chain blocks written both as assembler and call_static showHelp() non-assembler code. The char_q -> char_q1 end with some assembler code in the right hand new AbortException() instruction that column is generated from the char_q1 breaks the chain high level code in the left hand ts_top athrow block. Each chain column. The generated code is error -> return only one possible transformation, block must further ald System.out and many other transformations aldc "Invalid input" be given a name, of could be considered. call_entity println(string) which entry, pre, finally and catch are predefined names (and serve the same purpose as in non-assembler chains). In addition, chain blocks return and break are also predefined, and will simply return / break from the chain without any return value. 81

Uva bytecode looks very much like Java bytecode. Although the bytecode is - in written text - typically much longer than the high level code, and substantially less readable, it is actually clearer and simpler in terms of updatability. Each chain block is a separate unit that cannot halt or delay execution. As far as updating is concerned, a chain block - rather than an instruction - can be seen as the smallest accessible unit. Furthermore, the control flow can be explicitly derived from the bytecode, and seen as a linear chain of chain blocks (or a linear sequence of instructions). The only sensible way to update the high level code in Illustration 33 while it is executing, is to define some point where updating is allowed, and provide a transformation mapping for that update point. There is nothing wrong with such an approach, but it is certainly not an automated approach, and is haunted by overhead work and unreliable updates. Looking at the chain blocks, we deal with smaller units of code, and can more precisely identify what has been modified and what has not. Update points can be selected automatically, and the patch can be automatically generated for most small scale modifications. The good thing is of course that Uva supports both the low level and the high level view and can thus benefit from all of their advantages. 5.4.2 Uva vs. Java What we have described in this chapter is not only implementation details for the theory we have described. Although the specification is incomplete, and the reader is highly recommended to continue reading in [Öst03], this chapter still illustrates something of greater importance; how improved updatability can be tied into a high level language that looks much like the popular programming languages of today. Not only does Uva look like Java, but its specifications are actually such, that these two languages are highly compatible. Uva is not - and could not be - just an extension of Java, but a UVM could well be an extension of a JVM. In fact, the UVM described in [Öst03] smoothly runs both Java and Uva code, and Uva classes can transparently access Java classes. This allows the huge Java API to be used in Uva applications, while the Uva part still retains its benefit over Java. Tools for Uva can also be more quickly developed since existing tools for Java can be modified and extended. The class file format closely resembles that of Java, and garbage collecting, synchronization, multithreading and more can be implemented in exactly the same way as in Java. The base API class hierarchy is largely compatible, including classes uva.lang.Object, uva.lang.Class and uva.lang.Thread and their usage. Uva strings are compatible with Java strings if only the UVM adds an empty chain for put(int, int) and a link from get(int) to charAt(int). Java arrays can also be used in Uva if the UVM adds corresponding get, put and length chains that execute the correct Java bytecode instructions. Good compatibility also means that porting is easier. Java classes can fairly easily be converted to Uva classes and vice versa. Any new technology that requires some odd programming language in order to work has a serious drawback. But if the required language is as compatible with an as popular language as Java, this drawback is no longer critical. Software companies could write their applications in Uva and bundle them with the company's own UVM. The application would use the consumer's Java API classes, so this scenario is almost the same as when developing Java applications.

82

6 Discussion In this final chapter, we summarize what has been described in the previous chapters, and discuss what our contributions to the field of dynamic updating are. The presented theory is evaluated, and future work is suggested. A reader wanting to learn more about Uva should continue reading in [Öst03], which is a natural continuation of this work. Things already realized or accomplished in that technical report are not listed as suggestions for future work.

6.1 Updatability Evaluated In this section, we discuss what effects a number of modifications has on both Java and Uva implementations of a few simple applications with source code listed in Appendix C. 6.1.1 Fibonacci Terms We can modify the Fibonacci application so that it uses java.io.PrintWriter instead of java.io.PrintStream. These two classes contain almost identical methods, but their class hierarchies are java.lang.Object -> java.io.OutputStream -> java.io.FilterOutputStream -> java.io.PrintStream and java.lang.Object -> java.io.Writer -> java.io.PrintWriter. This means that the classes share no other superclass than java.lang.Object, which all Java classes share. Hence, Java API classes duplicate code quite heavily. In Uva, features could be used to eliminate this dilemma. Regarding dynamic updating, we can say the following: – In Java, we will be forced to manually modify 3 lines of code in two methods, and 6 more lines will be indirectly affected (each use of variable out). In Uva, only one line - the instantiation instruction - is affected. – In Java, almost all line numbers in the two modified methods will change if we e.g. needed or wanted to use an extra parameter during instantiation. Recompilation might further rearrange line ordering and even exchange instructions, as the result of optimization or compilation strategy. This means that we must specify PC mappings for every update point, and some of those can be difficult to identify. – In Java, the modified methods represent about 90 per cent of the whole code, in Uva only 20 per cent. – In Java, the modified methods contain two loops, call one recursive method and several methods that might delay execution or throw exceptions. If active methods are not updatable - as in many existing updating systems - updating can never be performed, since the bottommost method is modified. If instead requiring update points using sequence model criteria, we would need 24 update points. These could perhaps be reduced to 20 if the shortest methods were declared to be atomic. In Uva, we would need 7 update points, of which none would require any mapping, and hence no effort from the developer. A reasonable update is to start using java.math.BigInteger objects instead of normal integers that overflow already at term 47. The BigInteger class is not included in the class API for J2ME CLDC, which is the platform we target. To enable this update, we take the missing classes from the J2ME CDC API classes. The classes we need are Number and Comparable from the java.lang package, and 83

BigInteger, MutableBigInteger and SignedMutableBigInteger from the java.math package. From these, we remove all methods that use other classes not found in the CLDC API. In the new version of the Fibonacci application, four lines have changed in the high level code. Method term is now declared as static BigInteger term (BigInteger a, BigInteger b, int count), the two calls to this method are written as term(BigInteger.ONE, BigInteger.ONE, ...), and the recursive call is written as term(b, a.add(b), count - 1). This has the following effects on dynamic updating: – In Java, some low level line numbers will change for sure, so mapping the PC cannot be avoided. – In Java, the two modified methods represent roughly 85 per cent of the code, but in Uva, the three modified chains only represent just over 20 per cent of the code. – If providing update points in Java, we will need 23 such points, and at least some of these needs some mapping. In Uva, we will only need 6 update points, of which none requires any mapping. – In Java, we might need to update multiple stack frames for the recursive method term. In this case, the mapping is simple, but nevertheless more complex and time consuming than if the method was not recursive. – If we would settle for using 64-bit integers instead of switching directly to BigInteger, the only modifications we would have to make to the Uva code is declare that the word size of term, parseAndCalculate and printIntermediate is 64. In Java, the bytecode would have to be seriously altered on many locations. 6.1.2 The Dining Philosophers Consider that we would like to change the welcoming text that is written immediately after calling Philosopher.run(). – In Uva, this means that chain block notify("joined the happy diners") has been modified. Apart from application startup, this unit will never be active. Compulsory update points are not even needed, but if used, their sole purpose is to declare when updating may start. – In Java, the whole method run() is considered modified, and every thread will always have a modified method on its call stack. Therefore, our dynamic updating system must have support for updating active methods, and we will need to synchronize all threads and carefully reason about where we can safely define one or more update points.

6.2 Final Words 6.2.1 Summary This paper contributes to the development of improved techniques for dynamic updating in several ways. In it, we have identified many of the problems and tasks that a dynamic updating system must tackle, and presented a summary of different approaches to building systems that are capable of doing so. We clarified some concepts and described how they relate to dynamic updating. We discussed the basic building blocks available 84

in software engineering, and reasoned about how we could modify these to better support dynamic updating. We introduced two novel concepts - the sequence model and entity-oriented programming - that could potentially improve the updatability of future applications. Based on our discussion, other concepts could be invented as well. Not only did we describe the principal idea behind these new concepts, but we also explained a number of less significant primitives that could further aid and strengthen them. We also tied all these concepts together and showed how they can be utilized when specifying unified and intuitive programming languages. We showed how such a programming language can be made efficient, resemble popular object-oriented languages, and still retain good updatability. 6.2.2 Conclusion and Contribution Taking the Updatable Virtual Architecture as the measure of our inventions, we consider how well we met the requirements for a dynamic updating system (see section 2.1.1, "Requirements"). We conclude that we achieved very high flexibility, high robustness, and virtually as good efficiency as in Java. Writing code for the programming language requires some, but not much, overhead work, and resembles the act of writing code for popular programming languages. Generating and applying update patches are largely automated tasks, so the ease of use is definitely satisfying. The deployability is the weakest link, but not even that is entirely poor. Supporting only a special purpose programming language would suggest that the deployability is poor, but the close link to Java and easy porting improves the deployability somewhat. The language is also more than a "special purpose" language, and could basically replace Java. In section 2.2.2, "Invoke Model vs. Interrupt Model", we explained how runtime checks that determine when updating can be performed are usually conservative or unreliable. Uva has exact, safe, simple and controlled update checks that enable updating to begin quickly and reliably. We explained how compulsory update points are selected in order to enable this behavior. There will often be many compulsory update points, but as a matter of fact, this does not mean that updating is cumbersome. Many of these update points only declare that updating may begin at that location, and do not need any mapping at all. Regarding the performance of Uva, we note that long-lived chains are typically not performance critical, because they are seldom executed. This means that they do not need to be optimized, and can memorize the control flow, initial values etc. Short-lived chains might potentially be executed frequently, and should therefore be optimized to achieve better performance. This would mean that they neglect memorizing control flow etc. After receiving an update request, all threads will switch to an "update ready" state, in which they should run non-optimized code. They have well enough time to accomplish this, because the patch must first be loaded and prepared and all threads then synchronized. Since only short-lived chains are optimized, all threads will soon be executing non-optimized code. This causes temporary (but unnoticeable) slowdown, but enables good updatability and high performance to co-exist in Uva. Uva addresses many issues related to software evolution, updating and flexibility in general. Compared to most languages, Uva has unchallenged support for changes to class hierarchy, package renaming and rearranging, class renaming, class replacing and more. It also supports objects to be replaced with compatible objects in a uniquely powerful manner, and subroutines to be fairly easily updated. 85

In Uva, chain blocks do not share data on the stack, and because of this, certain algorithms cannot be quite as efficiently compiled as they could be for normal stack machines. The gain is that state memorization - as discussed in section 4.1.1, "What is Wrong?" - is reduced and updating is better supported. We also suggested a refined object-oriented access policy for accessing class and object fields. Not only did we stick to dynamic updating, but we also showed that EOP must be considered a serious alternative to AP, since they both achieve similar goals, but EOP does so in a much more efficient and user-friendly way. We suggested an alternative to multiple inheritance, which solves most of the problems haunting conventional multiple-inheritance. 6.2.3 Future Work First of all, the concepts that we have introduced are new and might have flaws. Version 1.0 of Uva has detailed, but not complete specifications, and has not undergone enough testing yet. First and foremost, we must further develop, test and criticize the work presented in this paper. Sequence model theory and entity-orientation must be further analyzed, and the detailed specifications for Uva refined. We could consider implementing a pure AOP solution that would utilize, for example, sequence model theory. Such an approach could introduce some of the benefits of Uva to standard Java applications. If Uva would prove useful and fully functional, then a revision of Uva could add support for e.g. distributed computing and real-time constraints. Such a revision could also make minor adjustments to the Uva specification and define a more complete standard API. Efficient optimizations for Uva are critical, and effort should be made to develop as efficient optimization techniques as possible. The interesting question is of course: How fast can we make it? As a follow-up of the previous suggestion for future work, the performance of an optimized UVM could be compared to the performance of a state-of-the-art JVM. Apart from Uva, more unconventional programming languages with support for the sequence model and entity-orientation could be invented. Such languages could for example support multiple return values and explicit chain block declarations. They could be graphical languages, or perhaps text based languages with support for viewing and designing graphical representations as well.

86

A Appendix A A.1 Glossary Glossary 1,

active method: See Glossary 2.

Glossary 2,

active subroutine: A subroutine that is on the calling stack of one or more executing threads. This means that this subroutine is either currently executing, or has executed to some point, called another subroutine, and will continue executing when that subroutine finishes.

Glossary 3,

active class: A class that contains one or more active methods.

Glossary 4,

adaptive programming: A specialization of object-oriented programming, where propagation patterns are used instead of direct referencing.

Glossary 5,

agile software process: A software development process that encourages fast decisions and rapid implementation rather than exhaustive planning. The idea is to reduce risk and quickly be able to adapt to changes. Extreme programming is perhaps the most well known agile software process.

Glossary 6,

aspect-oriented programming: A development practice where different aspects - such as efficiency and updatability - are modeled separately and then combined using an aspect weaver.

Glossary 7,

binary method: A method that takes one or more instance arguments of the same type as the instance upon which the method was invoked. These methods are called binary because they take (at least) two such arguments, with the self reference being one of them. This cannot be expressed in traditional object-oriented programming languages (C++, Java) - subclasses either alter the method signature (identity) or cannot prevent incorrect arguments from being passed to it (instances of the superclass and other subclasses).

Glossary 8,

bytecode: Code with an instruction size - excluding operands - of one byte. It is usually the "machine language" of some virtual processor, such as the Java Virtual Machine in the case of Java bytecode.

Glossary 9,

continuation: An abstract entity representing the current state in the form of all instructions to compute in the future.

Glossary 10, context switching: The act of freezing one thread or process and activating another one. A prerequisite for parallelism in a single processor environment. Glossary 11, dynamic typing: Denotes that type checking is performed dynamically instead of statically at compile time. Has the benefit of knowing "more" than the compiler knew, and hence allowing certain typing rules not 87

possible in static typing, but also the penalty of being less reliable and less efficient. Glossary 12, entity table: The entity-oriented variant of a virtual table. For maximum efficiency, classes can contain both a virtual table and an entity table. Glossary 13, ephemeral / transient state: The opposite of persistent state - the subset of the full state that is temporary and need not be preserved during software updating. Glossary 14, global state: The persistent state of an entire network of communicating nodes. Vice versa, the global state also affects the persistent state of individual nodes. Synchronization is usually needed to determine the global state. Glossary 15, garbage collection: The act of implicitly freeing non-referenceable memory. This is usually done automatically, and thus being transparent to the user. Glossary 16, hungarian notation: A variable naming convention originally invented by Charles Simonyi at Microsoft. If followed, variables get prefixes that reveal what type they have, such as strName for strings and iVersion for integers. In many cases, this makes code easier to read and certain bugs easier to find. Glossary 17, inter-process communication: The means by which processes communicate with each other. In practice, this stands for message passing, shared memory, pipes and semaphores. Glossary 18, law of demeter: A coding convention for object-oriented programming. It says that objects should have only limited knowledge about each other and the object model. Each object should only talk to its "friends", and the motivation is to keep implementation details localized, ease maintenance and reduce information overload. Adaptive programming is one solution, but the most common solution is to restrict access over object boundaries to methods. This way, the instance fields of other objects cannot be accessed, and an object only has direct knowledge about its immediate neighbors. Glossary 19, loose typing: A type checking policy where variables are allowed to change data type dynamically. Mostly used in scripting languages. Glossary 20, multiple dispatching: A more general form of method dispatching. The method to invoke is selected based on the method signature and all of its arguments. (The self reference is not privileged over other arguments.) Glossary 21, persistent state: The subset of the full state that is required to survive crashes, process migration etc. Should be kept intact by any software updating system.

88

Glossary 22, polymorphism: Usually refers to polymorphism through inheritance, which means that subclasses polymorph the superclass and can override its behavior. Glossary 23, primitive data type: A data type with built-in functionality. Also called basic data type, and commonly refers to integers, floating points, characters, booleans and other simple data types. Glossary 24, quiescence: The requirement that a module must be inactive if it is to be replaced. If quiescence is implemented, e.g. active methods cannot be updated. Glossary 25, reachable state: A state that is reachable from the initial state of the executing application. Any state following a reachable state is also a reachable state. Glossary 26, reentrancy: The ability to have the same code executed simultaneously in different threads (i.e. to be re-entered). Both recursion and reentrancy require that each call to a subroutine allocate memory for a new set of local variables. Glossary 27, refactoring: The act of restructuring and heavily modifying source code in order to produce a better design or more maintainable and readable code. Has traditionally been seen as an undesired act resulting from inadequate planning, but is becoming widely accepted, particularly in conjunction with agile software processes. Glossary 28, reverse polish notation: The same as postfix notation. An expression notation, in which operands precedes the operator. The infix expression 4 + 5 would in postfix be 4 5 +. Glossary 29, silent naming: An entity-oriented policy, where the data types of variables are given a name, but that name may be anything and is not strictly type checked. Glossary 30, single dispatching: The normal instance method dispatching policy, invoking the method on some object (the self reference). The method to invoke is selected based on the type of that object and the signature of the method. Glossary 31, static typing: Denotes that type checking is performed statically, at compile time. Has the benefit of detecting typing errors as early as possible, but is not flexible enough to support all programming styles and languages. Glossary 32, strict typing: A type checking policy where the data type of variables is strictly type checked, and may not change. Commonly used in serious programming languages, since it results in more reliable code.

89

Glossary 33, tail-recursive optimization: An optimization where (usually) the compiler removes recursion from recursive subroutines that need not compute anything after the recursive call. Allows certain recursions to be as efficient as iteration. Glossary 34, universal polymorphism: A generalization of polymorphism, in which the polymorphism applies to an infinite number of types that share some common feature. Glossary 35, Uva: A programming environment including a programming language and a virtual machine. Could be seen as a revision of Java, with improvements for - among other things - updatability. Glossary 36, Uva source language: The programming language used for writing Uva software. Is the link between the developer and the Uva environment, and corresponds to the Java high level language. Glossary 37, virtual method table: An array of pointers to all virtual methods in a class. Each virtual method is assigned a fixed index in this table, and objects call virtual (instance) methods using these indexes. Each class defines its own v-table, thus enabling polymorphism by inheritance. In Java, all methods are virtual (but superclass methods can still be called using non-virtual invocation). In C++, methods are by default nonvirtual, because invocation is then more efficient. Glossary 38, virtual machine: An abstract, non-physical computer that only exists in the form of a specification and software implementations.

A.2 Definitions Definition 1, abstract state machine: An ASM M is a finite set of rules for guarded state updates. Executing one step of M at a given state A, will result in another state A', where the state updates for all guards evaluated to true have been performed in parallel. Executing an ASM means executing it step by step, until no guards are evaluated to true anymore. Definition 2, complexity of a modification: The complexity of a modification is not determined by the scale of the modification, but by the effect it has upon the global state. Definition 3, difficulty of mapping: The difficulty of specifying a mapping between two software versions is directly dependent on how closely related those two versions are. Definition 4, sequence: Any collection of sequential code, having one single entry point and multiple exit points, and for which every instruction is executed at most once. 90

Definition 5, sequence chain: A chain of sequences and sequence chains. Definition 6, validity: Replacing program  in process P with  ' at a given instance in time, is a valid update if P is guaranteed to reach a reachable state in some permutation of  ' in a finite amount of time. A permutation of  ' is  ' run from some modified reachable state (and thus having its own reachable states), but behaving as intended.

A.3 Abbreviations ALU: AOP: AP: API: ASM: CISC: CDC: CFG: CLDC: DoS: e-table: EOP: GC: GNU: HN: IEEE: IPC: Jini:

Arithmetic-Logic Unit Aspect-Oriented Programming Adaptive Programming Application Programming Interface Abstract State Machine Complex Instruction Set Computer Connected Device Configuration Control Flow Graph Connected Limited Device Configuration Denial of Service Entity table Entity-Oriented Programming Garbage Collector GNU's Not Unix (recursive acronym) Hungarian Notation Institute of Electrical and Electronics Engineers Inter-Process Communication Jini Is Not Initials (anti-acronym reflecting the fact that officially, it is not an abbreviation) JIT: Just-In-Time (JIT compiler, JIT compilation) JVM: Java Virtual Machine LoD: Law of Demeter OOP: Object-Oriented Programming OS: Operating System PC: Program Counter PDA: Personal Digital Assistant PVM: Portable Virtual Machine RISC: Reduced Instruction Set Computer RPC: Remote Procedure Call RPN: Reverse Polish Notation SML: Standard Meta Language TRO: Tail-Recursive Optimization USL: Uva (seethe next abbreviation ) Source Language Uva: Updatable Virtual Architecture UVM: Uva (see the previous abbreviation) Virtual Machine v-table: Virtual method table VLIW: Very Long Instruction Word 91

B Appendix B B.1 Pseudo Code Semantics Pseudo code is by nature inexact, and should also be so. The power of pseudo code is easy readability, expressiveness and platform independence. Nevertheless, even for pseudo code, part of the semantics must be exact, and fully understood by the reader. All such semantics is listed below, and - as far as possible - established notations are used. Variables: var T.var t.var type var (type)var

A local variable. A class variable declared in class T. Aninstance variable found in object t. Declaration of a new variable with a given type. A variable, cast to a given type.

Subroutines: f(...) T.f(...) t.f(...)

A local subroutine. A static method declared in class T. Aninstance method accessed from object t.

Expressions: val var f(...) (expr) expr+expr expr-expr expr*expr expr/expr

Evaluates to the given fixed value. Evaluates to the value of the given variable. Evaluates to the value returned after calling subroutine f. Evaluates to the expression in parenthesis. Evaluates to the sum of to evaluated expressions. Evaluates to the difference between to evaluated expressions. Evaluates to the product of to evaluated expressions. Evaluates to the quote of to evaluated expressions.

Conditions: true false expr expr=expr expr=expr (cond) ¬ cond cond ∧ cond cond ∨ cond

Always true. Always false. The result of an expression that must be evaluated to true or false. Evaluates and compares two expressions for equality. Evaluates two expressions and compares if the left hand side is less than (or equal to) the right hand side. Evaluates two expressions and compares if the left hand side is greater than (or equal to) the right hand side. Evaluates to the condition in parenthesis. Negation. The logical inverse of an evaluated condition. Logical And. True if both conditions are true. Logical Or. True if either one (or both) of the conditions are true.

Statements: expr cond

Calculates an expression. Calculates a condition. 92

var:=expr stmt;stmt stmt||stmt [cond]/stmt

Assigns a value to a variable, by calculating an expression. Executes two statements sequentially. Executes two statements in parallel. Shorthand conditional statement. Evaluates a condition and executes a statement if evaluated to true.

Additional High Level Statements: if cond then stmt fi

Evaluates the condition and executes the statement if evaluated to true. if ... elif ... else ... fi Shortened version of nested conditional statements, where only one of the statements will be executed. while cond do stmt end Performs the specified statement as long as the condition evaluates to true. (Re-evaluates the condition only after having executed the whole statement.) new T(...) Instantiates class T and calls its constructor with the given actual parameters. define f(...) stmt end Defines a macro or subroutine. Additional Low Level Statements: jmp line jc cond, line jnc cond, line

Other: var:val type= 1; i--) { if (points[i].y < yMin) yMin = points[i].y; else if (points[i].y > yMax) yMax = points[i].y; } // Loop through all rows int x1 = Integer.MIN_VALUE; for (int y = yMin; y = 0; i--) { if ((y >= points[i].y && y = 1 select i else

select < yMin yMin = points[i].y points[i].y

> yMax select yMax = points[i].y points[i].y

else

else

i++

[continue] ]

[break] ] [chain nextLine var: i use: yMin, yMax, y, x1] ] = points[i].y and = 0

points[i + 1].y) or points[i + 1].y) - points[i].y) * - points[i].x) / - points[i].y)

= Integer.MIN_VALUE [continue] ]

x1 = x i--

else call drawLine select (x1, x) x1

x1 = Integer.MIN_VALUE

[break] ]

= x [continue] ]

C.2 Fibonacci Terms This is a standalone application that executes Fibonacci terms, and is hence a real life implementation of the example discussed in chapter 4. It is originally written for J2ME,

95

and has been directly ported to Uva, which is specified in chapter 5 and [Öst03]. This means that the port uses J2ME API and is mainly intended for comparing Uva to Java. C.2.1 J2ME Implementation package fibonacci; import java.io.*; import javax.microedition.io.*; class Fibonacci { /****************************************************************************/ /* Main method /****************************************************************************/ public static void main(String args[]) { System.out.println( "Fibonacci demonstration:\n" + "========================\n\n" + "Enter which term (>= 1) in the Fibonacci series that you want to\n" + "calculate. Add an initial plus sign to have every intermediate term\n" + "calculated as well. Write 'Q' or press Ctrl+D or Ctrl+C to exit."); StreamConnection InputStream InputStreamReader OutputStream PrintStream String

sc = null; is = null; in = null; os = null; out = null; strLine;

try { // Communication port "stdio" is not officially supported in J2ME CLDC, // but is used in PJVM to access standard input and output. sc = (StreamConnection)Connector.open("comm:stdio"); is = sc.openInputStream(); in = new InputStreamReader(is); os = sc.openOutputStream(); out = new PrintStream(os); while (null != (strLine = prompt(in, out))) { boolean bIntermediate = false; strLine = strLine.trim(); if (strLine.length() > 0) { switch (strLine.charAt(0)) { case '+': bIntermediate = true; strLine = strLine.substring(1); break; case 'q': case 'Q': return; } } try { int count = Integer.parseInt(strLine); if (bIntermediate) { for (int i = 1; i < count; i++) { out.println(i + ": " + term(1, 1, i)); } } out.println("Term " + count + " equals " + term(1, 1, count)); } catch (NumberFormatException nfe) { out.println("Invalid input, use: [+]positive_integer"); continue; } catch (IllegalArgumentException iae) { out.println("Invalid input, use: [+]positive_integer"); continue; } } }

96

catch (IOException ioe) { System.err.println("An input / output error occurred:" + ioe); } finally { try { if (null != is) is.close(); if (null != sc) sc.close(); } catch (IOException ioe2) { } } } /****************************************************************************/ /* Prompt for input /****************************************************************************/ private static String prompt(Reader in, PrintStream out) throws IOException { out.print(">"); int iChar = in.read(); if (iChar < 0) { return null; } String strLine = "" + (char)iChar; while (0 0 select(strLine[0]) '+' bIntermediate = true strLine = strLine.substring(1) else 'q' || 'Q' throw new AbortException() parseAndCalculate(strLine, bIntermediate) catch NumberFormatException nfe

98

out.println("Invalid input, use: [+]positive_integer") return catch IllegalArgumentException iae out.println("Invalid input, use: [+]positive_integer") return void parseAndCalculate(string strCount, bool bIntermediate) var int count entry count = Integer.parseInt(strCount) select(bIntermediate) loop printIntermediate() < count out.println("Term " + count + " equals " + term(1, 1, count)) void printIntermediate() entry // ic = the invocation count out.println(ic + ": " + term(1, 1, ic)) /******************************************************************************/ /* Prompt for input /******************************************************************************/ protected static string prompt(#Reader in, #Printer out) throws IOException var int iChar string strLine = "" entry out.print(">") iChar = in.read() select(iChar) < 0 return null loop concatenate() > 1 && < 1000 return strLine

void concatenate() pre(iChar) >= 0 && != '\n' iChar = in.read() entry strLine += (char)iChar /******************************************************************************/ /* Calculate a fibonacci term /******************************************************************************/ protected static int term(int a, int b, int count) entry select(count) < 1 throw new IllegalArgumentException("Variable count (=" + count + ") must be >= 1!") else 1 return a else reentrant return term(b, a + b, count - 1) end // Fibonacci

C.3 The Dining Philosophers This standalone application is a solution to the classic synchronization problem formulated by Edsger W. Dijkstra (1930 - 2002). The Java implementation has been directly ported to Uva (specified in chapter 5 and [Öst03]), and thus uses Java API, and is mainly intended for comparing Uva to Java. 99

C.3.1 Java Implementation package philosophers; import java.util.*; // Main class public class DinnerTable { protected static boolean[] protected static Object[] private static Random

forks; keys; rand = new Random();

/****************************************************************************/ /* Main method /****************************************************************************/ public static void main(String args[]) { System.out.println( "Dining Philosophers demonstration:\n" + "==================================\n"); int n = 1; if (args.length != 1) { System.out.println("Needs exactly one integer parameter!"); return; } try { n = Integer.parseInt(args[0]); } catch (Exception e) { System.out.println("Needs an integer parameter!"); return; } System.out.println("Press + C to exit."); forks = new boolean[n + 1]; keys = new Object[n + 1]; keys[0] = forks; // may be anything for (int i = 1; i