Uniform Versioning: The Change-Oriented Model - Semantic Scholar

5 downloads 0 Views 81KB Size Report
Apr 22, 1993 - Uniform Versioning: The Change-Oriented Model. Bjørn P. Munch. , Jens-Otto Larsen, Bjørn Gulla, Reidar Conradi. Norwegian Institute of ...
Uniform Versioning: The Change-Oriented Model Bjørn P. Munch, Jens-Otto Larsen, Bjørn Gulla, Reidar Conradi Norwegian Institute of Technology, Trondheim, Norway.



Even-Andr´e Karlsson EP Telecom Q-Labs, Lund, Sweden April 22, 1993

Abstract This paper discusses aspects of the Change Oriented Versioning (COV) model, which provides a uniform version concept rather different from the more traditional models. Its main goal is to give users and administrators a more flexible, yet simpler, way to manage changes in large and complex sets of data.

enhancements, a comment, a suggestion... There is also a need to recreate the documents as they were, and to include and exclude logical changes on an individual basis.



In addition to a description of our model, we here add some simple guidelines about how we think it will be used, and describe some of our recent work, not previously presented. We also discuss briefly its use in our work on cooperating transactions. One recent breakthrough in COV is that we have now designed and implemented a versioning “engine” which is independent of the data model being used.



Making logical changes to these documents can take several weeks, and there is a need to allow parallel work. There must also be a possibility to reconcile this work in a later stage, and to decide who shall see the changes while they are in progress. During their work, each user needs to be able to establish a workspace (sub-database), and to have a consistent view of the whole set of documents. Some kind of cooperation between the different sub-databases should be supported.

1 Introduction

Thus we will try to focus on the logical change as a primary concept in the evolution of the database, and regard the resulting objects as a byproduct of applying these logical changes.

The introduction of databases into new application areas like office information systems, computer aided design and software engineering environments have introduced some new interesting requirements to the database technology. These requirements are mainly connected to the increased size and complexity of the stored objects. The relationships between objects are also more complex in these applications. For these applications it could be more natural to focus on the changes done to a set of objects as a logical entity in itself, than just the resulting objects. A version of a set of objects will then be denoted by a set of changes. The advantage of this view can be exemplified for OIS where a set of objects could be a set of interrelated documents.

The rest of this article is organized as follows. Section 2 explains the basic concepts of change orientation, i.e. option, ambition, choice and visibility. Section 3 gives some guidelines about how we think COV will be used, and how its use can be made easier. Section 4 explains the independence of COV from the data model, and gives a brief overview of our database implementing it. Section 5 describes one application of COV: cooperating transactions. Section 6 lists the implementation history. Section 7 discusses related work that has some similarity to ours, and section 8 gives a short conclusion and indicates some further work.





These documents are often changed by a number of peo- 2 Basic Concepts ple through their lifetime, and the logical change will often consist of many physical changes to related objects Change Oriented Versioning, or COV for short, was first introand relationships. A group of physical changes consti- duced in [Hol88], and is an alternative to the more traditional tuting a logical change could represent e.g. a functional versioning principle, which we may call Version Oriented Versioning (or VOV for short). Detailed address: Dept. of Computer Systems and Telematics, Norwe-



gian Institute of Technology (NTH), N-7034 Trondheim, Norway. Email: [email protected]. Detailed address: EP Telecom Q-Labs, IDEON, S-22370 Lund, Sweden. Email: [email protected]

In stead of explicitly registering versions of objects, COV will register the logical changes, and versions are the result of applying a set of such changes. The changes are orthogonal 1

to object selection, i.e. they apply to the whole database, not These options indicate for which versions of the database the to individual objects . physical change is to apply. All versions that include the option bindings given by the ambition, will be modified. The Note the word logical above; there is no direct mapping fewer options that are specified, the more general the change between logical changes, which modify the properties of obwill become. jects, and the physical changes that were actually performed to implement them. This is central to the COV “philosophy”. An ambition is equivalent to a (connected) region within the version space, having one dimension for every option not specified. Including an extra option setting in the ambi2.1 Definitions tion cuts it in half. We usually draw the ambitions in twodimensional Venn-diagrams when making illustrations. Option. An option is a boolean variable representing a It will usually be best in practice not to make changes imlogical change. This logical change focuses on the actual mediately available upon each write operation. The writes properties of the objects as they exist in the database. An may be performed within a long-duration transaction, and option value TRUE means the change is to be included, FALSE be made available when the transaction commits; see Secmeans it is to be excluded. An option may also be left tion 5.1. However, COV itself does not depend upon any such unspecified. facility. An option has global scope, i.e. it is applicable to the database The ambition and choice together constitute the version conas a whole. New options may be added at any time. text — every access to the database is done within such a context. The set of selectable options forms what we call the version space, which has one dimension per option. These three concepts are really all that is needed to define the Choice (for reading). In order to see a particular version of fundamental principles of COV. However, the workings of the database, the user or application will give a truth value to COV may be a little easier to grasp if we also introduce the the relevant options, defining the logical changes that are to notion of visibility: be included/excluded. One of the major advantages of COV Visibility. This is a logical expression over options, and is that this is only done once for the whole database, and is attached to every single fragment in the database. Given then COV will create a view consisting of the corresponding the option/truth value bindings of a choice, a visibility should version of each individual object. evaluate to either TRUE or FALSE. This value indicates whether This is a central point of COV. We establish a mapping which the fragment is to be included in the view, or not. goes like this: A choice will in effect map to a subset of the visibilities; those which evaluate to TRUE. The database view consists of all fragment whose visibility is in this subset. While traditional systems typically have: Updates and deletions work by modifying this visibility, in-



     "!$#  %& '& ()* "!$#

sertions by adding a visibility to the fragment.

In other words, we set up a global, one-version view (a “slice”) of the complete database contents, before we start selecting 2.2 Doing Physical Changes the set of data to work with. The main advantage of this method is that object selection is much simpler, since there is We will here give a brief description of how physical changes can be performed in a COV database. only one version of everything to choose from.





,

-

is a choice, is a visibility, is an ambition and is a database fragment of some kind. We use the general term “fragment” as a common denominator for all kinds of “data items”, so that our description is independent of datamodel.

A complete choice is a choice which includes enough option settings to uniquely identify one version of every item in the database. A non-complete choice can be made complete by assigning an implicit interpretation of FALSE to those options not explicitly mentioned. A complete choice is then a particular grid point in the version space.

The ambition is, as noted, a list of option settings, and so is the choice. The choice must be within the ambition, which The choice can be seen as a read filter through which one we could express like this: particular version slice of the database can be seen. Note that (1) this not only decides what version of objects to see, but also which objects exist in the first place.

 . ,

Ambition (for writing). A physical change to the database What this really means, is that the choice we have when doing will be marked with an ambition, which is also a set of op- modifications, must include at least all the option setting specified in the ambition. Otherwise, we will simply not see tion/value bindings, as the choice is. what we do.

+ We use “object” here in a very general meaning as any identifiable data

For a fragment to be visible, its visibility must be true given

item.

2

Fragments w/visibilities

the choice, i.e. the choice must be within the visibility:

$.

(Versions of SAME fragment)

Ambition

Filters:

(2)

Choice

v1 Or, with the choice implicitly expanded:

0/213."

v2

(3)

Visible fragment

1

v3

is the conjuncted negations of all options not , where specified in the choice. The latter form is the one we use; it removes possible ambiguities.

v4

When we change our fragments in the database, we want our modifications to be applied for all choices (versions) within our ambition. We denote a fragment with visibility as . There are three different forms of write operations:

-

4 *5 -76

Read Write



v5 v6

1. Deletion. A fragment is to be deleted.

4 $5 - 698 4;: 5 -76

vi=TRUE

New value

vx =amb. (4)

vx

2. Insertion. A new fragment is to be inserted.

8 4 , 5 -A@CBED6

3. Update. A fragment is to be updated/replaced.

4 *5 -GFIHKJL6'MN8 4;: 0/O= , > 5 G- FIHKJL6'MNP 4 , 5 - @CBQD 6

R

Figure 1: COV seen as a filter. (5) “narrow” enough), at most one version of the fragment will pass through, and will be read. In the example, this is frag(6) ment number 4. The choice filter will also strip the fragment of its attached visibility, as this is not part of the data.

The in equation 6 indicates there may be more than one The update operation will use the ambition filter, which is wider than the choice filter. The new value of the fragment “old” version of the fragment . passes through, and picks up a visibility on its way – this One can see from the above equations that a later read from visibility is modified by the ambition. The path is indicated the database will “see” these changes if and only if the new by a “full” arrow. As mentioned in section 2.1, some of the choice includes the option bindings of (and therefore implies) changes may be delayed until commit of the transaction doing the ambition . One can also see that if the “same” fragment the write; the end result will be as described here. is only allowed to be inserted once , more than one version of it will never be visible for the same (complete) choice; we In addition, some of the old versions have to be modified, as defined in equation 6. These are the versions that we can “see” will not get ambiguities in version selection. through at least one possible choice within the ambition. In Note that these equations should not be taken too literally; addition to version 4, which we also see through the current COV will behave as if it happened this way, though the im- choice filter, versions 3 and 6 can be seen in this example. plementation might be (in fact, is currently) a bit different. The arrows indicate that they will be touched.

S

,

T

2.3 COV as a Filter

2.4 Uniform Versioning

As mentioned in Section 2.1, COV can be seen as a two-way The major strong point of COV is its ability to version evfilter, where the choice affects the read operations, and the erything through one operation, namely the setting of the ambition affects the write operations. choice. This means the user does not have to apply different We try to illustrate this in figure 1. On the left is a set of mechanisms to different kinds of data, or to perform version versions of the same fragment, each with an attached vis- selection at several points. ibility. On the right is a read, followed by a write (more One example of the successful use of uniform specifically, an update), of this same fragment. The filters versioning, is our own Process Management prototype are placed in the middle. The choice filter is the easiest to [COWL91],[CJM 92]. Here, COV provides versioning not explain. If the choice is complete (or in other words, the filter only of normal data (including source code), but also of metaThe single new fragment with the ambition as visibility is a simplifica- data such as type properties , and of the processes themselves.

U

M

k l This is made simpler since our database stores type representations in

VKWYX[Z]\G^`_badceQf`gih

tion; in principle, we should use . The implementation details making this simplification possible, is outside the scope of this paper. We may allow a re-insertion, if preceded by a deletion.

j

“normal” objects, but it will work with any representation.

3

Just as the set of available objects may be different for dif- 3.2 High Level Descriptions ferent choices, so may the set of available types and processes/process models. In [GKY91], a number of mechanisms for more high level version descriptions are described, all of which are intended to make the version selection easier and/or more correct for the user. These include:

3 How to Use COV

Validities Used either to attach status properties to certain versions, or to “freeze” them, making further changes In section 2.1, COV is defined through three concepts: Option, impossible. The former can be used to let the user Choice and Ambition. These three, and the definitions given explicitly ask for, e.g. module-tested versions. The latter for them, is really all the user should have to know about the will prevent changes into already released versions. versioning. How, then, is the user going to use these? We envision COV Constraints These specify restrictions on the combinations of certain options. The most useful ones are mutual being used in a multi-user environment with large amounts exclusiveness, and implications. See also Section 8.2. of data, and where “our” user may want to do some modifications. Whether the main data repository is maintained Preferences These are weights (positive or negative) atby a centralized DBMS, or it is distributed, is not important tached to options, indicating how much you want/don’t here, but we will assume that everything is subjected to COV want them. They are used by the heuristics to guide the versioning. search for a choice. The user will have to make decisions like: Aggregates Are simply names attached to other version



When to introduce a new option.



How to keep track of the options, and what they mean.



How to decide what version choice to use.



descriptions, to create higher level structures. Defaults These are settings attached to particular projects or tasks. These are functions of the environment, rather than of the COV system itself.

3.3 Ambitions Before starting a job, how to set a correct ambition for it. It is important to understand what the ambition is all about, as setting it correctly is important for keeping the database consistent with regard to the intended semantics of the options.

3.1 Options

Ideally, most ambitions should consist of only one option setting (through knowledge about dependencies between options, more may be added behind the scenes by the system; see Sections 3.2 and 8.2). This option identifies the functionality to be implemented by the change we are about to do.

An option denotes some property of the product, that it may or may not have. When specifying a choice, the user basically selects among these properties: which ones (s)he wants, does not want, or doesn’t care about. Options should be created with this in mind.

However, the possibility of specifying more than one option Typical examples for software could be options for various in the ambition is central to COV. You will need to do this operating systems/hardware platforms, graphical interface or whenever the combination of two changes need some extra not, optimized for speed or not, national language support, modifications to be semantically correct. etc. For example, you may have extended a program to provide Enhancements and fixes may also be designated by options, a graphical interface in addition to a textual one, identified so that it is possible for the user to decide whether or not (s)he by the option gui. Independently of this, you add some wants each of them included. In fact, any such enhancement local language support, identified by option langu. When that it should be possible to selectively include/exclude, must you later specify a choice of gui langu, you will see be identified by an option. a program with both changes merged. You may perhaps notice that the graphical part does not yet have local language Over time, this will obviously lead to a larger and larger support. number of options for the user to select from. We realize this problem, and are working on solutions that will reduce the This is natural, since both changes had to be done form the complexity as perceived by the user; see Sections 3.2 and 8.2. base version of the program. The way to fix it is also quite natural: You need to do some additional changes for the comTo aid the user in his/her selections, some additional informabined version; you do it with an ambition of gui langu. tion may be attached to each option, like when and by whom it was introduced, and what its intended semantics is. In the After these extra changes have been completed, they will ausimplest form, it could be a simple text. tomatically be included in any version created by setting both

/

/

4

options to TRUE. The strong point of COV is that this happens completely transparently to the user, and at no additional cost. The new changes become an integrated part of the versioned database, and do not have to be merged explicitly when the corresponding versions are read at a later date.

both attributes modified. This is because each change is now smaller in extent, and they do not overlap. In addition to these basic questions, there may be some design choices possible within each data model.

It is important to note that it is possible to build a generic COV engine which works without knowledge of the data model and 3.4 Choice the answers to the questions mentioned here. The latest revision of our own EPOSDB-II indeed has such a COV engine In addition to its restriction in equation 1, the user may want embedded. Its only current requirement is that data be reto add additional option bindings to define what version to trieved in fixed size record from basic access functions. “see” out of all the possible versions affected by the changes. First of all, the version descriptions mentioned above, in section 3.2, may be of help. We can also give some general rules about options to include in a choice.

4.1 Case: the EPOS DB Our database is a single server process with clients communicating over RPC. We will give a brief description of the design, to illustrate how COV is embedded in the system.

The user may have different objectives. If you want a particular version for delivery, then the validities mentioned above will be useful, to ensure that a stable version is selected. In addition, you should of course include (if appropriate) options for the particular environment the product is intended for, and for any additional facilities requested.

In addition to versioning, the EPOSDB-II provides long, nonserializable transactions, as detailed in section 5.1.

On the other hand, if some new functionality (or combination 4.1.1 Overall Design of functionalities) is to be implemented, you should be careful about adding any other options to your choice, as you may The layer structure of the EPOSDB-II server is shown in figure 2. We include all of them for completeness, but will only lose some of the generality of your solution. describe those relevant for our context here. While COV always will be able to merge any combination Application of changes for you, the results cannot be semantically guaranteed. You may want to use the validities to mark those (PM Layer) Client combinations that have been tested, and fixed if necessary.

q

oClient (RPC)

o

4 Data Model Independent COV RPC

The basic COV principles can be applied to any kind of data model, but the practical consequences will vary. Two important questions that have to be answered before you can apply COV within a datamodel, are:





Data Model

nTransactions

mVersioning

What are the “things” (here called fragments) that may exist in different versions? What has identity? The latter is important, as we cannot update something without some kind of identity, so that we can talk about different versions of the “same”.

pServer

(CoV)

Low Level Record Handling

rC−ISAM

R

UNIX

What is the versioning granularity? This is not the same Figure 2: The layering in the server, and the client. question as the above; we now ask not what will exist in different identifiable versions, but how small changes The lack of layer independence was the main problem of we can separate out, and combine together. the implementation in the first EPOSDB-II . In the current To take an object oriented model as an example: If the granu- implementation however, we have achieved this goal to a larity is at the level of the whole object, then a change would large degree. in effect be a complete replacement of the object. If we are then do to independent changes to an object, each modifying 4.1.2 The Layers a different attribute, and then attempt to combine these two changes, we will end up with only one of them as a result. Low Level Storage This layer handles the basic read/write On the other hand, if the granularity is a single attribute, we of records in files. All access to the data files will go through will achieve the probably desired result: an object having the interface of this layer, to ensure consistency. Cursor 5

management is performed here. Data is handled in the form The ER schema is represented in the database in the form of records, where each layer interprets only its “own” parts. of typedescriptor objects and related ER types corresponding to the data model components. This is a useful data model feature, and has little relation to the versioning. Versioning Layer The versioning layer acts as a COV filter (as described in section 2.3), both for reads and for writes. Our relationships do not have identity, and hence cannot be updated, only inserted/deleted. For the same reason, the verAll normal data access goes through this filter. sioning granularity has to be the whole relationship instance, It acts as a pure COV filter, in that it functions independently including all attributes. of transaction or data model; it sees data only as records with Of course, a relationship cannot exist unless the related entia visibility. ties exist (and have acceptable types). Since the generic COV engine cannot know about such things, we let the data model Transaction Layer The transaction layer filters data by part of our implementation take care of this added restriction. transaction “ownership”, filtering out irrelevant records when For this use (as well as others), the data model layer maintains reading and putting written records into the correct transaction internally an explicit instance of relation between object and type; this is subject to COV versioning just like any other subdatabase. data. The transaction manager has only a vague idea that versioning Finally, text files are objects with identity, while their contents is taking place. are identity-less lines. Hence, text lines cannot be directly For the transaction manager, the ambition and choice are updated . Unlike relationships, they do have a strict ordering. only “gray pointers” which are passed on to the versioning This order is maintained by controlling the I/O flows through layer. The versioning layer in return, does not really under- the COV engine. stand “transactions” as such, but treats the ambition/choice purely as a COV context. This de-coupling contributes to the independence of the layers.

s

5

Cooperating Transactions

Data Model Layer This layer knows the details of the EPOSDB-II data model. It understands the difference between 5.1 Transaction Model entities and relationships, the meaning of OIDs, and the principles of subtyping and inheritance. It also knows how to EPOSDB-II offers a nested transaction model with long, nonmap attributes into flat records. serializable transactions . Transactions may survive several It “knows” about the existence of transactions, and that ver- application sessions, and may be represented by auxiliary persioning is taking place, but has no idea about how these are sistent objects in the database . These contain a configuration implemented and what kinds of algorithms and internal struc- description (which consist of a version binding and a product description), exchange protocols against other transactions tures are being used. (Section 5.2), and auxiliary information such as change reOnly the data model layer will have access to the internal quest, user identity, comments, date etc. representation of the type descriptors. In fact, the lower layers do not even know that there are types, only that there Transactions are started as sub-transactions of existing ones, and a “perpetual” root transaction forms the base of the transare different data files with records in them. action hierarchy. The transactions are user-controlled – they may be started and terminated interactively. A user (client The Process Model Layer On top of the data model layer, process) can only be working within one transaction at a time. in the client, there is a “soft” extension of the data model to accommodate process modeling. This consists of an object- All database operations must be performed within the context oriented SPELL language [CJM 92] to express procedures, of a long transaction. This means that all relevant project triggers and tasks, and related translators and interpreters. information is uniformly subject to COV. All this information The PM layer is not part of the EPOSDB-II proper, and appli- constitutes a visible sub-database or database version. A subdatabase is a ”private” universe, with transparent versioning cations can also be written without using it. of all accessed data.

t

u

M

Thus, a transaction represents a consistent set of logical and physical changes to the database. A transaction is also a “container” for a version context (ambition + choice).

4.2 Versioning Related to the Data Model

v In principle, we would like to have some identity on the lines too so that

The EPOSDB-II has an ER-based [Che76] data model with subtyping, and can also store (versioned) text files in longfield objects. The entities (objects) have identity, and can therefore be updated. The versioning granularity is not as small as single attributes, but rather the set of attributes defined at a single subtype level.

w

we could update them. This is in our future plans. The term transaction is used in lack of a more appropriate one. Serializable, “ACID” transactions are not suitable in this domain, see e.g. [BK88]. Though such transaction objects are not maintained by the database itself; it has its own internal representation.

6

x

If the requested instances are not present in the current transaction, the request is passed up the transaction hierarchy. Upon commit, all changes are transferred (made visible) to the parent transaction. The changes then become visible to its sub-transactions, unless the same instances have been redefined locally. Such sub-transactions are siblings of the committing transaction, and will normally be kept isolated until each other’s commit. If the parent receives conflicting updates from committing sub-transactions, it will be responsible for reconciliation – see Section 5.2.

The sub-databases and their workspaces must, of course, be kept sufficiently synchronous. Such high-level cooperation policies are added by the PM layer, “enveloping” all basic I/O operations. We are now working on a more generalized transaction layer inside EPOSDB-II to better support such the required low-level mechanisms, e.g. improved locking control, adding triggers etc.

The EPOSDB-II will be providing a flexible locking scheme (currently being developed), including non-restrictive read and write locks, similar to the lock types in ObServer [HZ87].

6

Implementation History

The implementation of COV has gone through four incarnations:

5.2 Cooperating Model and Mechanisms

The first [Mun89] was a simple tool set for versioning of text files. It handled the visibilities in a very na¨ıve way, which made it unsuited for systems of some complexity.

As mentioned, a transaction is the fundamental work environment in EPOSDB-II , and its life-time may span several application sessions. Since transactions may exist without connected clients, they can be viewed as persistent database processes. Thus they form a natural base for defining the communication system between the database users.

The second [Mun90] was EPOSDB-I , using the same underlying algorithms for versioning as the first tools, but now extended to an ER database. The current incarnation, EPOSDB-II , has much improved COV algorithms, has a more extensible data model, and a more general interface. EPOSDB-II has recently gone through a major internal redesign from scratch.

It was mentioned briefly that a parent transaction may receive conflicting updates from its sub-transactions. This may happen in spite of the parent trying to distribute work properly to avoid such clashes. But in case of conflict, the parent may 1) ignore the updates from all but one sub-transaction, 2) rollback itself (!), 3) itself try to reconcile (i.e. merge) the changes, or 4) choose to delegate to a new sub-transaction to perform reconciliation.

The “freeing” of COV , also explained in section 4, is an important breakthrough, and makes it much easier to develop the ideas and the implementation further.

In addition, the mechanisms described in section 3.2 have However, conflict resolution applied afterwards is not always been implemented in Prolog [GKY91]. The PM dethe best solution. A better one is planned cooperation among scribed in section 4.1.2 have also been implemented in Prosub-transactions, partly sharing data in overlapping versions. log [CJM 92]. In this way, one reconciled version will be committed from all sub-transactions.

M

7

Two sub-transactions have overlapping configurations, and thus possibilities for conflicts, if they both have overlapping ambitions (otherwise they are “variants”) and overlapping product structures. In this case, suitable protocols must be established for negotiation and exchange of shared and changed information [CM91]. Each update operation may then cause a notification to be sent to the affected sibling transactions, for further negotiation and possible propagation of the changes. [LMC92] defines a set of mechanisms to enable:







Related, Similar Work

The traditional way of handling a versioned system of source code is to use conditional compilation to handle variants (e.g. different operating systems, different devices, optional features), and a tool like SCCS [Roc75] or RCS [Tic85] to handle revision chains (e.g. bug-fixes, mandatory enhancements).

While conditional compilation flags may be likened to our options, their use is much more restricted, as they are language specific and only works on source code. The programmer also Notifications or Messages Asynchronous notifications about events in the database. has to maintain them manually. On the other hand, they do give potentially the same freedom (or even more) as COV in Object Propagation deciding for what versions to include a particular piece of A framework for propagating changed and new obcode. jects to designated ongoing transactions, without making The major problems of this approach is the use of two unthese objects publicly available. related mechanisms for versioning, with little possibility for Protocols for Negotiation and Propagation combinations, and confusion than can arise from extensive Each pair of cooperating transactions will have an agreed use of conditional compilation; all variants are visible to the policy for how to negotiate and handle conflicting upprogrammer at the same time. dates, e.g. busy vs. lazy updates, logical vs. physical sharing etc. In an early work Belady and Merlin [BM77] attempted to de7

vise a formal model for evolving software systems, and discuss how some actual systems used within IBM can be viewed as conforming to this model. An individual selectable unit (similar to a COV option) may implement some functionality or repair other units, and may span several modules. Since no visibility concept associated with each fragment is used, a complex global function expressing both fragment selection and composition must be represented. Different forms of this function and the formidable problems of redefining it when new units are added are discussed at some length. The model presented is mainly a model of software as seen by the user or installation manager, not the developers.

ADC, and a rule to ensure it is included in a version if and only if the first two csets are included.

In the PIE system [GB80] a layered network is used to allow alternative software designs to coexist. Layers identify functional changes and may span several modules. A layer is either an alternative of some other layers (placed in the same context) or independently selectable. However since the versioning granularity is Smalltalk methods, some of the advantages of combining individual functional changes are lost.

Version selection is done once, and uniformally for all data, instead of being applied to each individual object.

8

Conlusion and Further Work

8.1 Conclusion Change Oriented Versioning is an alternative to the more traditional models, its main advantages are:

  

Merging of changes comes “for free”, but extra changes to a merge can still be added in a transparent way.



The versioning can be applied to data in any form, as long as it can be split into fragments.

The work reported in [Kru84, SBK88] is primarily concerned with editing and maintaining individual, multi-version text files. Concerning versioning models, the papers contain notions being very similar to those in COV. Specifically, a version is defined as a collection of fragments, where each fragment is associated with a conditional expression (visibility, such as (SYSTEM = UNIX) & (TIME 1986). SYSTEM and TIME are multi-valued variables, called dimensions, being somewhat similar to options in COV. There is also a notion comparable to ambition which is called an edit set, but no validity concept. In their approach dimensions seem to be used to classify a set of existing versions, as opposed to our options which describe a space of potential versions.

It is our hope that a purely logical description of properties will be easier to use that version numbers.

More work is under way to find out how well COVwill scale up for large and complex systems, both regarding performance and usefulness.

y

8.2 Further Work

Of directly COV related work in progress, we can mention the cooperating transactions already described in some detail in section 5, various optimization techniques (too technical to Aide-de-Camp (ADC) [adc90] is a database system tailored be presented here), and in addition: for use in software development. Their versioning is based Better Option Support. In our implementation, we have on change sets (csets), where a cset is a set of changes over a flat option structure, i.e. all options are “equal” and there all the files (as well as other data) in the database. A version are no restrictions on their combinations. We are considering is a base version plus a specific collection of csets. ADC ways to structure this in order to reduce the complexity as handles the merging automatically. There is also a facility perceived by the user. We recognize at least two kinds of for transferring individual csets across databases. semantic relationship between options: There are clear similarities with this approach and our COV. The csets are similar to our options in that they can be comMutual exclusion. Only one of a set of options may be bined in arbitrary ways, and a version is selected for the TRUE. This set can then be treated as a multi-valued whole database in one operation, before we start picking option instead. components. Dependency. An option is not useful unless a specific ADC distinguishes between “plastic” versions, which can other option is TRUE. For example, there is no need to be modified by adding/removing csets, and “installed” ones, consider SunOS unless you’ve already selected UNIX. which are fixed. Something similar could be achieved in COV by the use of version descriptions and validities; see Knowledge about these relationships (which have to be given section 3.2. by the person defining each new option) will help us signifiThe fundamental difference is that ADC, though it gives much cantly reduce the number of choices the user needs to do. more freedom than traditional systems, still is based on the concept of an explicit, physical change. COV, despite its As mentioned in section 3.2, we have implemented a protoname, is not; the options refer to logical changes, which may type of a high level abstraction which includes relationships be implemented by any number of physical changes. The like these. However, we hope to improve not only user friendexample in section 3.3 would require an additional cset in liness, but also performance, by “pushing” these down into the COV implementation.

 

8

We may also want to bind options in some way to product [GB80] structure, since it will often be the case that a particular option only makes sense, or is intended to be used, for a particular piece of a larger product. [GKY91]

Experimentation on Text. We are developing a tool that reads source code (currently in C or C++) which is stored under RCS control and uses conditional compilation, then extracts all the “versions” it can from this and stores it under [Hol88] COV control into the EPOSDB-II. This is a rather specific tool; it is intended to show us how COV would scale up if being used instead of more traditional [HZ87] alternatives. See also the intro to Section 7.

This experimentation will also include better ways to do COV on text files, such as detection of merge conflicts, languagesensitive fragmentation, more intelligent diff handling, etc., [Kru84] all of which will contribute to better results for the user.

h

Software Maintenance & Development Systems, Incorporated, Concord Massachusetts. Aide-de-Camp, Product Overview, 7.2 edition, 1990.

[BK88]

Naser S. Barghouti and Gail E. Kaiser. Implementation of a knowledge-based programming environment. In Proc. of the 21st Annual Hawaii International Conference on System Sciences, pages 54–63, Hawaii, USA, January 1988.

[BM77]

[BSS84]

[Che76]

h

L. A. Belady and P. M. Merlin. Evolving parts and relations - a model of system families. Technical report, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, 1977. Tech. Report RC 6677, Also in M. M. Lehman and L. A. Belady (Eds.): Program Evolution — Process of Software Change, Academic Press, 1985, pp. 221–236. David R. Barstow, Howard E. Shrobe, and Erik Sandewall, editors. Interactive Programming Environments. McGraw-Hill Book Company, New York, 1984. ISBN 0-07-003885-6. P. P.-S. Chen. The Entity-Relationship Model — Towards a Unified View of Data. ACM Trans. on Database Systems, 1(1):9–36, March 1976.

[CJM 92] Reidar Conradi, M. Letizia Jaccheri, Cristina Mazzi, Amund Aarsten, and Minh Ngoc Nguyen. Design, use, and implementation of SPELL, a language for software process modeling and evolution. In J.-C. Derniame (ed.): Proc. from EWSPT’92, Sept. 7–8, Trondheim, Norway, Springer Verlag LNCS 635, pages 167–177, September 1992. [CM91]

Bjørn Gulla, Even-Andr´e Karlsson, and Dashing Yeh. Change-Oriented Version Descriptions in EPOS. Software Engineering Journal, 6(6):378–386, November 1991. Per Holager. Elements of the Design of a Change Oriented Configuration Management Tool. Technical Report STF44-A88023, 95 p., ELAB, SINTEF, Trondheim, Norway, February 1988. M. F. Hornick and S. B. Zdonik. A Shared Segmented Memory System for an Object-Oriented Database. ACM Transactions on Office Information Systems, 5(1), January 1987. Vincent Kruskal. Managing multi-version programs with an editor. IBM Journal of Research and Development, 28(1):74–81, January 1984.

[LCD 89] Anund Lie, Reidar Conradi, Tor M. Didriksen, EvenAndr´e Karlsson, Svein O. Hallsteinsen, and Per Holager. Change Oriented Versioning in a Software Engineering Database. In Walter F. Tichy (Ed.): Proc. of the 2nd International Workshop on Software Configuration Management, Princeton, USA, 25-27 Oct. 1989, 178 p. In ACM SIGSOFT Software Engineering Notes, 14 (7), pages 56–65, November 1989.

References [adc90]

Ira P. Goldstein and Daniel G. Bobrow. A layered approach to software design. Technical Report CLS80-5, Xerox Palo Alto Research Center, Palo Alto, CA, December 1980. Also in [BSS84], 1984, pp. 387–413.

Reidar Conradi and Carl Chr. Malm. Cooperating Transactions against the EPOS Database. In Peter H. Feiler (Ed.): “Proceedings of the 3rd International Workshop on Software Configuration Management” (SCM3), Trondheim, 12–14 June 1991, 166 p. ACM Press Order no. 594910., pages 98–101, June 1991.

[COWL91] Reidar Conradi, Espen Osjord, Per H. Westby, and Chunnian Liu. Initial Software Process Management in EPOS. Software Engineering Journal (Special Issue on Software process and its support), 6(5):275–284, September 1991.

9

[Lie90]

Anund Lie. Versioning in Software Engineering Databases. Technical Report 1/90, EPOS TR 95, ISBN 82-7119-155-1, DCST, NTH, Trondheim, Norway, January 1990. 166 p. (PhD thesis NTH 1990:2).

[LMC92]

Jens-Otto Larsen, Bjørn P. Munch, and Reidar Conradi. Cooperating transactions in the epos software engineering database. In Proc. 3rd ERCIM Database Research Group Workshop on Updates and Constraints Handling in Advanced Database Systems, Pisa, Italy, pages 61–67, September 1992. Also EPOS TR 157, NTH, 31 March 1992, 6 p.

[Mun89]

Bjørn Munch. Eofil, et verktøy for endringsorientert fillagring. Technical Report 24/89, EPOS TR 82, DCST, NTH, Trondheim, Norway, May 1989. 32 p. + Appendix with 60 p. (Student Project Work, In Norwegian).

[Mun90]

Bjørn Munch. Change Oriented Versioning. Technical Report 28/90, EPOS TR 110, 120 p., DCST, NTH, Trondheim, Norway, June 1990. (MSc thesis).

[Roc75]

Mark J. Rochkind. The Source Code Control System. IEEE Trans. on Software Engineering, SE-1(4):364– 370, 1975.

[SBK88]

N. Sarnak, B. Bernstein, and V. Kruskal. Creation and Maintenance of Multiple Versions. In [Win88], pages 264–275, 1988.

[Tic85]

Walter F. Tichy. RCS — A System for Version Control. Software — Practice and Experience, 15(7):637–654, 1985.

[Win88]

J¨urgen F. H. Winkler, editor. Proc. of the ACM Workshop on Software Version and Configuration Control, Grassau, FRG, Berichte des German Chapter of the ACM, Band 30, 466 p., Stuttgart, January 1988. B. G. Teubner Verlag.