Databasr i - Quretec

13 downloads 27272 Views 1MB Size Report
an ob.ject-oriented modeling tech- nique. 'l'he basic idea .... robustness of' the logical data structure, i.e., the classes. RDBMS and ...... ments for an electronic design auto- mation .... 1250 Sixth Avenue, San Diego, CA 92101, U.S.A.. (619) 699- ...
AnObject-Oriented Relational Databasr i

A relational DBMSand an object-orientedprogramminglanguagecan be combinedto yield a surprisingly effective OO-DBMS for many applications. Wllllam J. Premerlanl, Michael R. Blaha, James E. Rumbaugh, and Thomas A. Varwlg

oriented layer that keeps relevant data in memory. Locking and update protocols are built into this layer. One achieves quick access to data, while storing it’in a database between executions. The 00 layel hides the database from applications; applications are unaware that they are receiving DBMS services. ‘The programmer sees an objectoriented language with certain predefined operations that allows ob,jects to be retrieved f’rom and stored in a database between program executions. The contribution of this article is the realization that a checkout mechanism can be used to combine a KDBMS and OOPL into a robust and ef‘ficient 00-DHMS. Our OODBMS is not a complete system. It lacks commonly expected features such as extensible data types, management of behavior as well as data, and inheritance. Nevertheless, OUI approach to an OO-DBMS can satisfy many applications. Related

Many people recognize the shortcomings of current database management systems (DBMS) [‘L, 9, IO] a11cl conventional programming languages [4, 81. DRMS and progranlniing languages each provide a distinct viewpoint on data and applications. .l‘he two viewpoints are not well integrated, although each has its own strengths and weaknesses. Kelational DBMSs (KDHMSs) have a firm theoretical base aind satisfy many applications. KDEMSs, however, Iack important features needed fin. ad~~anced applications, such as abstract data types, c0111plex integrity constraints, and versioning. I’rogimiiining languages, such as I’ascal and C+ +, provide abstract data types, structured control coiistructs, and the ability to write complex algorithms, but lack data

persistence across executions and concurrent access to data. Objected-oriented (00) design provides a uniform paradigm for both database design and program code design. 00 data models prorelational databases that duce match real-world applications and avoid the normalization problems often associated with relational database design [3]. 00 programming languages (0OPL.s) improve code reuse, code maintenance, and modularity [8]. We describe a technique for constructing an object-oriented DBMS (00-DBMS) from existing technology and a small amount of humanwritten code. .l‘he existing technology is an KDBMS, an OOPL, and an ob.ject-oriented modeling technique. ‘l‘he basic idea is to buffel the database with an object-

Work

Two different areas of research relate to this article: OO-DBMS and database checkouticheckin. 00-DEMS

The term “OO-DBMS” is not well defined; it means different things to different people. We will define 00-DBMS as the intersection of database and OOPL technology. The ambiguity in the term OODBMS is largely a result of whethet one emphasizes the database or OOPL side. OO-DBMSs are just starting to emerge in the commercial and research worlds. The technology is immature, however, and suffers from lack of standards, poor perf‘ormance, and unresolved design issues, much the same as KDBMSs did a decade ago. One Of’ the most important fea-

tutu of an ()()-DBMS is the implicit assumptions that (1) the systern is oriented toward operations on individual &jec,ts and (2) the programmer can expect these to is notable perform well. This RDBMSs typically mainly because perform badly for single-object operations and navigation between objects [6]. There are two basic approaches to OO-DBtMS: extending a relational DBMS and extending an 00 programming language. One can implement an OODBMS by extending an RDBMS. One extends the relational model with new data types, operators, and access methods. An explicit goal is to minimize changes to the relational model. This approach definitely adds to the functionality provided by an RDBMS. This type of 00-DBMS integrates well with existing relational databases and provides for smooth flow of data between engineering and business applications. Potential disadvantages are performance and robustness limitations. Even an augRDBMS may not be mented capable of efficient operations on individual objects. PlOSTGRES [23] and EXODUS [5] illustrate this approach. Another approach to an OODBMS is extending an OOPL. Database functionality (persistence, authorization, concurrency) is provided as needed for individual objects. An extended OOPL efficiently navigates individual objects and has no inherent limits on functionality. An extended OOPL, however, must demonstrate reliable management of large quantities of data. One must also develop a theoretical base as Codd provided for RDBMSs. Gemstone [ 171, Vbase [ 11, and ORION [14] adopt this approach. At this point in time, it is not clear which approach is best. The choice would seem to depend on the application. The 00-DBMS described in this article adopts the viewpoint of extending an OOPL. Our approach combines the maturity and robustness of’ the -

KDBMS with the perf‘ormance of the 00 language ill an open architecture that makes it convenient to intcrf&c with other languages. Also, you can have both an 00 and a relational interface to the same database. This allows you to write 00 programs that access data that already exists in an RDBMS as well as continue to use conventional RDBMSs access to the same data. Datalmse

checkout

The notion of database checkout has been discussed [ 1 1, 151. A portion of the database is locked for exclusive use and copied into RAM. All subsequent read and write traffit is then executed against the KAM copy. Once work is finished, the data is checked back into the database and made available for general use. The checkouticheckin mechanism provides fast interactive access and eliminates most locking overhead. We will refer to this checkout/checkin mechanism as database shadowing.

An ordinary database transaction locks its data for a few seconds. In contrast, a checkout scheme may lock data f‘or hours, days, and even weeks. Thus the checkout mechanism provides support for long transactions. Long transactions are often needed in a design environment. SoQtwaee subsystems Our 00-DBMS is constructed from an object-oriented data modeling and an protocol, an RDBMS, OOPL. The exact subsystems chosen are not important for the success of the shadowing technique, which we use to interface the programming language to the database. Object-oriented

Modeling

We have been using the Object Modeling Technique (OMT) [3, 161 for our work. Any type of 00 data model would suffice for our shadowing technique, but our discussions will f’ocus on the OMT for the sake of clarity. The OMT specifies logical data structure, i.e., the classes

of objects in the database and their rek&rships to one another. OMT diagrams are straightforward to map into KDBMS data definition (DDL) commands and 00 code declarations. Figure 1 shows a sample OMT diagram. Some aspects of OMT diagrams are discussed next. A class describes a set of object instances with similar structure and behavior. Each class has a name, a set of attributes (also called instance variables, fields, or properties) that hold state values of the object, and a set of operations that an object is subject to. Object classes are denoted by boxes. The name of the class, its attributes, and its methods are drawn in the box. The listing of attributes and methods may be suppressed, as in Figure 1, depending on the level of detail desired. A relationship connects two (or more) classes. An object class may inherit some of its structure and behavior from a superclass; the subclass is a refinement of the SUperc&zss. This relationship among classes is called generulizution. Generalization (inheritance) is indicated by lines and a triangle ,fanning-out from the superclass to the subclasses. An association relationship connects two or more object instances and is indicated by ordinary lines. Special symbols at the ends of an association line indicate the multiplicity of the association (how many objects are related to a given object). A solid circle indicates zero or more; a hollow circle means zero or one; a straight line without a terminator denotes exactly one. Relationcrl

DSMS

~RDRMSI

The KDBMS must provide data persistence, concurrency control, transactions, and a programming language interface. A query language or query-by-forms capability is unimportant for many engineering analysis problems because access to complex, structured data is controlled by the 00-DBMS front end. The 00-DBMS described here hides the details of the RDBMS and removes some of the

arbitrary restrictions. The RDBMS may be regarded as a lower-level resource invisible to the end user. ProgrammIng

Orr/ect-Oriented

MoPL) The object-oriented programming language must provide objects of different classes and the ability to send messages to objects to invoke a class-dependent method. Inheritance of methods is not required by the algorithm presented here, although any reasonable 00 languagc will have it. Each class describes a set ofo+jects with identical structure. A class maps into an RDBMS table; each object instance maps into one row of a table. An ID generated by the 00-DBMS serves as the primary key of each object. Some relationships into map RDBMS tables. The other relationship are stored in object tables as buried pointers.

Language

00-DBMS Architecture

System

The programmer views the OODBMS as an object-oriented lan-

I

guage with persistence. The language includes specific operations about instances of classes and relationships. The 00-DBMS has both compile-time and run-time components. ~~~,“~~U~‘mP”e~t’me Figure 2 summarizes 00-DBMS compile-time architecture. OMT diagrams and supplemental files are provided as input. The supplemental files contain minor details that OMT diagrams lack, such as attribute data types and permissibility of nulls. This input is operated on by a conversion process. The conversion output is a runtime programming interface (00 language subroutines) and DDL commands to generate RDBMS tables. Initially the conversion process was partially automated. We used AWK’ to create the programming interface from a few hand-coded templates and manually prepared tables that described object classes and relationships. Then we gener-

Connects-to

Connecton

ated RDBMS tables. We created one table for each object class. For relationships we had the choice of creating explicit tables or embedding object IDS as foreign keys. We indicated our decision on a case-bycase basis by the presence or absence of a relationship name on the OMT diagrams. Some decision factors were performance, proliferation of tables, and the likelihood of future changes. Reference [3] contains more details on the mapping process. Later, we drew OMT diagrams with a general purpose graphics editor. The graphics editor produces graphics output in an ASCII markup language for which we have a BNF description. We used LEX and YACC? to compile the BNF definitions. Then we wrote software to recognize connectivity on OMT diagrams and generate programming interface subroutines and DDL statements. Our long-term solution to this

Pin

Bus Segment

’ A\\VK

is ;I Unix

“I.I:S

mrd

Yi\(X:

1001. ;II‘C’ Unix

tools.

I

Circuit Segment 1 Bus Pin FIGURE

q. Part of the OH1 Data Model for Electric Circuit Application

+

FIGURE

CCMM”WlCATlCWSCFT”EACM/November

IY9O/Vd

33. No II

2.

00-DBMS Programming Interface

00-DBMSCompile-lime Architecture

101

Buffered

.

’ /

\

issue is an OM.1‘ diagram editor. A custom editor provides active support for the semantics of OMT diagrams. The OMT edil.or checks for duplicate names and simplifies the drawing of OMT diagrams.

face, later). These routines access buffered objects through an 00 language. The shadowing routines relational database the access through the KDBMS programming interface (usually cursors on tables).

00-Di3MS Architecture

Key

Figure 3 summarizes 00-DBMS run-time architecture. There are two modes of open ation: buf’f’ered and concurrent. To access buffered objects, the programmer first loads one or more sections into memory in a checkout protocol. Concurrent objects are accessed without using the protocol. In either case, the programmer deals ,wil.h objects and relationships by calling the approinterface functions. The priate mechanics of database interaction are hidden. In many cases, the operations are recorded directly into memory, and updating of the database is deferred. We define datuthse shrrdourityg as this mode of transparbuffered database access ent, on objects and through operations relationships. The 00-DBMS programming interface is supported by internal shadowing routines (see section on 00-DBMS Programming Inter-

\

\

\

\

3. OO-DBMSRun-lime Arthitethire

FIGURE

Run-time

Concurrent

Design

Criteria

Performance was OUI‘ most important design criterion for the OODBMS. The shadowing mechanism improves interactive performance by using the 00 language to search memory and by deferring database writes. Most read operations can be performed without accessing the database. The 00-DBMS eliminates redundant database writes. For example, an object that is inserted and then deleted requires no database activity. The shadowing mechanism intercepts intermediate activity and only applies the net result to the database. The initial delay upon loading sections into memory was tolerable for our purposes. Considerable elision is expected under normal application usage. In a typical session, a user will concentrate on a few areas of a design, making repeated revisions of the same object before committing the results to the database.

Another design goal was to increase programmer productivity. The 00-DBMS eases the burden of using a database. To a large extent the programmer can think in terms of the 00 language and forget about database interface details. 00 languages also provide robust libraries of tested code and facilitate code reuse. The OO-DBMS reduces programming errors since the programmer does not become confused by the mismatch between programming languages and database languages. Instead of using linked lists, trees, and hash tables, one operates on objects and relationships. Objects and relationships provide a simple, uniform programming paradigm. Extensibility was another design criteria. Additional functions were added as the need arose, and future additions are expected. For example, we plan to incorporate propagation of operations among objects [al]. Key

Design

Decisions

Concurrent versus buffered access The 00-DBMS supports two types of database access-co)lcurre)lt and

Concurrent data has global scope and is visible to all users. Concurrent data is accessed directly from the database and locked for the shortest possible amount of time. System data for sections (discussed later) is an example of concurrent data. In contrast, buffered data is private to a single user and is locked by a user for the duration of an interactive session. Buffered data is loaded into memory, operated upon, and then saved back to the database. Buffering reduces database traffic and improves application performance. Each database table is either concurrent or buffered. Both categories look alike to the programmer; both categories support the same operations and have the same syntax. The 00-DBMS keeps track of which objects are buffered and which are concurrent. The only difference between the two categories is the time at which changes are committed to the database and the degree of concurrency. Updates on concurrent tables are immediately applied to the database. Updates on buffered tables are accumulated in memory for later writing upon an explicit save command or at the end of a session. In practice, only a few tables contain data that must be shared among concurrent users and must be assigned to the concurrent category. The other tables can be buffered.

buffered.

Sections A section is a subset of the database

that can be independently manipulated. Each instance of an object or relationship belongs to a single section. Each section contains zero or more data items from each buffered database table. In other words, sections partition the data instances and cut across all the buffered classes and relationships in the data model. An application must lock a section before accessing its data. This is called checking out a section. An application may check out one or more sections. Other users cannot read or write to checked-out sec-

CCMM”YICAT,CNSCFTWEAC1CM/Nove~~~bcr

1990iVol.33,

tions. The notion of a section only applies to buffered data. Concurrent data cannot be checked out and can always be accessed by any application. Database versus nondatabase definitions

There are two kinds of class and relationship definitions in the OOdatabase or nondatabase. DBMS: Database classes and relationships are handled both by the 00 language and the KDBMS and may be saved in the database. Nondatabase definitions are handled only by the 00 language and may not be saved. Both kinds are useful. There is no point in creating database tables for nondatabase objects. Modifying nondatabase definitions only requires recompilation of the application, while the database must be rebuilt if any database definitions are changed. Programmers use database definitions sparingly while nondatabase definitions are used freely. Persistence

Object instances are either persistent or trunsient. Persistent objects remain in the database beyond the life of a program execution. Transient objects are newly created objects stored only in memory and disappear when an application program terminates. Database objects may be transient or persistent and may be converted from transient to persistent. Nondatabase objects are transient only. Transient objects may temporarily violate database integrity rules. For example, the copy operation may create a transient object with a primary key that matches the key of a persistent object. This may also occur when the transient copy is going to be further updated before ultimate database insertion. A transient object that satisfies database integrity may be converted to a persistent object with an INSERT command (see section on Operations on Objects). IDS for

object

references

An ID [I31 is an arbitrary

No 11

handle

for referring to an object. Every object has a unique ID. IDS are automatically generated by the OODBMS and are not subject to user update. ID allocation involves concurrency issues in a multiuser environment. Because of their stability, IDS are particularly useful for object references. Contrast this with the RDBMS scenario where the value of a primary key changes and all foreign keys that refer to it must be updated. All object database tables have an ID as the primary key. All relationship database tables use one or more IDS from participating objects as the primary key. All IDS are 32 bits long because the system is being run on a 32-bit machine. Section IDS use only I6 bits and pad the remainder. Object IDS have I6 bits to identify the section and another I6 bits to resolve identity within a section. One benefit of this ID allocation scheme is that once a section is locked, object IDS can be assigned within that section without consulting the database. Another benefit is that if database storage of records within tables by IDS can be ordered, objects and relationships for a given section will cluster within the tables. This improves the efficiency of section loading and section saving. oo-DBMS DragrammIng InterSace The 00-DBMS provides operations on objects, relationships, and sections. The programmer executes an operation by invoking the function for the operation on the appropriate class. The function in turn calls internal routines for performing buffered insert, delete, update, load, or save database operations (see section on the Internal Buffering Mechanism). The following functions cover insertion, deletion, updating, and retrieval of objects and relationships from memory and the database. The functions provide the same functionality that would ordinarily be provided by a cursor interface to a database. These func-

103

tions have applications. Operations

been

on

sufficient

for

our

Ohrjerts

The 00-DBMS includes functions to perform the fi)llowing operations for each object class. NEW (classname) Creates and returns a new empty transient object of the given class. Issues a creulp request to the buff‘ering system. After an object is created, data may be entered with a PUT operation. An object is transient until an INSERT is performed. PUT (attributename, object, value) Fills in an attribute by overwriting previous cements. Issues an update request to’ the buffering system. GET (attributename, opject) Returns an attribute value from an object. No action is required by the buffering system. RETRIEVE (classname, keyname, valuel, ) Returns a set of persistent objects whose keys match the sequence of’ values. Transient objects are list ignored. A key is a predefined of attribute names used to select objects. Any number of keys may he definecl per ollject class. RETRIEVE-ALL (classname) Retrieves all persistent objects in a class. All concurrent objects are found; bufferccl objects are found only for sections that have been loaded. INSERT (object) Converts a transient object into a persistent one. lsslles an insert request to the buffering system. An error is raised if the object violates database integrity. DELEI‘E (object) Destroys an object. issues a drlete request to the buf‘fering system. An error is raised if the object belongs to any relationship. COPY (object) Returns a transient copy of the object, including its Idata. Issues a create request to the buffering system. COPY is equivalent to NEW followed by many GETS and PUTS.

Operations

on

Relationships

The 00-DBMS includes functions to perform the following operations for each relationship with an explicit table. INSER’I‘

(relationshipname, obj I, obj2) Adds a pair of persistent objects to a relationship. Issues an zrrsert request to the buffering system. DELETE (relationshipname, obj I,

ol?P) Deletes an object pair from a relationship. Issues a delete request to the huf’f’ering system. DELETEI (relationshipname, objl) Deletes all object pairs from a relationship where o/j1 is the first object in the pair. Issues a delete request to the buffering system for each pair. DELE’I‘E-2 (relationshipname, obj2) Deletes all object pairs from a relationship where oDj2 is the second object in the pair. Issues a dc4etp request to the buffering system for each pair. RE’I‘RIEVEI (relationshipname, objl) Searches relationship to find object pairs in which objl is the first object. Returns a set of objects. REl‘KlEVE-2 (relationshipname, obj2) Searches relationship to find object pairs in which oDj2 is the second object. Returns a set of’ ob,jects. objl, ‘I‘ES’I’ (relationshipname, obj2) ‘l‘ests whether an object pair is a member of the relationship. Returns a boolean value.

memory aborts previous changes for the section. SECTION-SAVE (section) Commits changes for a given section to the database. Save requests are issued to the buffering system for objects and relationships requiring insertion, dele. tion, or updating.

Internal Mechanism

BuiQeelng

The programmer accesses data by first locking and loading one or more sections into memory. As the data is loaded, the 00-DBMS builds data structures used to search memory for objects and relationships. Subsequent operations are performed in memory. Ultimately, changes made to a section are discarded or saved to the database. Each buffered object has a state. When a section is loaded, all of its objects are in the PERSISTENT state. Each time an internal operation is performed on an object, its state is checked to cletermine what action to take and what the new state should be. Figure 4 shows a state transition diagram. States are sliown as uppercase names; actions are the lowercase names next to arcs. EuffeHng

States

‘I‘KANSIENT Refers to transient objects. A newly created or copied object has this state. An insert operation ch;mgcs the state to INSERT. Objects in this state are ignored during a save operation. Update operations 011 objects in this state leave the state &h;mged. Ob1 jects deleted f.rom this state are discardecl. Operations on Sections l’E:KSlS’l‘Eh7 The 00-DBMS includes functions Refers to ur~modified persistent for controlling the buffering of secobjects. Objects loaded from the tions. database are put in this state. An update operation f’rom this state SECTION-LOAD (section) 1,ocks a given database section sets the state to UPDATE. A deand loads it into memory. An lete operation sets the state to DEIKI‘E. Objects in this state error occurs if the section is alare iCgiiored during a save operaready loaded by another process. tioil because they are already up Reloading a section already in

Save

0

Database

Update

Update, Save

(New) 0

Load

FIGURE

to-date in ~hc database. An insert from this state is an error. DIcl.E’l‘E Kef’ers to prrsistellt objects that must be deleted from the database. A save operation deletes objects in this state froRi the database and discards the inmemory copy. Any other opcraLions are errors, INSEKI‘ Kcfers to persistent objects that have been created during the current session but not yet written to the database. A save inserts objects that are in this state into the clatabase and sets their state to I’EKSISTENT. Updates to objects in this state do not change their state. Delete simply discards the object. An insert from this state is an error. UI’DAI‘E Kctcrs to persistent objects that have been modified since being

4.

Buffering AlgOritltfN States

loaded into memory. A save operation updates these objects in the database and sets their state to I’EKSIS’I‘ENT. Updates to objects in this state do not change their state. Delete sets the state to DFLEI‘E. An insert from this state is an error. Internal

Buffering

Operations

The database buff‘ering system is driven by internal II&, savr, imrrt, rlrlcte, o-e&, and ufhte operations. These internal operations and the state information recorded with each buffered object are used to buffer database writes. Each internal operation on aii object may change the state of the object. Each operation is immediately applied to memory, and flags are set on the object so that the changes can eventually be written to the database. load (section)

I.oads a section of the database. Searches all database tables fol rows belonging to the section and creates objects in memory containing the data. save (section) Brings the database up-to-date with a section in memory. Updates the database with information from objects in the insert, update, or delete state. Objects in the persistent or transient state are ignored. Objects in the delete state are discarded and their IDS recovered for reuse. insert (object) Makes an object persistent. This operation is applied to objects in the Lransient state. Performs bookkeeping chores such as ID generation. delete (object) If the object is in the insert state, it is discarded. If the object is persistent, it is placed in the de-

lete state for eventual removal from the database. create (class) Creates a new object of the given class and places the object in the transient state. update (object) Updates object .attributes and places the object in the update state for eventual writing to the database. OF the algoritm~m We now show that ,applying database updates via our buffering scheme yields the same results as immediate database update. Our demonstration only .applies to persistent objects and relationships, since transient and local data are not written to the database. We begin by defining consistency conditions.

(3)

Correctness EuHering

Condition 1: Each database record corresponds to an object or relationship record in memory. Each record in memory has a state that correctly describes the difference between what is in the database and what should be in the database. Condition 2: Each database record is uniquely identified by a primary key. With one exception, no two persistent records in memory may have the same primary key. The one exception is that a record that has been deleted and then inserted may appear twice, once in the insert state and once in the delete state. Assumptions We make the following tions. (1) (2)

(4)

(5)

tion followed by a deferred deletion cancels out, leaving no trace in the buffering system.) Deletions must be processed before insertions, in case there is a duplicate record with the same primary key. For example, primary keys for relationships are formed from the primary keys of the objects being related. If a member of a relationship is deleted and then inserted, there will be two records in memory with the same primary key. Insertions must check persistent records for uniqueness of the primary key. Modification of the primary key is not allowed for persistent records. This is not a problem, since all primary keys are IDS.

Assertion 1 Under consistency conditions, a save operation correctly updates the database. All objects and relationship database tables use IDS as the primary keys. IDS are assigned by the OODBMS and are never changed. The primary key for a record provides a one-to-one mapping between memory and the database. The state of each memory record indicates what actions must be performed to synchronize the database. Records in the PERSISTENT state require no action because they are already up-to-date. Records in the UPDATE state have their nonprimary key attributes updated in the database. Records in the INSERT state are added. Records in the DELETE state are removed.

assump-

Sections are lock.ed for a single user. If there are two records with the same primary key, it is an indication that there was a deferred deletion followed by a deferred insertion and not vice versa. (A deferred inser-

Assertion 2 Consistency conditions hold immediately after a load operation. Data is copied from the database into memory, so the two copies are the same. Each memory record has the same unique primary key that it had in the database. Each record is In the PERSISTENT state, indicating that it agrees with the database.

Assertion 3 Each internal buffering operation preserves consistency. In this section phrases such as “deferred insert” are used to refer to an internal buffering operation. This will distinguish an internal buffering operation from the eventual database action. A deferred insert is only allowed on records in the TRANSIENT state. Part of the deferred insertion process is a check for uniqueness of the primary key. Since the object must belong to a section and since the section ID is part of the key, this check can be made without consulting the database. Records in the INSERT, PERSISTENT, and UPDATE states are checked, and duplication of their primary keys is blocked. On the other hand, since records in the DELETE state are invisible to the programmer, their primary keys may be reused. A deferred delete is allowed fo. in the PERSISTENT, records UPDATE, or INSERT states. Records in the INSERT state are not written in the database and can be simply discarded. Records in the PERSISTENT or UPDATE state have corresponding information in the database and are placed in the DELETE state for later removal. A deferred update is allowed for records in all states except the DELETE state. With the exception of the TRANSIENT state, modification of the primary key is not allowed. Modification of the primary key of a record in the TRANSIENT state is allowed because such records are temporary. Induction By induction, since each individual buffering operation preserves consist.ency, a series of buffering operations also preserves consistency. Siclirectional

Linkage

The 00-DBMS must quickly map from the database primary key to the 00 language pointer and vice versa. Some possible implementation techniques are:

(1) (2)

pair of tables + hashing pair of trees.

We used the container class relntionsl~if, of our 00 language, which was implemented with a pair of tables plus hashing. Hashing algorithms and table lookup are provided by 00 language libraries. Integrity

Checking

The 00-DBMS incorporates rudimentary integrity checking. The 00-DBMS enforces the uniqueness of primary keys and candidate keys and nonnull specifications. We also provide some support for enumeratlon types and range checking. Since the 00-DBMS is essentially a layer that wraps around an RDBMS, we could include more thorough integrity support in the future.

jects. One can navigate DSM objects in a manner similar to navigating RDBMS tables. DSM can automatically enforce certain constraints, such as existence dependencies between objects. DSM has a rich library of predefined classes. Applketion

Experience

We have used the 00-DBMS described in this article to support an editor for electric circuit design. The circuit eclitor stores its data in a

AuullCatlon oi the OO-DBMS Choice of SuEsystems

so#tware

Our implementation of the OODBMS was built on top of DSM, MIMER, and the Object Modeling Technique. The 00-DBMS could be ported to another RDBMS by rewritin,g the database interface. The 00-DBMS could be ported to another 00 language by emulating the DSM relation feature [19]. We chose the MIMERa DBMS for reasons unrelated to its technical merits. MIMER is more or less a conventional SQL-like relational DBMS [7. 181. As with its competitors, MIMER implements certain aspects of the relational model well, yet contains arbitrary implementation restrictions on others. For example, a secondary index is restricted to a single attribute; it cannot be composite. We chose the Data Structure Manager (DSM) [ 19, 221 as our 00 language because of its technical features and in-house availability. DSM is a full-functionality 00 language developed at GE. DSM runs on top of the C language. The most noteworthy DSM feature is its support for relationships among ob-

COMMUNlCAllOWSOCT”EACM/Nowmbcr

1990,“0,.33,

database for access by other programs such as mathematical simulators. The object-oriented layer allows the circuit editor to receive database services such as data persistence and concurrent access, yet it still responds in real time. Our electric circuit application supports interactive graphical editing and requires a fast response to keep pace with the user. We designed the 00-DBMS sporadically over the course of a year. Once we had completed the design, it took three months to implement the 00-DBMS. The extensive DSM library was largely responsible for the short implementation time.

No.,,

The full electric circuit application had fourteen full pages of OMT diagrams. There were 82 object classes and 45 relationships, yielding 108 database tables. Programmers wrote several thousand lines of code for programming interface function templates. The performance of the resulting 00-DBMS was sufficient to support an interactive electrical circuit editor running on a MicroVAXa computer. The. OODBMS keeps pace with interactive mouse movement. An RDBMS by itself cannot support real-time performance because of I/O delay and commancl-processing overhead. It takes the 00-DBMS one second to perform a mixed sequence of several thousand object and relationship operations m memory, running on a MicroVAX II@. The same sequence of operatious would take more than a minute if they were applied directly to the database. This methodology is currently being used for another interactive graphical application. Most of the effort required to adapt it to this new application is the preparation of a new data model. Concluslons We have described an approach to implementing an object-oriented DBMS (00-DBMS). One can take relational DBMS existing an (RDBMS) and hide it beneath an object-oriented programming language. The buffering algorithm yields fast interactive performance while storing objects in a database. Our work demonstrates that it is possible to build an object-oriented DBMS on top of a relational DBMS and still get good performance. This approach combines the best features of’ both KDBMS and 00 programming languages. RDBMSs have a sound theoretical base and work well for business applications. Commercial products have robust concurrency, ,journaling, and rollback facilities. At the same time we obtain the benefits of using objects to abstract an application. The programming language allows com-

107

plex algorithms to be written that would be hard to express in an RDBMS, without frequent access of the RDBMS within the algorithm. This approach nninimizes the

amount

of new code Ithat must be

written, since it uses existing software subsystems. An 00 design methodology is the “glue” that binds together the RDBMS and 00 language. Our “OC)-DBMS” lacks the functionality of a full system. Nevertheless, our appr0ach to an OO-DBMS is quite effective for

many applications. In our applications, we have found that 00 programming improves programme’r productivity,

relative languages approach

to conventiona. like

Pascal

procedural and

C. Our

to an 00-,DBMS enables one to reap these productivity gains while using an RDBMS. q References 1. Andrews. T., and Harris, C. Combining language and database advances in an object-oriented develIn envir,onnient. opmcrlt Procrrdiug.< (q OOPSLA ‘87 Conference. (Oct. 4-8, Orlando, FL.) ACM/ SIGPLAN, New York, 1988, pp. 142-152. 2. Atkinson, M.P., and Buneman, O.P. Types and persistence in database programming languages. ACM Comput. Surv. ZY, 2 (June 1987), 103-190. 3. Blaha, M. R., Premerlani, W.J., and Rumhaugh. J. E. Relational database design using an ohjectoriented methodology. Commr~n. ACM 31, 4 (Apr. 1988), 414-427. 4. Booth, G. Object-oriented development. IEEE Trans. Softw. Eng. SE12. 2 (Feb. 1986), 21 l-221. 5. Carey, M., et al. The architecture of the EXODUS extrnsihle DBMS. In Proceeding5 of the 1986 Intrrnalional Workshop on Object-.Oriented Databa.w Systp,ns (Sept. 23-26, Pacific Grove, Calif.). ACMISIGMOD, New York, 1986. pp. 52-65.

6. Cattell, R.G.G., and Rogers, T.R. Combining object-oriented and relational models of clata. In Proceedings of the 1986 International WorkDalabase .st1op on Objw-Orwnted .S@,ns (Sept. 23-26, Pacific Grove, Calif). ACMISIGMOD, New York, 1986, pp. 212-213.

108

7. Codd. E.F. A relational model of data for large shared data hanks. (,‘ommun. ACM 13. 6 (June 1970).

377-387. 8. Cox. B. J. Object-Orienlrd Propam miug: Au Euolurionary Apponch. Addison-Wesley, Reading, Mass., 1986. 9. Dayal, U., et al. PROBE-A research project in knowledgeoriented database systems: Preliminary analysis. Tech. Rep. CCA-8503. Computer Corporation of America, Cambridge, Ma., 1985. 10. Gadient. A.J. Functional requirements for an electronic design autoenvironment integration mation frame\cork. In Comi)inl 85: Fir-St Intrrwtioncd Conference 00 Computer Aided Technologie,y (Sept. IO- 12, IEEE-G, Montreal, Canada). Washington, D.C.. 1985, pp. 34%

York, 1987, pp. 192-202. 17. hlaier, D., Stein, J., Otis, A., and Purdy, A. Development of an ohjcct-oriented DBMS. In Pr-orrrdiugx of OOPSLA ‘86 CO~IJ~~CIICP (Sept. 29Oct. 2, Portland, Oreg.). ACM/ SIGPLAN, New York, 1986. pp.

354. 11. Haskin.

R.L., and Lorie, R.A. On extending the functions of a relational database system. In Proceediqs of SIGMOD ‘82 International Cor~wence on Ma?ragemenl of Dala Fla.). ACM/ (June 2-4, Orlatldo, SIGMOD, New York, 1982, pp. 207-212. 12. Kernighan, B. W.. and Pike, R. The UNIX Progl-ammiug Enuironmen~. Prentice-Hall, Englewood Cliffs, N.J.. 1984. 13. Khoshafian, S. N., and Copeland, G. 1’. Object identity. In Proceedings oJOOPSLA ‘86 Collfirence (Sept. 29Or.). ACM/ Oct. 2, Portland, SIGPLAN, New York, 1986, pp. 406-4 16.

14. Kim, W. et al. Integrating an ohjcctoriented programming system with a database system. In Proceedings of OOPSLA ‘88 Con@rence (Sept. 2530, San Diego, Calif.). ACM/ SIGPLAN, New York, 1988, pp. 142-152. 15. Klahold, P., et al. A transaction model supporting complex applications. In Procredinp of SIGMOD ‘85 International Conference on Mowagement oJ Data (Mq 28-31, AUS& Tex.). ACMISIGPLAN, NPW York,

1985, pp. 388-401. 16. Loomis, M.E.S., Shah, A.V.S., and Rumhaugh. J.E. An object modeling technique for conceptual design. In Proceedings of the Eurofwu~ Conference on Object-Oriented Programming (June 15-17, Paris, France). Lecture Notes in Computer Science, 276. Springer-Verlag, New

November

472-482. 18. MIMER

Infin-mation Systems AB, Uppsala, Sweden. 19. Rumhaugh. J.E. Relations as semantic constructs in an ohjectoriented language. In Proceedings of OOPSLA ‘87 Confrreuce (Oct. 4-8, Orlando, Fla.). ACM/SIGPLAN. New York, 1987. pp. 466-481.

20. Rumhaugh, J. E. Controlling propagation of operations using attrihutes on relations. In Proceedinp of OOPSLA ‘58 Conferewe (Sept. 2530, San Diego, (ialif.). ACM/ SIGPLAN, New York, 1988. pp.

2X5-296. 21. Rumhaugh

J. et al. Objecf-Orienfed Modeling and Drsip. Prentice-Hall. Englewood Cliffs. N.J., 1991. 22. Shah, A. et al. DSM: An ohjectrelationship modeling language. In Proceeding7 of OOPSLA ‘89 Collferrnce (Oct. l-6. New Orleans, La.). ACM/SIGPLAN, New York, 1989, pp. 191-202. 23. Stonehraker, M., and Rowe, L. The design of POSrGRES. In Proceedings of SIGMOD ‘86 Infrrnalional Confwrnce on Management of Data (May 28-30, Washington, D.C.). ACMISIGMOD, New York, pp.

340-355. CR Categories and Subject Descriptors: D.2.2 [Software Engineering]: Tools and Techniques; D.2. IO [Software Engineering]: DesignmelhodoloLgie?ps; H.2.1 [Database Management]: Logical Design; H.2.4 [Database Management]: Systems General Terms: Design, Performance Additional Key Words and Phrases:

Database checkout, database performance, database shadowing, engineering database application, entity-relationship modeling, long transaction, object-oriented datahase, relational database About the Authors: MICHAEL R. BLAHA

is a computer scientist at GE’s Corporate Research and Development Center in Schenectady, New York. His research interests include engineering database management and complex data modeling. Email: [email protected]

199Ohb1.33,

No.ll/COMMUNICATIOWSOFT”EACM

J. PREMERLANI is a computer scientist at GE’s Corporate Kesearch and Development Center. His include objectinterests I-CSC~tl-Ch oriented methodologies and applications of databases to engineering applications. Email: [email protected]

WILLIAM

E. RUMBAUGH is a computer scientist at GE’s Corporate Research and Dcvelopmcnt Center and is working ml object-oriented methodologies for software design and their imple-

mentation as practical systems for applications. Email: rumbaugh~crd.ge.cc,m A. VARWIG is a senior software design engineer at Cadence in San Diego. His research interests include tools for printed circuit hoard design including automatic placement and routing. Email: [email protected]

THOMAS

.JAMES

Author’s

Present

Address:

M. K. Hlaha, W. J. Premerlani, and J. E. Kmnbaugh, General Electric Company, Corporate Research and Development,

NEW DIRECTIONS COMPUTING AND

IN COMMUNICATIONS

m Journalof VisualCommunication andImageRepresentation EDITORS-IN-CHIEF

Yehoshua Y. Zeevi Technion-lsrue/ Institute of Technology, Ha+

and CAlP Center, Rutgers University, Piscatawuy, New jersey

T. Russell Hsing Bell Communications Research, Morristown,

New lersey

The Journal of Visual Communication and Image Representation publishes papers on the state-of-the-art of visual communication and image representation with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems. Volume 2 (1991),4 issues

ISSN1047.3203

In the USA and Canada:$128.00

All other countries:$154.00

Journalof Paralleland DistributedComputing Kai Hwang

Howard Jay Siegel

EDITOR-IN-CHIEF FOR SPECIAL ISSUES AND INVITED PAPERS University of Southern California, Los Angeles

EDITOR-IN-CHIEF FOR SUBMITTED RESEARCH PAPERS Purdue University, West Lafayette, Indiana

This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal features special issues devoted to specific topics such as: parallel architectures and algorithms; algorithms for hypercube computers; parallelism in computer arithmetic; concurrent hypercube computations; frontiers of massively parallel computation; and languages, compilers, and environments for parallel programming. ISSN0743.7315

Volumes11-13 (19911,12 issuer

In the USA and Canada:$279.00

All other cowfries:$346.00

Sample copies and privileged personal rates are available upon request. For more information, pleasewrite or call:

ACADEMIC

PRESS, INC.,

1250 Sixth Avenue,

San Diego,

Journal Promotion Department CA 92101, U.S.A. (619) 699-6742

All prices are in U.S. dollars and me subject to change without notice. Circle

# 15 on Reader

Flll,~ Service

Card

109