data from application domains like computer-aided design (CAD), ... database applications (LDA) impede the establishment of an integrated information ...
Using Graph Grammars for Building the Varlet Database Reverse Engineering Environment Jens H. Jahnke, Albert Zündorf AG-Softwaretechnik, Fachbereich 17, Universität Paderborn, Warburger Str. 100, D-33098 Paderborn, Germany; e-mail: [jahnke|zuendorf]@uni-paderborn.de Abstract This paper reports on the usage of graph grammar theory and systems for building the Varlet database reverse engineering environment. The Varlet environment supports analysis of legacy relational database systems, translation of the relational schema into a conceptual object-oriented schema, interactive enhancement of the conceptual schema and translation of relational data into the resulting object-oriented database. Construction of the Varlet environment exploits the Progres prototyping mechanism. In a previous publication we showed how to employ Triple Graph Grammars to generate a flexible translation mechanism from the relational to the objectoriented data model. However, due to the semantic gap between the relational and the object-oriented data model there are many possible equivalent translations. Thus, this paper describes an extension of the previously described approach by providing so-called redesign transformations which allow the enhancement of the target data model by fully exploiting higher-level object-oriented concepts. At this, we use graph grammar theory to prove important properties of redesign transformations (e.g. losslessness).
1. Introduction A crucial factor for the competitiveness of today’s companies is an efficient infrastructure for information management. Such an infrastructure often demands to merge business data with engineering data from application domains like computer-aided design (CAD), computer-integrated manufacturing (CIM), or software engineering (SE). Complex dependencies between data stored in different systems (CODASYL, relational, object-oriented) might exist and should be maintained. There are various techniques to achieve the desired integration, e.g. gateways, federated database systems, database migration, etc. [Rad95] discusses these approaches with their specific benefits and drawbacks. A prerequisite to use any of these techniques is to have a complete technical documentation of the information systems which are to be integrated. Unfortunately, this is lacking in many industrial database applications that have evolved over several generations of programmers. Such so-called legacy database applications (LDA) impede the establishment of an integrated information management. Thus, database reverse engineering aims to recover an up-to-date conceptual design of an LDA’s static data structure. This can be a delicate problem, because the schema catalogs of older database systems provide only information about the physical (low-level) representation of data. For example, older relational database management systems (RDBMS) do not support means to explicitly represent relationships between database tables (foreign-key constraints)[BCN92] in their schema definition. Abstract modeling concepts like inheritance, aggregation, and n-ary associations cannot be expressed in the relational data model. Additionally, many physical schemas of LDAs comprise optimizations that make it even harder to grasp the real semantics of the data structure. The first activity in the data reverse engineering process is to analyze all available sources of information about the LDA, in order to yield a semantically annotated database schema. Then, the annotated schema is transformed into an equivalent object-oriented (OO) conceptual schema. Like the analysis task, this transformation has to be done interactively, because, in general, there are many possible conceptual schemas for a given LDA. Both activities are performed incrementally in an intertwined reverse engineering process.
Tool support is especially important to cope with the complexity of reverse engineering LDA that comprise several hundred thousand lines of code and maintain a vast amount of data. However, the development of environments that support the described process is a demanding task. Currently available tools are insufficient, because they impose waterfall-like, strictly phase-oriented processes. In many cases, their transformation rules are hard-coded in a low-level programming language and cannot be modified or extended easily. To build a database reverse engineering tool which overcomes these problems we follow an approach proposed by [Nag96]. This means, we use abstract syntax graphs and structure oriented editors specified with graph grammars [Ros97] for building relational and conceptual schema editors. To transform relational into conceptual graph schemata and to keep these schemata consistent during incremental editing, we adapted Triple Graph Grammars (TGG) invented by [LS96]. This allows the development of a set of TGG schema transformation rules specifying the translation from relational to conceptual schemata, cf. [JSZ96]. In [Hol97] we developed a compiler that facilitates the derivation of incremental forward and backward transformation tools from such TGG specifications. This approach was used to derive the first version of our Varlet database reengineering environment. Experiences with this first prototype showed that our TGG specification soon grew exuberantly in order to describe all possible (and reasonable) schema translations. To overcome this problem, in the current version of the Varlet environment we combine a simple TGG-based schema mapping with a set of conceptual schema redesign transformations defined by graph rewriting rules. We first use the TGG to translate the annotated relational schema into an initial OO schema. Then we employ redesign transformations to enhance this initial translation to fully exploit the higher level object-oriented concepts. In addition this approach makes it easier to incorporate other data models for legacy database systems, e.g. the hierarchical or network data model. In this paper we report on our experiences with the application of graph grammars for developing Varlet as a flexible and incremental database reverse engineering tool. Major benefits of using graph grammars in Varlet are their high level of abstraction and their rich theory. In Section 2 we illustrate the desired functionality by a small reverse engineering sample scenario that is used throughout this paper. In Section 3 we employ graph grammar theory to determine important properties of schema redesign transformations, e.g. the impact of transformations on the information capacity of transformed schemas. Finally, in Section 4 we summarize our results and compare our approach with related work.
2. From Relational Schemata to equivalent OO Schemata A sample scenario The analysis starts with the extraction of the LDA’s schema catalog. This will reveal at least the existing tables together with their attributes. Schema catalogs of newer database applications may additionally contain information on candidate keys and referential integrity constraints. Figure 1 shows a screenshot of the Varlet Analysis Tool, that displays a detail of a relational document information system. At this, foreign keys are represented as directed edges between tables and attributes that belong to candidate keys are set in bold face. As stated in the introduction, a fully automatic schema extraction will normally fail to detect all relevant information and to recover the higher level design concepts of a schema. Thus, the analysis tool supports the reverse engineer in retrieving further semantic information about the legacy schema, by inspecting the application code, stored procedures, and event handling procedures and by inspecting the data itself. The analysis results are used to further annotate the relational schema, e.g. the equal sign “=” in the triangles of the foreign key from table Version to ProductSpec (cf. Figure 1) denotes the information that the reverse engineer assumes an inclusion dependency (IND) over columns doc_id and id in both directions. Such a foreign key with an inverse IND imposes the cardinality constraint that for every docid in ResearchReport there exists a tuple in Version with the same value in column id and vice-versa. Furthermore, in our example, the analysis reveals that there are two different variants of tuples in table Version, and that the legacy schema comprises an optimization structure for a many-to-many relationship between tables Keywords and ProductSpec that consists of five for-
= =
Figure 1. : Varlet Environment, Relational Analysis View eign keys (doc0-4) [PB94]. A detailed description on how these informations can be extracted is beyond the scope of this paper and can be found in other contributions [JH98, JSZ97, FV95, And94, PB94, PKBT94, SLGC94]. Once the relational schema is retrieved and enriched by our analysis tool, it is transformed into an equivalent OO schema. This transformation must exploit all semantic annotations and has to employ the additional expressive power of the OO data model as far as possible. Figure 2 shows a screenshot of the Varlet (Re)Design Tool, which displays the initial transformation of our sample relations. The knowledge about the different variants of tuples in table Version was used to create an inheritance hierarchy for class Version with two new subclasses (ArchiveVersion and OnlineVersion)3. The two occurring relational implementations of many-to-many relationships, i.e. the join table Xref and the optimization structure between tables Keywords and ProductSpec are mapped to associations between the corresponding classes. Furthermore, knowledge about not-null constraints and inverse INDs in the relational schema is used to determine cardinality constraints of OO associations.
Figure 2. : Varlet Envirnoment, (Re)Design View 3. The names of new classes cannot be determined by the transformation tool, but have to be added by the user.
Triple Graph Grammars Transforming the relational schema into an conceptual schema is a central task in database reengineering. Within this task one frequently faces unforeseen situations like denormalized relational schemas and various kinds of optimization structures. Later on, the reengineer stepwise enhances the initial OO schema exploiting additional concepts like aggregation and inheritance. This frequently triggers further analysis and annotation of the original legacy schema. Many currently existing database reverse engineering tools are not able to deal with such situations because their transformation process is hardcoded using a general purpose programming language. On the other hand, the problem of keeping the two schemas consistent is a well known application for so called Triple Graph Grammars originally introduced by Lefering and Schürr [LS96] and further developed in [JSZ96, Hol97]. Thus, in our approach, the mapping between the relational and the OO data model is defined by an adaptable set of schema mapping rules defined by a Triple Graph Grammar (TGG). Using TGGs enables us to specify schema mapping rules on a very high level of abstraction and to adapt the schema mapping to new situations, easily. Due to the lack of space, for details concerning the usage of TGGs for schema mapping see [JSZ96].
3. Schema Redesign Transformations Due to the semantic gap between the two data models the OO schema resulting from the initial schema transformation is still quite “relational”. Basically, tables become classes, table attributes become class attributes, and inclusion dependencies are turned into associations. In order to exploit the additional OO concepts like aggregation (i.e. complex attribute types) and inheritance, the initial OO translation of the legacy schema can interactively be restructured by using the Varlet (Re)Design tool. To achieve this, Varlet provides a number of schema redesign transformations defined by graph rewriting rules. To illustrate the application of redesign transformations in Varlet Figure 3 shows a second screenshot of the (Re)Design tool that displays a restructured version of the initial OO schema shown in Figure 2: it contains a new generalization Document for classes ProductSpec and ResearchReport, class ProductSpec has been split in two associated classes ProductSpec and Product, class Version has been aggregated into class ProductSpec, class Keywords has been transformed into a complex attribute of class ProductSpec, and artificial keys have been removed. In the folowing, we will try to underline our ideas with some kind of semi-formal definitions and propositions and a proof sketch. Our theory is based on the double-pushout approach, cf. [Ros97]. Thus, we employ “edge-object” graphs consisting of sets of node and edge identifiers, source and target functions for edges, and node and edge labeling functions. In addition, our nodes are attributed. Within this paper we model databases as graphs consisting of a database schema representation (upper part of Figure 4) and a representation of the current database extension (lower part of Figure 4). We use nodes of type Class to represent classes. Associations are represented by two association stub nodes mutually connected via assoc edges. An association stub node represents the role that the corresponSplitAttrs (
Figure 3. Redesigned Object-Oriented Schema
database schema assoc
assoc
: 1-Ref
t_o ins
i link
j
: Obj has
m : 1-Ref
o o o
: Obj has
link
l
f
inst_of
inst_of
inst_of has
inst_of
h : Obj
g : Class
: Class
n : SetRef
inst_of
f
has
inst_of
has
e = class
d : SetRefT has
c : SetRefT
inst_of
b : 1-RefT
a : 1-RefT has
k : Obj has o : SetRef
database extension Figure 4. (Abstract Syntax) Graph representing parts of an OO Schemata ding class has in that association. We use nodes of type 1-RefT to represent “exactly-one” roles, nodes of type 0-1-RefT to represent “at-most” one roles, and nodes of type SetRefT to represent “many” roles. AttrT nodes represent the attributes of classes. Within graph rewriting rules we use type ClassPropT as super type for all kinds of class properties. All class properties are connected to their class via a has edge. All database schema nodes carry a name attribute. Nodes of type Obj represent objects in a database extension. Object properties are represented via 1-Ref nodes (exactly-one links), SetRef nodes (many-links), etc. Each node in the database extension has exactly one inst_of edge connecting it to the corresponding database schema element. For database graphs a large number of consistency constraints must hold that are out of scope for this paper. As one example, we demand that a 1-Ref node has exactly one outgoing link edge connecting it to another link stub node. (Note, SetRef nodes may have no or multiple outgoing link edges.) Redesign Transformation Rules Formally, a schema redesign transformation rule T is defined as a pair of (double-pushout) graph rewriting rules T=(s,i) where s is called the structure transformation rule and i is the instance mapping rule of T. The structure transformation rule represents a function s:ST→S* that is defined on the subset ST⊂ S* of all schemas S* that satisfy the precondition of T, i.e. which contain a valid match for the left-hand side of s. It replaces a given source schema S∈ST by a target schema S’. Consequently, the instance mapping rule i:µ(S)→µ(S’) converts valid database extensions of the original schema into valid extensions of the target schema S’. At this, µ(S) denotes the information capacity of a given schema S which is defined as the set of all valid database states (or instances) of S. In addition, s must be contained in i, i.e. there exists the inclusion morphism id: s → i. Later on, s will serve as a subrule for the amalgamation of i to a parallel graph rewriting rule (cf. [Tae96]) of a graph schema together with its extension(s). Figure 5 shows an example redesign transformation rule, which consists of the structure transformation rule (upper part of Figure 5) and the instance mapping rule (whole Figure 5). Note, we use a Progres like notation, cf. [Ros97, Nag96], to represent our graph rewriting rules. However, we employ the double-pushout approach. Thus, the figure shows only the left graph L and the right graph R of the represented graph rewriting rule. The glueing graph K consists of the common parts of L and R. The morphisms l:K→L and r:K→R are the obvious inclusions. Applied to a class c, the structure transformation rule SplitClass generates a new class 5 with name newName and with a 1:1 association to the original class. Consequently, the instance mapping rule migrates objects of the original class to linked pairs of objects that comply to the new schema. Execution of Redesign Transformation Rules The parallel execution of a redesign transformation rule is defined by fully synchronized partial coverings [Tae96], where the instance mapping rules are fully synchronized by the (contained) structure
transformation SplitClass( c :Class, newName :String) =
structure transformation rule
assoc
c
inst_of
inst_of
:=:
2 : Obj
instance mapping rule
:=:
inst_of
c
4 : 1-RefT 5’.name has = newName 5 : Class inst_of
∀ c2 ∈ Class : c2.name != newName;
2 : Obj has 7 : 1-Ref
inst_of
3 : 1-RefT has
6 : Obj link
has 8 : 1-Ref
Figure 5. Preserving Redesign Transformation SplitClass transformation rule. In addition we consider only injective matches for the structure transformation rule and each application of the instance mapping rule. Informally, a redesign transformation rule is executed by first finding a match for (the left-hand side) of its structure transformation rule. Then, we look for all possible extensions of the structure rule match to matches for the instance mapping rule. Thereby, we construct an amalgamated graph rewriting rule. The execution of the amalgamated graph rewriting rule executes the structure mapping rule only once, while the „instance“ parts apply to all occurrences. In Figure 5 this means, we first look for a match for class c (which is actually passed as parameter). Then we look for all objects in the current database extension that comply to node 2, i.e. that are connected to the match of c via inst_of edges. The resulting amalgamated rule generates a new class 5 with name newName and with a 1:1 association to the original class. In addition, all instances of the original class are transformed to linked pairs of objects that comply to the new schema. In order to be applicable, our sample redesign transformation rule SplitClass (Figure 5) has to fulfill the parallel gluing condition [Tae96, Def. 4.2.10]. The parallel gluing condition requires that the amalgamated rule fulfills the dangling edge and the identification condition. At this, the dangling edge condition requires that the deletion of nodes must not create dangling edges and the identification condition demands that edges or nodes that match the same host graph element have to be preserved. A redesign transformation rule aims to specify a modification of database schemas including the migration of its current extension. However, we want to provide general redesign transformations that apply to a database schema independent of its current extension. Thus, whenever the structure transformation rule part is applicable, we require that the instance mapping rule has to be able to migrate the data. Therefore, we introduce the following definition: General Redesign Transformation Rules For a valid redesign transformation rule T we require that for any database schema S and any of its extensions µ(S) the amalgamated rule must fulfill the parallel gluing condition. We call this condition the general parallel gluing condition. A redesign transformation rule that fulfills the general parallel glueing condition is called a general redesign transformation rule. Our example redesign transformation rule SplitClass fulfills the general parallel glueing condition since the rule does not delete any element and thus its amalgameted applications do not delete anything, too, and thereby they fulfill the parallel glueing condition, trivially. Database reverse engineering aims on recovering conceptual designs that reflect the semantics of LDAs. Thus, a major issue is to prove equivalency of the initial schema with its resulting counterpart. One possibility to achieve this is to produce the resulting schema by applying a sequence of information-preserving transformations to the initial schema. The notion of information-preservation can be defined based on the losslessness of schema transformation rules, cf. [Hai91,BCN92]:
Definition 1: Losslessness of General Redesign Transformation Rules Let T=(s,i) be a general redesign transformation rule and T-1=(s-1,i-1) be constructed by swapping the left- and right-hand sides of s and i. T is called lossless (reversible) iff the associated mapping T: µ(S) → µ(S) is injective and 1. T-1 is a total function on T(µ(S)) 2. T-1 ° T = idµ(S) This means, any extension δ ∈µ(S) of a schema S that has been transformed by T can be built back by applying T-1 to the transformed instance. We call a given schema transformation rule T=(s,i): • information-augmenting (IA) if T is lossless, • information-preserving (IP) if T is symmetrically-lossless, i.e. T and its inverse transformation T-1 are lossless, For parallel graph rewriting rules we aim to employ static criterions to prove the above properties. At this, we were inspired by [HW95]. We need the following proprosition: Proposition 1: Losslessness Criterion 1 Let T = (s, i) be a database schema transformation. Let T-1= (s-1, i-1) be constructed by swapping the left- and right-hand sides of s and i. if ∀H, G with H=T(G) holds 1. H contains only one match for s-1. 2. H does not contain any match for i-1 other than those constructed by T. then T is lossless. Proof sketch: An amalgamated graph rewriting rule is a double-pushout rule, too. Thus, the amalgamated rule is reversible, i.e. the amalgamated rule applied to the match it created at the forward execution results in the original graph, cf. [Tae96]. However, applying T-1 to H we have to construct the amalgamation anew. During the (forward) application of T on G each excution of i has created an „original“ match for i-1. Each of these original matches provides the „original“ match for s-1. If our graph contains no other match for s-1 and i-1 the application of T-1 results in exactly the same amalgamation as the one used for the forward transformation. This implies T-1 is a total function on T(µ(S)) and T-1(T)= id. Proposition 2: Losslessness Criterion 2 Criterion 1 holds iff there exists a subgraph sg of the right-hand side Ri of instance mapping rule i such that ∀H, G with H=T(G) holds: 1. H contains only one match for s-1. 2. For any two matches m1, m2 of Ri in H holds m1|sg = m2|sg implies m1 = m2. 3. For any match msg of sg in H there exists an “original” match m’ of Ri in H such that m’|sg = msg Condition 2 of Criterion 2 demands that sg determines the whole match of Ri. Condition 3 says there is no match for sg other than an „original“ one. Proof: If Criterion 1 holds, sg=Ri fulfills the requirements of Proposition 2. Ri fulfills condition 2, trivially. If sg equals Ri, Criterion 2’s condition 3 equals condition 2 of Criterion 1. Assume Criterion 2 holds but there exists a match mx of Ri different from any original match Ri. The additional match mx contains a match msg for sg. Due to Criterion 2 there exists a match m’ of Ri such that mx|sg=m’|sg, i.e. mx and m’ have msg in common. Thus, condition 2 of Criterion 2 implies that mx=m’. This contradicts to our assumption. Thus, Criterion 2 implies Criterion 1. Properties of SplitClass These criterions are handy enough to show the losslessness of our example schema transformation
rules, easily. In the following we will show that the subgraph sg’ consisting of nodes 5 and 6 of the right-hand side of SplitClass together with the connecting inst_of edge fulfills the losslessness Criterion 2: Condition 1: Node 5 contains a name that identifies the match for node 5, uniquely. Since the corresponding class is just created it can have only one property that is matched by node 4. Thus, H contains only one match for s-1. Condition 2: Assume we have a match msg’ for sg’. The match for node 5 is the new class that has been created by applying SplitClass to the start database. This just created class contains only one property, the corresponding 1-RefT association stub. This 1-RefT node serves as only possible match for node 4 of SplitClass’s right-hand side and the connecting has edge. In turn, the match for node 4 determines the match for the partner association stub 3 and the connecting assoc edges, uniquely. Similarily, the object matched by node 6 has only a single link stub attached that determines the match for node 8 and the connecting has edge. The match for node 8 determines the match for node 7, the connecting link edges, and the inst_of edges from 7 and 8 to 3 and 4, respectively. Thus, the match for node 5 and 6 determines the whole match. Condition 3: Assume there exists a match msg’ for sg’ that has not been created by SplitClass. The class c5 matched by msg’ has a unique name that did not exist before the execution of SplitClass, cf. SplitClass’s pre-condition. Since c5 is just created by SplitClass, the match for the inst_of edge connecting node 6 and node 5 must have been created by SplitClass, too. Since SplitClass connects only just created objects to the new class c5, the match for node 6 must have been created by SplitClass. Thus, the match for sg’ is part of an original match m’, which contradicts to our assumption. Thus, condition 3 holds. Together, the three conditions proof that SplitClass is lossless. Losslessness of SplitClass-1: First we have to show, that SplitClass-1 fulfills the general parallel glueing condition (Definition ). The structure rule s-1 of SplitClass-1 is applicable only, if the match for node 5 contains no other property as the one matched by node 4. Otherwise s-1 would violate the dangling edge condition. Therefore, any object o matched by node 6 has only a single property which is matched by node 6. Thus i1 will not violate the dangling edge condition, too. Different matches for i-1 can not overlap in nodes 6 or 8 since a match for nodes 6 or 8 determines the entire match (thus, all matches overlapping in 6 or 8 are equal). Thus, the identification condition can not be violated, either. The general parallel glueing condition holds for SplitClass-1. To proof the losslessness, we use Criterion 1. The match for s is already determined by the parameter. Assume there exists a match mx of the left-hand side of SplitClass in a Graph G that has been created by applying SplitClass-1 to a Graph H and mx is no „original“ match m’. Since class c has been provided as paramater, the match cc of class c is common to mx and m’. To hold our assumption, mx and m’ must differ in their match for node 2. In mx node 2 applies to an oject o2 that belongs to class cc. According to the semantics of 1-1 associations, in H object o2 must have had a unique neighbor object o6 that could have served as a match for node 6 in the application of SplitClass-1. This application of SplitClass-1 would have created a match m’ for the left-hand side of SplitClass. This match m’ would be equal to mx’. This contradicts our assumption and all together proofs that SplitClass-1 is lossless. Thus, SplitClass is a general, information preserving redesign transformation rule. Properties of MoveProperty Figure 6 shows another general, information-preserving schema transformation rule: MoveProperty is used to transfer a certain class property prop (attribute or reference) from class 3 to class 4 via a total 1:1 association. MoveProperty fulfills the general parallel glueing condition: Informally, each match of the instance part of MoveProperty applies to a different pair of database objects that are connected via a 1-1 link where one of the objects holds the property to be moved to the other. There is no way to overlap two different matches at all. Thus, different matches do not interfere and the general parallel glueing condition holds. Condition 1 of Criterion 2 holds since the parameters via and prop determine the match of s and s-1,
via has
assoc
3 : Class
2 : 1-RefT has
via has
4 : Class
3 : Class
assoc
2 : 1-RefT has 4 : Class
has
has
prop
prop
7 : Obj has
link
9 : 1-Ref
8 : Obj has
has
10 : 1-Ref
9 : 1-Ref
6 : Prop has
7 : Obj
inst_of
:=:
inst_of
has
inst_of
inst_of
6 : Prop
inst_of
inst_of
inst_of
inst_of
inst_of
:=: inst_of
structure transformation rule instance mapping rule
transformation MoveProperty (via : 1-RefT; prop : ClPropT) =
8 : Obj has
link
10 : 1-Ref
Figure 6. Preserving Redesign Transformation MoveProperty
instance mapping structure transformation rule rule
uniquely. We will use nodes prop, 4, 6 and 8 and the connecting edges as subgraph sg’. Obviously, sg’ identifies any match of MoveProperty-1. Condition 3: Assume there exists an “foreign” match msg for sg’. msg matches node 8 to an object o8 that is an instance of class c4 matched by node 4. Before the application of MoveProperty no instance of class c4 has owned a property of type prop. Thus, the match for nodes 6 and 8 and the connecting has edge must have been created by an “original” match m’. Since msg and m’ overlap in nodes 6 and 8 and the connecting has edge they overlap in nodes prop and 4 and their connecting edges, too. This contradicts to our assumption and thereby proofs the losslessness of MoveProperty. Symetric arguments imply the losslessness and general applicability of MoveProperty-1. Properties of SetCardToMany We have shown that the general redesign transformation rules SplitClass and MoveProperty are information-preserving. An example for an information-augmenting redesign transformation (SetCardToMany) is presented in Figure 7. It transforms an x:1-association between two classes into an x:manyassociation. The losslessness of SetCardToMany can be shown using the same techniques as above. On the other hand, SetCardToMany-1 violates the general dangling edge condition, since there may exist database extensions where node 10 matches a SetRef with more than one outgoing link. Thus, redesign transformation SetCardToOne is information-augmenting. All three redesign transformations presented up to now are so-called primitive transformations. The user can define more complex redesign transformations by specifying macros for concatenations of primitive transformations. An example for such a complex transformation is SplitAttrs (cf. Figure 3), transformation SetCardToMany( ref_t: 1-RefT) = 1 : Class has
2 : Class assoc
3 :ClPropT
1 : Class
has
has
4 =ref_t
2 : Class assoc
3 :ClPropT
has 9 :SetRefT
:=: inst_of
inst_of
5 :Prop has 7 : Obj
6 :1-Ref link
inst_of 5 :Prop
has 8 : Obj
inst_of
has 7 : Obj
10 :SetRef link
has 8 : Obj
Figure 7. Augmenting Redesign Transformation SetCardToMany
which is a concatenation of primitive transformations SplitClass and several times MoveProperty. SplitAttrs is information-preserving, because SplitClass and MoveProperty are information-preserving.
4. Conclusions The Varlet environment exploits the power of graph grammars in various ways. First we generate structure oriented editors for different kinds of database schemas following the Progres prototyping approach [Nag96]. The richness of graph grammar theory allows the introduction of information-preserving redesign transformations and provides “handy”, static criterions that facilitate the prove of such properties. In addition, graph grammar theory allows us to derive powerful data translation tools from a mapping specification that may exploit inherent parallelism for the translation of database extensions. The Varlet environment comprises about 235000 lines of (C, C++, and TCL/TK) code (LOC). We have developed the 195 000 LOC core parts of Varlet using Progres. The Progres specification consists of about 18 000 LOC.
Acknowledgements and References Many thanks to Gabi Täntzer, Anika Wagner, and especially to Reiko Heckel for their private lessons in double-pushout theory, proof readings, discussions and recommendations. [And94] M. Andersson. Extracting an Entity Relationship Schema from a Relational Database through Reverse Engineering. In Proc. of the 13th Int. Conference of the Entity Relationship Approach, Manchester, pages 403–419. Springer, 1994. [FV95] C. Fahrner and G. Vossen. Transforming Relational Database Schemas into Object-Oriented Schemas according to ODMG-93. In Proc. of the 4th Int. Conf. of on Deductive and Object-Oriented Databases 1995, 1995. [Hai91] J-L. Hainaut. Entity-generating schema transformations for entity-relationship models. In Proc. of the 10th Entity-Relationship Conference, San Mateo, 1991. [Hol97] Jens Holle. Ein Generator für integrierte Werkzeuge am Beispiel der objekt-relationalen Datenbankschemaintegration. Master’s thesis, Universität Paderborn. Fachbereich 17, July 1997. [HW95] R. Heckel and A. Wagner: Ensuring consistency of conditional graph grammars - a constructive approach. Proc. of SEGRAGRA ’95, “Graph Rewriting and Computation, Electronic Notes of TCS, 2, http://www.elsevier.nl/locate/entcs/volume2.html. [JH98] J. H. Jahnke and M. Heitbreder. Design recovery of legacy database applications based on possibilistic reasoning. In Proceedings of 7th IEEE Int. Conf. of Fuzzy Systems (FUZZ’98). Anchorage, USA.. IEEE Computer Society, May 1998. [JSZ96] J. H. Jahnke, W. Schäfer, and A. Zündorf. A design environment for migrating relational to object oriented database systems. In Proc. of the 1996 Int. Conference on Software Maintenance (ICSM’96). IEEE Computer Society, 1996. [JSZ97] J. H. Jahnke, W. Schäfer, and A. Zündorf. Generic fuzzy reasoning nets as a basis for reverse engineering relational database applications. In Proc. of European Software Engineering Conference (ESEC/ FSE), number 1302 in LNCS. Springer, September 1997. [LS96] M. Lefering and A. Schürr. The IPSEN Book, chapter Specification of Integration Tools. Springer, Berlin (LNCS 1170), 1996. [Nag96] M. Nagl, editor. The IPSEN Book, volume 1170 of LNCS. Springer, Berlin, 1996. [PB94] W. J. Premerlani and M. R. Blaha. An approach for reverse engineering of relational databases. Communications of the ACM, 37(5):42–49, May 1994. [PKBT94] J. Petit, J. Kouloumdjian, J-F. Boulicaut, and F. Toumani. Using queries to improve database reverse engineering. In Proc. of 13th Int. Conference of ERA, Manchester, pages 369–386. Springer, 1994. [Rad95] Elke Radeke. Federation and Migration among Database Systems. PhD thesis, University of Paderborn - Dept. of Mathematics and Computer Science, 1995. [Ros97] Grzegorz Rosenberg, editor. Handbook of Graph Grammars and Computing by Graph Transformation. World Scientific, Singapore, 1997. [Sch97] A. Schürr. Handbook of Graph Grammars and Computing by Graph Transformation, volume 1, chapter Programmed Graph Replacement Systems, pages 479–541. World Scientific, Singapore, 1997. [SLGC94] O. Signore, M. Loffredo, M. Gregori, and M. Cima. Reconstruction of er schema from database applications: a cognitive approach. In Proc. of 13th Int. Conference of ERA, Manchester, pages 387–402. Springer, 1994. [Tae96] Gabriele Taentzer. Parallel and Distributed Graph Transformation: Formal Description and Application to Communication-Based Systems. PhD thesis, TU Berlin, Fachbereich 13, 1996.