Selection of Reverse Engineering Methods for Relational ... - CiteSeerX

6 downloads 0 Views 95KB Size Report
Key words: Database reverse engineering; relational model, legacy systems ... known. It is also assumed to be known the candidate key that is the primary key.
Selection of Reverse Engineering Methods for Relational Databases

Lurdes Pedro-de-Jesus, Pedro Sousa IST / INESC , R. Alves Redol, 9, 1017 Lisboa Codex, Portugal {Lurdes.Jesus, Pedro.Sousa }@inesc.pt

Abstract The problem of choosing one method to use in the reverse engineering of existing relational database systems is not a trivial one. On one hand, methods hold different input requirements. On the other hand, each legacy system has its particular characteristics that restrict the available information. Our experience has identified different information types, with different degrees of availability and reliability, which can be found in existing systems. In this paper we present a short description of several reverse engineering methods and propose one classification model. The characterisation of the methods is organized in 5 categories: input, assumptions, output, methodology and main contributions. The classification model is based on the input requirements of the method, namely, attribute semantics, attribute’s name coherency, data instances, applications code, candidate keys, 3NF, inclusion dependencies and human input. We analyse the applicability of each method to existing database systems, as well as the possibility to apply different methods to different parts of the system.

Key words: Database reverse engineering; relational model, legacy systems

1- Introduction Information systems (ISs) supported by relational databases are nowadays common. Some ISs are old, not adequate to the actual needs they are intended to serve. Many ISs have been constructed using conceptual

1

modelling, others have not. In both cases, it is very common to find systems ill-designed and poorly documented. Even when organisations have upgraded their hardware and software infrastructures, the scenario remains unchanged. In fact we have found information systems supported on recent versions of Relational Database Management Systems which still exhibit the whole set of characteristics typical of ancient systems.

There are several motives that require the identification of the conceptual scheme of an existing IS, such as: maintenance, migration among database paradigms or platforms, integration with new software to be developed. As defined by Chikofsky and Cross [Chikofsky90], reverse engineering is “the process of analysing a subject system to identify the system’s components and their relationships and create representations of the system in another form or at a higher level of abstraction”. Database Reverse Engineering (DBRE) deals with the tasks of understanding legacy databases and extracting their design specifications (domain semantics) [Chiang97]. In recent and well designed systems, DBRE requires the analysis of the database catalogue and the examination of specific code, such as triggers used to define constraints that could not be expressed in the database catalogue. But, even in these systems, most constraints are spread over the many applications that access the database. In older systems, the scenario may be much more diversified. Therefore, in all cases, DBRE requires the analysis of the database catalogue, data mining of the database, interaction with users (programmers and analysts with knowledge of the system) and, to some extent, comprehension of the application source code.

Today, a wide range of DBRE methods is known. Each of them exhibits its own methodological characteristics, produces its own outputs, require specific inputs and assumptions. In order to select the most adequate method, according to the available information, we adopted a classification mechanism based on the required inputs and assumptions.

The structure of this paper is the following. Section 2 describes the framework of classification and the classified methods and presents the framework application to the methods. Section 3 discusses, for each method, its practical application to real world projects based on the sketched framework and needed information sources. In section 4, future work is referred and finally, in section 5, we present the conclusions.

2 - Methods Classification The need of comparing the known DBRE methods in order to apply them to real legacy databases has led us to make a list of items on which the methods differ and that are relevant to the selection. That constitutes the

2

framework that is next described. Then we present a short description of each selected method and finally apply the framework to these methods.

2.1 - The Framework The items of the framework reflect the input and the assumptions required by the method and do not reflect the type and the quality of the output.

A. Semantic knowledge A.1. Attribute semantics. It is assumed by the method that the semantic meaning of each attribute is known. This requirement implies to have knowledge about the application domain. A.2. Attribute’s name coherency. Even if one does not know the actual meaning of attributes, name’s coherency is assumed. Names of attributes follow some known policy. Synonyms and Homonyms conflicts, if any, are solved by a process external to the method. This item is only relevant when it is not assumed that the semantic meaning of each attribute is known. B. Data B.1. Data is used. Database instances are one of the inputs. Possible values for this item is yes/no. B.2. No error on data. If data is used, it is assumed that database values do not contain errors. In this case, results from queries can be used blindly to reject hypothesis. C. Code C.1. Code is used. Code is one of the inputs. Possible values for this item is yes/no. C.2. No error on code. If code is used, it is assumed that applications source code contains no bugs or contradicts each other. That is, information extracted from one program does not contradict information extracted from other programs. D. Candidate keys or key functional dependencies. It is assumed that the candidate keys are known. This is equivalent to assume that functional dependencies where the determinant is a candidate key are known. It is also assumed to be known the candidate key that is the primary key. E. Foreign keys or key-based inclusion dependencies. The foreign keys, or at least key-based inclusion dependencies are known. F.

Non-key functional dependencies or 3NF. Either functional dependencies where the determinant is not a candidate key are known or 3NF relations are required. In this framework, all the valid functional dependencies for the domain are considered, and not only the “relevant” ones. In this item, “relevant” means that the functional dependencies that do not imply update anomalies should be ignored. So we are considering that 3NF imply, truly, that relation-schemes represent single object-sets, in order to cope with the more restrictive cases. For more details about this question see [Markowitz90, Ullman80].

3

G. Non-key-based inclusion dependencies. Inclusion dependencies where the attributes are non-key. H. Cases in which human input is required. The method identifies the cases were human input is required.

2.2 - The Methods To characterise each method, we present a brief description of the method organised by the following categories: •

Input - The types of information that need to be supplied in order to apply the method



Assumptions - The assumptions required by the method (for instance, 3NF schema)



Output - The results produced by the method



Methodology -The major steps followed



Main Contribution - The main contribution of the method.

Chiang et al.’s Method [Chiang93,Chiang94,Chiang95] This method requires, as input, the data instances and relation-schemes, including primary keys. Optionally, the user may insert some inclusion dependencies. The assumptions made are third normal form (3NF), consistent naming of attributes and no error on key attributes values. The output given is an Extended Entity-Relationship model (EER). The methodology has three major steps: (i) the classification of relations and attributes, based on the relation-schemas and its primary keys, (ii) the generation and verification, against data instances, of inclusion dependencies using heuristics based on the classified relations and attributes and, finally, (iii) the identification of EER components using a list of rules.

The main contributions of this work are the generation of inclusion dependencies and the complete justification of the transformations applied to the existing database to produce the resulting EER model. The detailed description of the method in [Chiang94] highly contributes to consider its application in actual DBRE cases. It should be noted that, in each step, the cases in which human input is required are clearly identified [Chiang95].

Johannesson’s Method [Johannesson94] This method requires as input, the relation-schemes, functional dependencies and inclusion dependencies. The last ones have to be divided into two subsets: the generalization indicating inclusion dependencies must be separated from the others. Relations are assumed to be in 3NF. The output is a conceptual schema described as a pair made of a language L and a set IC of typing, mapping and generalization constraints. The methodology has four steps: (i) split the relation schemes that correspond to more than one object; (ii) add extra relation schemes to handle the occurrence of certain types of inclusion dependencies; (iii) collapse

4

the schemes of the relations that correspond to the same object type and, finally (iv) apply the schema mapping that converts a relational schema into a conceptual schema.

The Johannesson’s method is based on the well-established concepts of relational database theory. It is very complete, in terms of the description of the RE steps, but with the drawback of needing all keys and inclusion dependencies. Note also that at beginning of step (iv), when the mapping is ready to be done, the input information is already “clean and neat”. So the mapping process is simple and automatic, with each relation scheme giving rise to an object type.

Markowitz et al.’s Method [Markowitz90] This method requires, as input, the relation-schemes, key dependencies and key-based inclusion dependencies, i.e. referential key constraints. Relations are assumed to be in Boyce-Codd normal form (BCNF), considering only the “relevant” functional dependencies (see the description of item 6, in the classification framework description above). So, in our classification, this method is classified as not assuming 3NF. The output is an EER model. The methodology has four steps: the first one transforms relational schemas into a form appropriate for identifying EER object structures. The transformations made consist of (i) folding relations that represent the same object-set, by removing certain inclusion dependencies that correspond to a directed cycle in the inclusion dependency graph, (ii) splitting relations that correspond to several object-sets by looking to single attribute foreign keys that are allowed to have null values, and (iii) removing redundant foreign keys. In the second step of the methodology examines relation-schemes, functional dependencies and inclusion dependencies obtained after the transformations in order to detect whether they satisfy a set of properties. The third step determines the type of objectinteraction, such as weak-entity-set and specialization, for each inclusion dependency. The fourth step derives a candidate EER schema using some mapping rules, finally the quality of generated EER schema is examined.

This method is very demanding on the input. In spite of not requiring that relation-schemes represent singleobject-sets or coherency of names, it requires all key functional dependencies and key-based inclusion dependencies. Dependencies are not even a common type of input. We consider that the main contributions of this work is the independence from attribute names and the formalisation of the mappings between schemas.

Navathe and Awong’s Method [Batini92(pp 328-340)] This method only requires as input the relation-schemes, although with several assumptions: (i) 3NF or BCNF relations are required, (ii) coherency on attribute names: there are no ambiguities in foreign keys and there are no homonyms, and (iii) all candidate keys must be specified. The output is an EER model and the used methodology has the following steps: First, relation are processed and classified by human

5

intervention, so that assumptions are satisfied. Then, the classified relations are mapped based on its classification and the key attributes. Finally, the special cases of non-classified relations are handled on a case-by-case basis. The method’s goal is to resolve the most common situations rather than to claim exhaustiveness.

This method is driven by the detection of relationships, which are discovered by observing the primary and candidate key attributes, referential constraints and inclusion dependencies. The method reveals the drawback of requiring, earlier in the process, semantic input. This requirement enables the achievement of coherency of names, the determination of foreign keys, and the selection of the more adequate primary keys between the candidate keys.

Petit et al.’s Method [Petit94, Petit96] This method requires, as input, the relation-schemes, with unique and not null constraints, data instances and code. The assumptions identified in our framework are not mandatory in this method. The output is an EER model. The methodology followed firstly proposes inclusion dependencies using relation-schemes, database instances and equi-join queries from the code. Then, the method proposes relevant non-key functional dependencies – the ones not derivable from the candidate keys and related to hidden objects – using relation-schemes, candidate keys, and the inclusion dependencies not rejected by the user. Then it uses the (1NF) schema and the sets of keys, functional dependencies, inclusion dependencies and hidden objects to obtain a 3NF schema – i.e. it uses a decomposition normalization algorithm. Finally, the mapping from the relational schema into an EER one is done.

To be noted, that among the analyzed methods, the Petit et al.’s is the only method that includes the normalization of the relational schema. However, hidden objects, by option of the user or not revealed by equi-joins, may persist.

Premerlani and Blaha’s Method [Premerlani94, Blaha98] This method requires, as input, the relation-schemes and data. The assumptions identified in our framework are not mandatory in this method. The output is an Object Modelling Technique (OMT) model. The methodology, has the following steps: First, prepare an initial objects model by represent each relation as a tentative class. Then, the user should look for candidate keys using some clues described. Then, the user should determine foreign key groups using, again, some clues described. Finally, the refinement of the OMT schema is progressively done by the user based on given guidelines that include querying data.

The Premerlani and Blaha’s method is characterised by allowing only the automation of small steps, due to the high level of human input required. It is intended to process some tricky representations, providing

6

guidelines for coping with design optimisations and unfortunate implementation decisions. This method gives several clues on how/where the user should look for the types of information needed.

Signore et al.’s Method This method [Signore94] requires, as input, the relation-schemes and code. The assumptions identified in our framework are not mandatory in this method. The output is an Entity-Relationship model. The methodology has three phases. The first phase is for the identification of primary keys: searching for candidate keys indicators in the code; each candidate key as assigned to it the frequency of usage. At the end of this phase, each relation must have a primary key or, at least, a hypothesis for the primary key and possible candidate keys. The second phase is for the detection of the indicators of synonyms and referential key constraints. These indicators are obtained from SQL instructions in the code. The third and last phase is the conceptualization: using the four types of indicators found - schema, primary key, SQL and procedural indicators – the conceptual model is derived.

Notice that this method is based on clues. The clues are adopted to cope with unusual implementation techniques, optimisation choices, poor Data Definition Language, code errors, among others. So, in the conceptualisation phase, potential concepts are identified based on suitable combination of indicators.

2.3 - The Classification Table 1 depicts the application of our framework to the previous section methods. In many cells, we insert references to forward text which explains our values.

7

Items/Methods

Chiang

Johannesson Markowitz Navathe

Petit

Premerlani

A.1.Semantic

to classify

Not required

Not required

Not assumed to Not assumed to

meaning of

weak entity

attributes

and specific

Not required

To solve A.2

Signore

be known1

be known.

relationship relations

A.2. Attribute

Attribute’s

Not assumed.

Not

Problem

Problem

Problem

Names

name

Names

assumed.

considered but

considered but

considered and

coherency

coherency on

meaning isn’t

Names

not resolved.

not resolved

solved3

key attributes

used.

meaning

Coherency

isn’t used.

achieved by

is assumed

Not assumed.

the user2

B.1.Data

Yes

No

No

No

Yes

Yes

No

B.2.No Error

No error on

-

-

-

No, not

Problem not

-

on Data

key attributes

assumed

mentioned.

C.1.Code

No

No

No

No

Yes

No

Yes

C.2.No Error

-

-

-

-

Problem not

-

Problem

values

considered.

on Code

partially solved4

D.Candidate keys

or

key

FDs

Yes for

Yes: primary

Yes, key-

primary keys.

keys and all

based

Not all

FDs

functional

Yes, must be specified

alternate keys

dependencie

need to be

s

6.

Yes, included in the input.

No, they are 7

determined

No, not assumed to be known8

known5.

E.Foreign keys

No, not

Yes, key based

Yes, key

Not assumed to Not assumed to Not assumed to Not assumed to

or key-based

assumed to be

inclusion

based

be known, but

inclusion

dependent of

Dependent of

dependencie

1.2.

1.2.

s.

resolution11

resolution13

inclusion

known.

9

10

dependencies

dependencies F.NK FDs or 3NF

14

G.NK IncDep

Yes. 3NF

Yes. 3NF 15

assumed

assumed.

Not assumed

Yes, used in

No, 3NF not assumed

16

No

Yes. 3NF assumed.

be known.

No. They are

the

be known, but

but allowed.

transformation

allowed18

be known.

Not required.

Not required.

Not used.

No

Identified23

Identified: at

17

derived

Not assumed to No. They are

to be known,

be known12

derived.

“inclusion dependency splitting”

H. Cases in which human input is required

Identified20

Clearly identified

19

Identified21

Not

Identified22

applicable:

the end of each

human input

step to give

is not

final decisions

required.

Table 1 – Application of the framework to the selected methods

8

1

Nevertheless, semantics is one of the inputs for their process – after some automatically done procedure, like at the end of steps 5 and 6, in [Premerlani94] - semantic understanding is required to refine or confirm the results obtained 2 Names coherency is left to the user before applying the methodology. The ambiguities caused by synonyms and the homonyms have to be removed by renaming the attributes. Renaming is also necessary to solve ambiguities in references. 3 Homonyms. Detection of homonym: by the definition, i.e. looking for attributes belonging to different relations, identified by the same name but defined on different domains (is not clear how that is known). Solution to homonyms: the dot notation (tablename.attributename). Synonyms Are detected by some SQL indicators taken from the analysis of SQL statements. 4 Considering that problem, the method requires user intervention to get final decisions based on the generated clues as mentioned for candidate keys, for example. 5 Information about candidate keys is requested only when: (1) there are ambiguities in the classification of relations, and (2) the user specifies inclusion dependencies between non key attributes. 6 The user needs to specify all candidate keys for each relation and may need to substitute a candidate key for the current primary key. 7 from (1) unique indexes; (2) automated scanning of data and semantic knowledge 8 The authors have identified a list of patterns to be looked for in the code, conducting to a set of proposed keys. Only keys referred by at least one where clause in a SQL statement will be detected. The frequency of usage (in the code) of each proposed key is assigned to it. At this point user intervention is required in order to decide which of the proposed keys will be considered candidate keys and which will be elected primary key. 9 Referential integrity constraints are generated. As in Premerlani’s method, foreign key groups are formed by achieving name’s coherency. However, afterwards Chiang’s method is able to proceed automatically while Premerlani’s method still requires user input. 10 The set of inclusion dependencies must be divided into two subsets: IG – the generalization indicating inclusion dependencies, and IA – the other. 11 After the renaming of attributes, the referential constraints should become obvious. 12 Step 3 in the process is for determining foreign key groups – groups of attributes within which foreign keys may be found. They determine merely foreign key groups instead of foreign keys because of generalization. With generalization, the patterns of superclass-subclass, disjoint or overlapping, and multiple inheritance are not immediately apparent. 13 (1) explicitly declared foreign keys; (2) based on the synonyms detected and the primary keys already determined. The uncertainty of primary keys is transferred to foreign keys; confirmation is obtained by looking at referential integrity constraint checks in the code. 14 Until now, we have not found methods in which non-3NF relations were an excluding demand. The methods that require 3NF are applicable if a non-3NF relation exists – decomposition can be made in the EER model. 15 This work has the particularity of talking about the cases when even a relation schema in the 3NF corresponds to several object types. 16 does not require that relation-schemas represent single object sets, but requires some normalization. 17 This method elicits the functional dependencies relevant for the restructuring, those which influence the way attributes should be restructured. Irrelevant hidden objects may persist. 18 Not to be used in the steps of the mapping process, but is referred that they should simply be noted as additional knowledge to be used latter during the methodology. 19 In [Chiang95] those cases are clearly identified: • Indicate the primary key for relations with multiple possible primary key attributes •

Classify weak entity relations and specific relationship relations



Optionally propose inclusion dependencies between non-key attributes



Optionally to change names of identifying relationships for weak entity types



Specify the proper type of an inclusion relationship for two entity types having, not only the same key but also the same set of data instances for their keys



provide names for binary relationships identified by foreign keys

9



determine the exact participating entity types for relationships identified by relationship relations and foreign keys whenever inclusion dependencies cannot clarify the ambiguities



specify new entity types for general key attributes of specific relationship relations



Assign non-key attributes of relations with foreign keys

20

(i) to decide whether a relation scheme containing several keys corresponds to one or several object types (ii) to decide whether an inclusion dependency where both sides are keys indicates a generalization constraint or a 1-1 attribute. 21 Two cases of user input: (i) pre-process the relations to solve 1.2 and (ii) treat the special cases that were not translated. The method does not claim exhaustion. 22 Human input is required to (i) to reject inclusion dependencies automatically generated that are not already rejected by the data instances and to (ii) to reject functional dependencies automatically generated or decide about the conceptualization of an hidden object 23 Human input is required but is not clear when. Throughout all phases, except the first one that prepares an initial object model, semantic knowledge about the domain can or must be used, at least at some point where no more can be decided without human intervention: (i) determining candidate keys, to interpret suggestive patterns of data, (ii) determining foreign-key groups, to resolve homonyms and synonyms - as some guidelines of automated procedures are given, it is presumed that human input will be required only when no more guidelines are applicable and (iii) following the clues listed in the method

10

3 – Applying DBRE Methods to Existing Systems The applicability of DBRE methods is essentially restricted by the information availability. On one hand, the complete knowledge of the relational schema, such as functional or inclusion dependencies, is usually unknown. On the other hand, the available information is not complete: there are always tables without known primary keys and there are sets of attributes with synonym and homonym conflicts to be resolved. On top of this, there are other factors such as erroneous code and data that must be taken into consideration.

We found out that: •

Markowitz, Johannesson and Navathe methods only use schema information, requiring far more knowledge about the relational schema than it is possible from the available sources of information. We could envisage the use of these methods when no data exists or is far from being representative or as a second step of a composed DBRE process.



Chiang’s method can be applied if the database data is mostly correct and representative, and if a coherent naming policy was followed or there is enough information describing attribute correspondences. Data is required to be representative to filter out the large number of inclusion dependencies hypothesis generated by the heuristics. Knowledge about the names of attributes is required to resolve naming conflicts among key attributes.



Premerlani’s method is based on data and users knowledge. Users must have sufficient knowledge about the application domain to be able to proceed whenever sufficiently sound conclusions could not be extracted from data.



Petit’s method can be applied whenever code and data are representative. The former to allow enough indicators to be extracted and the later to allow their validation.



Signore’s method can be applied whenever code is representative, users have sufficient knowledge about the system details and application domain to decide if hypothesis are accept rejected.

As mentioned before, it is relatively simple to get a partial set of the information for most of the types, but it is very difficult to get the complete set of information of a single type. Therefore one must choose the methods that are best suited for the information available. In practice, we do not select one single method, but rather combine several methods. Since methods have well-defined steps, each having a clear contribution to the overall ER schema, in most cases we produce a combination of steps of different methods according to the information available. The most successful combination is the usage of Chiang classification for handling the relations that implement strong and weak entities and regular relationships, because it is easier to resolve synonyms and homonyms for attributes of primary keys than for any other attributes. Petit method’s queries against the database are used to detect generalizations among entities (either strong or weak). We use an extension of Signore’s patterns to find indicators of embedded foreign

11

keys in the source code and detect relationships based on embedded foreign keys. Its description is outside the scope of this paper and is not herein included.

4- Future Work Having in mind a more complete classification of DBRE methods, the one herein presented should be extended in its two dimensions. First, to include more methods like [Hainaut93] supported by the DBMAIN CASE environment [Henrard98a], or MeRCI [C-Wattiau96]. And the second one, to include more characteristics relevant for the classification, such as the impact of incompleteness of information or the quality of the conceptual schemas produced.

We envisage an ideal DBRE method that is able to look for the necessary information among the available sources and is able to process it regardless of its type. Furthermore, such an ideal method should also use all redundant information as a mean to increase the confidence on the results.

5- Conclusions Transforming relational schemas into conceptual schemas is easily done in the presence of all the concepts of relational theory (see for instance [Johannesson94]). However, database reverse engineering turns to be not a trivial problem when facing the lack of information. Legacy systems have its own intrinsic characteristics and it is not easy to know all the relational concepts: relation schemes, primary keys, alternate keys, functional dependencies and inclusion dependencies. The availability of information can vary. On the other hand, existing DBRE methods, or contributions to some particular problem, cover different cases of input required.

In order to apply known methods in actual DBRE projects, eventually as a basis of work, the selection based on the available information must be done. Aiming that selection, a classification of a set of representative and algorithmic DBRE methods have been made and herein presented. In one extreme, some of the methods are more concerned with the reverse mapping of the logical data schema of a legacy database into a conceptual schema, in the presence of a complete relational schema, as [Markowitz90] and [Johannesson94]. In the other extreme, we can find a robust method that intends to deal with tricky conceptual-to-relational transformations and ill-designed databases [Premerlani94]. In the middle, we can find the other methods combining algorithms, clues and indicators and using different sources of information.

12

References [Batini92]

Carlo Batini, Stefano Ceri, Shamkant B. Navathe, “Conceptual Database Design – An Entity-Relationship Approach”, Benjamin/Cummings, 1992.

[Blaha98]

M. Blaha and W. Premerlani, “Object-Oriented Modelling and Design for Database Applications”, Prentice-Hall, 1998.

[Chiang93]

Roger H.L. Chiang, Terrence M. Barron, Veda C. Storey, “Performance Evaluation of Reverse Engineering Relational Databases into Extended Entity-Relationship Models”, in Proc. of the 12th International Conference on Entity-Relationship Approach, R. Elmasri and V. Kouramajian (Eds.), Arlington, Texas, USA, pp 336-352, Dec 1993.

[Chiang94]

Roger H.L. Chiang, Terrence M. Barron, Veda C. Storey, “Reverse engineering of relational databases: Extraction of an EER model from a relational database”, Data & Knowledge Engineering 12, pp 107-142, Elsevier Science, 1994.

[Chiang97]

Roger H.L. Chiang, Terrence M. Barron, Veda C. Storey, “A framework for the design and evaluation of reverse engineering methods for relational databases”, Data & Knowledge Engineering 21, pp 57-77, Elsevier Science, 1997.

[Chiang95]

Roger H.L. Chiang, “A knowledge-based system for performing reverse engineering of relational databases”, Decision Support Systems 13, pp 295-312, North-Holland, 1995.

[Chikofsky90]

Elliot J. Chikofsky, James H. Cross II, “Reverse Engineering and Design Recovery: A Taxonomy”, IEEE Software, vol. 7, n. 1, pp 13-17, Jan 1990.

[C-Wattiau96]

Isabelle Comyn-Wattiau and Jacky Akoka, “Reverse Engineering of Relational Database Physical Schemas”, in Proc. of the 15th International Conference on Conceptual Modeling, B. Thalheim (Ed.), Cottbus, Germany, pp 372-391, Oct 1996.

[Hainaut93]

J-L Hainaut, C. Tonneau, M Joris, M. Chandelon, “Transformation Based Database Reverse Engineering”, in Proc. of the 12th International Conference on EntityRelationship Approach, R. Elmasri and V. Kouramajian (Eds.), Arlington, Texas, USA, pp 353-372, Dec 1993.

[Henrard98]

J. Henrard, V. Englebert, J-M. Hick, D. Roland, J-L. Hainaut, “Program understanding in databases reverse engineering”, submitted to International Workshop of Program Comprehension, 1998.

[Horowitz92]

Susan Horowitz, Thomas Reps, “The Use of Program Dependence Graphs in Software Engineering " Proceedings of the 14th International Conference on Software Engineering, Melbourne, Australia, May 1992.

[Johannesson94] Paul Johannesson, “A Method for transforming Relational Schemas into Conceptual Schemas”, in Proc. Of the 10th International Conference on Data Engineering, Rusinkiewicz (Ed.), pp 115-122, Houston, IEEE Press, 1994.

13

[Markowitz90]

Victor Markowitz and Johann A. Makowsky, “Identifying Extended Entity-Relationship Object Structures in Relational Schemas”, IEEE Transactions on Software Engineering, Vol 16, N. 8, August 1990.

[Petit94]

J-M. Petit, J. Kouloumdjian, J-F. Boulicaut, F. Toumani, “Using Queries to Improve Database Reverse Engineering”, in Proc. of the 13th International Conference on EntityRelationship Approach, Lecture Notes in Computer Science, Volume 881, pp 369-386, Manchester, UK, Dec. 1994.

[Petit96]

J-M. Petit, F. Toumani, J-F. Boulicaut, J. Kouloumdjian, “Towards the Reverse Engineering of Denormalized Relational Databases”, in Proc. of the 12th International Conference on Data Engineering, New Orleans, Louisiana, USA, IEEE Press, Feb. 96.

[Premerlani94]

William J. Premerlani and Michael R. Blaha, “An approach for Reverse Engineering of Relational Databases”, Communications of the ACM, vol. 37, n. 5, May 1994.

[Signore94]

Oreste Signore, Mario Loffredo, Mauro Gregori, Marco Cima, “Using Procedural Patterns in Abstracting Relational Schemata”, in Proc. of the 13th International Conference on Entity-Relationship Approach, Lecture Notes in Computer Science, Volume 881, Dec. 1994.

[Ullman80]

Jeffrey D. Ullman, “Principles of database Systems”, Computer Science Press, 1980.

14

Suggest Documents