A Three-Layer Model for Schema Management in Federated Databases

3 downloads 7582 Views 565KB Size Report
A Three-Layer Model for Schema Management in Federated. Databases. Mark Roantree1. John Keane2. 1School of Computer Applications. Dublin City ...
A Three-Layer Model for Schema Management in Federated Databases Mark Roantree1 1

John Murphy1

John Keane2

School of Computer Applications Dublin City University Dublin, Ireland.

Abstract This paper describes our use of object technology to provide a framework for interoperability between databases. We are particularly interested in controlling the eects on the federation of schema modication in local databases. We describe two informal models for federated database design. The abstract model describes the dierent metadata objects in the federation and how they relate to each other. The service model is used as a framework architecture for federated database design. The architecture is designed to ensure that the integration of local databases allows the federation to control its view of local schema changes. We have use d CORBA's distributed object technology to implement a prototype based on the service model described in this paper. Keyw ords: CORBA, Federated Databases, Distributed Objects, Canonical Data Model, Metadata.

1 Introduction A federated database system FDBS is a collection of autonomous database systems which cooperate to provide a combined view of individual data stores. We have based our work on the ve-level schema architecture described in SL90 and will use the terminology adopted in HM85, SL90 when describing the architecture. One issue facing federated databases is how to incorporate schema changes made to local databases into the federated view of data. W e propose a layered framework using CORBA's distributed object technology MZ95 , OHE96 which minimises the eect on a federation of local databases of a local schema modication.

2

Departmen t of Computation UMIST Manchester, UK.

A typical federation allows local databases LDB to interoperate with other LDBs even though they may use a dierent data model. This is achieved by converting each LDB format into a canonical data model CDM format. We have chosen an objectoriented data model as our CDM as w e concur with the conclusions reached in SCG91 which describes the expressive qualities required of a CDM. The conversion process involves the creation of a new schema representation called the component schema SL90 which is expressed in the format of the CDM but which contains mappings to local schema attributes. A separate conversion process for each type of data model is required which involves a dierent set of issues which are discussed elsewhere by ourselves Roa96, HRM96 and others PBE95, CS91, LM91, FHM91 . Once the component schema has been generated, various export schemas are derived on top of each component schema similar to deriving views on traditional data models. Federated schemas are constructed using mappings to attributes in export schemas. The paper is structured as follows: x2 presents our framework architecture in detail at both an abstract and an implementation level x3 describes how we insulate the middleware component from schema changes at the local database level x4 provides details on the development of a healthcare prototype nally x5 provides conclusions and describes further work.

2 The Framework Architecture W e view a federated schema as a collection of attributes. An attribute may be single-valued or multivalued in the case of complex attributes. Complex attributes can be viewed as objects which contain a group of single-valued attributes. This object hierar-

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

data model conversion and contains the local and component schemas integration metadata. The relationship between principal objects is at an abstract level. For example, we say that a federation schema contains links to a number of export schemas. For the service model we need a more specic relationship. For this purpose we have identied supplementary objects for each layer. Each principal metadata object comprises one or more supplemen tary objects. Supplementary objects provide the ne attribute-toattribute mapping required between schemas in different layers.

Federation Layer

Middleware Component

Component Layer

2.1 The Abstract Model

Integration Layer

Figure 1: Three Layer Architecture chy is reduced to a at collection of attributes for the purpose of mapping federation attributes to attributes in export schemas which in turn map to attributes in component schemas and nally to attributes in local schemas. The process of collapsing the object hierarchy is necessary as our model permits mapping at attribute level only. It does not allow a more abstract level mapping such as object to object mapping or object to relation mapping. We have identied three types of metadata objects which we have called principal objects. They relate to the metadata stored at each layer in the threelayer model in gure 1 and are called federation metadata, component metadata and integration metadata objects. W e will describe the objects later and explain why we treat them as principal objects. In gure 1, the Federation Layer manages federation metadata and interfaces with the Component Layer. The Component Layer manages component metadata and interfaces with the Integration Layer. The architecture is layered so that schema changes in local databases do not aect the top two layers in the architecture which are eectively the middleware component Ber96 . W e are presently concerned with the bottom four layers in the 5-layer architecture previously mentioned SL90 . The Federation Layer contains federated schemas federation metadata and the Component Layer contains export schemas component metadata. The Integration Layer deals with

Our abstract model deals with metadata objects in each layer of the framework and the relationships between them. W e describe the principal and supplementary metadata objects for each layer  gure 3 contains a table of metadata objects. W e do not address implementation issues here but instead discuss the metadata objects in an abstract manner. In the Service Model we examine the behaviour of each metadata object which will lead to a specication for an implementation of metadata schema classes. 2.1.1

The Federation Layer

The federation schema object has been dened in the Federation Layer. We have called this a Principal Federation Metadata object PFM. Since a federated schema is a collection of mappings to attributes in export schemas, a Federation Metadata object m ust interface with Component Metadata objects export schemas in a many-to-many fashion. W e use the object model notation of the Fusion Method Col94 to graphically represent the metadata objects and the relationships between them  gure 2. In our service model we discuss how we model these metadata objects. The many-to-many relationship is explained by the fact that a federated schema object contains links to many export schema objects. In addition, an export schema object may be referenced by many federated schema objects. For example, a federated schema Fed1 may be constructed from n export schemas E1 to En, while the export schema E1 may be referenced by federated schemas Fed1, Fed2 and Fed3. A federated schema contains n links to export schemas. The links provide the ne level of attributeto-attribute mapping required for specic access to local database items. These attributes are called Supplementary Federation Metadata objects SFM. They

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

also have a many-to-many relationship with supplementary component metadata objects SCM. For example, an attribute city in one federated schema may have links to city attributes in several export schemas. Examples are provided in x2.2 where we discuss our service model. In addition, export schema objects may be referenced by many federation schema objects which demonstrates a 1-to-many relationship in the opposite direction. For example, the attribute city in one export schema object may be included in several federated schema objects.

2.1.2 The Component Layer

The Component Layer contains export schemas which we have called Principal Component Metadata objects PCM. An export schema contains links to a single component schema. The links are called Supplementary Component Metadata objects SCM. A many-toone relationship exists between objects in the component layer and objects in the integration layer. For example, a typical Principal Integration Metadata object PIM may have many Principal Component Metadata objects PCM derived from it in the same way as a component schema has many views or export schemas SL90 . For example, a component schema C1 may contain patient demographics, laboratory results, xray images and blood sample information. The owner of the local database may wish to export these four subsets of data to four di erent groups of users. The federated architecture permits this through the de nition of four separate export schemas. This provides a many-to-1 relationship between the export schemas PCM objects and the component schema PIM object.

2.1.3 The Integration Layer

The Integration Layer contains component schemas which contain direct mappings to each local database in a 1-to-1 fashion. This relationship is not present in the abstract model as it deals only with the middleware component. A fuller description of this layer is provided in our service model. 2.2

The Service Model

Our Service Model de nes the architecture of the system, the design of primary and supplemen tary templates i.e. classes and a description of some of the services required. We discuss federation metadata at a fairly abstract level as we have not yet speci ed our canonical data model for the federation. This e ort

PFM

PCM +

is linked to

+

+

1

PIM

is linked to

1

1

1

contains

contains

contains

+

+

+

SFM +

is linked to

+

SCM

SIM +

is linked to

1

Figure 2: Metadata Objects and Cardinalities of Rela-

tionships

has been based on adding extensions to the ODMG data model Cat95 . However, since we have chosen an object model as our CDM and CORBA as a middleware platform we can provide some examples of class de nitions in our descriptions. Although the primary and supplementary objects for each layer are discussed separately, they have many common characteristics. At a basic level all primary objects are schemas and all supplementary objects are mappings. Thus, at an implementation level we have one schema class which is the base class, and three schema classes which are derived from the base class. One of the primary characteristics of our service model is its adaptability to change. The attributes of local schemas cannot be directly represented in each of our federation schemas. Instead we provide generic links to each attribute in local databases. Some of these issues are discussed in x2.4. Earlier work based entirely on metadata modelling and integration issues in the LIOM project can be found in RHCM96 . Here, we focus on a generic middleware architecture which is insulated from local schema changes.

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

Metadata Object Type Acronym Entity Principal Federation PFM Fed. Sch. Principal Component PCM Exp. Sch. Principal Integration PIM Comp. Sch. Supplementary Federation SFM Fed. Attr. Supplementary Component SCM Exp. Attr. Supplementary Integration SIM Comp. Attr.

Figure 3: Framework Objects

methods. For example, the attribute attr1 is accessed only through its interface GetAttr1. As previously mentioned, CORBA objects are used to represent CDM schemas. Federated schemas together with component and export schemas will all derive from the base FedSchema class. A portion of this class is given below where the schema contains an identier and a generic list of attributes. interface FedSchema STRING SchemaName attribute sequenceSchemaAttribute AttrList

2.2.1 The Federation Layer

A federated schema contains an identier attribute and a collection of mappings. Our service model requires a direct mapping from a federation attribute to a component attribute, and some type and conversion information. This is supplied by each SFM object. The attribute and mapping information is encapsulated within each SFM object together with any semantic information. This allows us to view an interface only and hide the mapping and conversion information on the inside. Thus, we have a federated schema object PFM object, containing an identier and several attribute objects SFM objects. The characteristics of an attribute object are: Attribute identier in this Federation Attribute Identier in Export Schema 1 Type and Semantic information for attribute 1 ... Attribute Identier in Export Schema n Type and Semantic information for attribute n The type and conversion information is specic to each federation. For example, a patient's height could be required in cm at the federation level but actually be stored as inches at the local database level. Approaches to the inclusion of semantic information CS91 have been covered in the past. As it requires human interaction we feel it should take place at lower levels, i.e. when the local database model is converted to the canonical data model. However, we treat all canonical model schemas, i.e. component, export and federation, as uniform which allows manipulation of attributes and storage of type and semantic information in each schema. The inclusion of semantic information can be handled by modelling attributes as



As discussed in x2.1 the attributes may contain either 1-to-1 or 1-to-many links to other attributes. Schema attributes in our model contain a sequence of links to other attributes: this sequence may be a list of 1 or n attribute maps. interface SchemaAttribute STRING identifer INTEGER MapCount attribute sequenceAttributeMap MapList

 struct AttributeMap TYPE type STRING SchemaID STRING AttributeId



2.2.2 The Component Layer

The Component Layer contains export schemas. An export schema object contains an identier attribute and a collection of mappings. Unlike the mappings in a federated schema, export attributes map to a single attribute in a single component schema. In this case the SchemaAttribute object in x2.2.1 will contain a sequence list of 1 AttributeMap object. SCM objects have the following attributes: Attribute identier in this Export Schema Attribute Identier in Component Schema

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

Semantic and Type information One of the main reasons for deriving export schemas is to restrict access to the component schema or to export information to be shared with speci c groups or users. Thus, security details are encapsulated in the PCM and SCM objects. F or example, we may restrict access to an entire export schema object or to a selection of attribute objects inside the schema. Our current prototype does not have security features implemented but it is intended to encapsulate security information inside the interface to schema attributes in the same manner as the encapsulation of semantic information mentioned in x2.2.1. This layer together with the Federation Layer form the middleware component and must be insulated from schema modi cations at the local database level. This is accomplished by the application of certain rules in the Integration Layer. 2.2.3

The Integration Layer

The Integration Layer contains both component schemas PIM objects and local sc hemas. PIM objects are modelled in identical fashion to PCM objects see x2.2.2 whereas local schemas are expressed in the native data model of the participating database. The integration process involves local data model to canonical data model conversion. The process diers for each data model and involves issues such as legacy system interoperability Ber96, HRM96 , semantic and structural heterogeneity of data types SL90 and issues concerned with proprietary software applications. Interoperability issues for applications in our test environment are discussed in Roa96 . We concern ourselves with the layer above this integration process and assume we can browse a data dictionary of each participating database. We provide an example of this in x4 where we describe our prototype. In our implementation, the Component Schema class contains the behaviour to browse a generic data dictionary for the purpose of constructing populating a component schema object. A degree of human interaction will generally be required for naming of attributes and inclusion of semantic information for component schema attributes. It is a function of the Integration Layer to update its component schemas after local schemas have been modi ed. However, we believe that the local database should maintain autonomy and ignore that fact that schema modi cations can have an eect on the federation. Instead, the component schema can verify itself against its mapped local schema. We propose a

mechanism whereby a component schema object has a mechanism to refresh its mappings. As previously stated, neither data model conversion nor component schema refresh operations can be fully automated as both require the inclusion of local attribute names and possibly some semantic information. For example, some implementations could have a federated data dictionary which contains a list of common terms for the federation. Each component schema must select one of these common terms to refer to eac h attribute in the local database schema. Environments such as healthcare could have a standard terminology dictionary for all healthcare data.

2.3 Reusing Federation Schemas Our model also permits the reuse of federated schemas in that they may be used to generate other federation schemas. This relationship is modelled in the same fashion as a federation-to-export schema relationship although this is not shown in gure 2. Let us assume that a federation schema Fed1 has been created with an attribute called city which has links to attributes with similar semantics in four export schemas. The attribute city in Fed1 can be reused by another federation in much the same way as federations use attributes in export schemas.

2.4 Integration Issues To participate in the federation, a local schema is transformed by the canonical data model to construct a component schema. We have modelled component schemas as CORBA servers with BuildSchema and RefreshSchema methods as part of the interface. Neither process can be completely automated due to the heterogeneous nature of the participating data models. The BuildSchema process is used when constructing a component schema for the rst time. Once a component schema has been created the DeriveExportSchema method is used to create export schemas using the component schema. As stated earlier, both component and export schemas exist as a series of attribute-to-attribute links between schemas in dierent layers. The RefreshSchema method is used to modify a component schema after a change has been made to the local database. By modelling schemas see x2.2.2 as generic lists of SchemaAttribute objects we have ensured that our schema objects are adaptive to change. Attributes can be added, deleted or modied without rebuilding the schema. At an implementation level, this means that we do not have to rebuild

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

our CORBA server each time a local schema change is propagated to component and export schemas. * city

3 Schema Modications to Local Databases There are three basic types of modications that can be performed on database schemas: 1. addition of new metadata 2. modication of existing metadata 3. deletion of metadata. These modications can be at the class level, i.e. changing the name of a class, or at attribute level, i.e. adding a new attribute. The modications conform to the taxonomy of schema evolution in object-oriented database systems Ban87 . Let us examine the implications of these modications to a local database which is participating in a federation using our framework architecture. The federated schema in gure 4 contains an attribute called city. It is mapped to an attribute called city in various export schemas Component Layer which in turn are mapped to attributes in component schemas Integration Layer. In one component schema there is a mapping to an attribute called Address3 in the local database.

3.1 Adding new metadata The storage of new information involves the addition of metadata to the data dictionary of the local database. For example, in a healthcare environment, the inclusion of new laboratory tests are a frequent occurrence. Let us assume that in a local database a need has arisen to include the country portion of an address. A new attribute Address4 is added to its data dictionary together with semantic information. This has no eect on the component schema in the Integration Layer or either of the higher layers. They are unaware of the fact that new information exists. Similarly if an entire object is added, it has no eect on existing federations. It represents a new collection of attributes which are not being accessed by existing federations. Due to the fact that we use attributeto-attribute mappings we must refresh the integration process to rebuild the component schema object and create an access path to the new data. Since the component schema object acts as an interface to the local

Federation Layer

* city

* town

* city

* add

Component Layer

* add

* city * city

* town

* address3

DB1

* city

DB2

* town

DB3

* add3

DB4 Integration Layer

Conversion and Mapping Mapping

Figure 4: Attribute Mapping database, a revised interface will show the new information which has been added to the local database. Subsequent export schema and federation schema objects can create links to new data. W e are exploring methods for automatically updating export schema objects. For example, if an object in the component schema is called Patient Demographics and an export schema object contains attribute links to all attributes in the Patient Demographics object, then we suggest to the DBA that this export schema object should be updated to map to this new information. It is clear that the addition of new data does not pose diculties for the component or federation layers as existing links are not aected. However, links to new, and possibly relevant infuriation must be created. In our model, it is not the responsibility of the local database to broadcast this new information. Instead this is performed by ComponentSchema objects during routine VerifySchema operations.

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

3.2 Modifying existing metadata The modi cation of metadata may involve a simple attribute name change or in some cases, the data type of an attribute may be changed. Let us assume that the attribute Address3 is changed to City-Address in the local database. Before a refresh process is run, a query for this attribute would return NO LINK since the required attribute no longer exists. This is because a component schema object veri es that no modi cation has been made to an attribute since the last mapping information was recorded. The location of an invalid attribute will cause a request for refresh to be issued. After the component schema object has been refreshed, the link to the attribute called CityAddress has been modi ed. No modi cation to any export schema object is necessary. They map to the same attribute although the name has been modi ed in the local database. However, suppose an attribute in the local database schema age, is changed from STRING to INTEGER and the component schema object has not been refreshed. Before a query is passed to the local database, a check is made on each requested attribute to ensure that the attribute name and type match the information in the component schema object this is part of the same veri cation procedure discussed in the above example. If they dier an INVALID LINK is returned. After the refresh operation the type information in the component schema object is updated and subsequent retrievals for this attribute will operate as normal. For the Component and Federation Layers, they will see an INVALID LINK before the refresh operation but will be unaected after the refresh operation. They are unaected by the type change because export schemas check the type of attributes before the retrieval of data. This type information is stored in the component schema. Thus, where a federation schema object is returned a NO LINK or INVALID LINK message for any attribute in the query, a request for a refresh is issued and the query resubmitted. Note that the query will still be processed and attributes with invalid links are ignored where possible. For example, if a query requests patient demographics for all patients living in Dublin and in one case the attribute age has an invalid link, then the system will still return all remaining demographic attributes in response to the query.

3.3 Deleting metadata For this example, let us assume that the entire address details have been placed into the Address at-

tribute and attributes Address1, Address2 and Address3 have been deleted. The Address constitutes new information and is handled by the process in x3.1. However, existing export schema objects will have links to the Address1, Address2 and Address3 attributes through the component schema object. Queries on these attributes will return NO LINK. When a request for an attribute is passed through the component schema object, it rst veri es that the corresponding attribute in the local schema exists or has not changed type. Where it does not exist, a NO LINK is returned. After the refresh process the attributes deleted from the local schema are not removed from the component schema. They remain in place with a NO LINK mapping as export schema objects do not check for their existence. At some stage, a cleanup operation will remove these NO LINK attributes when it can be veri ed that no export schema object makes reference to them. This rule ensures that all export schemas have valid links to their component schemas.

3.4 Data Manipulation Given that we have provided a federated schema, we must enable users to interrogate query the federated database and potentially change perform transactions upon it. At this stage of the model's development many of these issues are still under investigation. Querying the federated database is easier than, but addresses similar issues to, the provision of transactions. MY95 and BGMS95 address many of the issues of speci c concern here. 3.4.1

Querying

As we have indicated earlier, we regard the objectoriented o-o data model as being suciently expressive for our purposes. Expressive characteristics such as classi cation, generalisation and specialisation, aggregation and decomposition, and extensibility as found in o-o models make them suitable candidates as canonical models SCG91 . Therefore, querying on the federated database will be in an o-o query language such queries are termed federated queries. In order to accommodate the federated system, w e wish to cause as few changes to the LDBs as possible termed design autonomy in BGMS95 . This is because the underlying LDBs are running production systems. This requirement is the converse of the general requirement that changes in the LDB will not cause changes to the higher layers.

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

W e note that the application of primary interest see x4 includes a legacy system that is currently running near full capacity. Hence we have to consider performance-related issues, i.e. that as far as possible federated queries do not interfere with local queries. Each LDB will be based potentially on a dierent data model hence a major issue to address is the mismatch between what an o-o data model and its associated query language will allow and what is allowed by an underlying relational LDB. There are obvious problems with matching recursive o-o queries on to relational models without recursion. Queries involving transitive closure highlight this diculty. For example, in diagnosing illness, it may be important to search for illnesses related to the one suspected, and, in turn, the illness related to these related illnesses etc. Such a query corresponds to a transitive closure operation which can be expressed easily in an o-o framework but not in a relational one. It is not clear how to handle this issue see the discussion in MY95  in the short term, the intention is to make use of the programming language front-ends provided by relational systems to handle this mismatc h. To process a federated query it is necessary to BGMS95 : 1. Modify federation-schema names into component schema names 2. Decompose the query into a number of subqueries, this continues until each sub-query only references data in one component schema. Essentially the intention is to enable each sub-query to run on a single LDB. This raises the issue of what to do with a subquery A which uses data both from component schema A LDB-A and from sub-query B which works on component schema B LDB-B. In this case, there has to be data transfer between LDBA and LDB-B, i.e. the results of sub-query B. 3. When a query is decomposed in this manner, it needs further translation from the component schema into the schema provided by the LDB. 4. Create a post-processing query which combines the results of the sub-queries. The major issue at this stage is deciding where the post-processing is carried out. The intention is to translate back from the local schema to the component schema when necessary at the component layer. The results of this translation will then be combined into the overall result from the query at the federation layer.

In the healthcare context, there are statistical summary information federation queries that are to run across all local databases and require current information. In this case, some of the detail may involve one sub-query processing data from more than one other sub-query, in which case we have the problem mentioned above.

3.4.2

Transactions

Transactions cause an additional problem to querying in that there is a requirement to ensure that the eect of a federated transaction involving updates to more than one LDB is seen in each LDB or in none. With updates of data items within transactions it appears necessary to address issues such as two-phase commit. We note that the provision of such a protocol violates the design autonomy mentioned above. This highlights the diculties: we have existing LDB production systems which we do not wish to compromise by requiring signicant change at the same time we wish to regard the sum of the LDBs as being a coherent federation, and further we do not wish changes in the LDBs to impact on the upper layers. W ork in this area is still preliminary but we envisage borrowing the memory coherency models from logically shared but physically distributed parallel systems, for example LW95 . In particular, we wish to investigate generalising the refresh protocol described in x2 to enable updates to data held in the LDBs to be handled in a similar way to updates to metadata. It is possible to envisage large transactions that will cause performance diculties if handled in the standard ACID way. Ways to relax the serialisability requirements need to be addressed in order to allow more concurrency and thus higher performance. We also need to address the issues to do with transaction management in a heterogeneous environment BGMS95 . Query optimisation is important for eciency considerations. W e note the potential conicts between optimisation at the federated layer and the potentially dierent optimisations at the LDB BGMS95 . Of particular interest is how we map changes in the behaviour of the LDB load and thus its optimisation policies. These changes should ideally dynamically impact on the optimisation policies at the federation layer. Again the aim is to propagate changes that have major impact, here on system performance.

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

4 A Healthcare Protot ype We have speci ed and partially constructed a prototype using the framework described in this paper and based on a subset of the healthcare environment at St. James' Hospital in Dublin. Four applications are required to share information for the Genito-Urinary Medicine clinic. They range from small PC-based applications to a large Laboratory Information System legacy database. A full description of these applications, their hardware and software platforms, and methods for interoperating with each application is given in Roa96, RHCM96. As stated in our discussion on the integration layer for the service model, an integration process is required for each data model to participate in the federation. This integration process di ers for each data model. We also stated that our ComponentSchema object was generic and that it assumed that all data model schemas could be browsed for the purpose of building the component schema. This involves the construction of an interface which lies between the component and local schemas. So far we have integrated one of the data models into the federation. The Pharmacy System Roa96 has a relational data model which we have browsed using ODBC functionality SCS95 to query the local schema and construct a component schema. It is this ODBC-style of interface which needs to be constructed for a local database to participate in the federation. W e also assume that the interface can be mapped easily to an object-oriented data model. For the relational model this has been relatively straightforward. An interface has also been constructed for a second application. The Laboratory Information System LIS is a legacy database based on the MUMPS platform. An interface has been built using the M programming language Shu96 to read the data dictionary and pass queries to the LIS. The application and its interface are discussed in more detail in our work with the LIOM project RHCM96 . Our entire framework has been built using Orbix2 Ion95, an implementation of the CORBA speci cation. A component schema exists as a CORBA server whose interface is e ectively the component schema. Using our application tool for building export schemas we can browse the component schema to derive export schemas. By using CORBA we were able to ignore certain distribution, naming and location issues and concentrate on interoperability issues.

5 Conclusion In this paper we have described our architecture for integrating autonomous database systems into a federation. We have designed our service model so as to minimise the e ects of local database schema modi cations on federated schemas. We have accomplished this by forcing the component schema objects to manage all change and to provide a consistent interface to the upper layers where export and federation schemas reside. Our current work in this area involves automatic updating of export schemas when component schemas have been modi ed. Our plan is to minimise the e ort of the DBA in this regard. W e are also examining persistence issues for our data dictionary and have started the speci cations for a data dictionary service for the federation. Finally, our major on-going work involves the formal speci cation of an object-oriented canonical data model for federated databases. Although our CDM is not fully speci ed, the work on federation metadata and architectures to handle local schema changes will provide value input into nalising our common data model.

Acknowledgements The authors wish to acknowledge the early comments of Pierce Hickey on the framework architecture. W e are also grateful to Fergus Kelledy for his proofreading of the nal draft of the paper. Finally, we would like to express our thanks to the anonymous referees whose helpful insights contributed to the nal draft of this paper.

References Ban87 J. Banerjee, H. Chou, J. Garza, W. Kim, D. W oelk, N. Ballou. Data Model Issues for ObjectOriented Applications. ACM Transactions on Ofce Information Systems, January 1987. Ber96 P. Bernstein. Middleware: A Model for Distributed System Services. Communications of the ACM, Vol. 39, No. 2, Feb 1996. BGMS95 Y. Breitbart, H. Garcia-Molina and A. Silberschatz. Transaction Managemen t in Multidatabase Systems. In Modern Database Systems, W. Kim Ed. , pp. 551-572, A CM Press, 1995. Cat95 R. Cattel. The Object Database Standard: ODMG 1.2. Morgan Kaufmann, 1995.

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

Col94 D. Coleman, P. Arnold, S. Bodo, C. Dollin, H. Gilchrist, F. Hayes, P. Jeremaes. ObjectOriented Development: The Fusion Method. Prentice Hall, 1994. CS91 M. Castellanos and F. Saltor. Seman tic Enrichment of Database Schemas: An Object Oriented Approach. 1st International Workshop on Interoperability in Multidatabase Systems, IEEE Press, 1991. FHM91 D. Fang, J. Hammer and D. McLeod. The Identi cation and Resolution of Semantic Heterogeneity in Multidatabase Systems. 1st International Workshop on Interoperability in Multidatabase Systems, IEEE Press, 1991.

HM85 D. Heim bigner and D. McLeod. A Federated Architecture for Information Management. ACM Transactions on Oce Information Systems, Vol. 3, No. 3, 1985. HRM96 P. Hickey, M. Roantree, J. Murphy. Architectural Issues for Integrating Legacy Systems using CORBA2 in the LIOM Project. To be published in 3rd International Conference on Object Oriented Information Systems, Springer, 1996. Ion95 IONA Technologies Ltd. Orbix 2 Programming Guide. Iona Technologies, 1995. LM91 Q. Li and D. McLeod. An Object-Oriented Approach to Federated Databases. 1st Interna-

PBE95 E. Pitoura, O. Bukhres and A. Elmagarmid. Object Orientation in Multidatabase Systems. ACM Computing Surveys, Vol. 27, No. 2, June 1995. Roa96 M. Roantree. Interoperability Issues for Healthcare Data Models. LIOM Report No. LIOM-DCU-96-03, Dublin City University, 1996. RHCM96 M. Roan tree, P. Hickey, J. Cardi, J. Murphy. Metadata Modelling for Healthcare Applications in a Federated Database System. Proceedings from the workshop on Trends in Distributed Systems in Aachen, Germany, LNCS, Springer,

1996. SCG91 F. Saltor, M. Castellanos and M. GarciaSolaco. Suitability of Data Models as Canonical Models for Federated Databases. SIGMOD Record vol. 20, no. 4, 1991. SCS95 R. Signore, J. Creamer and M. Stegman. The ODBC Solution. McGraw-Hill, 1995. Shu96 D. Shusman. Programming with M. Database Development, Dr. Dobbs Sourcebook, JanFeb 1996. SL90 A. Sheth and J. Larson. Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys, vol. 22, no. 3, September 1990.

tional Conference on Interoperability in Multidatabase Systems, IEEE Press, 1991. LW95 D.E. Leonoski and W-D. W eber.Scalable Shared-Memory Multiprocessors. Morgan Kau-

man Publishers, 1995. MY95 W. Meng and C. Y u. Query Processing in Multidatabase Systems. In Modern Database Systems, W. Kim Ed., pp. 573-591, ACM Press, 1995. MZ95 T. Mo wbray and R. Zahavi. The Essential CORBA: System Integration Using Distributed Objects. Wiley , 1996. OHE96 R. Orfali, D. Harkey and J. Edwards. The Essential Distributed Objects Survival Guide. Wi-

ley, 1996. OMG95 The Object Managemen t Group.The Com-

mon Object Request Broker Architecture: Architecture and Specication, Object Managemen t

Group, Framington, 1995.

Proceedings of The Thirtieth Annual Hawwaii International Conference on System Sciences ISBN 0-8186-7862-3/97 $17.00 © 1997 IEEE

1060-3425/97 $10.00 (c) 1997 IEEE

Suggest Documents