A Pattern-Based Framework for Database Reusability
BeiHu Wang, Xiaodong Liu and Jon Kerridge School of Computing Napier University Edinburgh, Scotland UK {t.wang, x.liu,
[email protected]}
Abstract The development of database application systems will benefit from high reusability because similar design circumstances recur frequently in database developments. This paper presents a framework to cope with design reuse in database environments. The framework proposes a reuse process supporting particular features of the database domain. Data intensive design patterns are used to record reusable artefacts in both data schema and processing. An XMIbased pattern definition language is developed due to its standard information exchange format and popularity among software communities. Keywords : software reuse, design patterns, database development, XMI, XML.
1 Introduction A database is the main building block of any modern information system. As an essential component, the database must be very carefully designed to contain all the information required by the user. Today, we are facing with an increasing demand for more complex applications on databases. This rapid growth has stimulated the need for more efficient, easy tools and techniques for a database design and development. [24] One of the potential solutions is to apply reuse technology to the development of database systems. Reuse has been researched with success as a key technology in software engineering for decades [15]. However, reuse in the database domain seems not to have received enough attention. Until now the most of the reuse research has been done only on the reuse of general software systems [22]. Approaches and tools supporting the reusability in data-intensive systems, such as relational or object-oriented databases, are urgently needed. In software engineering reusable components have been well known and practiced almost since the first programs were written. Reuse has been considered as a means for overcoming the software crisis. Reuse can be applied at the level of code (algorithms and data structures),
design (system decompositions) and requirements models (domain specific concepts) as well as human resources [9]. Software reuse has shown very successful improvement on efficient, high quality and low cost software design and development. Reuse patterns as a rather new and incrementally concerned reuse technology have shown the ability to collect communal wisdom about good solutions to problems, and to reuse those solutions in the future development of software systems [20]. In “Understanding and Using Patterns in Software Development”[5], Dirk Riehle and Heinz Zullighoven gave a nice definition of the term “pattern” which is very broadly applicable: “A pattern is the abstraction from a concrete form which keeps recurring in specific no-arbitrary contexts.” We believe that an approach that combines pattern technology and considerations of the particular features of database systems will help to improve reuse in database development. In this paper we propose a pattern-based framework, which is suitable for database reuse. The framework will cover the whole reuse process with particular consideration for database domain. The possible process phases include pattern mining, pattern definition, pattern repository, and pattern reuse. An XMI-based pattern definition language will be introduced due to the rich expressibility and popularity of XML and XML Metadata Interchange (XMI). The pattern definition language is object-oriented and database domain specific. A pattern example is given, and conclusion is drawn at the end of the paper.
2 Database Reusability 2.1 Feasibility and Importance The primary motivation to reuse a database is to reduce the time and efforts required to develop a database syste m. Because the quality of software system is enhanced by reusing quality software artefacts, reuse also reduces the time and efforts required to maintain software [3], similarly database reuse can, first of all, influence logical and physical database design and also database maintenance. Our literature investigation shows that the similar design circumstances recur frequently in database developments [26][12][6]. Another discovery is that these design circumstances are basically structurable and require very typical implementation solutions. These features provide an optimist foundation for applying pattern technology to achieve high degree of reuse during development.
2.2 Lack of Approaches and Technologies Reuse in general purpose software development has been researched for many years with success, including pattern technologies. Several successful projects of pattern reuse projects are summarised as follows: • • •
The SDL-Pattern Approach [16]. The system is designed to support all phases of a reuse process and the accompanying improvement cycle by providing adequate functionality based on pattern technique. XML Patterns [10]. Four patterns are developed to deal with a general problem, getting distributed data onto a WEB site. Using Patterns to Design Rules In Workflows [7]. It is an approach to flexible workflow design based on rules and patterns developed in the framework of the WIDE project. Rules allow a high degree of flexibility during workflow design by 2
• • •
modelling exceptional aspects of the workflow separately from the main activity flow. Patterns at AG Communication Systems [18]. At AG Communication Systems, software and process solutions have been captured and made available for collaboration and reuse as documented patterns. Patterns in industrial automation at Siemens [1]. Various operated divisions at Siemens are investigating the effectiveness of using patterns to improve their software production and these activities. Design patterns at Motorola [19]. Motorola has several independent efforts investigating the use of design patterns for system development.
Further matured patterns approaches are documented in a series of books [4][25][11][8]. However, the approaches and technologies in the field of database reuse have been inadequately addressed, and approaches and tools are in urgent need by both researchers and practitioners. The goal of the “MetaBase project” [23] is to show that it is possible to apply reuse ideas to the field of database, actually to the conceptual modelling. A model de rived for one enterprise could be used in similar projects in the future for more or less similar enterprises. This argument results in the meta data model repository—MetaBase, which enables the reuse of models or submodels of previous projects in the actual design.
2.3 Problems in Database Reuse Actually, every reuse in the field of computer science is difficult while usually useful abstractions for large complex reusable software artefacts will typically be complex. Because the development of database system is composed not only of data intensive applic ation design and also data content and structure design, the abstraction for database reusable artefacts is far more difficult than general software system. The challenge is embodied in the following aspects. • • •
More complex reuse components Database reuse should reuse data schema, application pr ogram, and data as well. Data-crucial No loss of data during evolution. In software evolution the much difficult and feasible step is reuse old database system’s design and data. Data-Centred Programs Database application programs are based on data schemas.
3. The Framework Based on the study of the features of database systems, a framework for the reuse in these systems is presented in this section. The framework depicts a s uitable process for the reuse of design patterns in database domain, the improvement cycle of database design patterns and a database pattern definition language.
3.1 Database Reuse Process Figure 1 shows the proposed process of design pattern reuse in database applications. The process consists of two closely related parts: pattern reuse, which focuses on the reuse of available design patterns in database application development, and pattern mining, which focuses on developing new patterns or improving existing patterns through the development process.
3
Figure 1 Database Reuse Process
Pattern reuse We advocate that the initial start point of pattern reuse is requirement analysis. For database applications, requirements should include two aspects: data requirements, which describe the requirements on data schema/architecture; and processing requirements, which describes the requirements on data processing/transactions. The next stage of pattern reuse is called scenario analys is, which is an extension of requirement analysis. In this stage both data and processing requirements will be decomposed and represented with data -intensive use cases, which is an extension of UML use cases with database application features and rigorous descriptions. The selection of suitable design patterns happens at the architectural design stage. Based on the requirements expressed in data-intensive use cases, the architectures of the data schema and processing software will be developed. Pattern qualification will be done in parallel with architectural design, which means that the selection of design patterns should comply with the architecture of the database application and the architecture of the database application may 4
be adjusted to fit certain qualified patterns as well. Design patterns are selected from the pattern repository constructed during pattern mining. All patterns will be described in XMIbased pattern description language. During pattern adaptation, the selected patterns will be adjusted to fit better to the database application architecture. The final product of architectural design is the architectural model, which validates the requirements and contains qualified design patterns as reuse units. The qualified design patterns are then applied in the detailed design stage, that is, the design solutions are introduced into the system design. For database applications, the qualified design patterns may contain design solutions about data schema and transactions. After refinements, the final design will be validated. In the implementation stage, the data schema implementation solutions and the transaction implementation solutions from the qualified patterns will be applied to the system implementation, which may include data schema, storage and application code. Pattern mining During requirements analysis, useful domain knowledge can be extracted with domain analysis techniques, and domain models will be built or enhanced. The purpose of domain analysis is to identify information used in developing systems in a domain, capture this information, and organize the knowledge into a reusable form when creating new systems [14]. This phase focuses on supporting systematic and large-scale database reuse by capturing both the commonalities and the variability of DB systems within a domain to improve the efficiency of DB development and maintenance. The results of the analysis, collectively referred to as a domain model, are captured for reusability in the current domain. The development of new design patterns and the evolution of existing patterns may happen at a few stages of the database application development. Pattern mining is an activity which involves many elements: domain models as the basis, structured requirements and architectural design from practical database application developments as input, and quality data schema and transaction implementation as pattern solutions. Pattern definition focuses on initial documentation of a developed or evolved pattern. The author of a pattern needs to decide what constitutes a pattern, what makes it reusable by others, what domain it serves, and hence whether it qualifies to be a pattern or not. Design patterns will be defined in an XMI-based Pattern Definition Language (PDL), and stored in the database design pattern repository. The reason for using XMI is that XMI is a widely accepted format of information exchange among software development groups and available tools, and by integrating other techniques, such as formal methods, with XMI , it is possible to make the PDL description rigorous and meanwhile easy to understand.
3.2 The Improvement Cycle of Database Patterns Although reusing database patterns can be a promising way for transferring existing database design knowledge into new projects, as a matter of fact, developing such artefacts is quite expensive and time consuming. Additionally, the latest state of the art is always expected, as reuse benefits strongly depe nd on the quality of reuse artefacts [16]. This calls for a systematic and efficient instrument for detecting and capturing improvement potential of the database design patterns and the reuse process. In accordance with our pattern-based database reuse process, a continuous quality improvement as applied to the database pattern approach supports incremental evolution of database design patterns, which is essentially driven by practical experience and triggered upon user demands. This approach is organized around a 5
database pattern reuse repository, which stores the reuse patterns and interrelates them with further experience elements that are essential for efficient packaging of suggested improvements. Figure 2 gives the details of our proposed improvement cycle of database design patterns.
Figure 2 Database Design Pattern Improvement Cycle
3.3 Database Pattern Definition Language If a pattern is a recurring solution to a problem in a context given by some forces, then a pattern definition language is a collective of such solutions which, at some scale, work together to resolve a complex problem into an orderly solution according to a pre-defined goal. So far there is not a standard PDL, though there are several projects working in this area: HTML 2.0 Pattern Language [13], Formal Method Pattern Language [17], and Pattern Markup Language [21], which is an XML-based format to describe software patterns. However, all of those approaches to PDL are not suitable or not successful to be accepted by database pattern designers, due to the lack of easy understanding, well-formatted and easy exchange reasons. In this paper, we propose to use a PDL based on XMI for rigorous description of database design patterns. The advantages for using XMI is that XMI is a widely accepted format of information exchanges among software development groups and available tools, and by integrating other techniques, such as formal methods, with XMI, it is possible to make the PDL description rigorous and meanwhile easy to understand. XML Metadata Interchange (XMI) provides open interchange information among the major types of application development environments [2]. Such as: • Design tools, including object-oriented UML tools such as Rational Rose and Select Enterprise. • Development tools, including integrated development environments like Jbuilder, Visual C++. • Database, Data Warehouses and Business Intelligence tools, including IBM DB/2, Oracle/8i. 6
• • •
Software assets, including program source code(c, c++, and java) and case tools such as TakeFive’s SniFF+. Repositories Reports, report generation tools, documentation tools, and web browsers.
An XMI-based PDL with special considerations of database application features will benefit pattern designers in rigorous pattern descriptio n, easy pattern propagation, and high reusability. As the XMI-based PDL is an extension of XML, it can be visualised in most Web browsers, and can be edited using any text editing tools. An example of pattern definition is given in section 4 to show the sample structure and syntax of the XMI-based PDL.
4 A Pattern Example In this section, an example of a database design pattern defined in the XMI-based PDL is given. The description of a database design pattern, stored in the database pattern repository, normally consists of the following contents: •
• • •
Title. Pattern titles appear to be noun phrases describing attributes like: - Design Decision - Name of the main objects of the pa ttern - Function Intent. What the pattern achieves - Domain - Database model Problem. This item describes when to apply the pattern. It explains the problem and its context. Sometime the problem will include a list of conditions that must be met before it makes sense to apply the pattern. Solution. - Reusable data schema - Reusable data processing - Implementation of the schema and processing
The PDL defines two sets of rules that provide open interchange and leverage the capabilities of XMI. The two sets of rules in PDL are PDL DTD generation and PDL document generation. PDL DTD provides a means by which an XML processor can validate the syntax and the semantics of an XML document. Syntax specifies what utterances are legal. Semantics specifies what legal utterances mean. PDL DTD generation is used to specify an interchange format, and PDL document generation creates documents that use a given PDL DTD. A PDL DTD involves standard UML DTD, Database Schema DTD, and Pattern Special DTD. A PDL DTD sample is shown ad below:
7
> ………….. ………. …………… ……….. ………… …………. ………………….…. ………………………. …………………………. ……………….. …………… ……………….. …………. … ………………… …………….
A PDL document involves five major elements: title, intent, question, solution, and related patterns. A sample is given as follows, the database system is called “Composer System”. From practical develop a pattern named composer design pattern. The composer design pattern includes several classes, for example Report, ProjectClient, ProjectServer. For conciseness, just one composer class is discussed here. The composer class includes a standard set of methods. GetData is to get data from the composer database to pass to the report. PutData is to put data from the report into the database. composer patterns composer system this system aims to manage composer composer database input, show composer report …………………. ………………….. ………………….. composer
8
…………………….. …………………….. ………………………. - - -
COMPOSER COMP_NO COMP_IS COMP_NO COMP_IS COMP_TYPE
-
COMPOSITION C_NO C_IN C_NO COMP_DATE C_TITLE C_IN
…………………… ……………………… ………………………
Because the XSL Transformations (XSLT) can transform an XML document to be shown in a Web Brower, so the composer pattern can be shown in web browser as follows: Composer system this system aims to manage composer composer database input, show composer report composer ........ composer ........ cop ........ GetData ........ PutData ........
course
9
courseno courseno cname cdate
.........
5. Conclusions Based on the investigation that structural design circumstances recur frequently in the development of database applications, we concluded that raising reusability in database applications would improve the efficiency of development greatly. The fact that most reuse approaches and tools have been concentrated on general software systems with little emphasis on reuse in a data intensive environment has triggered the research in this paper. The proposed framework aims to use design patterns to facilitate the database application development with improved reusability. To be a reuse methodology, the current framework needs further work, such as detailed pattern mining criteria and rules, and detailed pattern definition language. However, the current framework is valuable in the general process of design pattern reuse and mining in database environment. Initial investigation have shown that an XMI-based pattern definition language is very helpful in improving the reusability because of its wide acceptance among software engineers, approaches and tools, and its nature of a rigorous internal representation and concise easy understandable external format. Design patterns in the framework span a spectrum of abstraction levels, including requirement interface, architectural and design solutions, and implementation solutions. These patterns are the reusable blocks to build a new database system. Existing expertise and idioms of database design recorded in these patterns are then presented as the starting point for the design of a new database application. Design patterns are described in the XMI -based pattern definition la nguage and stored in the database pattern repository. The pattern definition language is a semi-semantic pattern language. The features inherited from XMI make the design patterns easier to understand, to exchange and to propagate over software communities.
References [1]
Beck, K., Industrial Experience with Design Patterns , IEEE proceeding of ICSE-18, 1996.
[2]
Brodsky, S., XMI Opens Application Interchange, IBM, 1999.
[3] [4]
Brooks, No Silver Bullet: Essence and Accidents of Software Engineering , , Computer 20, vol 4, 1987. Coplien, J.a.D.S., Pattern Language of Program Design, Addison-Wesley, 1995.
[5]
Drik Riehle , H.Z., Understanding and Using Patterns Software Development, Theory and Practice of Object Systems 2 , 1996(1): p. 3-13.
[6]
Egyhazy, C.J., From Software Reuse to Database Reuse, International Journal of Software Engineering and Knowledge Engineering, vol 10, 1998: p. 227-249.
[7]
Fabio Casati , S.C., Using Patterns to Design Rules in Workflows , IEEE Transactions on Software Engineering, vol 26, 2000. 10
[8]
Foote, B., N.Harrison, Pattern Language of Program Design 4. 1999: Addison -Wesley.
[9]
John D. McGregor, D.A.S., Object-Oriented Software Development: Engineering Software For Reuse. Van Nostrand Reinhold. 1992.
[10]
Maria Laura Ponisio , G.R. XML Patterns, the PLoP 2001 Conference, 1999.
[11]
Martin, R., , D. Riehle, Pattern Language of Program Design 3 , Addison-Wesley, 1998.
[12]
Martins, A.J.B., A. Applying Delegation to OO Logical Design: Towards Reuse, OO Databases , Simpósio Brasileiro de Banco de Dados, 1997.
[13]
Orenstein, R., An HTML 2.0 Pattern Language, http://www.anamorph.com/docs/patterns/default.html. 1995.
[14]
Prieto-Diaz, R., Domain Analysis : An Introduction, in Software Engineering Notes, 1990. 15(2): p. 4754.
[15]
Rada, R., Software Reuse. Intellect Ltd, 1995.
[16]
Raimund L. Feldmann, B.G., An ORDBMS-based Reuse Repository Supporting the Quality Improv ement Paradigm -- Exemplified by the SDL-Pattern Approach. The Proceedings of the: Technology of Object-Oriented Languages and systems, 2000.
[17]
Rajeev R. Raje, S.C., elePUS-- A Language for Specification of Software Design Patterns, Proceedings of ACM SAC'2001, 2001.
[18]
Rising, L., Reuse at AG Communication Systems = Patterns , MutiUse Express, vol 4, 1996.
[19]
Schemidt, D.C., Using Design Patterns to Develop Reusable Object -Oriented Communication Software, Communications of the ACM, vol 38, 1995.
[20]
Sommerville, I., Software Engineering 6th ed. Addison-Wesley. 2001.
[21]
Suzuki, J., UML EXCHANGE FORMAT & PATTERN MARKUP LANGUAGE, http://www.yy.ics.keio.ac.jp/~suzuki/project/uxf/index.html#papers . 2000.
[22]
TATJANA WELZER, M.D., Similarity Search in Database Reusability -- a Support on Efficient Design of Conceptual Models., Vaasa: Vaasan yilopisto, 2000, 2000.
[23]
TATJANA WELZER , B.S., Reuse in the Sense of the Rapid Database Prototyping, BADEN1996, 1996.
[24]
TATJANA WELZER*, B.S., Reuse Database Components, the Patterns From MetaBase Repository , HAWAIIAMS98, 1998.
[25]
Vlissides, J., J.Coplien, Pattern Language of Program Design 2.: Addison-Wesley, 1996.
[26]
Williams, P., A New Database Direction, IT Week 28th, 2001.
[27]
Yacoub, S.M., Pattern -Oriented Analysis and Design (POAD): A Methodology for Software Develo pment, Computer Engineering., West Virginia University: Morgantown, 1999.
11