Virtual hosting for different IP addresses mapped to the same server. vi. DirectoryIndex directives to ... Pages (JSP), and ColdFusion. In comparison to these ...
FUNCTIONAL ORIENTED TO OBJECT ORIENTED APPROACH IN SOFTWARE REENGINEERING (A case study of Student Information System, Usmanu Danfodiyo University Sokoto)
BY
Bello Alhaji Buhari B.Sc (UDUS 2000) M.Sc/SCIE/49130/2005-06
A THESIS SUBMITTED TO THE POSGRADUATE SCHOOL, AHMADU BELLO UNIVERSITY, ZARIA NIGERIA IN PARTIAL FULFILMENT FOR THE AWARD OF MASTER OF SCIENCE IN COMPUTER SCIENCE
DEPARTMENT OF MATHEMATICS, FACULTY OF SCIENCE, AHMADU BELLO UNIVERSITY, ZARIA
JULY, 2009
DECLARATION
I, declare that the work in the project thesis entitles “FUNCTIONAL ORIENTED TO OBJECT ORIENTED APPROACH IN SOFTWARE REENGINEERING (A CASE STUDY OF STUDENT INFORMATION SYSTEM, USMANU DANFODIYO UNIVERSITY SOKOTO)” has been performed by me in the Department of Mathematics under the supervision of Dr. D.N. Choji. The information derived from the literature has been duly acknowledged in the text and a list of references provided. No part of this thesis was previously presented for another degree or diploma at any University.
______________________
_________________
____________
Name of Student
Signature
Date
ii
CERTIFICATION
This thesis entitled Functional Oriented To Object Oriented Approach in Software Reengineering (A Case Study Of Student Information System, Mis Usmanu Danfodiyo University Sokoto)
by Bello Alhaji Buhari meets the
regulations governing the award of the degree of Master of Science of Ahmadu Bello University, Zaria and is approved for its contribution to knowledge and literary presentation.
_____________________________ Dr. D. N. Choji Chairman, Supervisory Committee
___________________ Date
_________________________ Dr. O. S. Adewale Minor Supervisor
_______________________ Date
___________________________ External Examiner
__________________________ Date
__________________________ Prof. G. U. Garba Head of Department of Mathematics
___________________________ Date
_______________________ Prof. S.A. Nkom Dean, Postgraduate School
___________________________ Date
iii
AKNOWLEDGEMENT
All praise and gratitude goes to Almighty Allah for sparing my life and guiding me all through to this moment. Peace and blessing of Allah be upon the Holy Prophet Muhammad, his family and companions. Blessing of Allah also be upon the men of God in every century till hereafter.
I have no words to thank my parents for their endless concern and support in all life. May Allah reward them with paradise?
My gratitude goes to my supervisor, Dr D.N Choji, for his patience and guidance through this research thesis. Also I wish to acknowledge Mr. A.A. Obiniyi for his encouragement and guidance toward the success of this thesis and study in general. I also acknowledge Dr. O. S. Adewale, my minor supervisor, Professor Sahalu B. Junaidu, Mr. Ajibade and all members of the department.
I wanted to use this opportunity to acknowledge Dr Aminu A. Ibrahim, who has been guiding and giving me advice toward the success of my entire life.
iv
ABSTRACT
Most organizations stand at a crossroads of competitive survival. A crossroads created by the information revolution that is now shaping/shaking the world. This thesis explore reengineering principles in changing Usmanu Danfodiyo University Sokoto Student Information System form functional oriented offline system to object oriented online system. It changes the system from DOS platform to Windows platform, from offline to online, from Dbase V to object oriented PHP (PHP 5) and from flat file (Dbase V) to relational database (MySql).
v
TABLE OF CONTENTS Chapter One: General Introduction 1.1 Background of the study ----------------------------------------------------------1 1.2 Motivations of the study ----------------------------------------------------------5 1.3 Problems analysis ------------------------------------------------------------------6 1.4 Objectives of the study ------------------------------------------------------------7 1.5 Research methodology ------------------------------------------------------------8 1.6 Chapter scheme --------------------------------------------------------------------9 Chapter Two: Literature Review 2.1 Introduction -----------------------------------------------------------------------11 2.2 Reengineering --------------------------------------------------------------------12 2.3 Reverse engineering -------------------------------------------------------------13 2.4 Software engineering and reverse engineering approaches ----------------14 2.4.1 Source-to-source translation approaches -----------------------------------15 2.4.2 Object recovery and specification approaches -----------------------------17 2.4.3 Incremental approaches -------------------------------------------------------22 2.4.4 Component-based approaches -----------------------------------------------24 2.5 New research trends -------------------------------------------------------------28 2.6 Software reengineering process model ----------------------------------------36 2.7 Technologies used in designing the student information system ----------41 Chapter Three: Reverse Engineering 3.1 Introduction -----------------------------------------------------------------------50 3.2 Reverse engineering to understand processing -------------------------------51
vi
3.2.1 Creating a data flow diagram -------------------------------------------------52 3.3 Reverse engineering to understand data --------------------------------------54 3.3.1 Creating an entity relationship diagram -------------------------------------55 3.4 Reverse engineering to understand user interface ---------------------------59 3.4.1 Creating a state transition diagram ------------------------------------------60 Chapter Four: Restructuring and Forward Engineering 4.1 Introduction -----------------------------------------------------------------------62 4.2 Code restructuring ---------------------------------------------------------------63 4.2.1 Creating flowchart -------------------------------------------------------------66 4.3 Unified modeling approach to object oriented analysis ---------------------71 4.3.1 Use-cases ------------------------------------------------------------------------72 4.3.2 Class-responsibility-collaborator (CRC) modeling -----------------------76 4.3.3 Defining structure and hierarchies -------------------------------------------86 Chapter Five: Summary, Conclusion and Recommendation 5.1 Introduction -----------------------------------------------------------------------88 5.2 Summary --------------------------------------------------------------------------89 5.3 Conclusion -----------------------------------------------------------------------90 5.4 Recommendation -----------------------------------------------------------------92 References -----------------------------------------------------------------------------94 Appendix A Appendix B
vii
LIST OF TABLES Table 1 the attributes of student object --------------------------------------------55 Table 2 the attributes of course object ---------------------------------------------55 Table 3 the attributes of examination object --------------------------------------55 Table 4 CRC for student class-------------------------------------------------------83 Table 5 CRC for fees class-----------------------------------------------------------84 Table 6 CRC for course class--------------------------------------------------------84 Table 7 CRC for course_registration class-----------------------------------------85 Table 8 CRC for course_result class------------------------------------------------85
viii
LIST OF FIGURES Fig. 1 reverse engineering process -------------------------------------------------14 Fig. 2 timeline of research on reengineering --------------------------------------29 Fig. 3 software reengineering process model -------------------------------------35 Fig. 4 the reverse engineering process model -------------------------------------48 Fig. 5 level 0 DFD for student information software ----------------------------51 Fig. 6 level 1 DFD for student information software ----------------------------51 Fig. 7 level 2 DFD that refines process: process result --------------------------52 Fig. 8 ERD for student information software (without cardinality and modality) ----------------------------------------------------------------------57 Fig. 9 ERD for student information software (with cardinality and modality) ----------------------------------------------------------------------57 Fig. 10 STD for student information software ------------------------------------60 Fig. 11 online registration module flowchart -------------------------------------66 Fig 12 result entry module flowchart ----------------------------------------------67 Fig. 13 online result checking module flowchart ---------------------------------68 Fig. 14 fees report module flowchart ----------------------------------------------69 Fig. 15 departmental course report module flowchart ---------------------------70 Fig. 16 a high level use-case diagram ----------------------------------------------74 Fig. 17 a low level use-case diagram for submit forms function ---------------74 Fig. 18 a low level use-case diagram for makes query function ---------------75 Fig 19 a low level use-case diagram for enters data function ------------------75 Fig 20 a low level use case diagram for create report function -----------------75
ix
Fig. 21 a class diagram for generalization/specialization with composite aggregate ----------------------------------------------------------------------87
x
GLOSSARIES Data flow diagram is a graphical representation that depicts information flow and the transforms that are applied as data move from input to output. Data processing Refers to a class of programs that organize and manipulate data, usually large amounts of numeric data. Forward engineering is the process applying software engineering principles, concepts, and methods to re-create an existing application. Legacy system is a computer system that has been in operation for a long time, and whose functions are too essential to be disrupted by upgrading or integration with another system. Reverse engineering for software is the process of analyzing a program in an effort to create a representation of the program at a higher level of abstraction than source code. System evolution is the accommodation of changes after the system has been delivered to its users.
xi
CHAPTER ONE: GENERAL INTRODUCTION 1.1
Background of the Study
It is not uncommon to find that many of the data processing systems of today have been in operation for as many as 10 – 30 years. Although the original developers may not have expected their products to be providing useful services for these years, it is now understood that long lifetimes are an inherent property of many business systems. Of great significant is the fact that any non-trivial system must change over the course of its lifetime. Long lifetimes represent a considerable period of time with potentially extensive changes to accommodate.
System evolution is the accommodation of changes after the system has been delivered to its users. Specification, development, validation, and evolution are all fundamental activities to any process model. Evolution is however, often seen as an adjunct to the other activities.
Lehman and Belady (1985) conducted a series of studies on system evolution and produced some five observations as follows: a. The first law states that change is inevitable. Systems operate in dynamic environments, and should a system remain static, it will not continue to serve its users’ real needs.
1
b. The second law states that as a system is changed, it structure is degraded. Additional costs over implementing the change is required to reverse the effects of structural degradation. c. Third law suggests that for large systems, program evolution is largely independent of management decision because of organizational factors. d. The fourth law shows that changes to resources, such as staffing, have imperceptible effects on evolution. For example, productivity may not increase by assigning new staff to a project because of the additional communication overhead. e. In the fifth law, Lehman suggests there is limit to the rate at which new functionality is introduced. Such that, if new functionality is introduces in any one release, Lehman observed that a new release would be required fairly quickly to correct errors introduced in the previous release.
Legacy systems are older software systems that remain vital to an organization. Many software systems that are still in use were developed many years using technology that are now obsolete. These systems are still essential for the normal functioning of the business or organization.
As mentioned above, systems must change in order to remain useful. Changing legacy system is often expensive for the following reasons: a. Different parts of the system are implemented by different teams. b. The system might be using an obsolete programming language.
2
c. The system documentation could be often out-of –date. d. The system structure might have been corrupted by many years of maintenance. e. Techniques to save space or increase speed at the expense of understanding might have been used.
Evolutionary systems, by themselves do not address the legacy problems. What is needed is a way of transforming legacy systems into evolutionary systems. The three options to choose for transforming legacy system into evolutionary system are: a. Continued maintenance: This approach involves continued maintenance of the system according to the traditional develop-and-maintain process model. However, it is sensible to employ some evolutionary system concept, such as starting to manage a system evolution records. b. Re-engineering: Re-engineering is the systematic transformation of an existing system into a new form to realize quantity improvements in operation, system capability, functionality, performance or evolveability at a lower cost. c. Replacement: The existing system is replaced with a new system developed from scratch. Replacement may be necessary where reengineering is not technically viable or radical change of the business process.
3
In this thesis the second alternative is preferred, because it has some key advantages. These advantages are: a. Lower costs: Evidence from a number of US projects suggests that reengineering an existing system costs significantly less than new system development. b. Lower risks: Re-engineering is based on incremental improvement of system, rather than radical system replacement. The risk of loosing critical business knowledge, which may be embedded in a legacy system or of producing a system that does not meet it users’ real needs, is drastically reduced. c. Better use of existing staff: Existing staff expertise can be maintained, and extended to accommodate new skills during re-engineering. The incremental nature of re-engineering means that existing staff skills can evolve as the system evolves. The approach carries less risk and expense which is associated with hiring new staff. d. Revelation of business rules: As a system is re-engineered, business rules that are embedded in the system are rediscovered. This is particularly true where the rules govern exceptional situations. e. Incremental development: Re-engineering can be carried out in stages, as budget and resources are available. The operational organization always has a working system, and end users are able to gradually adapt to the reengineered as it is delivered in incremental.
4
Approached to software re-engineering and use of them are investigated in this thesis. The knowledge is applied on an existing legacy system, which processes student results.
1.2
Motivations Of The Study
Software reengineering is being used to recover legacy systems and allow their evolution (Jacbon and Lindstrom, 1991). It is performed mainly to reduce maintenance costs, and improve development speed and systems readability.
However, the importance of reengineering goes beyond such technical aspects. Legacy systems represent much of the knowledge produced and maintained by an organization, which cannot afford to lose. Thus, reengineering allows this knowledge to be reused, in the form of reconstructed code and documentation.
The essence of software re-engineering is to improve or transform existing software so that it can be understand, controlled, and used anew. The need for software re-engineering has increase greatly, as Usmanu Danfodiyo University Sokoto Student Information System has become obsolescent in terms of its architecture, the platform on which it runs, and it suitability and stability to support changing needs. Software re-engineering is important for recovering and reusing existing software assets, putting high software
5
maintenance cost under control, and establishing a base for future software evolution.
The demand for reengineering usage has been growing significantly over the years. The need for different business sectors to adapt their systems for the Web or to use other technologies is stimulating research into methods, tools and infrastructures that support the evolution of existing applications. Most Nigerian Universities have now implemented online registration and result checking systems.
In conclusion, the above points are what motivate me for selecting the research topic: functional oriented to object oriented approach in software reengineering, and for selecting the case study as Usmanu Danfodiyo University Sokoto Student Information System.
1.3
Problems Analysis
The combination of employing dated processes, techniques and technology, coupled with the long lifetime over which legacy system have been considerably changed, result in system which suffer from several problems. These problems are: a. High maintenance costs: Legacy systems are generally associated with high maintenance costs. The root course for this expense is the degraded structure that results from prolonged maintenance. Systems with contrived
6
structures are invariably complex, and understanding them requires considerable effort. b. Obsolete hardware and software: Many systems operate with obsolete software and / or software. For business critical applications, this is clearly undesirable, as component failures cannot be corrected. Organizations should strive to avoid this situation by assessing vendors of hardware and software for survivability. Where this situation does arise, the system should migrate to technology that is not only currently supported, but is expected to be supported over the anticipated lifetime of the system. c. Poor documentation: It is common for many legacy systems to be poorly documented, if at all. Documentation includes specifications, design and test, and user documents. Over several years, documentation typically becomes out of step with the system it documents. It is often the case that quick bug fixes are not documented. d. Poor understanding: Inconsistent documentation is misleading and adds confusion to the system understanding exercise. Where documentation is non-existence, maintainers must rebort to the source code of legacy systems. In extreme scenarios, source code is missing too, and the system must be understood by studying its behavior.
1.4
Objectives Of The Study
a. The first objective is to provide knowledge onto what extent it is possible to reuse software in a re-engineering process.
7
b. Examination of tools and techniques that allow for a smooth transformation of student information system in Usmanu Danfodiyo University, Sokoto to one that use more appropriate and robust advanced technologies. c. Improve maintainability of the student information system by re-designing the system with more appropriate functional modules and explicit interfaces. d. Migrate to newer technology. It involves migrating to newer operating system / platform, database management system and programming language. e. To change the students’ information system from DOS operating system to Windows operating system, changing the system from functionaloriented
to
object-oriented
architecture,
from
functional-oriented
programming language called Dbase V to object-oriented programming language called PHP 5 and migrating from Dbase V database to MySql relational database and from offline to online.
1.5
Research Methodology
Traditionally, software engineering techniques have attempted to improve the process of new software development. These efforts have produced structured analysis/design, object-oriented analysis/design, Computer-Aided Software Engineering (CASE), etc. But what does an organization do with its legacy (i.e., existing) software created prior to the adoption of these wonderful new
8
methodologies? Legacy software still needs to be maintained even though its quality, performance, reliability, and maintainability are deteriorating.
Reengineering tools can capture design information from otherwise indecipherable software (i.e. spaghetti code). Unstructured software can be structured. And software/data can be ported to new languages, configurations, or platforms.
This thesis explores reengineering principles in changing Student Information System from functional oriented offline system to object oriented online system. It changes the system from DOS platform to Windows platform, from offline to online, from Dbase V to object oriented PHP (PHP 5) and from flat file (Dbase V) to relational database (MySql).
1.6
Chapter Scheme
Chapter one is a general introduction. It involves background of the study, motivation of the study, problems analysis, objectives of the thesis, scope and limitation, research methodology and chapter scheme.
Chapter two is literature review. It consists of introduction, reengineering, reverse engineering, software reengineering and reverse engineering approaches, new research trends, software reengineering process model and technologies used in designing the Student Information System.
9
Chapter three is reverse engineering. It consists of introduction, reverse engineering to understand processing, reverse engineering to understand data and reverse engineering to understand to understand user interface.
Chapter four is restructuring and forward engineering. It consists of introduction, code restructuring and unified modeling approach to object oriented analysis,
Chapter five is summary, conclusion and recommendation. It consists of introduction, summary, conclusion and recommendation.
10
CHAPTER TWO: LITERATURE REVIEW 2.1
Introduction
Most organizations stand at a crossroads of competitive survival. A crossroads created by the information revolution that is now shaping/shaking the world. Information Technology has gained so much importance that companies and governments must either step into, taking advantage of modern information processing technology, or lag behind and be bypassed. The point is that most of the critical software systems that companies and government agencies depend upon were developed many years ago, and maintenance is not enough to keep these systems updated, according to technological changes.
Even worse than deprecation, Lehman and Belady (1985) empirically prove that, if no improvement is made, maintenance degrades the software quality and, therefore, its maintainability. As the amount of maintenance activity increases, the quality deteriorations, called aging symptoms (Visaggio, 2001), become heavier to manage, turning the legacy system into a burden that cannot be thrown away, and consuming vital resources from its owner.
To solve this problem, Visaggio (2001) presents some experimental evidence showing that the reengineering process can decrease some aging symptoms. However, even reconstructing the legacy system, there is no guarantee that the new system will be more maintainable than before. For instance, the object-
11
orientation, often considered the “solution for the software maintenance” (Bennett and Rajlich, 2000), created new problems for the maintenance (Ossher and Tarr, 1999) and should be used with caution to assure that the maintenance will not be more problematic than in traditional legacy systems.
Current reengineering methods and approaches do not address these problems, leaving the decisions up to the software engineers. This is especially true in the Reverse Engineering phase.
2.2
Reengineering
In general, reengineering is a way to achieve software reuse and to understand the concepts underlying the application domain. Its usage makes it easier to reason about reuse information in analysis and design documents which are not always available in legacy systems.
According to the literature (Sneed, 1995; Olsem, 1998; Bennett and Rajlich, 2000 and Bianchi et al, 2003), reengineering is usually performed with one of the following objectives: a. Improve maintainability: maintenance efforts can be reduced through reengineering by producing smaller modules that are easier to maintain. The problem is that it is not simple to measure the impacts of this activity. In fact, it may take years before the maintenance effort reductions can be
12
observed, making it difficult to determine if these benefits were achieved through reengineering or due to other reasons; b. Migration: reengineering can be used to move the software to a better or less expensive operational environment. It also may convert old programming languages into new programming languages, with more resources or higher flexibility; c. Achieve greater reliability: the reengineering process includes activities that reveal potential defects, such as redocumentation and testing, making the system more stable and trustable; and d. Preparation for functional enhancement: decomposing programs into smaller modules improves their structure and isolates them from each other, making it easier to change or add new functions without affecting other modules.
2.3
Reverse Engineering
Reverse engineering should produce, preferably in an automatic way, documents that help software engineers in understanding the system. Over the last ten years, reverse engineering research has produced a number of capabilities for analyzing code, including subsystem decomposition (Umar, 1997), concept synthesis (Biggerstaff, et al., 1994), design, program and change pattern matching (Gamma et al, 1995, Stevens and Poley; 1998), analysis of static and dynamic dependencies (Systa, 1999), object-oriented metrics (Chidamber and Kemerer, 1994), and others. In general, these
13
approaches have been successful in treating the software at the syntactic level to address specific information needs and to span relatively narrow information gaps. The reverse engineering process is illustrated in Fig.1 (Sommerville, 1996).
Fig. 1 Reverse Engineering Process (Sommerville, 1996)
The process starts with the analysis phase. During this phase, the legacy system may be analyzed by tools to obtain information regarding the system (design, specifications or architectural diagrams). The software engineers work in the code level to retrieve this information, which should be then stored, preferably as high abstraction level diagrams. In industrial-scale systems, reverse engineering is usually applied through semi-automated approaches using CASE (Computer Aided Software Engineer) tools which are used to analyze the semantics of the legacy code. The result of this process is sent to restructuring and rebuilding tools and to the forward engineering activities to complete the reengineering process.
2.4
Software Reengineering and Reverse Engineering Approaches
A great number of techniques and methods have been proposed to face the software reconstruction problem. This section presents a survey on these approaches, with the objective of defining a common set of requirements that 14
should be addressed in an effective approach. There are several research trends on reverse engineering, but five stand out: i.
Source-to-Source Translation
ii.
Object Recovery
iii.
Specification
iv.
Incremental Approaches and
v.
Component-Based Approaches.
These trends are described in the following sections.
2.4.1
Source-To-Source Translation Approaches
Essentially, all program translators (both source-to-source translators and compilers) operate via transliteration and refinement. The source program is first translated from the source language into the target language on a statement-by-statement basis. Various refinements are then applied in order to improve the quality of the output. Although acceptable in many situations, this approach is fundamentally limited to the reengineering context due to the low quality of the produced output: it tends to be insufficiently sensitive to global features of the source program and too sensitive to irrelevant local details.
The transformation process is controlled by dividing it into a number of phases, which each phase applies transformation selected from a small set. The transformations within each set are chosen in a way that conflicts among
15
transformations will not arise. The Lisp-to-Fortran translator proposed by Boyle and Muralidharan (1984) is based on the transformational approach. The translator handles a subset of Lisp which does not include hard-totranslate features, such as the ability to create and execute new Lisp code. Since readability is not a goal of this translator, the readability of the output is abandoned in favor of producing reasonably efficient FORTRAN code. This translator is perhaps better through of as a Lisp to FORTRAN compiler rather than source-to-source translator.
The main advantage of source-to-source translation is that it is faster than traditional approaches, and it requires less expensive manual effort. However, it is clear that there are still some significant limitations in the quality of the output that is produced by typical translators. This can be clearly seen in source-to-source translation, where human intervention is typically required in order to produce acceptable output.
Waters (1988) presents an alternative translation paradigm – abstraction and reimplementation. In this paradigm, the source program is first analyzed in order to obtain a programming-language independent understanding of the computation performed by the program as a whole. Based on this understanding, the program is reimplemented in the target language. In contrast to the standard translation approach of transliteration and refinement, translation
via
analysis
and
reimplementation
16
utilizes
an
in-depth
understanding of the source program, which makes it possible for the translator to generate the target code without being constrained by irrelevant details.
2.4.2
Object Recovery and Specification Approaches
The buzzword of the late 1980’s and early 1990’s was the Object-Oriented (OO) paradigm. The OO paradigm offers some desirable characteristics, which significantly help improving software reuse. This is the predominant software trend of the 1990’s and it should enhance maintainability, reduce the error rate, increase productivity and make the data processing world a better place to live (Meyer, 1997).
The idea of applying OO reverse engineering is that in a simple way and with limited efforts the software engineer can derive a model from an existing system. With a model, one can better reason about a change to be performed, its extent, and how it shall be mapped onto the existing system. Moreover, the derived model can serve as a basis for a future development plan.
The first relevant work involving the OO technology was presented by Jacobson and Lindstrom (1991), who applied reengineering in legacy systems that were implemented in procedural languages, such as C or COBOL, obtaining OO systems. They stated that reengineering should be accomplished in a gradual way, because it would be impractical to substitute a whole
17
working system for a completely new one (which would demand many resources). They consider three different scenarios: i.
Changing the implementation without changing functionality;
ii.
Partial changes in the implementation without changing functionality; and
iii.
Changes in the functionality.
Object-orientation was used to accomplish this modularization. Jacobson and Lindstron (1991) used a specific CASE tool and defend the idea that reengineering processes should focus on tools to aid the software engineer. They also state that reengineering should be incorporated as a part of the development process, instead of being a substitute for it.
Gall and Klöch (1994) propose heuristics to find objects based on Data Store Entities (DSEs) and Non-Data Store Entities (NDSEs) which act primarily over tables representing basic data dependency information. The tables are called (m,u)- tables, since they store information on the manipulation (m) and use (u) of variables. With the help of an expert, a basic assistant model of the OO application architecture is devised for the production of the final generated Reverse Object-Oriented Application Model (ROOAM).
Yeh et al (1995) propose a more conservative approach based not just on the search of objects, but directed towards the finding of Abstract Data Types
18
(ADTs). Their approach, called OBAD, is encapsulated by a tool, which uses a data dependency graph between procedure and structure types as a starting point for the selection of ADT candidates. The procedures and structure types are the graph nodes, and the references between the procedures and the internal fields are the edges. The set of connected components in this graph forms the set of candidate ADTs.
Wilkening et al (1995) presented a process for legacy systems reengineering using parts of their implementation and design. The process begins with the preliminary source code restructuring, to introduce some improvements, such as removal of non-structured constructions, “dead” code (e.g: code that is not accessible from any of the system’s starting points) and implicit types. The purpose of this preliminary restructuring is to produce a program that is easier to analyze, understand and restructure. Then, the produced source code is analyzed and its representations are built in a high abstraction level. Those representations are used in the subsequent restructuring steps, redesign and redocumentation, which are repeated as many times as necessary in order to obtain a fully restructured system. Then, reimplementation and tests are performed, finalizing the reengineering process.
According to Wilkening et al (1995), the transformation of procedural programs to an OO structure presupposes that the programs are structured and modular, otherwise they cannot be transformed. A program is said to be
19
structured if it has no GOTO-like branches from one code segment to another, for instance, and is said to be modular if it is divided in a hierarchy of code segments, each one with a single entry and a single exit. The segments, or procedures, should correspond to the elementary operations of the program and each operation should be reachable by invoking it from a higher level routine. The other prerequisite for a modular program is that there must be a procedural calling tree for the subroutine hierarchy so that all subroutines are included as part of the procedure which calls or performs them.
Another fundamental work in OO reengineering is presented in Sneed (1996), describes a tool aided reengineering process to extract objects from existent COBOL programs. He need emphasizes the predominance of the object technology, mainly in distributed applications with graphic interfaces, questioning the need to migrate legacy systems to that technology. He need identifies obstacles to OO reengineering, such as the object identification, the procedural nature of most of the legacy systems, the code redundancy and the arbitrary use of names.
Similarly like Wilkening et al (1995), Sneed assumes as prerequisites for OO reengineering the structuring and modularization of programs. Sneed also advocates the need for the existence of a system call tree, to identify procedure calls within the system. Sneed’s OO reengineering process is composed of five steps:
20
i.
Object selection,
ii.
Operation extraction,
iii.
Feature inheritance,
iv.
Redundancy elimination and
v.
Syntax conversion.
Step (i) must be performed by the user, optionally supported by a tool. Step (ii) extracts operations performed on the selected objects, replacing the removed code segments with calls to their respective operations. Step (iii) creates attributes in the objects, to represent the data accessed by the operations removed in step (ii). Step (iv) merges similar classes and removes redundant classes. Finally, step (v) converts the remaining classes into ObjectCOBOL, in a straightforward conversion process.
The transformation of procedural programs into OO programs is not trivial. It is a complex, multi-step, m:n transformation process which requires human intervention in defining what objects should be chosen. Although there are already some commercially available tools to help the task of reverse engineering, a still bigger effort is needed to face the complexity of the existing systems, not only because of their size, but also due to their intrinsic complexity.
21
2.4.3
Incremental Approaches
The approaches presented in sections 2.3.1 and 2.3.2 involves the entire system reconstruction as an atomic operation. For this reason, the software must be frozen until the process execution has been completed; in other words, no changes are possible during this period. In fact, if a change or enhancement is introduced, the legacy system and its replacing candidate would be incompatible, and software engineers would have to restart the whole process. This situation causes an undesirable loop between the maintenance and the reengineering process.
To overcome this problem, several researchers have suggested wrapping the legacy system, and considering it as a black-box component to be reengineered. Due to the iterative nature of this reengineering process, during its execution the system will include both reengineered and legacy components, coexisting and cooperating in order to ensure the continuity of the system. Finally, any maintenance activities, if required, have to be carried out on both the reengineered and the legacy components, depending on the procedures impacted by the maintenance.
The first important iterative process was proposed in (Olsem, 1998). According to Olsem, legacy systems are formed by four classes of components (Software/Application, Data Files, Platforms and Interfaces) that cannot be dealt in the same way. The incremental reengineering process
22
proposed by Olsem uses different strategies for each class of components, reducing the probability of failure during the process.
Another important contribution from Olsem’s work is his proposal of two ways to perform incremental reengineering: with re-integration, in which the reconstructed modules are reintegrated into the legacy system, and without reintegration, in which the modules are identified, isolated and reconstructed, maintaining the interface with the modules that were not submitted to the process through a mechanism called “Gateway”.
Bianchi et al (2003) presented an iterative model for reengineering aged legacy systems. The proposed process is similar to Olseem’s proposed, however, the Bianchi et al. Iterative model focuses on components as result of the reengineering, characteristics are: the reengineering is gradual, i.e., it is iteratively executed on different components (data and functions) in different phases; during the execution of the process, legacy components, components currently undergoing reengineering, reengineered components, and new components, added to the system to satisfy new functional requests, must coexist in harmony.
In another work, Zou and Kontogiannis (2003) propose an incremental source code transformation framework that allows procedural systems to be migrated to modern object oriented platforms. Initially, the system is parsed and a high
23
level model of the source code is extracted. The framework introduces the concept of a unified domain model for a variety of procedural languages such as C, Pascal, COBOL, and FORTRAN. Next, to keep the complexity and the risk of the migration process into manageable levels, a clustering technique allows the decomposition of large systems into smaller manageable units. A set of source code transformations allow the identification of an object model from each unit. Finally, an incremental merging process allows the amalgamation of the different partial object models into an aggregate composite model for the whole system.
There are several benefits associated with iterative processes: by using the “divide-and-conquer” technique, the problem is divided into smaller units, which are easier to manage; the outcomes and return on investment are immediate and concrete; the risks associated to the process are reduced; errors are easier to find and correct, not putting the whole system at risk; and it guarantees that the system will continue to work even during execution of the process, preserving the maintainers’ and users’ familiarity with the system (Bianchi et al., 2000).
2.4.4
Component-Based Approaches
Currently, on top of object-oriented techniques, an additional layer of software development, based on components is being established. The goals of “componentware” are very similar to those of object-orientation: reuse of
24
software is to be facilitated and thereby increased, software shall become more reliable and less expensive (Lee et al., 2003).
As was discussed in previous sections, Component-Based Development (CBD) is not a new idea. McIlroy (1969) proposed using modular software units in 1968. The extraction of reusable software components from entire systems is an attractive idea, since software objects and their relationships incorporate a large amount of experience from past development. It is necessary to reuse this experience in the production of new software (Caldiera and Basili, 1991).
Among the first research works in this direction, Caldiera and Basili (1991) explore the automated extraction of reusable software components from existing systems. Caldiera and Basili propose a process that is divided in two phases. First, some candidates from the existing system are choosen and packaged for possible independent use. Next, an engineer with knowledge of the application domain analyzes each component to determine the services it can provide. The approach is based on software models and metrics. According to Caldiera and Basili, the first phase can be fully automated, “reducing the amount of expensive human analysis needed in the second phase by limiting analysis to components that really look worth considering”.
25
Years later, Neighbors (1996) presented some informal research, performed over a period of 12 years, from 1980 to 1992, with interviews and the examination of legacy systems, in an attempt to provide an approach for the extraction of reusable components. Although the work does not present conclusive ideas, it gives several important clues regarding large systems. According to Neighbors, the architecture of large systems is a trade-off between top-down functional decomposition and bottom-up support of layers of Application Programming Interfaces (API’s) or virtual machines. Therefore, attempts to partition a system according to one of these approaches will not succeed. A better partitioning approach is based on the concept of sub-systems, which are encapsulations convenient to system designers, maintainers and managers. The subsequent step, which may be performed manually or automatically, comprehends their extraction into reusable components.
Another work involving software components and reengineering may be seen in (Alvaro et al., 2003), where they presented a software reengineering CASE environment based on components, called Orion-RE. The environment uses software reengineering and Component-Based techniques to rebuild legacy systems, reusing the available documentation and the built-in knowledge that is in their source code. A software process model drives the environment usage through the reverse engineering, to recover the system design, and forward engineering, where the system is rebuilt using modern technologies,
26
such as design patterns, frameworks, CBD principles and middleware. Alvaro et al. observed some benefits in the reconstructed systems, such as a greater degree of reuse and easier maintenance and also observed benefits due to the automation achieved through the CASE environment.
Lee et al. (2003) proposed similar approach where they present a process to reengineer an object-oriented legacy system into a component-based system. The components are created based upon the original class relationships that are determined by examining the program source code. The process is composed of two parts: (i) creation of basic components with composition and inheritance relationship between constituent classes and (ii) refinement of the intermediate component-based system using the Lee et al. proposed metrics, which include connectivity strength, cohesion, and complexity. Finally, the approach proposed by Lee et al. is based on a formal system model, which reduces the possibility of misunderstanding a system and enables operations to be correctly executed.
These four approaches are examples of the current trend on reverse engineering research, as observed by (Keller et al, 1999). Component-Based approaches are being considered in reverse engineering, mainly due to their benefits in reuse and maintainability. However, a complete methodology to reengineer legacy systems into component-based systems is still lacking, but this lack is not restricted to reverse engineering. As can be seen in (Bass et al.,
27
2000), the problems faced when considering Component-Based approaches in reengineering are only a small subset of the problems related to ComponentBased Software Engineering in general. While these problems remain unsolved, reengineering may never achieve the benefits associated with software components.
2.5
New Research Trends
Fig 2 summarizes the survey presented on Section 2.4. In summary, the first works focused on source-to-source translation, without worrying about readability and quality of the generated products. Afterwards, with the appearance of OO technology, there was an increasing concern over the quality of source code and documentation. However, its appearance also introduced problems related to paradigm changing, since most legacy systems were procedural. To reduce these problems, incremental approaches were proposed as alternatives to give more flexibility to the processes, allowing the coexistence of legacy and reengineered systems.
28
29
Figure
2
Timeline
of
research
on
reengineering
and
30
reverse
engineering
(Eduardo
et
al,
2007).
Component-Based approaches do not follow this evolution (source-to-source → object-orientation → incremental approaches) and have been sparsely researched over the years. This may be explained by the recent nature of CBD and the problems associated with it, inhibiting researchers in their efforts. Although the isolated efforts never formed a real trend in the reengineering research area, this scenario is currently changing, as it can recently be seen with the concentration of efforts in this direction.
Currently unresolved issues include reducing the functionalities dispersal and increasing modularity, which assist in the maintenance and the evolution of the reengineered systems. Decomposing a system into small parts is a good start to achieve these benefits. Unfortunately, some requirements, such as exception handling or logging, are inherently difficult to decompose and isolate. Object-orientation techniques and Design Patterns (Gamma et al., 1995) may reduce these problems, but they are not enough (Kiczales et al., 1997).
The Aspect-Oriented Software Development (AOSD), came up in the 90’s as a paradigm addressed to achieve the “separation of crosscutting concerns (or aspects)”, through code generation techniques that combine (weave) aspects in the logic of the application (Kiczales et al., 1997). According to aspectoriented concepts, aspects modify software components through static and
31
dynamic mechanisms. The static mechanisms are concerned with the state and behavior addition in the classes, while the dynamic mechanisms modify the semantics of these at execution time. These aspects are implemented as separate modules, so that it is transparent how aspects act and how they relate to other components (Filman and Friedman, 2000).
Currently, with AOSD technologies being adopted and extended, new challenges and innovations start to appear. AOSD languages, such as AspectJ and AspectS, contributions from several research groups and the recent integration with application servers, such as JBOSS and BEA’s WebLogic, demonstrate the potential of AOSD in solving real problems, including those pursued in reengineering.
Investigations about AOSD in the literature have involved determining the extent to which it can be used to improve software development and maintenance, along the lines discussed by Bayer (2000). AOSD can be used to reduce code complexity and tangling (intertwined code in a confused mass); it also increases modularity and reuse, which are the main problems that are currently faced by software reengineering. Thus, some works that use AOSD ideas in reengineering may be found in the recent literature.
In the case study of Kendall (2000), existing object-oriented models are used as the starting point for reengineering with aspect-oriented techniques. In this
32
work, Kendall did not describe the reengineering process, the techniques, mechanisms nor the following steps in full detail. He just reported the comparative results among the object-oriented code and the aspect-oriented code generated by the reengineering. The use of AOSD in this case study reduced the overall module (number of classes and methods) and Lines Of Code (LOC). There was a reduction of 30 methods and 146 lines of code in the aspect-oriented system.
Sant’anna et al. (2003) present some metrics for comparisons between aspectoriented and object-oriented designs and implementations, which may serve to evaluate the product of the reengineering process.
Lippert and Lopes (2000) present a study that points the ability of the AOSD in facilitating the separation of the exceptions detection and handling concern. The case study involved the examining and reengineering of a Javabuilt framework, using AspectJ (Kiczales et al., 2001).
An approach to retrieve the knowledge embedded in an object-oriented legacy system using AOSD is presented in Garcia et al., 2005. This approach, called the Phoenix approach, aids in the migration from object-oriented code, written in Java, to a combination of objects and aspects, using AspectJ. It uses aspect mining in order to identify possible crosscutting concerns from the OO source code and extracts them through refactoring into new aspect-oriented code.
33
Next, the aspect-oriented design is retrieved through software transformations and may be imported in a CASE tool, and become available in higher abstraction levels. The retrieved information constitutes important knowledge that may be reused in future projects or in reengineering.
An evaluation was conducted to demonstrate the reengineering process usefulness. By following Phoenix, it could be verified that AOSD brings several important benefits to software development. The way the aspects are combined with the system modules allows the inclusion of additional responsibilities without committing the clarity of the code, maintainability, reusability, and providing reliability.
In the last two decades, software engineering has focused on application modeling, analysis, simulation, and semi-automated implementation of larger and larger software systems. Recently, the diffusion of personal wireless devices and their highly distributed, heterogeneous, and mobile nature have raised new challenges that permeate the entire software engineering life cycle. Some research involving personal wireless devices has been developed in many directions. This new set of challenges is more appropriately characterized as “programming in the small and many” (Medvidovic et al., 2003).
34
The spread of personal wireless devices has raised the need to port to this new environment existing (legacy) desktop applications. Unfortunately, these applications are often too large. Limited processing power and storage capacity affect the possibility of performing complex computations and working with large datasets, so preventing mobile devices from accessing a wide range of enterprise technologies and resources.
One important work is presented by Canfora et al. (2004) called Thin Client aPplications for limiTed dEvices (TCPTE) approach. In this work, they present an approach based on the thin client model implemented through a framework which enables the seamless migration of Java desktop applications to mobile devices with limited resources. The main benefit related to the use of TCPTE is the semi-automatic reengineering of legacy Java AWT applications for their porting to personal wireless devices.
In the last decade, many methods and tools have been proposed to port legacy applications toward new environments through the migration of their user interfaces (UIs) (Canfora et al., 2004). The most known approaches can be classified in: output analysis, code analysis and wrapping, translation.
These three methods have been primarily used to migrate legacy systems using character based UIs directly onto graphical user interfaces (GUIs) (Merlo et al.; 1995, Aversano et al.; 2001) or onto an abstract description for a
35
successive implementation on GUIs, WEB or WAP UIs (Kapoor and Stroulia, 2001), or to migrate GUIs from one toolkit to a new one (Moore and Moshkina, 2000).
With the translation approach (Paulson, 2001) the legacy application code is transformed in order to employ the primitives supported by the target platform.
Finally, with the library substitution approach, the legacy application code does not need to be modified because the GUI framework is replaced at linking time.
Far too often, architecture descriptions of existing software systems are out of synchronization with the implementation. If they are, they must be reconstructed, but this is a very challenging task.
2.6 Software Reengineering Process Model Reengineering is defined as the examination and alteration of software to reconstitute it in a new form, and includes the subsequent implementation of the new form. The reengineering paradigm shown in the fig 3 is a cyclical model. This means that each of the activities presented as a part of the paradigm may be
36
revisited. For any particular cycle, the process can terminate after any one of these activities. It consists of six steps.
Fig 3 Software Reengineering process model (Pressman, 2001) a. Inventory Analysis Every software organization should have an inventory of all applications. The inventory can be nothing more than a spreadsheet model containing information that provides a detailed description (e.g., size, age, business criticality) of every active application. By sorting this information according to business criticality, longevity, current maintainability, and other locally important criteria, candidates for reengineering appear. Resources can then be allocated to candidate applications for reengineering work
It is important to note that the inventory should be revisited on a regular cycle.
37
The status of applications (e.g., business criticality) can change as a function of time, and as a result, priorities for reengineering will shift. b.
Document restructuring
Weak documentation is the trademark of many legacy systems. But what do we do about it? What are our options? i.
Creating documentation is far too time consuming. If the system works, we’ll live with what we have. In some cases, this is the correct approach. It is not possible to re-create documentation for hundreds of computer programs. If a program is relatively static, is coming to the end of its useful life, and is unlikely to undergo significant change, let it be!
ii.
Documentation must be updated, but we have limited resources. We’ll use a “document when touched” approach. It may not be necessary to fully redocument an application. Rather, those portions of the system that are currently undergoing change are fully documented. Over time, a collection of useful and relevant documentation will evolve.
iii.
The system is business critical and must be fully redocumented. Even in this case, an intelligent approach is to pare documentation to an essential minimum
Each of these options is viable. A software organization must choose the one that is most appropriate for each case.
38
c.
Reverse Engineering
The term reverse engineering has its origins in the hardware world. A company disassembles a competitive hardware product in an effort to understand its competitor's design and manufacturing "secrets." These secrets could be easily understood if the competitor's design and manufacturing specifications were obtained. But these documents are proprietary and unavailable to the company doing the reverse engineering. In essence, successful reverse engineering derives one or more design and manufacturing specifications for a product by examining actual specimens of the product.
Reverse engineering for software is quite similar. In most cases, however, the program to be reverse engineered is not a competitor's. Rather, it is the company's own work (often done many years earlier). The "secrets" to be understood are obscure because no specification was ever developed. Therefore, reverse engineering for software is the process of analyzing a program in an effort to create a representation of the program at a higher level of abstraction than source code. Reverse engineering is a process of design recovery. Reverse engineering tools extract data, architectural, and procedural design information from an existing program. d.
Code Restructuring
The most common type of reengineering (actually, the use of the term reengineering is questionable in this case) is code restructuring. Some legacy systems have relatively solid program architecture, but individual modules
39
were coded in a way that makes them difficult to understand, test, and maintain. In such cases, the code within the suspect modules can be restructured.
To accomplish this activity, the source code is analyzed using a restructuring tool. Violations of structured programming constructs are noted and code is then restructured (this can be done automatically). The resultant restructured code is reviewed and tested to ensure that no anomalies have been introduced. Internal code documentation is updated. e.
Data Restructuring
A program with weak data architecture will be difficult to adapt and enhance. In fact, for many applications, data architecture has more to do with the longterm viability of a program that the source code itself.
Unlike code restructuring, which occurs at a relatively low level of abstraction, data structuring is a full-scale reengineering activity. In most cases, data restructuring begins with a reverse engineering activity. Current data architecture is dissected and necessary data models are defined. Data objects and attributes are identified, and existing data structures are reviewed for quality.
40
When data structure is weak (e.g., flat files are currently implemented, when a relational approach would greatly simplify processing), the data are reengineered.
Because data architecture has a strong influence on program architecture and the algorithms that populate it, changes to the data will invariably result in either architectural or code-level changes. f.
Forward Engineering
Forward engineering, also called renovation or reclamation, not only recovers design information from existing software, but uses this information to alter or reconstitute the existing system in an effort to improve its overall quality. In most cases, reengineered software reimplements the function of the existing system and also adds new functions and/or improves overall performance.
2.7 Technologies Used In Designing the Student Information System. The student information system designed is an online information system. It enables the students to do their registration online. It also enables MIS staff to enter student result and produce all needed reports online. It is designed using PHP (Hypertext preprocessor), Apache and MySql.
PHP, Apache, and MySQL are all part of the open source group of software programs. The open source movement is basically a collaboration of some of the finest minds in computer programming. By allowing the open exchange of
41
information, programmers from all over the world contribute to make a truly powerful and efficient piece of software available to everyone. Through the contributions of many people to the publicly available source code, bugs get fixed, improvements are made, and a “good” software program becomes a “great” one over time. a. Apache Apache acts as our Web server. Its main job is to parse any file requested by a browser and display the correct results according to the code within that file. Apache is quite powerful and can accomplish virtually any task that you, as a Webmaster, require.
The version of Apache is version 2.2.52. The features and server capabilities available in this version include the following: i.
Password-protected pages for a multitude of users.
ii.
Customized error pages.
iii.
Display of code in numerous levels of HTML, and the capability to determine at what level the browser can accept the content.
iv.
Usage and error logs in multiple and customizable formats.
v.
Virtual hosting for different IP addresses mapped to the same server.
vi.
DirectoryIndex directives to multiple files.
vii.
URL aliasing or rewriting with no fixed limit.
42
It can be used to host a Web site to the general public, or a company-wide intranet, or for simply testing your pages before they are uploaded to a secure server on another machine.
b. PHP PHP is a server-side scripting language that allows your Web site to be truly dynamic. PHP stands for PHP: Personal Home Page. Its flexibility and relatively small learning curve (especially for programmers who have a background in C, Java, or Perl) make it one of the most popular scripting languages around. PHP’s popularity continues to increase as businesses and individuals everywhere embrace it as an alternative to Microsoft’s ASP language and realize that PHP’s benefits most certainly outweigh the costs.
According to Zend Technologies, Ltd., the central source of PHP improvements and designers of the Zend Engine, which supports PHP applications, PHP code can now be found in approximately 9 million Web sites.
PHP was conceived in 1994 and was originally the work of one man, Rasmus lerdorf. It was adopted by other talented people and has gone through four major rewrite to bring us the broad, mature product we see today.
43
PHP is an Open Source product, which means you have access to the source and can use, alter, and redistribute it all without charge. PHP originally stood for personal home page but was change inline with the GNU recursive name convention (GNU) and now stands for PHP Hypertext preprocessor.
The current major version of PHP is 5. This version has seen complete rewrite of the underlying Zend engine and some major improvements to the language.
i. Some of PHP’s Strength Some PHP’s main competitors are Perl, Microsoft ASP.NET, Java Server Pages (JSP), and ColdFusion. In comparison to these products, PHP has much strength, including the following:
Performance
PHP is very efficient. Using a single inexpensive server, you can serve millions of hits per day. If you use large number of commodity servers, your capacity
is
Technologies
effectively
unlimited.
(http://www.zend.com)
Benchmarks show
PHP
published
by
outperforming
Zend its
competitors.
Database Integration
PHP has native connections available to many database systems. In addition to MySql you can directly connect to PostgreSQL, mSQL, Oracle, dbm, FilePro, Hyperwave, Informix, InterBase, and Sybase databases, among others. PHP 5 also has a built-in SQL interface to a flat file, called SQLite.
44
Using the Open Database Connectivity Standard (ODBC), you can connect to any database that provides an ODBC driver. This includes Microsoft products and many others.
Built-in Libraries
Because PHP was design for use on the web, it has many built-in functions for performing many useful web-related tasks. You can generate GIF images on the fly, connect to web servers and other network services, parse XML, send e-mail, works with cookies, and generate PDF documents, all with just a few lines of code.
Cost
PHP is free. You can download the latest version at any time from http://www.php.net for no charge.
Ease of Learning PHP
The syntax of PHP is based on other programming languages, primarily C, and Perl. If you already know C or Perl, or a C-like language such as C++ or Java, you will be productive using PHP almost immediately.
Object-Oriented Support
PHP version 5 has well-designed object-oriented features. If you learn to program in java or C++, you will find the features (and generally the syntax) that you expect, such as inheritance, private and protected attributes and methods, abstract classes and methods, interfaces, constructors, and destructors. You will find some less common features such as built-in iteration
45
behavior. Some of this functionality was available in PHP versions 3 and 4, but the object-oriented support in version 5 is much more complete.
Portability
PHP is available for many different operating systems. You can write PHP code on free UNIX-like operating system such as Linux and FreeBCD, commercial UNIX versions such as Solaris and IRIX, or on different versions of Microsoft Windows. Well-written code will usually work without modification on a different sytem running PHP.
Source Code
You have access to PHP’s source code. With PHP, unlike commercial, closedsource products, if you want to modify something or add to the language, you are free to do so. You do not need to wait for the manufacturer to release patches. You also don’t to worry about the manufacturer going out of business or deciding to stop supporting a product.
Availability of Support
Zend Technologies (www.zend.com ), the company behind the engine that powers PHP, funds its PHP development by offering support and related software on a commercial basis. c. MySQL MySQL (pronounced My-ess-Que-Ell) is a very fast, robust, relational database management system (RDBMS). A database enables you to efficiently store, search, sort, and retrieve data. The MySQL server controls access to your data to ensure that multiple users can work with it concurrently,
46
to provide fast access to it, and to ensure that only authorize users can obtained access. Hence, MySQL is a multi-user, multithreaded server. It uses structured Query Language (SQL), the standard database query language worldwide.
MySQL has been publicly available since 1996 but has a development history going back to 1979. It is the world’s most popular open source database and has won the Linux Journal Readers’ Choice Award on a number of occasions.
Complete list of features can be found at the MySQL website (www.mysql.com), some of the most popular features of MySQL are as follows: i.
Features of MySQL
Multiple CPUs usable through kernel threads.
Multi-platform operation.
Numerous column types cover virtually every type of data.
Group functions for mathematical calculations and sorting.
Commands that allow information about the databases to be easily and succinctly shown to the administrator.
Function names that do not affect table or column names.
A password and user verification system for added security.
Up to 32 indexes per table permitted; this feature has been successfully implemented at levels of 60,000 tables and 5,000,000,000 rows.
47
ii.
International error reporting usable in many different countries.
Some MySQL’s Strength
MySQL’s main competitors are PostgreSQL, Microsoft SQL Server, and Oracle. MySQL has much strength, including the following:
Performance
MySQL is undeniably fast. You can see the developer’s benchmark page at http://web.mysql.com/benchmark.html. Many of these benchmarks show MySQL to be orders of magnitude faster than the competitors. In 2002, eWeek published a benchmark comparing five databases powering a web application. The best result was a tie between MySQL and the much more expensive Oracle.
Low Cost
MySQL is available at no cost under an Open Source license or at low cost under a commercial license. You need license if you want to redistribute MySQL as part of an application and do not want to license your application under an Open Source license. If you do not intend to distribute your application or are working on free software, you do not need to buy license.
Ease of Use
Most modern databases use SQL. If you have used another RDBMS, you should have no trouble adapting to these one. MySQL is also easier to setup than many similar products.
Portability
48
MySQL can be used on many different UNIX systems as well as under Microsoft Windows.
Source Code
As with PHP, you can obtain and modify the source code for MySQL. This point is not important to most users most of the time, but it provide you with excellent peace of mind, ensuring future continuity and give you option in an emergency.
Availability of Support
Not all Open Source products have a parent company offering support, training, consulting, and certification, but you can get these benefits from www.mysql.com.
49
CHAPTER THREE: REVERSE ENGINEERING 3.1
Introduction
The reverse engineering process is represented in Figure 4. Before reverse engineering activities can commence, unstructured (“dirty”) source code is restructured so that it contains only the structured programming constructs. This makes the source code easier to read and provides the basis for all the subsequent reverse engineering activities. The core of reverse engineering is an activity called extract abstractions. The engineer must evaluate the old program and from the (often undocumented) source code, extract a meaningful specification of the processing that is performed, the user interface that is applied, and the program data structures or database that is used.
Fig 4 the reverse engineering process model (Pressman, 2001)
50
3.2
Reverse Engineering to Understand Processing
The first real reverse engineering activity begins with an attempt to understand and then extract procedural abstractions represented by the source code. To understand procedural abstractions, the code is analyzed at varying levels of abstraction: system, program, component, pattern, and statement
The overall functionality of the entire application system must be understood before more detailed reverse engineering work occurs. This establishes a context for further analysis and provides insight into interoperability issues among applications within the system. Each of the programs that make up the application system represents a functional abstraction at a high level of detail.
Information is transformed as it flows through a computer-based system. The system accepts input in a variety of forms; applies hardware, software, and human elements to transform it; and produces output in a variety of forms. Input may be a control signal transmitted by a transducer, a series of numbers typed by a human operator, a packet of information transmitted on a network link, or a voluminous data file retrieved from secondary storage. The transform(s) may comprise a single logical comparison, a complex numerical algorithm, or a rule-inference approach of an expert system. Output may light a single LED or produce a 200-page report.
51
A data flow diagram is a graphical representation that depicts information flow and the transforms that are applied as data move from input to output. A rectangle is used to represent an external entity; that is, a system element (e.g., hardware, a person, another program, etc) or another system that produces information for transformation by the software or receives information produced by the software. A circle (sometimes called a bubble) represents a process or transform that is applied to data (or control) and changes it in some way. An arrow represents one or more data items (data objects). All arrows on a data flow diagram should be labeled. The double line represents a data store—stored information that is used by the software.
3.2.1 Creating a Data Flow Model The data flow diagram enables the software engineer to develop models of the information domain and functional domain at the same time. As the DFD is refined into greater levels of detail, the analyst performs an implicit functional decomposition of the system, thereby accomplishing the fourth operational analysis principle for function. At the same time, the DFD refinement results in a corresponding refinement of data as it moves through the processes that embody the application.
Again considering the Student Information system product, a level 0 DFD for the system is shown in Fig. 5. The primary external entities (boxes) produce information for use by the system and consume information generated by the
52
system. The inputs entities are Student, course and Exam. The labeled arrows represent data objects or data object type hierarchies. The input data objects include: student data, course data and exam data. The level 0 DFD is now expanded into a level 1 model as shown in fig. 6
Fig. 5 Level 0 DFD for Students Information Software
Fig 6 Level 1 DFD for Student Information Software The processes represented at DFD level 1 can be further refined into lower levels. For example, the process: process result can be refined into a level 2 DFD as shown in Fig 3.
53
Fig. 3 Level 2 DFD that refines process: process result
3.3
Reverse Engineering to Understand Data
Reverse engineering of data occurs at different levels of abstraction. At the program level, internal program data structures must often be reverse engineered as part of an overall reengineering effort. At the system level, global data structures (e.g., files, databases) are often reengineered to accommodate new database management paradigms (e.g., the move from flat file to relational or object-oriented database systems).
Data modeling answers a set of specific questions that are relevant to any data processing application. What are the primary data objects to be processed by the system? What is the composition of each data object and what attributes describe the object? Where do the objects currently reside? What are the relationships between each object and other objects? What are the relationships between the objects and the processes that transform them?
54
To answer these questions, data modeling methods make use of the entity relationship diagram. The ERD enables a software engineer to identify data objects and their relationships using a graphical notation. In the context of structured analysis, the ERD defines all data that are entered, stored, transformed, and produced within an application.
The entity relationship diagram focuses solely on data (and therefore satisfies the first operational analysis principles), representing a "data network" that exists for a given system. The ERD is especially useful for applications in which data and the relationships that govern data are complex. Unlike the data flow diagram, data modeling considers data independent of the processing that transforms the data.
3.3.1 Creating an Entity Relationship Diagram The entity relationship diagram enables a software engineer to fully specify the data objects that are input and output from a system, the attributes that define the properties of these objects, and their relationships. Like most elements of the analysis model, the ERD is constructed in an iterative manner.
The data model consists of three interrelated pieces of information: the data object, the attributes that describe the data object, and the relationships that connect data objects to one another.
55
A data object is a representation of almost any composite information that must be understood by software. By composite information, we mean something that has a number of different properties or attributes. A data object can be an external entity (e.g., anything that produces or consumes information), a thing (e.g., a report or a display), an occurrence (e.g., a telephone call) or event (e.g., an alarm), a role (e.g., salesperson), an organizational unit (e.g., MIS department), a place (e.g., a warehouse), or a structure (e.g., a file). The data objects in this case are: Student, Course and Examination.
Attributes define the properties of a data object and take on one of three different characteristics. They can be used to (1) name an instance of the data object, (2) describe the instance, or (3) make reference to another instance in another table. In addition, one or more of the attributes must be defined as an identifier—that is, the identifier attribute becomes a "key" when we want to find an instance of the data object. Data objects are connected to one another in different ways. The attributes of Student, Course and Examination can be shown in table 1, table 2 and table 3 respectively. Table 1 the attributes of Student object. Adm no
Name
Address
State
LGA
Course of Faculty Study
56
Dept
Table 2 the attributes of Course object. Adm no
Session
Course Code
Course Title
Course Unit
Table 3 the attributes of Examination object. Adm no
Session
Course Code
Course Grade
Data objects are connected to one another in different ways. Consider two data objects, Student and Course. A connection is established between Student and Course because the two objects are related. But what are the relationships? To determine the answer, we must understand the role of students and courses within the context of the software to be built. We can define a set of object/relationship pairs that define the relevant relationships. For example, a. A Student registers Courses b. A Student attends lectures for Courses c. A Student sits for Examinations These objects can be represented using the ERD illustrated in Fig 8.
Cardinality is the specification of the number of occurrences of one [object] that can be related to the number of occurrences of another [object]. Cardinality is usually expressed as simply 'one' or 'many.' For example, a wife
57
can have only one husband (in most cultures), while a parent can have many children. Taking into consideration all combinations of 'one' and 'many,' two [objects] can be related as: a. One-to-one (l:l)—An occurrence of [object] 'A' can relate to one and only one occurrence of [object] 'B,' and an occurrence of 'B' can relate to only one occurrence of 'A.' b. One-to-many (l:N)—One occurrence of [object] 'A' can relate to one or many occurrences of [object] 'B,' but an occurrence of 'B' can relate to only one occurrence of 'A.' For example, a mother can have many children, but a child can have only one mother. c. Many-to-many (M:N)—An occurrence of [object] 'A' can relate to one or more occurrences of 'B,' while an occurrence of 'B' can relate to one or more occurrences of 'A.' For example, an uncle can have many nephews, while a nephew can have many uncles.
Fig 8 ERD for Student information Software (without cardinality and modality)
58
In this case, the relationships between both Student and Course, and Student and Examination are one-to-many. This can be shown in fig. 9.
Cardinality defines “the maximum number of objects that can participate in a relationship”. It does not, however, provide an indication of whether or not a particular data object must participate in the relationship. To specify this information, the data model adds modality to the object/relationship pair. The modality of a relationship is 0 if there is no explicit need for the relationship to occur or the relationship is optional. The modality is 1 if an occurrence of the relationship is mandatory.
Fig. 9 ERD for Student information Software (with cardinality and modality)
3.4
Reverse Engineering to Understand User Interface
Sophisticated GUIs have become de rigueur for computer-based products and systems of every type. Therefore, the redevelopment of user interfaces has become one of the most common types of reengineering activity. To fully understand an existing user interface (UI), the structure and behavior of the interface must be specified. Merlo and his colleagues (Merlo et al,
59
1993) suggest three basic questions that must be answered as reverse engineering of the UI commences: What are the basic actions (e.g., keystrokes and mouse clicks) that the interface must process? What is a compact description of the behavioral response of the system to these actions? What is meant by a “replacement,” or more precisely, what concept of equivalence of interfaces is relevant here?
Behavioral modeling notation can provide a means for developing answers to the first two questions. Much of the information necessary to create a behavioral model can be obtained by observing the external manifestation of the existing interface. But additional information necessary to create the behavioral model must be extracted from the code.
It is important to note that a replacement GUI may not mirror the old interface exactly (in fact, it may be radically different). It is often worthwhile to develop new interaction metaphors.
3.4.1 Creating a State Transition Diagram Behavioral modeling is an operational principle for all requirements analysis methods. The STD indicates what actions (e.g., process activation) are taken as a consequence of a particular event.
60
A state is any observable mode of behavior. For example, states for the student information software might be executing the reading user input state, result processing state, displaying the user input state, and processing queries state. Each of these states represents a mode of behavior of the system. A state transition diagram indicates how the system moves from state to state.
A simplified state transition diagram for the Student Information Software is shown in Fig. 10. The rectangles represent system states and the arrows represent transitions between states. Each arrow is labeled with a ruled expression. The top value indicates the event(s) that cause the transition to occur. The bottom value indicates the action that occurs as a consequence of the event. Therefore, when the student information software is executed and it starts, the system moves from the reading input state to the result processing state, etc.
Fig. 10 STD for Student information Software 61
CHAPTER FOUR: RESTRUCTURING AND FORWARD ENGINEERING 4.1
Introduction
Software restructuring modifies source code and/or data in an effort to make it amenable to future changes. In general, restructuring does not modify the overall program architecture. It tends to focus on the design details of individual modules and on local data structures defined within modules. If the restructuring effort extends beyond module boundaries and encompasses the software architecture, restructuring becomes forward engineering.
The forward engineering process applies software engineering principles, concepts, and methods to re-create an existing application. In most cases, forward engineering does not simply create a modern equivalent of an older program. Rather, new user and technology requirements are integrated into the reengineering effort. The redeveloped program extends the capabilities of the older application.
Object-oriented software engineering has become the development paradigm of choice for many software organizations. But what about existing applications that were developed using conventional methods? In some cases, the answer is to leave such applications “as is.” In others, older applications must be reengineered so that they can be easily integrated into large, objectoriented systems.
62
We live in a world of objects. These objects exist in nature, in human made entities, in business, and in the products that we use. They can be categorized, described, organized, combined, manipulated, and created. Therefore, it is no surprise that an object-oriented view would be proposed for the creation of computer software—an abstraction that enables us to model the world in ways that help us to better understand and navigate it.
Object technologies lead to reuse, and reuse (of program components) leads to faster software development and higher-quality programs. Object-oriented software is easier to maintain because its structure is inherently decoupled. This leads to fewer side effects when changes have to be made and less frustration for the software engineer and the customer. In addition, objectoriented systems are easier to adapt and easier to scale (i.e., large systems can be created by assembling reusable subsystems).
4.2 Code Restructuring Code restructuring is performed to yield a design that produces the same function but with higher quality than the original program. In general, code restructuring techniques e.g., Warnier's logical simplification techniques (Warnier, 1974), model program logic using Boolean algebra and then apply a series of transformation rules that yield restructured logic. The objective is to take "spaghetti-bowl" code and derive a procedural design that conforms to the structured programming philosophy.
63
Component-level design, also called procedural design, occurs after data, architectural, and interface designs have been established. The intent is to translate the design model into operational software. But the level of abstraction of the existing design model is relatively high, and the abstraction level of the operational program is low. The translation can be challenging, opening the door to the introduction of subtle errors that are difficult to find and correct in later stages of the software process.
It is possible to represent the component-level design using a programming language. In essence, the program is created using the design model as a guide. An alternative approach is to represent the procedural design using some intermediate (e.g., graphical, tabular, or text-based) representation that can be translated easily into source code. Regardless of the mechanism that is used to represent the component level design, the data structures, interfaces, and algorithms defined should conform to a variety of well-established procedural design guidelines that help us to avoid errors as the procedural design evolves.
The foundations of component-level design were formed in the early 1960s and were solidified with the work of Edsgar Dijkstra and his colleagues (Bohm and Jacopini, 1966; Dijkstra, 1965; Dijkstra, 1976). In the late 1960s, Dijkstra and others proposed the use of a set of constrained logical constructs
64
from which any program could be formed. The constructs emphasized "maintenance of functional domain." That is, each construct had a predictable logical structure, was entered at the top and exited at the bottom, enabling a reader to follow procedural flow more easily.
The constructs are sequence, condition, and repetition. Sequence implements processing steps that are essential in the specification of any algorithm. Condition provides the facility for selected processing based on some logical occurrence, and repetition allows for looping. These three constructs are fundamental to structured programming—an important component-level design technique.
The structured constructs were proposed to limit the procedural design of software to a small number of predictable operations. Complexity metrics indicate that the use of the structured constructs reduces program complexity and thereby enhances readability, testability, and maintainability. The use of a limited number of logical constructs also contributes to a human understanding process that psychologists call chunking. To understand this process, consider the way in which you are reading this page. You do not read individual letters but rather recognize patterns or chunks of letters that form words or phrases. The structured constructs are logical chunks that allow a reader to recognize procedural elements of a module, rather than reading the
65
design or code line by line. Understanding is enhanced when readily recognizable logical patterns are encountered.
Any program, regardless of application area or technical complexity, can be designed and implemented using only the three structured constructs. It should be noted, however, that dogmatic use of only these constructs can sometimes cause practical difficulties.
4.2.1 Creating Flowchart "A picture is worth a thousand words," but it's rather important to know which picture and which 1000 words. There is no question that graphical tools, such as the flowchart or box diagram, provide useful pictorial patterns that readily depict procedural detail. However, if graphical tools are misused, the wrong picture may lead to the wrong software.
A flowchart is quite simple pictorially. A box is used to indicate a processing step. A diamond represents a logical condition, and arrows show the flow of control. Fig. 11, fig. 12, fig. 13, fig. 14 and fig. 15 are five modules of the Student information System.
Fig. 11 is an online registration module. It enable the students to register – both central and courses registration, online. The student login with his/her pin number and admission number. If the pin number is present and not used
66
by another student, the student can be allowed access to the registration pages. If the student it is a new student, he can do both initial registration and fees registration. But, if the student is a returning student only fees registration will be allowed.
Fig. 11 online registration module flowchart
Fig. 12 is Result Entry module. It enables MIS staff to enter student examination results into MySQL database. MIS staff login using username and password. The staffs are allowed enter result for every student in the University. Only result for the courses that student register are allowed to be posted.
67
Fig. 13 is an online result checking module. It allows the student to check their result online. The student submits his/her admission numbers and password. The student can then be allowed submit session if his/she enter the correct password. The initial information, and all courses and grade information of the student for the session can now be displayed.
Fig. 12 Result Entry module flowchart
Fig. 14 is fees report module. It enables MIS staffs to produce fees report for every department in the University. A staff can login using his/her username and password. By submitting department and session, the entire student and their fees can be displayed. Also, total fees for the department are displayed.
68
Fig. 15 is department courses report module. It enables MIS to produce the list of all students that register a particular course in a particular department. A staff can login using his/her username and password. The list can now be produce for each course.
Fig. 13 Online Result Checking module flowchart
69
Fig.
14
Fees
Report
70
module
Fig. 15 Departmental Courses report module.
When restructuring moves beyond standardization and rationalization, physical modifications to existing data structures are made to make the data design more effective. This may mean a translation from one file format to another, or in some cases, translation from one type of database to another.
4.3 Unified Modeling Approach to Object Oriented Analysis Over the past decade, Jacobson I. et al. (1999) have collaborated to combine the best features of their individual object-oriented analysis and design
71
methods into a unified method. The result, called the Unified Modeling Language (UML), has become widely used throughout the industry.
UML allows a software engineer to express an analysis model using a modeling notation that is governed by a set of syntactic, semantic, and pragmatic rules.
In UML, a system is represented using five different “views” that describe the system from distinctly different perspectives. Each view is defined by a set of diagrams. The following views (Alhir, 1998) are present in UML: a. User model view. This view represents the system (product) from the user’s (called actors in UML) perspective. The use-case is the modeling approach of choice for the user model view. This important analysis representation describes a usage scenario from the end-user's perspective. b. Structural model view. Data and functionality are viewed from inside the system. That is, static structure (classes, objects, and relationships) is modeled. c. Behavioral model view. This part of the analysis model represents the dynamic or behavioral aspects of the system. It also depicts the interactions or collaborations between various structural elements described in the user model and structural model views. d. Implementation model view. The structural and behavioral aspects of the system are represented as they are to be built.
72
e. Environment model view. The structural and behavioral aspects of the environment in which the system is to be implemented are represented.
4.3.1 Use-Cases Use-cases model the system from the end-user’s point of view. Created during requirements elicitation, use-cases should achieve the following objectives: a.
To define the functional and operational requirements of the system (product) by defining a scenario of usage that is agreed upon by the end-user and the software engineering team.
b.
To provide a clear and unambiguous description of how the end-user and the system interact with one another.
c.
To provide a basis for validation testing.
During OOA, use-cases serve as the basis for the first element of the analysis model.
Using UML notation, a diagrammatic representation of a use-case, called a use-case diagram, can be created. Like many elements of the analysis model, the use-case diagram can be represented at many levels of abstraction. The use-case diagram contains actors and use-cases. Actors are entities that interact with the system. They can be human users or other machines or systems that have defined interfaces to the software.
73
To illustrate the development of a use-case diagram, we consider the use-cases for the Student Information System. Two actors were identified: the student, and staff.
Fig. 16 depicts a high-level use-case diagram for the Student Information System. Referring to Fig 16, four use-cases are identified (represented by ovals). Each of the high-level use-cases may be elaborated with lower-level use-case diagrams. For example, Fig. 17, fig. 18, fig. 19 and fig. 20 represents a use-case diagram that elaborates the four functions. A complete set of usecase diagrams is created for all actors.
Fig. 16 a high level use-case diagram
74
Fig. 17 a low level use-case diagram for submit forms function
Fig. 18 a low level use-case diagram for make query function
Fig. 19 a low level use-case diagram for enters data function
75
Fig. 20 a low level use-case diagram for create reports function
4.3.2 Class-Responsibility-Collaborator (CRC) Modeling Once basic usage scenarios have been developed for the system, it is time to identify
candidate
classes
and
indicate
their
responsibilities
and
collaborations. Class-responsibility-collaborator (CRC) modeling (WirfsBrock et al., 1990) provides a simple means for identifying and organizing the classes that are relevant to system or product requirements. Ambler (1995) describes CRC modeling in the following way:
“A CRC model is really a collection of standard index cards that represent classes. The cards are divided into three sections. Along the top of the card you write the name of the class. In the body of the card you list the class responsibilities on the left and the collaborators on the right.”
In reality, the CRC model may make use of actual or virtual index cards. The intent is to develop an organized representation of classes. Responsibilities are
76
the attributes and operations that are relevant for the class. Stated simply, a responsibility is “anything the class knows or does” (Ambler, 1995). Collaborators are those classes that are required to provide a class with the information needed to complete a responsibility. In general, collaboration implies either a request for information or a request for some action.
a.
Classes
Basic guidelines for identifying classes and objects were presented. To summarize, objects manifest themselves in a variety of forms: external entities, things, occurrences, or events; roles; organizational units; places; or structures. One technique for identifying these in the context of a software problem is to perform a grammatical parse on the processing narrative for the system. All nouns become potential objects. However, not every potential object makes the cut. Six selection characteristics were defined: i.
Retained information. The potential object will be useful during analysis only if information about it must be remembered so that the system can function.
ii.
Needed services. The potential object must have a set of identifiable operations that can change the value of its attributes in some way.
iii.
Multiple attributes. During requirements analysis, the focus should be on "major" information; an object with a single attribute may, in fact, be useful during design but is probably better represented as an attribute of another object during the analysis activity.
77
iv.
Common attributes. A set of attributes can be defined for the potential object and these attributes apply to all occurrences of the object.
v.
Common operations. A set of operations can be defined for the potential object and these operations apply to all occurrences of the object.
vi.
Essential requirements. External entities that appear in the problem space and produce or consume information that is essential to the operation of any solution for the system will almost always be defined as objects in the requirements model.
A potential object should satisfy all six of these selection characteristics if it is to be considered for inclusion in the CRC model.
Firesmith (1993) extends this taxonomy of class types by suggesting the following additions: i.
Device classes model external entities such as sensors, motors, keyboards.
ii.
Property classes represent some important property of the problem environment (e.g., credit rating within the context of a mortgage loan application).
iii.
Interaction classes model interactions that occur among other objects (e.g., a purchase or a license).
78
In addition, objects and classes may be categorized by a set of characteristics: i.
Tangibility. Does the class represent a tangible thing (e.g., a keyboard or sensor) or does it represent more abstract information (e.g., a predicted outcome)?
ii.
Inclusiveness. Is the class atomic (i.e., it includes no other classes) or is it aggregate (it includes at least one nested object)?
iii.
Sequentiality. Is the class concurrent (i.e., it has its own thread of control) or sequential (it is controlled by outside resources)?
iv.
Persistence. Is the class transient (i.e., it is created and removed during program operation), temporary (it is created during program operation and removed once the program terminates), or permanent (it is stored in a database)?
v.
Integrity. Is the class corruptible (i.e., it does not protect its resources from outside influence) or guarded (i.e., the class enforces controls on access to its resources)?
b.
Responsibilities
Basic guidelines for identifying responsibilities (attributes and operations) were also presented. To summarize, attributes represent stable features of a class; that is, information about the class that must be retained to accomplish the objectives of the software specified by the customer. Attributes can often be extracted from the statement of scope or discerned from an understanding
79
of the nature of the class. Operations can be extracted by performing a grammatical parse on the processing narrative for the system. All verbs become candidate operations. Each operation that is chosen for a class exhibits a behavior of the class.
Wirfs-Brock et al. (1990) suggest five guidelines for allocating responsibilities to classes: i.
System intelligence should be evenly distributed. Every application encompasses a certain degree of intelligence; that is, what the system knows and what it can do. This intelligence can be distributed across classes in a number of different ways. “Dumb” classes (those that have few responsibilities) can be modeled to act as servants to a few “smart” classes (those having many responsibilities. Although this approach makes the flow of control in a system straightforward, it has a few disadvantages: It concentrates all intelligence within a few classes, making changes more difficult, and it tends to require more classes, hence more development effort.
Therefore, system intelligence should be evenly distributed across the classes in an application. Because each object knows about and does only a few things (that are generally well focused), the cohesiveness of the system is improved. In addition, side effects due to change tend to
80
be dampened because system intelligence has been decoupled across many objects.
To determine whether system intelligence is evenly distributed, the responsibilities noted on each CRC model index card should be evaluated to determine if any class has an extraordinarily long list of responsibilities. This indicates a concentration of intelligence. In addition, the responsibilities for each class should exhibit the same level of abstraction. For example, among the operations listed for an aggregate class called checking account a reviewer notes two responsibilities: balance-the-account and check-off-cleared-checks. The first operation (responsibility) implies a complex mathematical and logical procedure. The second is a simple clerical activity. Since these two operations are not at the same level of abstraction, checkoff-cleared-checks should be placed within the responsibilities of check-entry, a class that is encompassed by the aggregate class checking account. ii.
Each responsibility should be stated as generally as possible. This guideline implies that general responsibilities (both attributes and operations) should reside high in the class hierarchy (because they are generic, they will apply to all subclasses). In addition, polymorphism should be used in an effort to define operations that generally apply to
81
the superclass but are implemented differently in each of the subclasses. iii.
Information and the behavior related to it should reside within the same class. This achieves the OO principle that we have called encapsulation. Data and the processes that manipulate the data should be packaged as a cohesive unit.
iv.
Information about one thing should be localized with a single class, not distributed across multiple classes. A single class should take on the responsibility for storing and manipulating a specific type of information. This responsibility should not, in general, be shared across a number of classes. If information is distributed, software becomes more difficult to maintain and more challenging to test.
v.
Responsibilities should be shared among related classes, when appropriate. There are many cases in which a variety of related objects must all exhibit the same behavior at the same time. As an example, consider a video game that must display the following objects: player, player-body, player-arms, player-legs, player-head. Each of these objects has its own attributes (e.g., position, orientation, color, speed) and all must be updated and displayed as the user manipulates a joy stick. The responsibilities update and display must therefore be shared by each of the objects noted. Player knows when something has changed and update is required. It collaborates with the
82
other objects to achieve a new position or orientation, but each object controls its own display.
c.
Collaborations
Classes fulfill their responsibilities in one of two ways: A class can use its own operations to manipulate its own attributes, thereby fulfilling a particular responsibility, or a class can collaborate with other classes. Wirfs-Brock et al. (1990) define collaborations in the following way: “Collaborations represent requests from a client to a server in fulfillment of a client responsibility. Collaboration is the embodiment of the contract between the client and the server. . . . We say that an object collaborates with another object if, to fulfill a responsibility, it needs to send the other object any messages. A single collaboration flows in one direction— representing a request from the client to the server. From the client’s point of view, each of its collaborations is associated with a particular responsibility implemented by the server.”
Collaborations identify relationships between classes. When a set of classes all collaborate to achieve some requirement, they can be organized into a subsystem. Collaborations are identified by determining whether a class can fulfill each responsibility itself. If it cannot, then it needs to interact with another class.
83
The CRC model index for student information system classes can be shown in the tables below Table 4 CRC for Student class Class Name: Student Class Type: Device class Class Characteristics: tangible, atomic and concurrent Responsibilities Collaborations Admission Number Name Password Course of study Faculty Department Mode of entry Year of entry State Local govt . . . Telephone of sponsor Initial_rec_entry ()
Table 4 is a CRC for student class. It is the super class, a device class having characteristics: tangible, atomic and concurrent. It has the attributes: admission number, name, password, course of study, faculty, department, mode of entry, state, local govt. and all other personal detail. It also has the operation: initial_rec_entry(). Table 5 CRC for fees class Class Name: Fees Class Type: Property class Class Characteristics: tangible, aggregate and sequential Responsibilities Collaborations Amount Has-knowledge-of student class Level
84
Card no Fees_entry() Fees_report()
Table 5 is a CRC for fess class. It is a subclass of student class, a property class having the characteristics: tangible, aggregate and sequential. It has the attributes: amount, level, and card number, and operations: Fees_entry() and Fees_report(). Table 6 CRC for Course class Class Name: Course Class Type: Device class Class Characteristics: tangible, aggregate and sequential Responsibilities Collaborations Course code Has-knowledge-of student class Course title Course unit Courses_entry()
Table 6 is a CRC for course class. It is also a subclass of student class, a device class having the characteristics: tangible, aggregate and sequential. It has the attributes: course code, course title and course unit, and operations: course_entry(). Table 7 CRC for Course_registration class Class Name: Course_registration Class Type: Device class Class Characteristics: tangible, aggregate and sequential Responsibilities Collaborations Course code Has-knowledge-of course class Course_reg_entry() Dept_course_list()
85
Table 7 is CRC for course_registration class. It is a subclass of course class, a device class having characteristics: tangible, aggregate and sequential. It has the
attribute:
course
code
and
operations:
courses_entry()
and
dept_course_list(). Table 8 CRC for Course_result class Class Name: Course_result Class Type: Device class Class Characteristics: tangible, aggregate and sequential Responsibilities Collaborations Course grade Has-knowledge-of course_registration class Course_result_entry() Has-knowledge-of course class Student_result_report()
Table 8 is a CRC for course_result class. It is a subclass of course_registration having characteristics: tangible, aggregate and sequential. It the attribute: course
grade
and
operations:
course_result_entry()
and
student_result_report().
4.3.3 Defining Structure And Hierarchies Once classes and objects have been identified using the CRC model, the analyst begins to focus on the structure of the class model and the resultant hierarchies that arise as classes and subclasses emerge. Using UML notation, a variety of class diagrams can be created. Generalization/specialization class structures can be created for identified classes.
86
To illustrate, consider the Student object defined for Student Information System, shown in Fig. 21. Here, the generalization class, Student, is refined into a set of specialization – fees and course. Course class can also be refined into a set of specializations – course_registration and course_result. The attributes and operations noted for the Student class and course class are inherited by the specializations of the classes. We have created a simple class hierarchy.
Fig. 21 a class diagram for generalization/specialization with composite aggregates.
87
CHAPTER FIVE: SUMMARY, CONCLUSION AND RECOMMENDATION 5.1
Introduction
Software re-engineering concerns the examination of the design and implementation of an existing legacy system and applying different techniques and methods to re-design and re-shape that system into hopefully better and more suitable software.
This process is by no means an easy task, since legacy systems may have come a long way from the state in which it is first conceived and implemented. Updates and adding of new functionality may course a lack of proper and updated documentation, especially if the system is entrusted to people not skilled in the re-engineering way of thinking. Also, maintainability may suffer as the source code becomes flooded with new ways of communication.
Software re-engineering involves one or more of the activities of: changing operating system or hardware platforms, moving to web services, migrating to distributed environment, introducing middleware, changing to an objectoriented architecture, use of modern programming languages, migration of database and special reasons (like Y2K).
88
5.2
Summary
This thesis is titled: Functional oriented to object oriented Approach in Software Reengineering (A case study of MIS Usmanu Danfodiyo University, Sokoto). It explore software reengineering principles and methods to change the current Student Information System form functional oriented to object oriented, from offline to online, from DOS based to Windows based and Dbase V DBMS to MySQL DBMS.
The current system is first reverse engineered in order to extract necessary processing, data and usage information about system. Data flow diagram is designed to depict information flow and transformations that are applied as move input to output. In addition, entity relationship diagram, representing a "data network" that exists for the system, is designed. This involves identifying data objects, their entities and the relationship that exist between these data objects. Lastly, a behavioral model, called state transition diagram, is designed. It indicates what actions (e.g., process activation) are taken as a consequence of a particular event.
Restructuring and forward engineering is performed after reverse engineering. Code restructuring is performed to yield a design that produces the same function but with higher quality than the original program. Flowchart for each restructured module is designed. Also, use-cases, which model software system from the end-user’s point of view, are designed. The main objectives
89
are: to define the functional and operational requirements of the system (product) by defining a scenario of usage that is agreed upon by the end-user and the software engineering team, to provide a clear and unambiguous description of how the end-user and the system interact with one another and to provide a basis for validation testing. In addition CRC modeling is done. It identifies candidate classes and indicates their responsibilities and collaborations. Lastly, structures and hierarchies were defined for the classes. It focuses on the structure of the class model and the resultant hierarchies that arise as classes and subclasses emerge.
In conclusion, an Online Student Information System has been designed and implemented. Some of the interfaces and report are shown in the appendix A and appendix B respectively.
5.3
Conclusion
It is not uncommon to find that many of the data processing systems of today have been in operation for as many as 10 – 30 years back. Although the original developers may not have expected their products to be providing useful services for these years, it is now understood that long lifetimes are an inherent property of many business systems. Of great significance is the fact that any non-trivial system must change over the course of its lifetime. Long
90
lifetimes represent a considerable period of time with potentially extensive changes to accommodate.
Legacy systems are older software systems that remain vital to an organization. Many software systems that are still in use were developed many years using technology that are now obsolete. These systems are still essential for the normal functioning of the business or organization.
Software reengineering is being used to recover legacy systems and allow their evolution (Jacbon & Lindstrom, 1991). It is performed mainly to reduce maintenance costs, and improve development speed and systems readability.
The essence of software re-engineering is to improve or transform existing software so that it can be understand, controlled, and use anew. The need for software re-engineering has increase greatly, as Usmanu Danfodiyo University Sokoto Student Information System has become obsolescent in terms of its architecture, the platform on which it runs, and it suitability and stability to support changing needs. Software re-engineering is important for recovering and reusing existing software assets, putting high software maintenance cost under control, and establishing a base for future software evolution.
91
The demand for reengineering usage has been growing significantly over the last years. The need for different business sectors to adapt their systems for the Web or to use other technologies is stimulating research into methods, tools and infrastructures that support the evolution of existing applications. Most Nigerian Universities have now implemented online registration and result checking systems. Usmanu Danfodiyo University Sokoto, being one of the best Universities, needs not to be left behind.
5.4
Recommendation
Most organizations stand at a crossroads of competitive survival. A crossroads created by the information revolution that is now shaping/shaking the world. Information Technology has gained so much importance that companies and governments must either step into, taking advantage of modern information processing technology, or lag behind and be bypassed. The point is that most of the critical software systems that companies and government agencies depend upon were developed many years ago, and maintenance is not enough to keep these systems updated, according to technological changes.
In general, reengineering is a way to achieve software reuse and to understand the concepts underlying the application domain. Its usage makes it easier to reason about reuse information in analysis and design documents which are not always available in legacy systems.
92
I recommend the University to implement this research. It will enable the University to overcome the problems it is facing in student registration and to meet the technological needs of the society.
In conclusion, I also recommend that further research on this thesis should be conducted to expand the scope – all MIS activities and to make it such that only lecturer that thought the course can be allowed enter the results for the course.
93
REFERENCES Alhir, S.S. (1998): UML in a Nutshell, O’Reilly & Associates. Alvaro,
A.
(2003):
Orion-RE:
A
Component-Based
SoftwareReengineering Environment. In: Proceedings of the 10th Working Conference on Reverse Engineering (WCRE), pp. 248–257. IEEE Computer Society Press. Ambler, S. (1995): Using Use-Cases, Software Development, pp. 53–61. Aversano, L. (2001): Migrating Legacy Systems application to the Web. In: Proceedings of 5th European Conference on Software Maintenance and Reengineering, Lisbon, Portugal, IEEE Comp. Soc. Press, pp.148157. Bass, L.; Buhman, C.; Dorda, S.; Long, F.; Robert, J.; Seacord, R.; Wallnau, K. (2000): Volume i: Market assessment of componentbased software engineering. In: Technical note CMU/SEI-2001-TN-007, Carnegie Mellon University, Software Engineering Institute. Bayer, J. (2000): Towards engineering product lines using concerns. In: Workshop on Multi-Dimensional Separation of Concerns in Software Engineering (ICSE). Bennett, K. H. and Rajlich, V. T. (2000): Software maintenance and evolution: a roadmap. In: Proceedings of the 22nd International
94
Conference on Software Engineering (ICSE). Future of Software Engineering Track, pp. 73–87. ACM Press. Bianchi, A.; Caivano, D.; Marengo, V.; Visaggio, G. (2003): Iterative reengineering of legacy systems. In:IEEE Transaction on Software Engineering, Vol. 29, No. 03, pp. 225–241. Bianchi, A.; Caivano, D.; Visaggio, G. (2000): Method and process for iterative reengineering of data in a legacy system. In: Proceedings of the Seventh Working Conference on Reverse Engineering (WCRE), pp. 86–97.IEEE Computer Society. Biggerstaff, T. J.; Mitbander, B. G.; Webster, D. E. (1994): Program understanding
and
the
concept
assignment
problem.
In:
Proceedings of the 15nd International Conference on Software Engineering (ICSE), pp. 482-498. ACM Press. Bohm, C. and G. Jacopini (1966): Flow Diagrams, Turing Machines and Languages with Only Two Formation Rules, CACM, vol. 9, no. 5, pp. 366–371. Boyle, J. M. and Muralidharan, M. N. (1984): Program reusability through program transformation. In: IEEE Transactions on Software Engineering, Special Issue on Software Reusability, Vol. 10, No. 5, pp. 574– 588, September. Caldiera, G. and Basili, V. R. (1991): Identifying and qualifying reusable software components. In: IEEE Computer, Vol. 24, No. 02, pp. 61– 71, February. Canfora, G.; Di Santo, G.; Zimeo, E. (2004): Toward Seamless Migration of Java AWT-Based Applications to Personal Wireless Devices. In: Proceedings of the 11th Working Conference on Reverse Engineering (WCRE), pp. 38-47.
95
Chidamber, S. R. and Kemerer, C. F. (1994): A metrics suite for object Oriented design. In: IEEE Transaction on Software Engineering, Vol. 20, No. 06, pp. 476–493. Dijkstra, E. (1976): Structured Programming, in: Software Engineering, Concepts and Techniques, (J. Buxton et al., eds.), Van NostrandReinhold. Dijkstra, E. (1965): Programming Considered as a Human Activity, in: Proc. 1965 IFIP Congress, North-Holland Publishing Co. Eduardo, A; Alexandre, A; Silvio L. M. (2007): Component Reuse in Software Engineering (C.R.U.I.S.E.), Recife Center for Advanced Studies and Systems (C.E.S.A.R.), Federal University of Pernambuco.
Filman, R. E. and Friedman, D. P. (2000): Aspect-oriented programming is quantification and obliviousness. In: Workshop on Advanced Separation of Concerns (OOPSLA). Firesmith, D.G. (1993): Object-Oriented Requirements Analysis and Logical Design, Wiley. Gall, H. and Klösch, R.(1994): Program transformation to enhance the reuse potential of procedural software. In: Proceeding of the ACM Symposium on Applied Computing (SAC4), pp. 99–104. ACM Press. Gamma, E.; Helm, R.; Johnson, R.; Vlissides, J. (1995): Design Patterns – Elements of Reusable Object Oriented Software. Addison Wesley Professional Computing Series. Addison-Wesley. Garcia, V. C.; Lucrédio, D.; Prado, A. F.; Almeida, E. S.; Alvaro, A.; Meira, S. R. L. (2005): Towards an Approach for AspectOriented Software Reengineering. In: 7th International Conference on Enterprise Information Systems (ICEIS), Miami, USA.
96
Jacobson, I.; Booch, G.; Rumbaugh, J. (1999): Unified Software Development Process, Addison-Wesley.
Jacobson, I. and Lindstrom, F. (1991):
Reengineering of old systems to
an object-oriented architecture. In: Proceedin!gs of the ObjectOriented Programming Systems, Languages, and Applications (OOPSLA’91), pp. 340 - 350. ACM Press. Kapoor, R. V. and Stroulia, E. (2001): Mathaino: Simultaneous Legacy Interface Migration to Multiple Platforms. In: 9th International Conference on Human-Computer Interaction, New Orleans, LA, USA, Vol. 01, pp. 51-55, Lawrence Erlbaum Associates. Keller, R. K.; Schauer, R.; Robitaille, S.; Pagé, P. (1999): Pattern-based reverse engineering of design components. In: Proceedings of the 21st International Conference on Software Engineering (ICSE), pp. 226–235. IEEE Computer Society Press. Kendall, E. A. (2000): Reengineering for separation of concerns. In: Workshop on Multi-Dimensional Separation of Concerns in Software Engineering (ICSE). Kiczales, G.; Lamping, J.; Mendhekar, A.; Maeda, C.; Lopes, C.; Loingtier, J.M.; Irwin,
J.
(1997):
Aspect-Oriented
Programming.
In:
Proceedings of the 11st European Conference Object-Oriented Programming (ECOOP), Vol. 1241, pp. 220–242. Springer Verlag. Lee, E.; Lee, B.; Shin, W.; Wu, C. (2003): A reengineering process for migrating from an object-oriented legacy system to a componentbased system. In: Proceedings of the 27th Annual International
97
Computer Software and Applications Conference (COMPSAC), pp. 336–341. IEEE ComputerSociety Press. Lehman, M. M. and Belady, L. A. (1985): Program Evolution: Processes of Software Change, Vol. 27 of APIC Studies Data Processing. Academic Press. Lippert, M.and Lopes, C. V. (2000): A study on exception detecton and handling using aspect-oriented programming. In: Proceedings of the 22nd International Conference on Software Engineering (ICSE), pp. 418–427. ACM Press. McIlroy, M. D. (1969): Mass produced software components. In: Software Engineering: Report on a conference sponsored by the NATO Science Committee, pp. 138–155. NATO Scientific Affairs Division, January. Medvidovic, N.; Rakic, M. M.; Mehta, N.; Malek, S. (2003): Software Architectural Support for Handheld Computing. In: IEEE Computer, Vol. 36, No. 09, pp.66-73. Merlo, E.; Gagné, P.Y.; Girard, J.F.; Kontogiannis, K.; Hendren, L.; Panangaden, P.; De Mori, R. (1995): Reverse Engineering of User Interfaces, Proc. Working Conference on Reverse Engineering, IEEE, Baltimore, pp. 171–178. Meyer, B. (1997): Object-Oriented Software Construction. Prentice-Hall, Englewood Cliffs, second edition. Moore, M. M. and Moshkina, L. (2000): Migrating Legacy User Interfaces to the Internet: Shifting Dialogue Initiative. In: Proceedings of 7th Working Conference on Reverse Engineering (WCRE), Brisbane, Australia, IEEE Comp. Soc. Press, pp. 52-58.
98
Neighbors, J. M. (1996): Finding reusable software components in large systems. In: Proceedings of the 3rd Working Conference on Reverse Engineering (WCRE), pp. 02-10. IEEE Computer Society. Olsem, M. R. (1998): An incremental approach to software systems reengineering. In: Journal of Software Maintenance, Vol. 10, No. 03, May/June, pp. 181-202. Ossher, H. and Tarr, P. (1999): Using subject-oriented programming to overcome common problems in object-oriented software development/evolution. In: Proceedings of 21st International Conference on Software Engineering (ICSE), pp. 687–688. IEEE Computer Society Press. Paulson, L. D. (2001): Translation Technology Tries to Hurdle the Language Barrier. In: IEEE Software, Vol. 34, No. 9, pp.12-15. Roger S. Pressman (2001): Software Engineering: a practitioner’s approach. Mc Graw Hill Companies Inc, New York. fifth edition. Sant’anna, C.; Garcia, A.; Chavez, C.; von Staa, A.; Lucena, C. (2003): On the reuse and maintenance of aspect-oriented software: An evaluation framework. In: XVII Brasilian Symposium on Software Engineering. Sneed, H. M. (1995): Planning the reengineering of legacy systems. In: IEEE Software, Vol. 12, No. 01, pp. 24 34. Sneed, H. M. (1996): Object-oriented COBOL recycling. In: Proceedings of the 3rdWorking Conference on Reverse Engineering (WCRE), pp. 169–178. IEEE Computer Society Press.
99
Sommerville, I. (1996): Software Engineering. Addison Wesley, fifth edition. Stevens, P. and Pooley, R. (1998): Systems reengineering patterns. In: Proceedings of the ACM SIGSOFT 6th International Symposium on the Foundations of Software Engineering (FSE), Vol. 23, No. 06, Software Engineering Notes, pp. 17–23. Systa, T. (1999): The relationships between static and dynamic models in reverse engineering java software. In: Proceedings of the 6th Working Conference on Reverse Engineering (WCRE). IEEE Computer Society Press. Umar, A. (1997): Application (Re)Engineering: Building Web-Based Applications and Dealing with Legacies. Prentice Hall, Upper Saddle River, NJ. Visaggio, G. (2001): Ageing of a data intensive legacy system: Symptoms and remedies. In: Journal of Software Maintenance and Evolution, 13(5): 281–308. Waters, R. C. (1988): program translation via abstraction and reimplementation. In: IEEE Transaction on Software Engineering, Vol. 14, No. 08, pp. 1207-228. Warnier, J.D. (1974): Logical Construction of Programs, Van NostrandReinhold. Wilkening, D. E.; Loyall, J. P.; Pitarys, M. J.; Littlejohn, K. (1995): A reuse approach to software reengineering. In: Journal of Systems and Software, Vol. 30, No. 01-02, pp. 117–125. Wirfs-Brock, R., B; Wilkerson; L. Weiner (1990): Designing Object-
100
Oriented Software, Prentice-Hall. Yeh, A. S.; Harris, D. R.; Reubenstein, R. (1995): Recovering abstract data types and object instances from a conventional procedural language. In: Proceedings of the Second Working Conference on Reverse Engineering (WCRE), pp. 227–236. IEEE Computer Society Press. Zou, Y. and Kontogiannis, K. (2003): Incremental transformation of procedural systems to object oriented platforms. In: Proceedings of the 27th Annual International Computer Software and Applications Conference (COMPSAC), pp. 290–295. IEEE Computer Society Press.
101
Appendix A
102
103
104
105
106
107
108
109
110
111
112
113
Appendix B
114
115
116