of Component-Based Software Engineering (CBSE), frameworks, patterns, and ..... Starting from Classes Model, the Distributed Adapters Pattern (DAP) [23] is used to ... Figure 9 shows the design Components Model after the application of DAP. ...... on Object-Oriented Systems,Lecture Notes in Computer Science (LNCS), ...
Distributed Component-Based Software Development: An Incremental Approach Eduardo Santana de Almeida Daniel Lucrédio Antonio Francisco do Prado Luis Carlos Trevelin
Federal University of São Carlos, Computing Department, São Carlos/SP, Brazil {ealmeida, lucredio, prado, trevelin}@dc.ufscar.br SUMMARY In spite of recent and constant researches in the Component-Based Development (CBD) area, there is still lack of patterns, approaches and methodologies that effectively support the development “for reuse” as much as “with reuse”. Considering the accelerated growth of the Internet over the last decade, where distribution has become an essential non-functional requirement of most applications, the problem becomes bigger. This paper proposes an Incremental Approach that integrates the concepts of Component-Based Software Engineering (CBSE), frameworks, patterns, and distribution. This approach is divided into two stages: the development “for reuse”, and the development “with reuse”. A CASE tool is the main mechanism to apply this process model, supporting inclusively, the code generation of components and applications. An experimental study evaluates the viability of the use of an approach for the distributed component-based software development and the impact of applying it to a software development project. KEY WORDS: Incremental Approach; Reuse; Components; Distribution; Patterns and MVCASE tool.
1. INTRODUCTION One of the most compelling reasons for adopting component-based approaches to software development, with or without objects, is the premise of reuse. The idea is to build software from existing components primarily by assembling and replacing interoperable parts. These components range from user-interface controls such as listboxes and HTML browsers, to components for database persistence or distribution. The implications for reduced development time and improved product quality make this approach very attractive [1]. Reuse is a variety of techniques aimed at getting the most from design and implementation work. The objective is not to reinvent the same ideas every time when designing a new product but rather to capitalize on that work and deploy it immediately in new contexts. In that way, more products can be delivered in shorter times, maintenance costs are reduced because an improvement to one piece of design work will enhance all the projects in which it is used, and quality should improve because reused components have been well tested [1, 2]. In order to make reuse effective, it must be considered in all phases of the software development process [1, 2, 3, 4]. Therefore, the Component-Based Development (CBD) must offer methods, techniques and tools that support the components identification and specification at a problem domain level, as well as their project and implementation in a component-oriented language. Also, the CBD must use interrelations among existing components, which have been previously tested, aiming to reduce the complexity and the software development costs. Components are already a reality at the implementation level [4], and now this concept must be founded at the earlier phases of the development lifecycle. In so doing, CBD principles and concepts should be consistently applied throughout the whole development process, and consistently followed from one phase to the next. Current CBD methods and approaches, those discussed in [1, 5, 6], do not include full support for the concept of components. This results in components being handled mainly at the implementation and deployment phase, instead of being the focal point throughout the complete system lifecycle. The methods are significantly influenced by their Object-Oriented origins, while trying to introduce the CBD concepts mainly through the use of standard Unified Modeling Language (UML) [7] concepts and notation [8, 9].
In this context, motivated by ideas of reuse, component-based development and distribution, this work proposes and evaluates an Incremental Approach to support the Distributed Component-Based Software Development (DCBD). In previous work [10, 11, 12] we introduced the approach and described our initial experience. This paper makes two novel contributions: - The use approach in a complete case study. - An experimental study that evaluates the viability of the use of an approach for the distributed componentbased software development and the evaluation of the impact of applying it to a software development project. The paper is organized as follows. Section 2 presents the main mechanisms integrated into the approach. Section 3 describes this approach, followed by a case study showing its uses. Section 4 discusses relation aspects of test approaches. An experimental study in which the approach was applied is described and analyzed in Section 5. Related works are considered in Section 6, and, finally, Section 7 presents some concluding remarks and directions for future work. 2. MECHANISMS OF THE APPROACH After researching different methods, techniques and tools available in the literature [1, 5, 6, 13, 14, 15, 16], it was decided to use Catalysis [1] as the method component development; Middleware [17] as an additional layer between the client and server; framework [18] and Patterns [19] to facilitate the distribution of components, and the uses of database. These mechanisms were integrated by an approach, which guides the software engineer in process software development. To automate the development tasks, the MVCASE tool [20, 21] has been implemented into a design approach where the components’ constructions and reuses are based. 2.1. Catalysis Method Catalysis [1] method is divided into three levels: Problem Domain Definition, where emphasis is on understanding the problem, specifying “what” the system must do to solve the problem; Components Specification, where the system’s behavior is described in a non ambiguous way; and the Components Inner Project, where it defines “how” the specified requirements will be implemented. Catalysis is based on the principles of abstraction, precision and “plug-in” components. The abstraction principle guides the developer in search of essential aspects of the system, sparing details that are not relevant for the context of the system. The objective of the precision principle is to detect errors and inconsistency in modeling. The “plug-in” components principle supports components reuse to construct other systems [1]. The integration of Catalysis in the first stage of the proposed approach was done through corresponding the definition of each step of the components development, to each of those levels as shown in Table1. Table 1. Catalysis versus first stage of the approach. Catalysis Method
First Stage of Approach
Problem Domain Definition Components Specification Components Inner Project
Define Problem Specify Components Project Components Implement Components
2.2. Middleware Middleware differs from the usual two layer client/server applications where the client identifies and synchronizes directly with the required server. The concept of middleware assumes the role of an additional layer between the client and the server in the applications’ multilayers and applications with distributed objects. Middleware is connectivity software that consists of a set of enabling services that allow multiple processes, running on one or more machines, to interact across a network [17]. The Object Request Broker, or simply, ORB, is among the types of middleware [17] we emphasize in this work. ORB is a technology that manages the communication and the exchange of data between objects. In other words, the ORB provides interoperability in object distributed systems, allowing the construction of applications by grouping objects that communicate between themselves through it, hiding details of programming languages, operating systems, hardware and localization of objects.
Thus, the approach uses middleware like an intermediary layer, to allow the distribution of the components through the net and to offer an infrastructure for localization and communication. 2.3. Framework and Patterns To construct software components that are more reliable, easier to maintain and use, the approach uses a pattern [19] based framework. Framework can be considered as a reusable design for a system, or subsystem, which describes how the system is decomposed into a set of interacting objects or components [18]. The use of patterns in complex software systems allows already existing and previously tested solutions to be reused, making systems easier to develop and maintain. Thus, to decrease the time and reduce the components development costs of a problem domain, a basic component framework (Persistence) was developed in the database access area. The component framework reuse gives a guide to accomplishing the data persistence in database systems. 2.4. MVCASE Tool CASE tools have been used, with success, in the project and re-project of reconstructed systems. Among various CASE tools, MVCASE [20, 21] stands out. It supports system specification using UML notation and has been extended to implement the approach discussed here. MVCASE supports code generation from the system specifications. The way the approach leads with the distribution and reuse of components is to manage the components deployment in a set of repositories for further reuse. MVCASE implements a three-tier architecture [17] to construct and place the components. The three tiers allows the software engineer to separate the client applications from the remote client (thin client) interface, business rules, and from the database services or, to store in another way. The proposed approach uses the MVCASE to build and reuse the components of a problem domain, using a middleware. Once specified the components of a domain can generate the code of those components in an object-oriented language, specifically Java [22], making it available in a component repository for future reuses. 3. DISTRIBUTED COMPONENT-BASED SOFTWARE DEVELOPMENT: AN INCREMENTAL APPROACH Integration of Catalysis CBD method, the principles of middleware, components framework (persistence) and the Distributed Adapters Pattern (DAP) [23], into the MVCASE Tool, define an Incremental Approach that supports the Distributed Component-Based Software Development (DCBD). The approach was divided in two stages, as shown in Figure 1, using SADT [24] notation. In the first stage, the approach starts from the requirements of a problem domain and produces implemented components in an object-oriented language (development “for reuse”). Once implemented, these components are stored into a repository. In the next stage, using a Trading [25] mechanism offered by MVCASE, the software engineer consults the available components of a given problem domain in the repository. After identifying the necessary components, the applications that reuse them are developed (development “with reuse”).
Figure 1. Incremental Approach for DCBD.
Following is a detailed presentation of each approach stage presented. 3.1. Development of Distributed Components In this stage, which corresponds to a pattern [26], components of a problem domain are built in four steps: Define Problem, Specify Components, Design Components and Implement Components, according to Figure 2. The first three steps correspond to the three levels of Catalysis, as shown in the right part of Figure 2. In the last step, the physical implementation of the components is done.
Figure 2. Distributed Components Development Steps. The detailed presentation of each step of the stage Development of Distributed Components is presented next. To more easily understand the inputs, outputs and controls of each step, the Service Order problem domain of a computer company is used as example. The Service Order (SO) domain applications are divided into three big modules. The first, Customers, is responsible for registering and notifying customers of a certain service order. The second, Employees, is responsible for registering employees and controlling service order tasks. The third, Reports, is responsible for emitting reports related to accomplished and pending tasks’ consultation, the manager, service orders of a certain client, and of employees responsible for each task. 3.1.1. Define Problem In the first step, emphasis is placed on understanding the problem and specifying “what” the components must do to solve the problem. Initially, the requirements of the domain are identified, using techniques as storyboards or mind-maps [1], aiming to represent the different situations and problem domain scenarios. Next, the identified requirements are specified in Collaboration Models, representing the action collections and the participant objects. Finally, the collaboration models are refined in Use Cases Model. The first step summarized in Figure 3, where a mind-map defined in the Service Order domain requirements identification is specified in a Collaboration Models and later refined and partitioned in a Use Cases Model, aim’s to reduce the complexity and improve understanding the problem domain.
Figure 3. First Step: Define Problem. 3.1.2. Specify Components This step supports the second level of Catalysis, where the system’s external behavior is described in a non-ambiguous way. In the CASE tool, the software engineer refines the previous step’s specifications, aiming to obtain the components’ specifications. This step begins with the refining of the models’ problem domain. The Model of Types is specified, according to Figure 4, showing attributes and object’s type operations, without worrying about implementation. Still in this step, the data dictionary can be used to specify each type found, and the Object Constraint Language (OCL) [1] to detail the objects behavior, with no ambiguity.
Figure 4. Model of Types from Specify Components step. Once identified and specified, the types are put together in Model Frameworks. Model Frameworks are designed at a higher level of abstraction establishing a generic scheme that can be imported, at the design level, with substitutions and extensions in order to generate specific applications [1]. Figure 5 shows this model. The fact that the Model Framework is small, thus narrowly focused, increases its reuse potential in a well-defined application domain, the Service Order domain in this case. In addition, conceived as a Model Framework, it is a reusable asset at the design level, thus it is intended to be customized to more specific applications down to the code component level [1]. As a design represents many of the major decisions that go into finished code, it can specify frameworks at a design level and offer a process to refine these frameworks down to the level of a set of interoperable code components.
Figure 5. Service Order Model Framework.
Figure 6. Service Order Framework Application.
The types with names written between brackets are defined as placeholders [1]. These types can be substituted in the specific application. The concept is similar to class extensibility in the object-oriented paradigm. The framework for Service Order can be reused in several of the application’s domains. Figure 6 shows the Framework Application of Service Order domain. In this framework, the types with placeholders are substituted by respective types. Still in this step, the Use Case Models from the last step are refined through Interaction Models represented by sequence diagrams [7] to detail the utility scenarios of components in different applications of the problem domain. In summary, the activities from this step, accomplished by the software engineer in the MVCASE tool include the specifications of: a) Model of Types; b) Model Framework; c) Framework Application; d) Interactions Models, represented by sequence diagrams, based on Use Cases Model. These models are used in the next step to obtain the components inner project. 3.1.3. Design Components In this step, the software engineer performs the components inner project, according to the third level of Catalysis and specifies other non-functional requirements standing out: distributed architecture, fault tolerance, caching, and persistence. As a first step, the Model of Types are refined into Classes Models, where the classes are modeled with their relationships, taking into consideration the components definitions and their interfaces. Figure 7 shows a portion of the Classes Model of Service Order domain. The previous Interactions Models, represented by sequence diagrams, are refined to show design details of method behavior in each class.
Figure 7. Classes Model obtained from Model of Types.
Starting from Classes Model, the Distributed Adapters Pattern (DAP) [23] is used to design components from Components Model, where the organizations and dependencies between components are shown. The next section presents the overview of this pattern.
3.1.3.1. Distributed Adapters Pattern (DAP) The Distributed Adapters Pattern (DAP) was developed with the purpose of refining the distribution layer of distributed architectures. It is a combination of the Facade, the Adapter, and the Factory design patterns [19]. Indeed, well-known patterns for structuring distributed systems already exist. The Broker [27] and the Trader [27] patterns are two examples. These are architectural patterns [27] and focus mostly on providing fundamental distribution issues, such as marshalling and message protocols. Therefore, they are mostly tailored to the implementation of distributed platforms, such as CORBA [25] for instance. DAP uses these fundamental patterns and provides a higher level of abstraction due to the distribution Application Programming Interface (API) transparency to both clients and servers [20]. DAP introduces a pair of object adapters [27] to achieve better decoupling of components in distributed architectures. The adapters basically encapsulate the API that is necessary for allowing distributed or remote access of business objects. In this way, the business layer of an application becomes independent with respect to the distribution layer, so that changes in the latter do not impact on the former [23]. There are two kinds of adapters: source adapters and target adapters. Roughly, the latter wraps server business objects in the places where they are located, and the former represents those objects in remote locations. In a typical interaction, a user interface object (a GUI, for instance) in one machine would request the services of a source adapter located in the same machine. The source adapter would then request the services of a corresponding target adapter residing in a remote machine. Finally, the target adapter would request the services of a Facade object co-located with the target adapter. Figure 8 illustrates this example [23].
Figure 8. DAP Interaction Model. 3.1.3.2. Applying DAP Figure 9 shows the design Components Model after the application of DAP. The components Source and Target abstract the business rules of the problem domain. The TargetInterface interface abstracts the Target component behavior in distributed scenery. At this interface, the components Source and Target do not have communication code either. These three elements compose a distributed independent layer. The main components are SourceAdapter and TargetAdapter. They are connected to a specific API of distribution and encapsulate the communication details. SourceAdapter is an adapter that isolates the Source component from distributed code. It is located in the same machine as Source and works as a proxy to TargetAdapter. TargetAdapter is located in another machine, isolating the Target component from distributed code. SourceAdapter and TargetAdapter, usually, are located in different machines, and do not directly interact. TargetAdapter implements RemoteInterface used to connect with SourceAdapter.
Figure 9. Design Component Model after apply DAP. Once designed components, specify other non- functional requirements. 3.1.3.3. Other non-functional requirements The adapters presented deal with basic distribution details and hide these details from the business and the user interface code. The adapters may also handle additional non-functional behavior, which also should not affect the business and the user interface code. In this step, we illustrate how the adapters may perform some of this additional behavior, which might be useful for implementing distributed applications. i. Fault Tolerance. The source adapters presented previously have no fault tolerant behavior. If there is a communication error or if the server is unavailable, they simply raise a communication exception. Nevertheless, source adapters can also implement fault tolerant behavior. If a source adapter receives a remote exception when interacting with the target adapter, it may implement the policy of trying to contact the target adapter again a certain number of times, or try to contact another target adapter, representing a spare service. This policy, being implemented by the source adapter, is hidden from its client, a GUI for instance. ii. Caching. Some operations may return a considerable amount of data, of which only part is useful at any moment. Sending everything to the client at once is not desirable since it may have a negative impact on network performance. One solution is to send a cache with part of the required data and to transfer more data every time a fault happens. A source adapter can implement this caching behavior. When a querying operation returns many entries, part of them are used to initialize a source adapter. The client of this adapter (a GUI, for instance) retrieves the entries from this adapter. When a fault happens in the source adapter, it contacts the target adapter to retrieve more entries. This caching behavior is implemented in the source adapter and is transparent to the GUI. iii. Data Persistence. To facilitate database access the software engineer can be reuse components of Persistence framework [28]. Figure 10 shows these components. The ConnectionPool component, through its IConnectionPool interface, does the management and connection with the database used in the application. The DriversUtil component, based on eXtensible Markup Language (XML), has information from supported database drivers, available through its interface IDriversUtil. The TableManager component manages the mapping of an object into database tables, making their methods available by the ITableManager interface. The persistent component of the FacadePersistent structure, through its IPersistentObject interface, makes the values available, which must be added to the database, passing on parameters to the TableManager component.
Figure 10. Framework Persistence. In summary, the main artifacts and the sequence of the Design Components activities, include: a) Refining Model of Types into Classes Models; b) Refining the Interactions Models; and c) Creating the Components Models. 3.1.4. Implement Components In this step, the software engineer defines the distribution technology and so uses a code generator, from MVCASE, to optimize implementation tasks from designed components. In this situation, CORBA was chosen as illustration means, but other technologies such as RMI [22], JAMP [29] and JINI [22] can be used. Thus, for each component, it has the stubs and skeletons and its interfaces that make its services available. Those components are customized by the software engineer and next stored in a component repository to be used on applications development in the future. Figure 11 shows, the process of code generation of design component by MVCASE.
Figure 11. Generate code in MVCase tool. Once the components of the problem domain are constructed, the software engineer moves to the second stage of the approach, where applications that reuse those components may be developed. 3.2. Development of Distributed Applications Figure 12 shows the steps for application development. It begins with the application requirements and proceeds with the normal life cycle development, which includes: Specify Application, Design Application and Implement Application.
The constructed components of the problem domain are available to reuse in the component repository. As in components development, MVCASE is the main mechanism to guide the software engineer during the application development.
Figure 12. Development of Distributed Applications. For a better understanding of these steps, an application example is developed that registers, via web, a customer from the Service Order (SO) domain and which components were built in the previous stage. 3.2.1. Specify Application This step starts with problem understanding and identifying the application requirements. Before the requirements specification starts at MVCASE, the software engineer, using its Trading mechanism imports the components of the related problem domain, in this case SO, that are available in the repository and will be used in the application. Next, the requirements are specified in Use Cases and Sequence Diagrams. 3.2.2. Design Application The specifications from the last step are refined by the software engineer to obtain the application project. In this step, the non-functional requirements related to distributed architecture and data persistence are specified. Thus, continuing the modeling process, the software engineer specifies the application component model. In this case, the component ServletAddCustomer was built in a way that reuses services from the SO domain components. Figure 13 shows the three components (shaded), reused in the Add Customer application.
Figure 13. Reused components in the Add Customer application.
Figure 14 shows the components diagram of the application project, where the components DriversUtil, ConnectionPool, FacadePersistent and TableManager, from persistence framework, were added to deal with the database access.
Figure 14. Project of the distributed application. Once the projected application is achieved the software engineer prepares the environment for execution. 3.2.2.1. Prepare Environment To distribute an application, at least one platform needs to be chosen. For this platform there must be some configuration information, such as the location of a server, port numbers and others. Figure 15 shows how the distributed application is structured, through an Object Web [25] extension model, using a middleware, e.g. CORBA, to accommodate the components repositories. In the first layer, the client executes the application. Through the HTTP (HyperText Transfer Protocol) communication, the requests are sent and received from Servlets, available in the Web Server. In the second layer, the web server communicates to the Component Server, which makes available the problem domain components that are stored in a repository. This communication is done via ORB. The database and its server are located in the third layer. Communication between the Component Server and the Database Server, via JDBC (Java Database Connectivity) allows access to the database services.
Figure 15. Extended Distributed Model Object Web. 3.2.3. Implement Application Lastly, based on the application project, the software engineer uses the MVCASE code generator and produces customized adaptations. Figure 16 shows part of the generated code to Add Customer application.
Figure 16. Implementation of the Add Customer application. 4. TESTS OF COMPONENTS AND APPLICATIONS The test is completed at three levels, individual components, consistency within a domain, and application development reusing components. i. Within a domain, each component is tested individually by building a simple test application around the component that allows the component’s funcionality to be exercised. ii. Then, tests to verify the internal consistency between components of a domain are accomplished. iii. The last form of testing is at the level of applications. Applications are submitted to a long series of tests before taken into production. The Service Order domain, constructed using the approach, was tested using the three levels presented. The tests had helped to correct imperfections in the different phases of the development process, driving improvements in each step and refining the mechanisms’ integration. 5. EXPERIMENTAL STUDY In order to determine whether the approach meets its proposed goals, an experimental study was performed. This section describes the steps of the experimental study and present how these steps were performed in the study of this work The plan of the experiment to be presented follows the model proposed in [30] and the organization adopted in [31]. The definition and the planning steps to be presented in the following sections are described in the future tense, symbolizing the precedence of the plan to the execution of it. 5.1. Definition of the Experimental Study Object of study: the use of the approach in the distributed component-based software development project for the evaluation of the behavior of the project. Purpose: to identify the viability of the approach in the distributed component-based software development. Quality Focus: to obatin benefit from the use of the proposed approach, measured by the number of models, number of classes, number of components, number of groups for the classification of the components, number of applications, development time and the involved user difficulties regarding comprehension and use. Perspective: the study will be developed from the researchers’ point of view, by evaluating the viability of the use of the approach and pondering its continuity. Context: the distributed component-based software development, defined in a laboratory, with the requirements defined by the experiment definition staff, will be based on real-world projects. The study will be conducted as multiple tests over one single object.
Thus, we will be using a notation based on Goal Question Metric Paradigm (GQM) [32], which consists of: Analyse the use of the incremental approach in the distributed component-based software development. for the purpose of characterizing the viability of its use and the development continuity. with respect to the gain and difficulties of its use. from the point of view of the researcher. in the context of the distributed component-based software development.
5.2. Planning of the Experimental Study Context: the objective of this study is to evaluate the viability of using the incremental distributed component-based software development approach. The subjects of the study will be requested to act as software engineers in the same project. One group of the subjects will be trained to use the approach, while the other group will use previously acquired knowledge from industry or university. Training: the training of the subjects using the approach will be conducted in a classroom and a laboratory at the University. The training will be divided into four sessions, each of approximately five hours duration. In the first session, the MVCASE will be presented. In this session, all of its functionalities will be demonstrated. Then, the main mechanisms used in the approach will be presented. Once this session is concluded, a use case will accompany the exemplification of the approach. Lastly, in the fourth session, the subjects will develop a small-sized project to use and apply the knowledge acquired from the training. Pilot Project: before establishing viability, a pilot project with the same structure defined in the planning will be conducted. Only two of the subjects will execute this project. One of them will be trained in the use of the approach in both projects. The other one will develop the project using the acquired knowledge from industry or faculty only. All subjects will use the same material, which is described in this paper, and will be accompanied by the researcher responsible. So, the pilot project will be a study based on the observation, serving to detect problems and improve the planned material before its use. Subjects: the subjects of the study will be software developers. The study will be executed in an academic environment. Instrumentation: each subject will act as a software engineer, a person responsible for the development of the project. All the subjects will receive a questionnaire (QT1) about his/her education and experience. The trained subjects will receive a kit [33] containing a case study, including all the steps of the approach and the MVCASE tool to execute the proposal. The kit includes a second questionnaire (QT2) for the evaluation of the subjects’ satisfaction of the approach. Both questionnaires can be found in [34]. Criteria: the quality focus of the study demands criteria that evaluates the gain obtained by the use of the incremental distributed component-based software development approach and the difficulties of the users. The gain obtained will be evaluated quantitatively (number of models, number of classes, number of components, number of groups for the classification of the components, number of applications and development time). The difficulties of the users will be evaluated using qualitative data from questionnaire QT2. Null Hypothesis, H0: this is the hypothesis that the experimenter wants to strongly reject. In this study, the null hypothesis determines that the use of the approach for the distributed component-based software does not produce benefits that justify its use. According to the selected criteria, the following hypotheses can be defined: H0: µnumber of models without approach = µnumber of models with approach H0: µnumber of classes without approach = µnumber of classes with approach H0: µnumber of components without approach = µnumber of components with approach H0: µnumber of groups for classification of the components without approach = µnumber of groups for classification of the components with approach H0: µnumber of applications without approach = µ number of applications with approach H0: µtime development without approach = µtime development with approach Alternative Hypothesis: this is the hypothesis in favor of that which the null hypothesis rejects. The experimental study aims to prove the alternative hypothesis, contradicting the null hypothesis. In this study, the alternative hypothesis determines that the subjects who use the distributed component-based software development approach will have superior results compared to the subjects who use an ad-hoc approach. According to the selected criteria, the following hypotheses can be defined:
H1: µnumber of models without approach < µnumber of models with approach H2: µnumber of classes without approach < µnumber of classes with approach H3: µnumber of components without approach < µnumber of components with approach H4: µnumber of groups for classification of the components without approach < µnumber of groups for classification of the components with approach H5: µnumber of applications without approach < µ number of applications with approach H6: µtime development without approach > µtime development with approach Independent Variables: the main independent variable is an indicator, which informs if the subjects of the experiment did or did not use the incremental distributed component-based software development approach. The education and the experience of the subjects, collected through the questionnaire QT1, are also independent variables, which can be used in the analysis for the formation of blocks. Dependent Variables: the dependent variables are: number of models, number of classes, number of components, number of groups for the classification of the components, number of applications and development time. Qualitative Analysis: the qualitative analysis aims to evaluate the difficulty of the application of the proposed approach and the quality of the material used in the study. This analysis will be made through questionnaire QT2. This questionnaire is very important because it will allow us to evaluate the difficulties the subjects have with the use of the approach, evaluate the provided material and the training material, and enhance these artifacts, in order to replicate the experiment in the future. Random Selection: this technique can be used in the selection of the subjects. Ideally, the subjects must be selected randomly from the set of candidates. The only restriction is that they must be software developers. Balancing: during the execution of the study, the subjects will be divided into two groups with the same quantity in each of the groups (one that will use the incremental approach and one that will use an ad-hoc approach). Internal Validity of the Study: the internal validity of the study is defined as the capacity of a new study to repeat the behavior of the current study, with the same subjects and object with which it was executed. The internal validity of the study is dependent the number of subjects. This study is supposed to have at least eight subjects to guarantee a good internal validation. Obviously, the more subjects in the study, the better the internal validation. An issue that could influence the results of the study is the exchange of information among the subjects. In order to avoid this problem, the subjects will be told not to exchange information regarding the project. External Validity of the Study: the external validity of the study measures its capability to be affected by the generalization [30], that is, the capability to repeat the same in other research groups, other than the one in which the study was applied. The external validity of the study is considered sufficient, since it aims to evaluate the viability of the application of the incremental distributed component-based software development approach. Once the viability is shown, new studies can be planned in order to refine and enhance the approach. Validity of the Construction of the Study: the validation of the construction of the study refers to the relation between the theory that is to be proved and the instruments and subjects of the study [30]. In this study, a relatively well known and easily understandable problem domain was chosen to prevent the experienced users in a certain domain to make use of it. Thus, this choice avoids previous experiences of making a wrong interpretation of the impact of the proposed approach. Validity of the Conclusion of the Study: the validation of the conclusion of the study measures the relation between the treatment and the result, and determines the capability of the study to generate conclusions [30]. This conclusion will be drawn by the use of the parametric procedure of statistical analysis based on distribution T [30]. 5.3. The Project used in the Experimental Study The Project used in the experimental study was the development of components and applications for an e-commerce domain, restricted to the buying and selling operations through the Internet, as shown in Figure 17.
The New Time bookstore, a specialized bookstore in mathematics, computer science, physics and chemistry books, wants the process of buying and selling books through the Internet automated. The bookstore needs software that allows some functionality to be performed: - Registration, alteration, exclusion and query of its customers; - Registration, alteration, exclusion and query of its books with the respective associated genres; - Registration, alteration, exclusion and query of its suppliers; - Purchase of books by customers, who have previously registered. If the bookstore does not have the book in supply, a request to the suppliers is made and the purchase is set to be waiting until there is a notification from the suppliers. Once the notification is done, the purchase is made; - Emission of reports for the management, informing the sold books and the respective quantity; Emission of the purchase history of a given customer.
Figure 17. Description of the Development Project. 5.4. Instantiation of the Experimental Study Selection of the Subjects: for the execution of the study, undergraduate students (Computer Science and Computer Engineering) and graduate (MSc.) students (from the Distributed Systems Group) from the Federal University of São Carlos, Brazil, were selected. The subjects filled the criteria previously stated, that is, they are software developers. The subjects were selected and represent a non-random subset from the universe of students at the university. Random Capability: the selection of the subjects for the study was not random, since the availability of the students for the study was considered. Once the names of the students were selected, the undergraduate students were assigned the incremental approach to use in order to accomplish the task, while the graduate students from the Distributed Systems Group would make use of their past experience from academia and industry. Analysis Mechanisms: to evaluate the hypothesis of the study, mechanisms of descriptive statistics (e.g. the mean) , will be used. 5.5. Execution of the Experimental Study Realization: the experimental study was conducted during part of an undergraduate and MSc. Course in Distributed Systems, during second semester in 2002, at the Federal University of São Carlos. The time stipulated for execution was a month. Training: the subjects who used the approach in the study were trained before the study began. The training constituted 20 hours, divided into four sessions each of five hours for the undergraduate course. Subjects: the subjects were the undergraduate students and the MSc. students. Six were MSc. students from the Distributed Systems Group and two were undergraduate students. Five presented with development experience (from academia or from industry), while the others presented with little or no experience. Table 2 presents an overview of the subjects’ education and experience. Table 2. Profile of the subjects of the Experimental Study. ID
Used The Approach
Education
Development Experience
1
Yes
MSc Student
University (3 projects)
2
Yes
MSc Student
Industry (3 projects)
3
Yes
BSc Student
University (2 projects)
4
Yes
BSc Student
University (1 project)
5
No
MSc Student
University (1 project)
6
No
MSc Student
University (3 projects)
7
No
MSc Student
Industry (3 projects)
8
No
MSc Student
University (1 project)
Study Cost: since the subjects of the experimental study were students from the Federal University of São Carlos, and the required equipment were the computers from the laboratories, the cost for this project was basically the planning of the experiment. Planning Cost: the planning is, definitely, the most costly step for the realization of an experimental project. The planning of the viability of the DCBD approach took 78 days, from October to December of 2002. In this period, three versions of the experimental plan were released. 5.6. Analysis of the Results of the Experimental Study Training Evaluation: the training was applied to the subjects who used the experiment incremental approach. The training took twenty hours, as previously stated, and was a slide show. Upon completion, a small-sized use case example was given for practising the approach. One subject (ID = 4) requested further training, due to the variety of mechanisms of the approach. Quantitative Evaluation: the quantitative analysis was divided into six independent parts: number of models, number of classes, number of components, number of applications, development time and number of groups for the classification of the components. Each was performed using descriptive statistics. Descriptive Statistics: once the necessary information was collected, the analysis could be performed. Firstly, the data from the subjects of the experiment were grouped. Tables 3 and 4 present the data from each subject using the incremental approach and an ad-hoc approach for the software development, respectively. Table 3. Data from each subject using the incremental approach. ID Models Classes Components Applications
Time
Groups
1
17
21
15
19
12h
2
2
17
20
15
19
9h 26m
2
3
17
20
15
19
20h 46m
2
4
17
20
15
19
20h 52m
2
Table 4. Data from each subject using an ad-hoc approach. ID Models Classes Components Applications
Time
Groups
5
0
3
3
19
27h 15h
1
6
0
3
3
19
26h
1
7
0
3
3
19
13h 50m
1
8
0
3
3
19
41h
1
After grouping the data, the mean was applied for both the datasets, as shown in Table 5. The standard deviation was not considered due to the small number of subjects, which would make it irrelevant. Table 5. Mean of the data from both of the groups. Artifacts Models Classes Components Applications Time Groups
Mean Incremental Approach 17 20,25 15 19 15h 46m 2
Ad-Hoc 0,5 3 3 19 27h 12m 1
i. Models: The mean for the obtained models (Mind-Maps, Collaboration Model, Use Case Model, Types Model, Model Framework, Framework Application Model, Classes Model, and Component Model) specified by the subjects who used the incremental approach (17) was far superior to those who used an ad-hoc approach (0,5). It is due to the progressive development of these artifacts at each step of the approach. In the case of subjects who used an ad-hoc approach, only one (ID = 8) specified the Use Case and Classes Model, even though he/she was one of the least experienced developers. The other subjects who used an ad-hoc approach went directly to decisions regarding the implementation, ignoring the analysis tasks. ii. Classes and Components: The mean for the subjects who used the incremental approach was also superior in the number of classes and components. Regarding the classes, a problem could be detected when using an ad-hoc approach. The specified classes (3) were highly interlaced by coding of distinct purposes, such as code for distribution, code for exception handling, code for database access, which makes it hard to distinguish analysis classes from project classes. On the other hand, with the use of the incremental approach, the classes (approximately 20) had well-defined functions because of the use of DAP pattern for business rules classes, distribution classes and Persistence framework for database access classes. The same analysis can be applied to the components. iii. Applications and Development Time: Both groups had the same mean for the number of applications (19) reusing the components. However, despite the quantity of steps to be followed with the use of the incremental approach (four for the development of components and three for the development of the applications), the mean time for the use of the incremental approach (approximately 15 hours and 46 minutes) was far lower than the mean for the use of an ad-hoc approach (approximately 27 hours and 12 minutes), even taking into account the better knowledge and experience of the ad-hoc subjects. This observation can be explained by the orientation given to the subjects about the way in which to proceed with issues related to requirements specification, components project, distribution, database access, and the way to develop the applications reusing the components. iv. Groups for the classification of the components: Using the incremental approach, it is possible to obtain, at the end of the development process, a clear distinction between two types of components: business and infra-structure components (database access and distribution components), which makes it easier to perform maintenance tasks. On the other hand, due to the code interlacing when using an ad-hoc approach, it turned out to be difficult to classify the components according to a determined criterion. Thus, the components were classified into a single category: business components. Conclusion: Even though the statistic tests were not conclusive, they indicated that the distributed component-based software development approach can be a viable aid for software engineers in the development of their tasks. However, as well as further studies, as the one described in this section, improvements to facilitate the use of the proposed approach are necessary. Next, a qualitative evaluation to present new research directions and approach refinements is described. Qualitative Evaluation: The results of the qualitative analysis about the usefulness of the proposed approach and the quality of the material of the experimental study are presented. This analysis is based on the answers of the questionnaire QT2, which is presented in [34]. Usefulness of the Approach: All four subjects who used the incremental approach for the development of the proposed project indicated that the approach was useful for the conclusion of the project. One subject (ID 4) pointed out the need for the classes with the DAP pattern structure to be already available at the Component Design step to avoid its repeated creation (by providing a template for reuse). Quality of the Material: Only one of the subjects pointed out that the training was not enough for the application of the approach. The other subjects pointed out the need for more training, especially the development of a complete project instead of an application use case. Furthermore, three subjects requested improvements of the Training Kit, particularly in the components and applications implementation step. 5.7. Learned Lessons In an eventual replication of the experiment, some issues must be considered as limitations of the first execution. Training Improvement: It is suggested that a complete project be developed, step by step, by the subjects, before the experiment itself. As well, it is suggested that more details about implementation and configuration of components and applications that have not been totally clear in the first version presented here, be included. Instrumentation: At the time of the project development, some subjects encountered problems with the MVCASE tool. Thus, in an eventual replication of the experiment, it is suggested to execute a big-sized pilot project, in order to avoid problems during the experiment. Lastly, two subjects suggested the need for on-line help for the use tool, to aid the subjects performance of some tasks.
6. RELATED WORKS There are similar works in literature. In this section, we’ll present part of them, emphasizing their differences and similarities in the proposal approach. i. In [29], the authors describe an object-oriented distributed systems development strategy, divided into three steps: Distributed System Specification, where the system is modeled using UML notation, Object Distribution, where it defines the architecture of the system distributed based in the services and frameworks available in platform JAMP (Java Architecture for Media Processing) and Distributed System Implementation where is realized their implementation, based on specifications of the previous steps. This approach differs from the approach proposal for utilizing frameworks available in platform JAMP, where all strategy is supported. Furthermore, it works with objects, not components, decreasing the reuse in the applications development. ii. In [35], a strategy is presented for distributed component-based systems development, which uses Aspect-Oriented Programming (AOP) to describe and implement the dependencies between components. The authors use the process modeling of the component-based system proposed in [13]. This process was extended using AOP in interfaces and components specification and components implementation levels. Compared to the proposal of [35], our work defines an incremental approach for distributed component-based software development, starting at level conception domain, identifying their elements and relationships, until their current materialization on software components. On the other hand, the incremental approach allows no just code reuses, as in, models of high abstraction level (Model Frameworks). Furthermore, the approach defines, systematically, the applications development reusing the development components. iii. The Rational Rose [15] and Objecteering/UML [16] are CASE tools that offer a great range of resources in order to support the whole distributed component-based software development process. These tools support the system specification using UML notation and generate code from these specifications. Compared to the MVCASE tool, this differs from the other tools for having a component repository integrated, allowing the storage and search for components for a later use in the application development. Another resource available in the tool is a visual editor like the ones found in other environments, as in JBuilder, allowing a quick application development. Last, for being an academic tool, the MVCASE is totally free. 7. CONCLUSION AND FUTURE WORKS The main contribution of this work is to propose an Incremental Approach for Distributed Component-Based Software Development, that integrates several mechanisms, guiding the software engineer as much in the development as in the reuse of components of a problem domain. The integration of mechanisms in the MVCASE tool makes possible the development of components and applications in a distributed platform. Although these mechanisms exist in the literature, there still exists a void of methods and tools that integrate them and assist the software engineer in the distributed component-based software development, automatizing part of its tasks. Additionally, we present the definition, planning, execution, and packaging of an experimental study that evaluates the viability of the usefulness of an incremental approach to aid distributed component-based software development. The qualitative and quantitative analysis, realized as part of the study, provide evidence that the subjects who made use of the incremental approach obtained better results than those who used an ad-hoc approach. Another contribution of this work is in the extension of the model of distribution Object Web, approaching the use of the components server with its repositories, allowing a bigger reuse for the applications, not treated in the original model. In future work, two directions are being researched for integration in the approach: the utilization of Aspect-Oriented Programming (AOP) [36] to treat distribution and database persistence, as well as, the insertion of other functions, like concurrency control and security. A second direction comes from the COTS use. Lastly, we intend to gather experience with other users of the approach, and perform further experiments to verify if the approach can be successfully applied to larger projects. There is a necessity to investigate the distributed components performance, and finally, to explore their utilization on a larger scale. ACKNOWLEDGEMENTS The authors would like to thank the anonymous referees for their reviews and suggestions during this work. This work is supported by Fundação de Amparo à Pesquisa do Estado da Bahia (Fapesb)-Brazil.
REFERENCES 1. D’Souza DF, Wills AC. Objects, Components, and Frameworks with UML, The Catalysis Approach, Addison-Wesley. USA, 1999. 2. Jacobson I, Griss M, Jonsson P. Software Reuse: Architecture, Process and Organization for Business Sucess, Addison-Wesley. Longman, 1997. 3. Heineman GT, Councill WT. Component-Based Software Engineering, Putting the Pieces Together, Addison-Wesley. USA, 2001. 4. Szyperski C. Component Software: Beyond Object-Oriented Programming, Addison-Wesley. USA, 1998. 5. Jacobson I, et al. The Unified Software Development Process. Addison-Wesley. USA, 4nd edition, 2001. 6. Perspective. Select Perspective: Princeton Softech’s practical methodology for delivering next generation applications, The Active Archive Solutions Company, 2000. Avaliable in 10/06/2002, URL: http://www. princetonsoftech.com. 7. Rumbaugh J, et al. The Unified Modeling Language Reference Manual, Addison-Wesley. USA, 1998. 8. Stojanovic Z, Dahanayake A, Sol H. A Methodology Framework for Component-Based System Development Support. In EMMSAD’2001, Sixth CAiSE/IFIP8.1 International Workshop on Evaluation of Modeling Methods in Systems Analysis and Design. 9. Boertin N, Steen M, Jonkers H. Evaluation of Component-Based Development Methods. In EMMSAD’2001, Sixth CAiSE/IFIP8.1 International Workshop on Evaluation of Modeling Methods in Systems Analysis and Design. 10. Almeida ES, Bianchini CP, Prado AF, Trevelin LC. Distributed Component-Based Software Development Strategy. In PhDOOS’2002, The 12th Workshop for PhD Students on Object-Oriented Systems,Lecture Notes in Computer Science (LNCS), Springer-Verlag. In conjunction with the 16th European Conference on Object-Oriented Programming (ECOOP). 11. Almeida ES, Bianchini CP, Prado AF, Trevelin LC. Distributed Component-Based Software Development Strategy Integrated by MVCase Tool. In The Second-Ibero American Symposium on Software Engineering and Knowledge Engineering, 2002. 12. Almeida ES, Bianchini CP, Prado AF, Trevelin LC. IPM: An Incremental Process Model for Distributed Component-Based Software Development, The The 5th International Conference On Enterprise Information Systems (ICEIS), ACM Press, Angers, France, 2003. To appear. 13. Cheesman J, Daniels J. UML Components: A Simple Process for Specifying Component-Based Software. Addison-Wesley. USA, 1nd edition, 2000. 14. Atkinson C, et al. Component-Based Software Engineering: The KobrA Approach. In ICSE’2000, 22th International Conference on Software Engineering, 3rd Workshop on Component-Based Software Engineering. ACM Press. 15. Rational Rose Tool. Rational the software development company. Avaliable in 10/07/2001, URL: http://www.rational.com. 16. Objecteering/UML Tool. Objecteering Software. Avaliable in 10/07/2002, URL: http:// www.objecteering.com. 17. Eckerson W, et al. Three Tier Client/Server Architecture: Achieving Scalability, Performance, and Efficiency in Client/Server Applications. Open Information Systems, 1995. 18. Johnson R, How To Develop Frameworks. In ECOOP’1996, 10th European Conference on Object-Oriented Programming, Tutorial Notes, 1996. 19. Gamma E, et al. Elements of Design Patterns: Elements of Reusable Object Oriented Software, Addison-Wesley, 1995. 20. Almeida ES, Bianchini CP, Prado AF, Trevelin LC. MVCase: An Integrating Technologies Tool for Distributed Component-Based Software Development. In APNOMS’2002, The Asia-Pacific Network Operations and Management Symposium,Poster Session. Proceedings of IEEE. 21. Almeida ES, Lucrédio D, Bianchini CP, Prado AF, Trevelin LC. MVCase Tool: An Integrating Technologies Tool for Distributed Component Development (in portuguese). In SBES’2002, 16th Brazilian Symposium on Software Engineering, Tools Session. 22. Horstmann CS, Cornell G. Core Java 2: Volume II, Advanced Features, Prentice Hall, 2002. 23. Alves V, Borba P. Distributed Adapters Pattern (DAP): A Design Pattern for Object-Oriented Distributed Applications. In SugarLoafPlop’2001, The First Latin American Conference on Pattern Languages of Programming. 24. Ross DT. Structured Analysis (SA): A language for communicating Ideas, 1977. IEEE Transaction on Software Engineering. 25. Orfali R, Harkey D. Client/Server Programming with Java and CORBA. John Wiley & Sons, Second Edition, 1998. 26. Almeida ES, Bianchini CP, Prado AF, Trevelin LC, 2002. DCDP: A Distributed Component Development Pattern. In SugarLoafPlop’2002, The Second Latin American Conference on Pattern Languages of Programming. 27. Buschmann F, et al. Pattern Oriented Software Architecture: A System of Patterns. John Wiley & Sons, 1996. 28. Yoder J, Johnson RE, Wilson QD. Connecting Business Objects to Relational Databases. In PLoP’1998, Pattern Language of Progamming. 29. Guimarães MP, Prado AF, Trevelin LC. Development of Object Oriented Distributed Systems (DOODS) using Frameworks of the JAMP plataform. In First Workshop on Web Engineering, in conjunction with the 19th International Conference in Software Engineering (ICSE), 1999. 30. Wohlin C, Runeson P, Host M, Ohlsson C, Regnell B, Wesslén A, Experimentation in Software Engineering: an Introduction, Kluver Academic Publishers, Norwell, 2000. 31. Barros MO, Werner CML, Travassos GH, An Experimental Study about Modelling Use and Simulation in suport to the Software Project Management (in portuguese), In the 16th Brazilian Symposium in Software Engineering, Gramado, Brazil, 2002. 32. Basili VR, Selby R, Hutchens D. Experimentation in Software Engineering, IEEE Transactions on Software Engineering, July. 1986. 33. Almeida ES, Distributed Component-Based Software Development Strategy (in portuguese), Tutorial Notes, Computing Department, Federal University of São Carlos, 2003.
34. Almeida ES, Distributed Component-Based Software Development Strategy (in portuguese), Master Thesis, Computing Department, Federal University of São Carlos, 2003. 35. Clement PJ, Sánchez F, Pérez MA. Modeling with UML Component-based and Aspect Oriented Programming Systems. In WCOP’2002, The 7th Workshop for Component-Oriented Programming. In conjunction with the 16th European Conference on Object-Oriented Programming (ECOOP). 36. Kiczales G, et al. Aspect-Oriented Programming (AOP). In ECOOP’1997, The 11th European Conference on Object-Oriented Programming (ECOOP). LNCS Springer-Verlag. 37. Almeida ES. Service Order Domain Construction Report (in portuguese). Federal University of São Carlos, 2002. 38. Emmerich W, Distributed Component Technologies and their Software Engineering Implications. In ICSE’2002, 24th Internation Conference on Software Engineering. ACM Press. 39. Ommering RV. Building Product Populations with Software Components. In ICSE’2002, 24th Internation Conference on Software Engineering. ACM Press.