SEC: A Search Engine for Component Based Software Development

3 downloads 25963 Views 458KB Size Report
SEC: A Search Engine for Component based software development .... to combine the good terms to describe the task in the facet technique. However, this ...
SEC: A Search Engine for Component based software development Sofien KHEMAKHEM

Khalil DRIRA

Mohamed JMAIEL

ISET of Sfax DEP. of computer science P.B. 88A-3099 Sfax, Tunisia

LAAS-CNRS 7, Avenue du Colonel Roche 31077 Toulouse Cedex 4

University of Sfax Laboratory ReDCAD P.B. 3038 Sfax, Tunisia

Khemakhem [email protected]

[email protected]

ABSTRACT

[email protected]

reuse is important and depend on the efficiency of the search procedure. The search step will become an important step in the development process. The search step may fail if the explored component repositories are not appropriately structured. This step may also fail if the requirement service and the provided service are not appropriately compared. During the development process, the developer is faced with a significant number of various component types. The use of a software components repository, having a clear structure, is crucial for the effectiveness of the CBD approach. This allows the developer to easily search and select the component which meets his/her needs perfectly. Several approaches tried to improve software components classification to simplify the request specification and to obtain better results. In existing work, we distinguish five software component classification approaches. The Adhoc technique, called also behavioral technique is based on the exploitation of results provided by the execution of the component. These results are collections of answers which describe the dynamic behavior of the component [24], [23], [3]. A relation of a behavioral nature must Categories and Subject Descriptors be used to classify the software components. The second approach is based on the semantic characD.2.13 [Software Engineering]: Reusable Software —Reusable teristics of the component presented by a pair (attribute, libraries value) [21], [25]. It represents the functional aspects of the software components. The identification of these characterKeywords istics and the components check classification procedure are Software component, discover, ontology, non-functional asfixed and verified by an expert of the domain. The similarpect ity between two components is measured by the number of common characteristics. Research is based on a syntactic comparison of the set of the characteristics. 1. INTRODUCTION The third approach uses the facet classification. The facet Component-based and service-oriented software architecclassification approaches [5], [20], [26] represent the type of tures are likely to become widely used technologies in the information to describe the software components. Each facet future distributed system development. Component reuse is has a name which identifies it and a collection of a wella critical requirement for the development process of comcontrolled terms known as vocabulary to describe its asponent based and service-oriented software. pects. For example, the facet component-type can have the Component are developed as important and big autonomous following values: COM, ActiveX, javabean, etc. In search and customizable software units. The successfulness of the procedure, the user query is specified by the selection of a term for each facet. The fourth technique of classification is based on the latPermission to make digital or hard copies of all or part of this work for tice. The concept of lattice was initially defined by R. Wille personal or classroom use is granted without fee provided that copies are [27]. This concept is the representation of a relation, R, not made or distributed for profit or commercial advantage and that copies between a collection of objects G (Gegentande) and a colbear this notice and the full citation on the first page. To copy otherwise, to lection of attributes M (Merkmale). The triplet (G, M, R) republish, to post on servers or to redistribute to lists, requires prior specific is called concept. Wille considers each element of lattice as permission and/or a fee. a formal concept and the graph (Hasse diagram) as a reSAC’06 April 23-27, 2006, Dijon, France Copyright 2006 ACM 1-59593-108-2/06/0004 ...$5.00. The successfulness of the component-based development(CBD) process relies on several factors including the structuration of the component repositories and the comparison procedures for interface exploring while comparing the expected and the provided services. Both functional and non-functional aspects should be considered. This paper presents a discovery ontology to organize components in a repository and an integration ontology to integrate a component into an application. In addition, we propose a search engine, called SEC for CBD, which uses the discovery ontology to automatically locates and presents a list of software components that could be used in the current development situation. This search engine consists of a persistent and an intelligent component which automatically generates a query from developer specification and indexes a repository of software components. SEC is not only suitable to discover components, but also able to automatically classify the selected components using the subsumption notion.

1745

lation of generalisation/specialisation. The lattice is seen as a hierarchy of concepts. Each concept is seen as a pair (E, I) where E is a sub-set of the application instance and I is the intention representing the properties common to the instances. Granter and Wille [11], Davet and Priesly [6] apply the technique of lattice to establish the relation between objects and their attributes. The last applies the notion of ontology to describe and organize the components. It is defined by Gruber as an explicit specification of a conceptualization or a formal explicit description of concept(denoting sometimes a class) in a speech domain [18]. The properties of each concept describe the characteristics and the attributes also called slots or role. The restrictions applied to the slots are called facets. The objects of classes constitute the knowledge base. The comparison in the behavior based technique is done between the specified behavior and the behaviors of each component. The procedure of research becomes very slow for a repository having a significant number of components. In facet technique, it is difficult to specify the query and to combine the good terms to describe the task in the facet technique. However, this technique requires the repository structure understanding, the terms, and the significance of each facet [4]. Software components classification problems can appear when the component has many states. Component behavior depends on its current state, which multiplies the possibilities of its classification. These problems are not presented in the ontology-based technique. However, the latter facilitates the fusion of the repositories having the same ontology [9], as well as the component insertion. This is not the case for the facet technique where the fusion of two repositories is done manually by adding component per component from one repository to the other. Current methods for component selection are based on one of the software component classification approaches previously presented. This allows the developer to easily discover the appropriate component that meets his/her needs perfectly. The majority of these approaches accentuate the functional aspect, while the non-functional aspect is generally neglected. In this paper we aim to define a unified and a complete approach to ameliorate the re-use of software components in the CBD. Our approach encompasses the functional and the non-functional aspect in different stage: description, discovering and integration. To achieve our objective we develop two ontologies for component discovery and integration. The first ontology describes the functional and the non-functional aspects of a component. It is used by the search engine SEC in order to select the suitable component for CBD. SEC is itself a software component that can be integrated in several development environments, like C++, Vb, DELPHI, VISUAL Java. The second ontology describes the component’s internal structure to facilitate its integration in the current work after its selection. This paper is organized as follows: section 2 presents the ontology support for component description. We will devote section 4 to present the design and the implementation of the SEC engine. In conclusion, we will suggest some openings and prospects related to this study.

2.

ONTOLOGY SUPPORT FOR COMPONENT DESCRIPTION

We have described the semantics of components means to express knowledge about the behavior, functional aspect and non-functional aspect of a component or component process. This knowledge comprises: -Structural features: they specify the component’s internal structure. The developer uses these features to determine if interaction exists between component operations and the other components used to build the current project. -Functional aspect: it identifies the functionalities of the component is expected to provide throught many features. These features include methods that are used to adapt the behavior of the component to his context. The adaptation is made by specializing and customizing. The other kind of features are used by the application specific part of a component based software. Generally this type of information is specified by the component’s methods. -Non-functional aspect: it specifies the component constraints related to communication or computation. The non-functional aspect includes features such as performance, availability, reliability, security, adaptability and dependability. We distinguish the static and the dynamic categories of non-functional features. Static features, such as securityrelated constraints, do not change during component execution. Dynamic features, such as performance-related properties, depend on the deployment environment. All of this features represent different and complementary views of a component. The feature set used to describe a component, depends on the developer action: discovery and integration. The discovery of a component is made by sending a query to the repository manager. Once a set of components have been selected, additional features are specified to select a component before integration. For the discovery action, the query must include functional and/or nonfunctional features. For integration action the structural features have to be specified. Our approach is based on the following ontologies: • The discovery ontology: it specifies the functional and non-functional features. • The integration ontology: it describes the PSMs (ProblemSolving Methods) used to specify a component’s structural features. The integration ontology’s purpose is to decouple component’s functional features from its internal specification. Many approaches have proposed process-based languages such as BPEL(Business Process Execution Languages) [2] and OWL [7]. These languages describe a component’s internal structure using a predefined set of workflow-like patterns ( sequence, parallel split, choice, etc.). But these languages lack an explicit, declarative decoupling between a component’s functional features (what) and its structural description (how) [10]. In integration ontology we try to divide the process or component into tasks. Tasks are either solved directly (by means of primitive methods), or are decomposed into subtasks (by means of decomposition methods) whose interaction can be modelled as workflow pattern [1]. As result we obtain a set of PSM’s which are mapped to knowledge elements. We use The Unified Problem-Solving Method Language (UPML)[8] to describe the components of PSMs

1746

Figure 1: Discovery and integration ontologies

[15](task, method and adapter ). For our present approach, we suppose that such PSMs are to be provided by component suppliers. For our future work, we will address the problem of deducing such description from component source code. The discovery ontology describes the subject matter using the notions of concepts, instances, relations, and axioms [12]. -Concepts are organized in taxonomies through which inheritance mechanisms can be applied. A concept contains slots which are restricted by facets. In our case we specify a component as a concept that contain many slots such as the performance slot that describes the component performance by one of the three values(low, medium, high) - Relations represent a type of interaction between concepts. They are formally defined as any subset of a product of n sets: R: C1 x C2 x ... x Cn. Examples of binary relations include: subclass-of and connected-to. For example the relation between component and method is a connectedto relation. The concept method contains the input types, output type, precondition and signature slots. - Functions are a special case of relations in which the n-th element of the relationship is unique for the n-1 preceding elements. Formally, functions are defined as: F: C1 x C2 x ... x Cn-1 x Cn. Examples of functions are father-of and rank-of-a-component that calculates the rank of a component depending on the ”used rate” and kindness match. The ”used rate” computes the rate of the component utilization [14]. The kindness(goodness) of match is related to the subsumption notion [17]. The subsumption idea has to be related to suitable matching notions for components provided and query specification in order to refine the selection result. - Instances are used to represent elements. The discovery ontology is developed with PROTEGE2000 [13] and mapped through the Resource Description Framework which is an XML-based language [16] and which also represents the component descriptions in the repository. We have chosen XML because of its expressive power which allows representing complex information structures easily. We use the emerging XML Schema standard as a metalanguage for specifying the syntax and structure of compo-

Figure 2: Component description

nent descriptions. The XML schema consists of a set of core schema type definitions. Our XML document, generally, contains the following key concepts: Component, Interface, Dynamic NF Aspect that represents the component dynamic aspect, and Static NF Aspect that represents the component static aspect. Each concept contains many slots which describe component features. The slot, generally, contains the following key elements: mincardinality, maxcardinality, label, domain and range. as indicated in the figure 2 the slot input has as mincardinality value ”1” and as label the value ”input”. The domain represents the concept of the slot, in our case the concept of input is ”interface”. The range indicates the value type of the slot, in our case the input value(s) is/are one or more instances of the concept Type. These instances are: integer, double, float, date, string, class, etc.. The two ontologies are under development, we plan to elaborate and instantiate them in the cooperative work domain. The actual component description includes functional aspect which can be automatically exploited to generate the elements of the discovery ontology. However the component supplier must supplement this description by adding the non-functional aspect.

3.

DESIGN AND IMPLEMENTATION OF THE SEC ENGINE

Our system provides a persistent component, called SEC, that can be loaded in the development environment during the project creation. SEC is persistent because it is always ready for execution by the developers. It contains the search process and it can access the repository of component descriptions. Once SEC is running, the developer specifies the query by selecting the adequate criteria. An approximate comparison

1747

e

D

e

v

e

l

o

p

m

e

n

t

e

n

v

i

r

o

n

m

e

n

t

l

e

c

t

f

o

r

m

S

C

S

C

u

r

E

r

e

n

t

p

r

o

j

e

c

t

Figure 4: Search step

Figure 3: System architecture

between the specified query and the description of components in the component description repository is made by the compare query description() function. If there is a positive result, the system indexes the obtained description(s) to software component repository. So the search component(ref component[]) function retrieves the appropriate component(s), where ref component[] is the list of the components references to retrieve. Then, the developer uses an application programme interface (API) to integrate the desired components in the current project. Finally to facilitate the component integration the developer can use the integration ontology . The repository manager can also manipulate the SEC in order to manage the two repositories. He/she can add, modify and delete the component or/and the component description. The description of components is also managed by the repository manager. SEC is a component responsible for the location of components. It executes the specified query and retrieves and presents relevant components. SEC requires no loading from software developers in development environments. In current development practices, the developer clicks on the SEC icons. Next, he chooses the non-functional features and the functional features which meets his needs. A dynamic query would be formulated and then executed automatically in order to deliver the adequate component. Figure 3 shows the development of a simple chat application by using Visual BASIC. The Listbox component contains the identity of the recipient, the Text component contains the content of the message. The developer needs an ActiveX component that contains a method having two input parameters of string type that correspond to the address of the recipient, and the message. This component must also ensure the security of the message and establish

the communication with the mailing server. To find this component, the developer selects the following search criteria: method and communicating objects as functional properties, component type as external property and security as static non-functional property. In the following step, a value is applied to each criterion previously selected. Thus, the developer can associates the label ”Send” or any other synonym label to the criterion method. A query will be then formulated automatically when clicking on the search button. The components-result of the search will be sorted according to the degree of the similarity with the query. There are four degrees of similarity: - Exact: If component C and request R are equivalent concepts, we call the match Exact; formally, C ≡ R. In other words for each couple of the request and the description, there is identity of types. - PlugIn: If request R is sub-concept of component C, we call the match PlugIn; formally, R v C. In other words for each element of the query there is a similar element in component description - Subsume: If request R is super-concept of component C, we call the match Subsume; formally, C v R. In other words for each element of the component description there is a similar element in the query. - Disjoint: Otherwise, we call the match Disjoint; that is, C uR v⊥ . In other words there is no element of the component description that corresponds to an element of the query. The degrees of the similarity are organized in a discrete scale. Exact matches with a high level of used rate are clearly preferable than those with low level of used rate; PlugIn matches are considered the next best, Subsume matches are considered to be third best. The developer can read the description of each component to understand their functionality details. Thus, the developer loads the adequate component in the chat application. The retrieved re-usable components are presented in the search result form in the decreasing order of similarity value. Each component is accompanied by its degree of similarity, name, access path and a short description. Developers who are interested in a particular component can load it in the tool bar. Finally, the developer can adapt it to the task or reuse it, as it is in the program, depending on the similarities degree. SEC provides also an interface for the repository manager in order to add, remove components and modify the components information. We formulate many queries without introducing the non-functional features. We notice that in many cases the delivery component does not meet

1748

the developer needs. Actually, the non-functional features play a decisive role in the search quality. In the present implementation the repository is manually updated. Au[6] tomating this step may be done on the base of techniques such as introspection [19] or invariant detection [22]. We developed two ontologies, called integration ontology [7] and discovery ontology and also a search engine called SEC. The integration ontology describes the component internal structure in order to facilitate the integration of the desired component in the current project. The discovery ontology [8] helps the developer to select the adequate component by using SEC. Compared to other retrieval systems, our approach is unique [9] in the following aspects: (1) The first attempt to locating components is based on a software component. (2) The developed system can be integrated in many soft[10] ware development environments. (3) The whole system (The software development environment, the application and the search tools) is software componentoriented. [11]

4.

CONCLUSION

Software component retrieval research has been advancing very quickly over the past few decades. Researchers have experimented techniques ranging from probabilistic to learning techniques. At each step, significant insights regarding how to design and implement more useful software component search systems have been gained. In this paper, we presented SEC, a persistent component for discovering a software component in a repository. It delivers re-usable components and helps the developer to integrate the selected component into the current work. To ensure the success of these two tasks we developed two ontologies: A discovery ontology and an integration ontology. The present implementation of SEC does not integrated the ontology based description technique. This point will be considered for future versions of this work. In the future work, we plan also to compare our search engine with others and to extend our approach to exploit the search and the description of web services.

[12]

[13] [14]

[15] [16]

[17]

[18]

5.

REFERENCES

[1] W. M. P. Van Der Aalst, A. H. M. Ter Hofstede, B. Kiepuszewski, and A. P. Barros. Workflow patterns. Distrib. Parallel Databases, 14(1):5–51, 2003. [2] T. Andrews, F. Curbera, H. Dholakia, Y. Goland, J. Klein, F. Leymann, K. Liu, D. Roller, D. Smith, S. Thatte, I. Trickovic, and S. Weerawarana. Business process execution language for web services version 1.1. Technical report, March 2003. [3] S. Atkinson and R. Duke. Behavioural retrieval from class libraries. in Proceedings of the Eighteenth Australasian Computer Science Conference, 17(1):13–20, 1995. [4] B. Curtis. Cognitive issues in reusing software artifacts, software reusability. ACM Press, New York, NY, 2, 1989. [5] E. Damiani, M. G.Fugini, and C. Bellettini. A hierarchy-aware approach to faceted classification of objected-oriented components. ACM Transactions on

[19] [20]

[21]

[22]

[23]

1749

Software Engineering and Methodology, 8(3):215–262, July 1999. B. A. Davey and H. A. Priesly. Introduction to lattices and order. Cambridge, UK: Cambridge University Press, 2nd edition, 1990. M. Dean et al. Web ontology languages (owl) reference version 1.0. World Wide Web Consortium (W3C), November 2002. www.w3.org/TR/2002/WD-owl-ref-20021112. D. Fensel and al. The unified problem-solving method development language upml. Knowledge and Information Systems, 5(1):83–131, 2003. D. Fensel, D.L. McGuiness, E. Schulten, W. Keong, G.P. Lim, and G. Yan. Ontologies and electronic commerce. Intelligent Systems, IEEE, 16 :8–14, Jan-Feb 2001. Asunci´ on G´ omez-P´erez, Rafael Gonz´ alez-Cabero, and Manuel Lama. ODE SWS: A framework for designing and composing semantic web services. IEEE Intelligent Systems, 19(4):24–31, 2004. B. Granter and R. wille. formale begriffsanalyse. Mathematische grundlagen, Berlin, Springer, 1996. T.R Gruber. A translation approach to portable ontology specifications. Knowledge Acquisition, 5:199–220, 1993. Stanford Medical Informatics. The prot´eg´e project. http://protege.standford.edu, 2001. S. Khemakhem, M. Jmaiel, A. Ben Hamadou, and K. Drira. Un environnement de recherche et d’int´egration de composant logiciel. In Seventh Conference On computer Sciences, Annaba, May 2002. J. Int´ l. Human-computer studies. Special issue on problem-solving methods, 49(4):305–313, 1998. O. Lassila and R. R. Swic. Resource description framework (rdf) model and syntax specification. W3C - World Wide Web Consortium, Cambridge, MA, W3C Recommendation, 22 February 1999. L. Li and I. Horrocks. A software framework for matching based on semantic web technology. Proc. 12th Int. World Wide Web Conf., World Wide Web Consortium, page 48, 2003. F. Natalya and L. McGuinnes Deborah. Ontology development 101:a guide to creating your fisrt ontology. Stanford University, 2001. J. ONeil and H. Schildt. Java Beans Programming from the Ground Up. Osborne McGraw-Hill, 1998. E. Ostertag, J. Hendler, R. Prieto-Diaz, and C. Braun. Computing similarity in a reuse library system, an ai-based approach. ACM Transactions on Software Engineering and Methodology, pages 205–228, July 1992. J. Penix and P. Alexander. Efficient specification-based component retrieval. Automated Software Engineering, 6(2):139–170, 1999. Jeff H. Perkins and Michael D. Ernst. Efficient incremental algorithms for dynamic detection of likely invariants. SIGSOFT Softw. Eng. Notes, 29(6):23–32, 2004. A. Podgurski and L. Pierce. Behaviour sampling: A technique for automated retrieval of reusable components. In Proceedings of the 14th International

[24]

[25]

[26]

[27]

Conference on Software Engineering, pages 349–360, 1992. H. Pozewaunig and T. Mittermeir. Self classifying reusable components generating decision trees from test cases. International Conference on Software Eng. and Knowledge Eng, July 2000. N. S. Rosa, C. F. Alves, P. R. F. Cunha, J. F. B. Castro, and G. R. R. Justo. Using non-functional requirements to select components: A formal approach. In Fourth Workshop Iberoamerican on Software Engineeringand Software Environment, San Jose, Costa Rica, April 2001. P. Vitharana, F. Mariam Zahedi, and H. Jain. Knowledge-based repository scheme for storing and retrieving business components: a theoretical design and an empirical analysis. IEEE Transactions on Software Engineering, 29 (7):649–664, July 2003. R. Wille. Restructing lattice theory : an approach based on hierarchies of concepts. In I. Rival, editor, Ordered sets, pages 445–470, 1982.

1750

Suggest Documents