A Cooperative Classification Mechanism for Search and Retrieval Software Components Taciana A. Vanderlei, Frederico A. Durão, Alexandre C. Martins, Vinicius C. Garcia, Eduardo S. Almeida, Silvio R. de L. Meira Federal University of Pernambuco Recife Center for Advanced Studies and Systems (C.E.S.A.R.) {tav, fad2}@cin.ufpe.br,
[email protected], {vcg, esa2, srlm}@cin.ufpe.br literature reports [12], the main obstacle is to retrieve a component that correspond the developer’s need, once there is a gap between the problem formulation, in developer’s mind and the component description in the repository. Moreover, studies [7][12][21] show that search engine must focus on intuitive ways to classify and identify the components with low costs and based on information that is familiar to the software engineers. In this way, they can find the appropriate components during the software development without necessarily having knowledge about the contents of the repository [21].
ABSTRACT This paper presents the use of folksonomy concepts in a software component search engine as an alternative to improve the search result quality, covering from specification to implementation. A case study was performed in order to evaluate its performance and viability. Additionally, a set of requirements to perform component search and retrieval with folksonomy are presented, beyond the architectural and implementation aspects that accomplishes the tool. The case study indicates the suit of different search techniques is better than using separately. The engine’s current version combines keyword, facet-based and folksonomy search techniques.
In this scenario, different techniques should be experimented to address the aforementioned drawbacks. Folksonomy can be the key for a distributed classification system with initial low costs, since folksonomies are maintained by users. Folksonomy, combining “folk” and “taxonomy” [14], refers to a collaborative way in which information is categorized on the web. Instead of using a centralized classification scheme, users are encouraged to freely assign chosen keywords (called tags) to pieces of data, in a process known as tagging.
Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval – query formulation, relevance feedback, retrieval models, search process.
General Terms
In this paper, an improved version of Maracatu [6], combining the original keyword and facet-based search with folksonomy is discussed and its specification, design, implementation and a case study are presented. This new feature improves the component search precision as will be seen in the formal case study performed to evaluate the feasibility of the Maracatu search engine. This version is being used on real projects.
Performance, Design, Experimentation, Human Factors, Usability.
Keywords Folksonomy, search engine, cooperative classification.
1. INTRODUCTION Software reuse is a research area in software engineering which aims to promote significant improvements in the productivity and quality of the software applications through effort optimization. It deals with reusing existing artifacts, such as requirements, design, models, source code, and test plans, among others, to create new applications. However, there are several problems that must be resolved in order to achieve systematic reuse. One of these is the problem of component search and retrieval [17].
The remainder of this paper is organized as follows: Section 2 presents the background that motivates this work. Section 3 describes Maracatu requirements, architecture and implementation of the current stage of the engine. Section 4 presents a case study to evaluate the approach. Related works are presented in Section 5 and, finally, Section 6 presents some concluding remarks and directions for future work.
2. BACKGROUND
In order to promote reuse, some organizations maintain a large reusable software component library that requires an efficient method for retrieving components [17]. However, the majority of the automated methods for retrieving components are not completely satisfactory to find relevant ones. According to
The search and retrieval information is a key area to obtain success on reuse initiatives. According to Prieto-Diaz and Freeman [16], “To reuse a software component, you first have to find it”. However, this problem becomes more complicated when the components are distributed or in a huge quantity. Thus, it is necessary to provide ways to find and retrieve them using an efficient and precise approach.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’07 March 11–15, 2007, Seoul, Korea Copyright 2007 ACM 1-59593-480-4/07/0003…$5.00.
In the literature, several solutions were proposed as facet-based [15], free-text indexer [13], and enumerative classification [15]. However, there is not a satisfactory solution, since several factors as cost, precision and recall must be considered. In addition to these factors, according to [21], human behavior concerning must
866
be regarded to introduce tools that aid developers to find relevant components. On the other hand, some problems associated with usability, as seen in [3], can also be inhibitors to reuse tools.
f. Database persistence. Tags, related to a specific component and author should be stored in a persistent database for future reuse. This means that data can be accessed at any time by the tool.
In this context, the RiSE1 group developed Maracatu [6], a mechanism to search and retrieval components, integrated with the Eclipse environment. The mechanism combines a set of features to search and retrieval software components. One of these is the folksonomy, a cooperative classification schema that uses familiar concepts for users of the system.
g. High precision and recall. The engine should present a high precision, recovering the most relevant components; high recall, few relevant components are left behind without being retrieved; and high efficiency, refers to the time and space required by a retrieval method [6].
In folksonomy, we should have three distinct data elements [22]: (i) the tag, pieces of information separate from, but related to, an object; (ii) a clear understanding of the object being tagged; and (iii) the identification of the user doing the tagging.
3.2 Architecture Maracatu previous version combines text mining and facet-based classification scheme of search [6]. The Java class source code, which in Maracatu is considered as a white-box component, can be indexed by a mechanism that accesses distributed CVS open project repositories, indexing and ranking these artifacts using the Lucene search engine [9].
These elements allow the users to tag information and objects in order to increase the recall. It is the most important aspect, because using a known vocabulary can help the future recovery of the tagged objects by different users.
Maracatu’s architecture is based on the client-server model, supported by Web Services technology [20] for message exchange between the subsystems. Maracatu Service is responsible for indexing components in a background activity, and performing the user queries. The Eclipse plug-in corresponds to a Web Service client that communicates to the server in order to retrieve components according to developer queries.
3. MARACATU SEARCH ENGINE This work is an extension of the Maracatu search engine [6] with the introduction of search based on folksonomy. In this section are presented the requirements, architectural view and implementation aspects of the new tool.
3.1 Requirements
Aiming to improve the search precision, it was proposed complement the Maracatu classification approaches with folksonomy mechanism. Figure 1 shows the new Maracatu architecture view including the modules designed in the previous version represented by text mining and facet-based classification schema detailed in [6]. The new modules are:
The requirements proposed for Maracatu are based on studies about existing solutions in the literature [12], RiSE’s member discussions [7], empirical analyses of case tools that aid the software development [6], and interaction with the industry2. The specified requirements of the Maracatu [6] using folksonomy are: a. Integration with different search techniques. The mechanism should use the folksonomy technique combined with traditional schemes of classification to improve the precision of the search. b. Association of tags with components. Through this functionality, the user should be able to associate tags to components according to its domain. c. Search by tags. It should be possible to discover all items from all users that match a specific tag [18]. Moreover, the engine should support the discovery of items tagged from specific users that match the tag. d. User identification. The authors authorized to add tags to components should be registered and identified on the engine. It is necessary to aid user identification on the tag in registration and to identify author’s groups with the same understanding and knowledge of a domain, in order to improve the search.
Figure 1. Maracatu Architecture
e. Tag Cloud. The frequently used tags should be listed and emphasized with different colors, organized by relevance, to aid the search by tags.
1 2
Folksonomy Classifier. This module is responsible to perform the persistence of the tags in a database structured. Thus, while the users are classifying the components, the tags used are stored in an XML archive with the respective component and the related author. Folksonomy Search. The search is performed through folksonomy usage and can be combined with text mining and facet-based search techniques. The result of the folksonomy
http://www.rise.com.br / Brazilian organization of information technology that currently has approximately 800 collaborators and is preparing to obtain CMMI level 3.
867
search is merged with the results of the others two search techniques which utilize the Lucene search engine.
certain number of components using identical tags of the current user.
Component Rank. This module is responsible to rank the search results according to the number of times the tags searched appear on the database (XML file) associated with the components. In case of draw, the components are ranked according to how closely each component matches the query. This rank is valid if the folksonomy search is used; otherwise the components are ranked according to the Lucene algorithm. User Identification. This module is responsible to maintain a user list of the mechanism in an XML file, at the server side. Any developer, at the client side, must be identified before tagging or performing a search. The user registration requires a login and password.
3.3 Implementation This section describes the new features implemented in Maracatu plug-in3 necessary to accomplish the requirements of the folksonomy mechanism. The Figure 2 shows the association of tags to a specific component by an author. Figure 3. Maracatu search components screen The Tag Cloud area (4) presented on the right of the search screen is another option to aid the search. Tags registered by the authors are displayed according to its relevance with different tones of colors according to its popularity. Selecting a tag, the Tag Cloud will work as a browser leading to a collection of components in the text area (5) that are associated with that tag. Other option is select a number of tags within the Tag Cloud using the CTRL key, filling the tag field to help the search. Selecting an item in the text area a pop-up menu will present the options of download the component, visualize and generate the UML of the class and tag the source code. The current version of the search engine implementation contains 120 classes, divided into 57 packages, with 6688 lines of code (not counting comments). Maracatu previous version had about 106 classes, 55 packages and 3844 lines of code.
Figure 2. Maracatu component classification screen To insert tags to a specific component, the user informs which components can be tagging. Next, the class view will be presented (1) and text area (2) will be enabled with author tags related to the specific component if it exists to do updates or an empty text area to insert new tags. Each tag should be separated with blank spaces.
4. CASE STUDY Previous experiments in Maracatu indicated that using keywords and facet-based search techniques is better than using them separately [6]. In order to analyze the new version, demonstrating that folksonomy combined with others search techniques increase the precision of the results, a formal case study, using keywords, facet-based, and folksonomy search techniques and combinations of them, was conducted to evaluate the viability of the mechanism.
Below the text area lists are presented the Tags suggestions (relevant tags in the component), Top tags (most used tags) and Top author tags (most used author tags) addressed to help the author in the tags choice that fits better with the component. Clicking on a check box correspondent to the tag desired to associate, it will be automatically added to the text area. The Figure 3 shows the new search mechanism, where the developer may choose combinations of the folksonomy, facets and keywords, performing queries such as: to retrieval all ‘Networking’ components that are identified by the author ‘tav’ as an ‘infrastructure’ or ‘driver’ and developed for ‘J2EE’ Platform in the ‘EJB’ Component Model. An author’s list with the same knowledge is showed at the right of the screen (3), to aid the choice of the author field in the search. The identification of these groups is determined by the users of the system that tagged a 3
For the case study, three metrics were considered: the recall, the precision and the f-measure. Recall is the number of relevant components retrieved over the number of relevant components in the database [8]. The precision is the number of relevant components retrieved over the total number of components retrieved. Recall and precision are the classic measures of the effectiveness of an information retrieval system. Ideally, a search mechanism should have good precision and good recall. To assess this, mechanism can be evaluated through the f-measure, which is the harmonic mean of precision and recall [19]. The closer the f-measure is to 1.0, the better the mechanism is. But this will only occur if both precision and recall are high. If some
The current version of the plug-in may be obtained on the project site http://done.dev.java.net
868
repository from three projects of the SourceForge4 developer site and two RiSE projects.
mechanism has excellent precision, but low recall, or excellent recall, but low precision, the f-measure will be closer to zero, indicating that this mechanism does not perform well in one of these criteria.
The subjects in this study were 22 developers of the C.E.S.A.R, which use the Eclipse as development environment. They were solicited to add tags to 30 components of the repository. Next, one research of the RiSE group, that did not participate of the tagging phase, was given a set of ten queries, and was asked to find all items in the repository relevant to the query for each searching method (keywords, facet-based, folksonomy and their combinations). The expert for the known project was consulted in this last activity, helping the elaboration of the queries, specific to its domain.
4.1 Methodology The methodology adopted is based in existing methodologies for information retrieval, with a systematic experimental process [1][11]. Following the first experiment performed with Maracatu [6], the case study deals with real software projects to validate the proposed environment. To obtain a precise measure of the recall, since the experimenter needs to know exactly how many relevant components in the repository for each query, an adaptation in the methodology was made, using the strategy of insert a known project in the repository [6]. With this approach, the expected results for the query are exactly part of the artifacts contained in the known project, without necessarily have knowledge about the entire repository.
4.3 Analysis of the Case Study Results Recall. Table 1 shows the recall results for the case study. For each approach, the table shows the average of the recall, the standard deviance and the variance for the ten queries performed. Table 1. Recall of the case study Approach
Recall
Std. Dev.
Variance
Keyword
0,7780
0,1574
0,0248
Facet
0,8334
0,1619
0,0262
Folksonomy
0,4540
0,3399
0,1155
Keyword/Facet
0,7295
0,2480
0,0615
Keyword/Folkson.
0,4072
0,3160
0,0998
Facet/Folksonomy
0,5531
0,3587
0,1287
Kw./Facet/Folkson.
0,3768
0,3786
0,1434
In the study was also considered that values close to 50% for recall and values close to 20% for precision are satisfactory, since they come close to measurements made by other authors [4][21]. However, these values are only considered as a reference and these results were not included in the hypothesis of the case study. The independent variables for the case study - variable in a process that are manipulated and controlled, when comparing different situations - were keyword, facet-based and folksonomy search techniques and their combinations. The dependent variables – study variable object, whose values are expected to move as effect of changes in the independent variables - were recall, precision, and f-measure. Based on the exit of the dependent variable the experimental analysis was made.
In the case study, the facet approach presented the highest recall with average of 83,34%, followed by the keyword, keywords + facets and facets + folksonomy approach, all of them with averages above 50%, value considered satisfactory for this study. The two best results also presented the lowest standard deviance indicating that had considerable data convergence in relation of its averages. The rest of the approaches obtained values under the average satisfactory. However, they presented a high standard deviance that could have influenced the results.
The null hypotheses, i.e., the hypotheses that wanted to be rejecting in the case study determine that the use of the folksonomy or its combination with keywords and facets search methods generates a search engine less efficient, i.e., with a precision, recall and f-measure smaller than without their utilization. Through the rejection of these presented null hypothesis, was expected to favor the alternative hypothesis that refuse the null hypothesis.
Precision. Table 2 shows the precision results for the case study. For each approach, the table shows the average of the precision, the standard deviance and the variance for the ten queries performed.
With the results of the case study, was expected to measure the effectiveness of the new search engine compared with the previous one. The main expected result of the study was that the combination of the search techniques produces a search and retrieval components engine more efficient, adding the advantages of each approach.
Table 2. Precision of the case study
In order to evaluate the usability of the search engine, identify the users’ preferences by the offered approaches and identify possible improvements in the proposed solution, questionnaires were delivered to the subjects at the end of the study.
4.2 Preparation of the Case Study
Approach
Precision
Std. Dev.
Variance
Keyword
0,4485
0,2890
0,0835
Facet
0,0221
0,0211
0,0004
Folksonomy
0,5510
0,2607
0,0679
Keyword/Facet
0,5754
0,2887
0,0833
Keyword/Folkson.
0,6767
0,2429
0,0590
Facet/Folksonomy
0,5194
0,2055
0,0422
For the case study, 1168 archives of Java source code were used, distributed by five different indexed projects in the local 4
869
http://sourceforge.net/
Kw./Facet/Folkson.
0,8083
0,2666
answered the order preference approaches: Keyword + facets + folksonomy was ranked higher, followed by keywords + folksonomy, keywords + facets, facets + folksonomy, keywords, folksonomy and then facets only approach.
0,0711
The results of the study indicate that the keyword + facets + folksonomy approach has the best results with average of 80,83%. After that, we can highlight the keyword + folksonomy approach with almost fifteen percent points under the first collocation. All of the approach presented a satisfactory value, with average above 20%, with exception for the facet approach, with precision of 2,21%. However, the standard deviance is relatively high for these results been considered statistically conclusive, which may indicate that the average could drastically change.
5. RELATED WORKS A relevant work involving search engines and component retrieval is presented by Ye and Fischer [21]. They propose a new mechanism for locating components. This mechanism locates and presents to the developer, in an automatic way, information about components that are relevant for the activities that are being performed in the present moment, customized according to the knowledge and environment of the developer. Such mechanism was designed, implemented and evaluated in a system that received the name of CodeBroker. Empirical evaluations have shown that this kind of strategy is effective in promoting reuse. From the functional point of view, one can affirm that Maracatu and CodeBroker are complementary. CodeBroker performs a more complex interaction with the user, and it is active, while Maracatu is passive, since it only responds after the user sends a request. On the other hand, the Maracatu backend (server-side) has a more complex behavior, locating and collecting information on components from the network in an autonomous way, without direct intervention from the user and the local repository reflects users’ domain, an important functionality provided by folksonomy.
F-measure. Table 3 shows the f-measure results for the case study. For each approach, the table shows the average of the fmeasure, the standard deviance and the variance for the ten queries performed. Table 3. F-measure of the case study Approach
F-measure
Std. Dev.
Variance
Keyword
0,5276
0,2271
0,0516
Facet
0,0414
0,0363
0,0013
Folksonomy
0,4265
0,2572
0,0661
Keyword/Facet
0,5993
0,2159
0,0466
Keyword/Folkson.
0,4585
0,2833
0,0803
Facet/Folksonomy
0,4393
0,1677
0,0281
Kw./Facet/Folkson.
0,4247
0,3562
0,1269
Other important work is Krugle5, a search engine that uses tags to identify code snippets. This work follows the same direction of our search technique, relating components with tags, but does not identify authors of the tags. Additionally, the engine is not integrated with the development environment. It provides a web interface that makes it difficult to be used by developers, inhibiting its access.
By analyzing the average of the f-measure values, the approach that obtained the best result in the case study was the keyword + facets approach, followed, in that order, by the keyword search, keyword + folksonomy search, facets + folksonomy search, keywords + facets + folksonomy search, folksonomy search and, finally, the facets search. Analyzing these results we may immediately discard the facet approach, since it has a very low fmeasure, presenting as the approach less efficient of the approaches analyzed in this case study. The others approaches results can not be considered statistical guaranteed, although it could be true, since the standard deviance is too high that can change drastically the f-measure average.
Enterprises have also started blogging and experimenting with folksonomies. An example is IBM's Intranet that serves 315,000 IBM employees worldwide in different languages and with multiple roles and information needs [10]. They started experimenting with folksonomy to keep information updated and organized following their users’ personal way of accessing the system. Analyzing [7][12] we can observe that there are different ways to store and retrieve software components from repositories. However, our engine is probably the first one to search components involving cooperative classification and reflecting users’ domain, once we could not find reports in the literature of a search engine with the same characteristics.
4.4 Discussion Due to the high standard deviance presented in the results, new studies are necessary to present more indications that confirm the hypothesis. However, although the statistical tests cannot be considered conclusive, since a number of factors can influence the outcome, the results of the case study indicate that folksonomy and its combination with the others approaches do not improve the recall, but promises significant improvements in the precision results. It practically rejects the null hypothesize that the use of the folksonomy, combined with keywords and facets approaches, generates a search engine with lesser precision than without it, however, this is not truth for the recall and the f-measure metrics.
6. CONCLUDING REMARKS AND FUTURE WORKS Folksonomies are not the solution to every modern problem of classification and they must not be considered as alternatives to the traditional classification schemes librarians have designed over the years, but a complement to them. They should be applied only under the right circumstances – propitious and stable environment, with conscientious users about the importance of classifier artifacts adequately - and considering their own specific
Additionally, subjects that participate of the case study identified the necessity of changes in some aspects of the engine to improve its usability, which includes a better process to insert tags to components without repetition tasks. The subject, responsible to use each searching method and their combinations to find items,
5
870
http://www.krugle.com/product/
properties and the differences in respect to other classification schemes as taxonomies and controlled vocabulary.
[8] Grossman, D.A., Frieder, O. Information Retrieval. Algoritms and Heuristics. Second edn. Springer, Dordrecht, Netherlands, 2004.
In this paper, we presented the use of folksonomy in Maracatu, a component search engine, covering the tool specification, design, implementation and evaluation. The case study indicated that this technique, combined with keywords and facets approaches, can increase the precision of the results and reflect users’ domain, minimizing the distance between what developer wants and how components are described in the repository.
[9] Hatcher, E., Gospodnetic, O. Lucene in Action. In Action series. Manning Publications Co., Greenwich, CT, 2004. [10] IBM's Intranet and Folksonomy. March, 2005, available at: http://thecommunityengine.com/home/archives/2005/03/ibm s_intranet_a.html (Accessed on April 17, 2006). [11] Kitchenham, B., Pickard, P., Pfleeger, S. L. Case Studies for Method and Tool Evaluation. In the IEEE Software, Vol. 11, No. 4, July, 1995, 52-62.
Future work includes issues concerned with usability and scalability as pointed out by the case study. An incorporation of a semantic analyzer [2] to the tag vocabulary would be another important approach to be studied, as well as more specialized algorithms for component ranking, considering the dependences of the source code, control access to the components, search by neighbor tags and support to different IDEs, types and artifacts formats.
[12] Lucrédio, D., Almeida, E. S., Prado, A. F. A Survey on Software Components Search and Retrieval. In Steinmetz, R., Mauthe, A., eds.: 30th IEEE EUROMICRO Conference, Component-Based Software Engineering Track, Rennes France, IEEE/CS Press, 152–159. [13] Maarek, Y. S., Berry, D. M., Kaiser, G. E. An Information Retrieval Approach for Automatically Constructing Software Libraries. In the IEEE Transactions on Software Engineering, Vol. 17, No. 8, August, 1991, 800-813.
7. ACKNOWLEDGMENTS The authors would like to thank the C.E.S.A.R., Brazilian innovation institute that contributed with the evaluation of the mechanism.
[14] Mathes, A. Folksonomies – Cooperative Classification and Communication Through Shared Metadata. Computer Mediated Communication - LIS590CMC, December, 2004.
8. REFERENCES [1] Baeza-Yates, R., Ribeiro-Neto, B. Modern Information Retrieval. ISBN: 020139829X, 1st edition, ACM Press, 1999.
[15] Prieto-Díaz, R. 1991. Implementing faceted classification for software reuse. Commun. ACM 34, 5 (May. 1991), 88-97. [16] Prieto-Diaz, R., Freeman, P. Classifying Software for Reusability. In the IEEE Software, Vol. 4, No. 1, January, 1987, 6-16.
[2] Bruijin, J. Using ontologies. Tecnical report, Digital Enterprise Research Institute, 2003. [3] Caldiera, G., Basili, V. Identifying and Qualifying Reusable Software Components. In the IEEE Computer, Vol. 24, No. 2, February, 1991, 61–71.
[17] Podgurski, A., Pierce, L. Retrieving Reusable Software by Sampling Behavior. ACM Transaction on Software Engineering and Methodology, Vol. 02, No. 03, July, 1993, 286-303.
[4] Frakes, W. B., Pole, T. P. An Empirical Study of Representation Methods for Reusable Software Component. In the IEEE Transactions on Software Engineering, Vol. 20, No. 8, August, 1994, 617-630.
[18] Quintarelli, E. Folksonomies: Power to the People. In the Proceedings of the 1st International Society for Knowledge Organization, UniMIB Meeting, Milão, Italy, ISKOI, Italy, june, 2005, available at: http://www.iskoi.org/doc/folksonomies.htm (Accessed on april 17, 2006.
[5] Freitag, P. How to make a Tag Cloud. June, 2005, available at: http://www.petefreitag.com/item/396.cfm (Accessed on 5 july, 2006.
[19] Jardine, N. and van Rijsbergen, C.J. The use of hierarchical clustering in information retrieval. Information Storage and Retrieval, 7:217-240, 1971.
[6] Garcia, V. C., Lucrédio, D., Durão, F. A., Santos, E. C. R., Almeida, E. S., Fortes, R. P. M., Meira, S. R. L. From Specification to the Experimentation: A Software Component Search Engine Architecture. In the 9th International Symposium on Component-Based Software Engineering (CBSE 2006), Lecture Notes in Computer Science (LNCS), Västerås, Sweden, 2006.
[20] Stal, M. Web services: beyond component-based computing. Communications of ACM, Vol. 45, No. 10, 2002, 71–76. [21] Ye, Y., Fisher, G. Supporting Reuse by Delivering TaskRelevant and Personalized Information. In the ICSE 2002 – 24th International Conference on Software Engineering, Orlando, Florida, USA, 2002, 513-523.
[7] Garcia, V. C., Lucrédio, D., Almeida, E. S., Fortes, R, P, M., Meira, S. R. L. Toward a Code Search Engine Based on theState-of-Art and Practice, 13th IEEE Asia Pacific Software Engineering Conference (APSEC), Component-Based Software Development Track, Bangalore, India, 2006.
[22] Wal, T. V. Online Information Folksonomy Presentation Posted. Personal Infocloud, January, 2006, available at: http://www.personalinfocloud.com/folksonomy/index.html (Accessed on april 15, 2006).
871