Document not found! Please try again

Falcons Concept Search: A Practical Search Engine for Web Ontologies

35 downloads 55974 Views 616KB Size Report
Falcons Concept Search, a novel keyword-based ontology search engine. ... Index Terms—Indexing, ontology ranking, ontology search, snippet gen- eration, virtual ... Y. Qu is with the State Key Laboratory for Novel Software Technology,. Nanjing .... ity of different applications, one best practice is to reuse concepts that.
810

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

Correspondence Falcons Concept Search: A Practical Search Engine for Web Ontologies Yuzhong Qu and Gong Cheng Abstract—Web ontologies provide shared concepts for describing domain entities and thus enable semantic interoperability between applications. To facilitate concept sharing and ontology reusing, we developed Falcons Concept Search, a novel keyword-based ontology search engine. In this paper, we illustrate how the proposed mode of interaction helps users quickly find ontologies that satisfy their needs and present several supportive techniques including a new method of constructing virtual documents of concepts for keyword search, a popularity-based scheme to rank concepts and ontologies, and a way to generate query-relevant structured snippets. We also report the results of a usability evaluation as well as user feedback. Index Terms—Indexing, ontology ranking, ontology search, snippet generation, virtual document.

I. I NTRODUCTION The Semantic Web is targeted at facilitating data integration across Web applications. Semantic Web data are formatted according to Resource Description Framework (RDF), a triple/graph-based way to represent information. Furthermore, Web ontologies described in RDF Vocabulary Description Language (RDFS) and the Web Ontology Language (OWL) provide shared concepts, i.e., classes and properties, for describing domain entities and thus enabling semantic interoperability of different applications. Semantic interoperability depends on reusing or extending existing ontologies when developing new applications. Therefore, ontology search becomes a fundamental service for application developers. In recent years, several ontology search engines have been developed; some of which are still accessible [1]–[3]. Similar to traditional Web search engines, these systems accept keyword queries and return matched concepts and/or ontologies. However, for the returned results, they usually provide either only basic metadata (e.g., a humanreadable name of each concept) or all the related RDF description, both of which cannot help users efficiently determine whether a concept/ontology returned satisfies their needs. We developed Falcons Concept Search,1 a novel keyword-based ontology search engine, as part of the Falcons system. It retrieves concepts whose textual description is matched with the terms in the keyword query and ranks the results according to both query relevance and popularity of concepts. The popularity is measured based on a large data set collected from the real Semantic Web. Each concept Manuscript received November 13, 2008; revised April 29, 2009; accepted January 22, 2010. Date of publication May 2, 2011; date of current version June 21, 2011. This work was supported in part by the National Natural Science Foundation of China under Grants 60773106 and 60973024. This paper was recommended by Editor W. Pedrycz. Y. Qu is with the State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China (e-mail: [email protected]). G. Cheng is with the School of Computer Science and Engineering, Southeast University, Nanjing 210096, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMCA.2011.2132705 1 http://ws.nju.edu.cn/falcons/conceptsearch/

returned is associated with a query-relevant structured snippet, indicating how the concept is matched with the keyword query and also briefly clarifying its meaning. Meanwhile, the system recommends several query-relevant popular ontologies, which can be used by users to restrict the results to the ones in a specific ontology. Within such a mode of interaction, users can quickly compare ontologies and determine whether these ontologies satisfy their needs by checking queryrelevant concepts as well as their contexts, i.e., structured snippets. The system also provides the detailed RDF description of each concept and a summary of each ontology on demand. A demonstration of the system is given in the following. A. System Demonstration Suppose that a user wants to describe some students studying at some university. The user submits a keyword query “student university” to the system and obtains result page, as shown in Fig. 1. The bottom area presents the concepts returned. For each concept, the first line gives its name (label or local name) and type. The user can click on the name to browse its detailed RDF description. Below that, a structured snippet, consisting of part of the RDF description of the concept that is matched with the terms in the keyword query, is presented to help the user quickly determine its relevance. RDF description, followed by its provenance, is marked by stars if it comes from the RDF document retrieved by dereferencing the Uniform Resource Identifier (URI) of the concept. The URI is also presented below the snippet, followed by a number indicating how many RDF documents this concept is mentioned, which is linked to a list of these documents for further browsing. The top area of this page recommends nine ontologies. The user can select one of them, e.g., Semantic Web for Research Communities (SWRC), to restrict the search to that ontology. Then, as shown in Fig. 2, the concepts returned are filtered to include only those in the SWRC ontology. The user immediately finds that the SWRC ontology contains a “Student” class, a “University” class, and a “student” property, which are structurally related to each other in the ontology, as shown in their snippets. Consequently, the user determines to reuse this ontology; otherwise, the user can also select other ontologies and compare them. II. S YSTEM A RCHITECTURE Fig. 3 presents the architecture of the system. The multithreading crawler dereferences URIs with content negotiation (accepting only application/rdf+xml) and downloads RDF documents, which are then parsed by Jena (jena.sourceforge.net). The URIs newly discovered in these documents are submitted to the URI repository for further crawling. Initially, the URI repository is fed with seed URIs obtained from several online ontology repositories such as pingthesemanticweb.com and schemaweb.info, as well as retrieving the Swoogle search engine and Google search engine (for “filetype:rdf” and “filetype:owl”) with keyword queries randomly generated according to the category names of the Open Directory Project (dmoz.org). At the time of writing, 21.6 million wellformed RDF/XML documents have been downloaded and processed, containing 2.9 billion RDF triples. In the data set, 2 868 214 million classes and 264 315 properties have been identified, coming from 12 467 ontologies.

1083-4427/$26.00 © 2011 IEEE

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

Fig. 1.

First result page for the keyword query “student university.”

Fig. 2.

First result page for the keyword query “student university” after the SWRC ontology is selected to filter the results.

Fig. 3.

Architecture of Falcons Concept Search.

811

812

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

Each RDF triple in an RDF document and the document URI form a quadruple and is stored in the quadruple store implemented based on the MySQL database. The meta analysis component periodically computes several kinds of global information and updates them to the metadata database, e.g., which kind of entity (class/property/individual) a URI identifies and which concepts an ontology contains. Then, also periodically, the indexer updates a combined inverted index, which serves the proposed mode of user interaction, i.e., keyword search with ontology restriction. This combined index consists of two inverted indexes implemented based on Apache Lucene (lucene.apache.org). First, for each concept, a virtual document [4] is constructed, which consists of the terms extracted from its RDF description (cf. Section III). An inverted index, as a classic information retrieval data structure, is built from terms in virtual documents to concepts, to serve keyword search. Second, based on the metadata database, an inverted index is built from ontologies to the concepts they contain, to serve ontology-based result filtering. Thus, for a keyword query with an ontology restriction, the concepts finally returned are obtained by performing the intersection operation on the two result sets separately returned by these inverted indexes. The ranking process (cf. Section IV-A) is also implemented based on Lucene. At indexing time, a popularity score is computed and attached to each concept. At searching time, popularity of concepts and term-based similarity between virtual documents of concepts and the keyword query are combined to rank concepts. For each concept returned, a query-relevant structured snippet (cf. Section V) is generated from the data in the quadruple store. Meanwhile, several ontologies are recommended (cf. Section IV-B) based on top-ranking concepts. Moreover, for each concept requested, the browsing concepts component loads its RDF description from the quadruple store and presents it to the user. For each ontology requested, the browsing ontologies component loads ontology metadata from the quadruple store, loads the lists of classes and properties contained from the metadata database, and presents all of them to the user. III. C ONSTRUCTING V IRTUAL D OCUMENTS FOR K EYWORD -BASED C ONCEPT S EARCH Traditional Web search engines build an inverted index from terms in the contents of webpages to their URIs to serve keyword search. However, on the Semantic Web, a concept has no such contents but is described by RDF triples, from which we need to extract terms to construct its virtual document [4]. Existing ontology search engines [1], [2] usually extract terms from its local name and associated literals such as rdfs:label and rdfs:comment. However, it is insufficient to consider only literal-valued properties because, to describe a concept in RDF, we not only can attach literals but also can relate it to other entities. For example, to annotate a concept with its creator information, we can either use a literal-valued property to attach the name of the creator to the concept or use an entity-valued property to relate the concept to a URI that identifies the creator. Therefore, entity-valued properties should also be considered. Considering that blank nodes are widely used in OWL ontologies for connecting other concepts but they have no local names to be extracted, inspired by CBD [5], for each concept c, we first identify its description graph, a subset of all available RDF triples, as follows: first, include in the subset all the RDF triples where the subject is c; then, recursively, for all the RDF triples included in the subset thus far having a blank node object, include in the subset all the RDF triples where the subject is the blank node in question and which have not been included in the subset. For example, Fig. 4 illustrates an RDF graph sliced from the SWRC ontology, where the description graph of swrc:University includes only

Fig. 4.

RDF graph extracted from the SWRC ontology.

one RDF triple swrc:University, rdfs:subClassOf, swrc:Organization, and the other five RDF triples are all included in the description graph of swrc:Student. The virtual document of a concept consists of terms extracted from its description graph: For each entity identified by a URI in this graph, extract its local name and label; for each literal in this graph, extract its lexical form. For example, in Fig. 4, the virtual document of swrc:Student consists of terms “Student,” “subClassOf,” “Person,” “type,” “Restriction,” “onProperty,” “studiesAt,” “allValuesFrom,” and “University.” It is worth noting that property names are also included so that the system can support more diverse keyword queries, e.g., swrc:Student can be retrieved by the keyword query “subclassof person.” IV. R ANKING A. Concept Ranking In the system, the ranking score of a concept c is concerned with two factors, i.e., its relevance to the keyword query q and its popularity RankingScore(c, q) = TextSim(c, q) · Popularity(c)

(1)

which will be separately discussed in the following. 1) Query Relevance: On the one hand, as described in Section III, a virtual document is constructed for each concept. On the other hand, a keyword query can be treated as a short document. Thus, the problem of calculating the relevance of a concept to a keyword query could be transformed into the problem of calculating similarity between two documents. We use the vector space model and the term frequency weight to represent documents, i.e., each document is represented as a vector where each component corresponds to the frequency of a term in the document. In particular, the weights of the terms extracted from the local name and label of the concept in question are additionally multiplied by 10.0, based on our previous experience of using virtual documents in ontology matching [6]. Then, weights are further refined by the well-known inverse document frequency measure, i.e., a higher weight is assigned to a term in a virtual document if the term occurs in fewer documents in the whole data set because such a term is considered to be a more distinctive feature. Finally, the relevance of a concept c to a keyword query q, TextSim(c, q), is defined as the cosine of the angle between the vector form of the virtual document of c and the vector form of q. 2) Popularity: Other than query relevance, existing approaches study ontology structures to evaluate concepts with structural measures such as PageRank-like algorithms [7], [8] or graph centrality [9]. However, they failed to investigate the use of concepts in practice. To develop a new Web application, in order to maximize the interoperability of different applications, one best practice is to reuse concepts that have been widely used by existing applications. Therefore, our system gives higher ranks to popular concepts. For a concept c, let Docs(c) be the set of RDF documents where c is instantiated. A concept c is instantiated in an RDF document d

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

if either c is a class and d contains an RDF triple whose predicate is rdf:type and whose object is c, or c is a property and d contains an RDF triple whose predicate is c. The popularity score of c is calculated as follows: Popularity(c) = log (|Docs(c)| + 1) + 1.

(2)

In the system, popularity scores are evaluated based on a large data set collected from the real Semantic Web, which includes not only conceptual-level RDF documents (ontologies) but also a lot of instance-level RDF documents. Therefore, it is possible to characterize the use of concepts in practice. B. Ontology Recommendation In the system, according to the proposed mode of user interaction, several ontologies are recommended to be selected to filter the concepts returned. In Section IV-A, we have detailed the principle and method of ranking concepts. Now, we rank ontologies based on the ranking of concepts. For a keyword query, the ontologies that the concepts returned come from are regarded as candidates for recommendation. For each ontology candidate, its ranking score is evaluated by adding up the ranking scores of those concepts returned and contained in this ontology. Finally, up to nine top-ranking ontologies are recommended. The underlying criterion is that an ontology is more likely to be recommended if the concepts in the ontology that are matched with the terms in the keyword query are more popular on the Semantic Web. V. G ENERATING S NIPPETS For each concept returned, the system presents a query-relevant structured snippet to show how the concept is matched with the keyword query. The snippet can help users quickly determine the relevance of a concept to their needs. In this section, we propose a notion of property description thread (PD-thread) as the basic unit of a snippet and then introduce a method of ranking PD-thread and selecting the top-ranking ones into the snippet. A. PD-Thread: The Basic Unit of a Concept Snippet The description graph of a concept is usually too large to be presented entirely so that a subgraph is extracted to form a snippet. However, an RDF triple is not suitable for being the basic unit of a snippet when blank nodes are involved. For example, as shown in Fig. 4, a single RDF triple swrc:Student, rdfs:subClassof, _b1, where _b1 is a blank node, gives almost no useful information. Several graph structures [10]–[12] have been proposed to cope with a similar problem called RDF graph decomposition. However, presenting such structures having general topology may take significant space in the result page, which costs users much time to scroll up/down to read other concepts before/after so that it reduces the efficiency of result checking and comparison. Therefore, the basic unit of a snippet should be small sized but still meaningful when involving blank nodes. Based on this consideration, we propose a notion of PD-thread as the basic unit. In the description graph of a concept c, a PD-thread of c is a path in the graph identified as follows: The starting node of the path is c, and the ending node of the path is not a blank node, i.e., a URI or a literal; the internal nodes of the path, if any, are all blank nodes and are distinct from each other. The latter rule also avoids an infinite number of paths when there exist loops of blank nodes. For example, Fig. 4 contains four PD-threads of swrc:Student and one PD-thread of swrc:University. Evidently, a PD-thread, as its name indicates, is a linear structure so

813

that it could be easily presented within one line. In most cases, a PDthread is a single RDF triple not containing blank nodes. Otherwise, for each blank node contained in a PD-thread, two RDF triples are included to detail its denotation, i.e., in one as the object and in the other as the subject, which can better clarify the meaning than a single RDF triple can. B. Generating Snippets by Ranking PD-Threads In the system, a structured snippet of a concept consists of at most three of its PD-threads. Thus, the problem of generating snippets is transformed into a new problem: ranking PD-threads according to keyword queries. The ranking algorithm is outlined as follows: 1) Assign a ranking score to each PD-thread candidate. 2) Select the top-ranking candidate into the snippet. 3) If the desired number of PD-threads, which is three here, has not been reached, go back to Step 1). The ranking score of a PD-thread is evaluated by its relevance to the keyword query. A virtual document is constructed for each PD-thread in order to calculate its query relevance, which includes the following: for each property labeled on the arcs of the path, use its local name and label, and for the ending node, use its local name and label if it is a URI or its lexical form if it is a literal. Finally, the ranking score of a PD-thread w.r.t. a keyword query is defined as the cosine of the angle between the vector form of the virtual document of the PD-thread and the vector form of the keyword query. Further, for a multiterm query, the cosine measure may fail to create a snippet of a good coverage of the terms in the keyword query and may lead to a sort of redundancy. For example, for the keyword query “student university,” the three selected PD-threads in a snippet may be all matched with “student,” but none of them is matched with “university.” To deal with this, inspired by previous work on text summarization [13], after a PD-thread is selected into the snippet, the weights of the terms in the vector form of the keyword query that occur in the virtual document of this PD-thread are set to a very small number, i.e., 0.001 in the system. Consequently, in the next rounds, other unmatched terms in the query will dominate the scoring of the remaining PD-thread candidates, and the generated snippet is likely to cover more terms in the query. VI. E XPERIMENTS In this section, we first present the results of a preliminary usability evaluation of Falcons Concept Search and then report feedback collected from participants after the experiments. A. Usability Evaluation Other than Falcons Concept Search, Swoogle2 [1], as one of the most famous ontology search engines, was also evaluated as reference. Totally, 23 volunteers having experience of ontology-based research and development participated in the evaluation, including 4 senior researchers, 4 Ph.D. students, 14 master students, and 1 software engineer. Participants were first given instructions on how to use each search engine. Then, they were asked to use each search engine individually to find the best ontology for modeling university in order to build a university’s website. They could judge ontologies according to their own knowledge of this domain and this task, but the only requirement was that the selected ontology must contain classes describing

2 http://swoogle.umbc.edu/

814

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

TABLE I AVERAGE R ESULTS OF SUS

concepts “university,” “student,” and “professor.” After finishing the task by using each search engine, participants immediately responded to the system usability scale (SUS) [14] consisting of ten five-point Likert items, each of which is a statement with a degree of agreement to be specified. It is worth noting that the two search engines were based on different data sets, which might or might not have affected participants’ judgments of the systems’ usability. However, we did not take the difference of data set into account in the evaluation so that the preliminary results presented should not be regarded as a comprehensive usability comparison. Table I summarizes the average degrees of agreement given by all the participants. Overall, Falcons Concept Search performed consistently better than Swoogle in all the ten questionnaire items. In particular, significant differences (> 1.00) were observed in the items one, three, four, seven, and eight, indicating that participants believed that Falcons Concept Search is much easier to use, and they significantly preferred this search engine. SUS score was also calculated based on the questionnaire results, which is a composite measure of the overall usability of the system being studied and has a range of 0–100. Only three participants (13.0%) gave a higher SUS score to Swoogle than the one to Falcons Concept Search. The average SUS score of Falcons Concept Search is 71.63, which is significantly higher than 48.04, the one of Swoogle. B. User Feedback We have also collected feedback about Falcons Concept Search from participants after the experiments. Generally, participants were impressed by the mode of user interaction provided, within which they could quickly obtain the concept list of each ontology and compare them. However, some participants suggested that more (than nine) ontologies should be recommended, and a brief description of each ontology could help them skip irrelevant ones more efficiently. Participants also reported that query-relevant structured snippets assisted them not only in identifying query-relevant concepts but also disambiguating concepts based on structurally related concepts. Some participants believed that presenting snippets in the form of graph would be better. Moreover, one participant proposed that, when many great concepts or ontologies were returned, clustering of them would save users a large amount of time. Another interesting opinion given by several participants was that because practical ontology search engines serve not only researchers but also ordinary application developers, they should provide easily understandable interfaces and avoid purely academic functions or tips.

VII. R ELATED W ORK : A C OMPARISON OF O NTOLOGY S EARCH E NGINES We compare existing ontology search engines by investigating how they implement several key components of a typical search engine, including query types, result presentation, and ranking. At the time of writing, we found five accessible ontology search engines, i.e., Falcons Concept Search, ONTOSEARCH23 [15], SQORE4 [3], Swoogle, and Watson5 [2]. Table II summarizes a comparison of these systems. Other Semantic Web search engines are not involved because either they are not accessible (e.g., OntoKhoj [16]) or they do not specifically provide concept search or ontology search (e.g., Sindice6 [17] and SWSE7 [18]). A. Query Types All of the five search engines support keyword queries. ONTOSEARCH2 seems to support only single-term keyword queries. Falcons and Swoogle are implemented based on Lucene so that they fully support the Lucene query syntax, including phrase, wildcard, and Boolean queries. Swoogle and Watson also allow specifying finer constraints over various fields of data, such as matching keyword queries with only labels or comments of concepts. Swoogle and Watson enable users to specify whether they are searching for classes or properties when submitting queries, whereas Falcons provides this function as a secondary search. Falcons also enables users to filter the concepts returned by selecting an ontology from the recommended ones. Other than keyword queries, expert users can directly submit SPARQL queries to ONTOSEARCH2 and Watson. Falcons, SQORE, Swoogle, and Watson also provide application programming interfaces. B. Result Presentation Swoogle provides concept-level search and ontology-level search separately. For each concept returned, its associated literals are concatenated into a document, part of which matched with the keyword 3 http://dipper.csd.abdn.ac.uk/OntoSearch/ 4 http://research.shinawatra.ac.th:8080/sqore/SQORE.html 5 http://watson.kmi.open.ac.uk/ 6 http://sindice.com/ 7 http://swse.deri.org/

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

815

TABLE II C OMPARISON OF O NTOLOGY S EARCH E NGINES

query is presented as a snippet. Similarly, for each ontology returned, the names of the concepts in the ontology are concatenated into a document for generating snippets. Swoogle also provides plenty of metadata for each concept and ontology. In Watson, each result is an ontology, associated with its file size and language. The concepts in the ontology matched with the keyword query are also presented, each followed by its label and comment. For each ontology, users can look up its metadata such as language. ONTOSEARCH2 also presents both ontology URIs and matched concepts. Each ontology is associated with its ranking score. SQORE presents ontology URIs only. In Falcons, each concept returned is associated with its label, type, a query-relevant structured snippet consisting of part of its RDF description together with its provenance, and a number indicating how many RDF documents the concept is mentioned by. Several ontologies are also recommended, with which users can filter the results. Users can also retrieve detailed RDF description of each concept/ontology. C. Ranking In Swoogle, by default, the results are ranked according to a PageRank-like algorithm [1]. Keyword queries only affect whether a concept or an ontology is included in the results but do not participate in the ranking process. Users can also choose to rank ontologies by date or by size. SQORE employs WordNet to determine various semantic relations between terms in the keyword query and concept names in ontologies and allows users to quantify these relations in the ranking process [3]. ONTOSEARCH2 seems to use AKTiveRank [9] to rank ontologies. Falcons ranks concepts and ontologies according to their relevance to the keyword query and their popularity on the Semantic Web. VIII. C ONCLUSION In this paper, we have introduced how to search concepts and ontologies with Falcons Concept Search and have detailed its design and implementation. The system integrates concept-level search and ontology-level search by recommending ontologies and allowing filtering concepts with ontologies. For each concept returned, its label, type, and a query-relevant structured snippet are provided to help users quickly determine its relevance to their needs. Based on the concepts returned and their structured snippets, users can quickly learn the relevance and characteristics of an ontology and can also easily compare ontologies. Detailed RDF description of concepts and ontologies has been also provided on demand. The technical contributions of this paper include a mode of interaction that helps users quickly find desired concepts and ontologies as well as a supportive combined inverted index structure, a method of constructing virtual documents of concepts that includes the names of associated properties and related entities, a way to rank concepts and ontologies based on their popularity on the Semantic Web as well

as their relevance to keyword queries, and a method of generating query-relevant structured snippets. We have also performed a usability evaluation and compared various aspects of five accessible ontology search engines. User interaction is crucial to the usability of a search engine. In future work, we will investigate other query types besides keywords, e.g., controlled natural languages [19], and improve the method of snippet generation in order to better present ontology structures. It is also interesting to consider other metrics for ontology evaluation and recommendation.

ACKNOWLEDGMENT The authors would like to thank W. Ge for his time and effort in implementing the system, H. Wu for his valuable suggestions during the development of the system, and all the participants in the experiments. The authors would also like to thank the editors and anonymous reviewers for their insightful comments and suggestions that improved this paper considerably. R EFERENCES [1] L. Ding, T. Finin, A. Joshi, Y. Peng, R. Pan, and P. Kolari, “Search on the semantic web,” IEEE Comput., vol. 38, no. 10, pp. 62–69, Oct. 2005. [2] M. d’Aquin, C. Baldassarre, L. Gridinoc, M. Sabou, S. Angeletou, and E. Motta, “Watson: Supporting next generation semantic web applications,” in Proc. IADIS Int. Conf. WWW/Internet, 2007, pp. 363–371. [3] C. Anutariya, R. Ungrangsi, and V. Wuwongse, “SQORE: A framework for semantic query based ontology retrieval,” in Proc. 12th Int. Conf. Database Syst. Adv. Appl., 2007, pp. 924–929. [4] C. Watters, “Information retrieval and the virtual document,” J. Amer. Soc. Inf. Sci., vol. 50, no. 11, pp. 1028–1029, Aug. 1999. [5] Nokia, P. Stickler, CBD—Concise Bounded Description. [Online]. Available: http://sw.nokia.com/uriqa/CBD.html [6] Y. Qu, W. Hu, and G. Cheng, “Constructing virtual documents for ontology matching,” in Proc. 15th Int. World Wide Web Conf., 2006, pp. 23–31. [7] X. Zhang, H. Li, and Y. Qu, “Finding important vocabulary within ontology,” in Proc. 1st Asian Semant. Web Conf., 2006, pp. 106–112. [8] G. Wu, J. Li, L. Feng, and K. Wang, “Identifying potentially important concepts and relations in an ontology,” in Proc. 7th Int. Semant. Web Conf., 2008, pp. 33–49. [9] H. Alani and C. Brewster, “Metrics for ranking ontologies,” in Proc. 4th Int. EON Workshop, 2006, pp. 1–7. [10] X. Zhang, G. Cheng, and Y. Qu, “Ontology summarization based on RDF sentence graph,” in Proc. 16th Int. World Wide Web Conf., 2007, pp. 707–716. [11] L. Ding, T. Finin, Y. Peng, A. Joshi, P. P. da Silva, and D. L. McGuinness, “Tracking RDF graph provenance using RDF molecules,” in Proc. 4th Int. Semant. Web Conf. (Poster), 2005, pp. 1–14. [12] G. Tummarello, C. Morbidoni, R. Bachmann-Gmür, and O. Erling, “RDFSync: Efficient remote synchronization of RDF models,” in Proc. 6th Int. Semant. Web Conf. 2nd Asian Semant. Web Conf., 2007, pp. 537–551. [13] A. Nenkova, L. Vanderwende, and K. McKeown, “A compositional context sensitive multi-document summarizer: Exploring the factors that

816

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 41, NO. 4, JULY 2011

influence summarization,” in Proc. 29th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2006, pp. 573–580. [14] J. Brooke, “SUS—A quick and dirty usability scale,” in Usability Evaluation in Industry, P. W. Jordan, B. Thomas, I. L. McClelland, and B. Weerdmeester, Eds. Boca Raton, FL: CRC Press, 1996, pp. 189–194. [15] J. Z. Pan, E. Thomas, and D. Sleeman, “ONTOSEARCH2: Searching and querying web ontologies,” in Proc. IADIS Int. Conf. WWW/Internet, 2006, pp. 211–219. [16] C. Patel, K. Supekar, Y. Lee, and E. K. Park, “OntoKhoj: A semantic web portal for ontology searching, ranking and classification,” in Proc. 5th Int. Workshop Web Inf. Data Manage., 2003, pp. 58–61.

[17] E. Oren, R. Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, and G. Tummarello, “Sindice.com: A document-oriented lookup index for open linked data,” Int. J. Metadata Semant. Ontologies, vol. 3, no. 1, pp. 37–52, Jan. 2008. [18] A. Harth, A. Hogan, R. Delbru, J. Umbrich, S. O’Riain, and S. Decker, “SWSE: Answers before links!,” in Proc. Semant. Web Challenge, 2007, pp. 1–8. [19] R. Valencia-García, F. García-Sánchez, D. Castellanos-Nieves, and J. T. Fernández-Breis, “OWLPath: An OWL ontology-guided query editor,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 41, no. 1, pp. 121–136, Jan. 2011.