Using Fuzzy Ontology for Query Refinement in a ... - CiteSeerX

2 downloads 0 Views 143KB Size Report
Dwi H. Widyantoro and John Yen. Department of Computer Sciences. Texas A&M University. College Station, TX 77843-3112, USA dhw7942, [email protected].
Using Fuzzy Ontology for Query Refinement in a Personalized Abstract Search Engine Dwi H. Widyantoro and John Yen Department of Computer Sciences Texas A&M University College Station, TX 77843-3112, USA dhw7942, [email protected] Abstract

query term, PASS uses its knowledge about term associations to suggest a list of broader and narrower terms in addition to providing the search results based on the original query term. A term x is said to be broader than a term y if the semantic meaning of x subsumes or covers that of y. For example, fuzzy system is broader than fuzzy controller, while the latter term is broader than lyapunov stability. The definition of narrower term is the inverse of broader term definition. The fuzzy ontology of term associations is created automatically using information directly obtained from a corpus. The construction of this ontology involves two steps: (1) creating a full ontology using the notion of fuzzy narrower term relation, and (2) pruning the full ontology to eliminate excessive and unnecessary term associations.

Recommending alternate queries during information seeking activities is an important feature in a Web-based search engine because users often do not know the exact terms to locate the information relevant to their interests. This paper describes our approach to construct automatically a fuzzy ontology that can be used to refine a user’s query. The method has been incorporated in a domain-specific search engine namely Personalized Abstract Search Services (PASS). Preliminary results suggest that the basic approaches adopted as well as the technique developed for practical use are intuitively promising.

In the following section, some backgrounds from which our approach departs will be described. Section 3 presents our ontology-based approach for query refinement. We then briefly discuss how to use the fuzzy ontology for query refinement and how the technique might improve the effectiveness of information seeking activities. Section 5 describes the implementation of fuzzy ontology construction in PASS search engine and shows some of its results based on current PASS abstract collection. A discussion of related works in query refinement method will be presented in Section 6, followed by conclusions in Section 7.

1. Introduction Web-based search engines have become common tools to locate information in cyberspace. Typical search engines retrieve information based on keywords given by users and return the information found as a list of search results. In spite of their popularity, keywordbased search engines have a weakness in that they often return a large list of search results with many of the top list of search results are irrelevant. This problem can be trivially avoided if users know exactly the right query terms to use. These terms are the ones that are unique or at least very specific, causing the search engines to bring only the relevant information to the top list of search results. However, such query terms are often very hard to find, and in many cases, they do not even exist. In order to get the information needed, finding the right query terms can become additional task during information seeking activity. Unfortunately, this problem has not been widely addressed by most search engines currently available on the Web.

2. Background The basic construct needed to build a fuzzy ontology of term associations is knowledge about the relation between two terms. The knowledge can be acquired from either a manually (e.g., WordNet [1]) or automatically [2, 3] constructed thesaurus. In this work, we focus on the construction of fuzzy ontology of term associations that are derived from an automatically built, cooccurrence-based thesaurus. Furthermore, we are interested in exploiting narrower and broader term relations as building blocks in the fuzzy ontology construction.

This paper presents our strategy on refining a query term and describes how the strategy is applied in a Personalized Abstract Search Services (PASS). Our approach is based on a fuzzy ontology of term associations. Given a 1

*,W,< :  

PR6 :   , or

2.1. Fuzzy Narrower Term Relation

Similarly, the membership degree of , can be defined by

Let where each article by a set of terms s.

(3) *,W,< :  > *,;=<    suggesting that the computation of PQ6 ’s membership value can be derived directly from the corresponding 576 ’s membership value.

     be a collection of articles   s,      is represented ! ! !  Given a term , theoccurrence of in article is rep  )   resented by "#$#&%('  )02 and value1 is 134)its membership 4  . The function defined by *,+ --. / is a general membership function that takes the frequency   occurrence of in as its argument. 1 In the information retrieval community, the function can be viewed as a normalized within document term weighting method.    denote that 8 is narrower-than  . Let 576 :   , represented by The membership degree of 596     8    

 , is defined by *,;=< 8  HG * + --. / E    +   F . / *     >  @  B ?  A E C D  * ;=< ? ACED * +--.F/ 8   (1) G where denotes a fuzzy conjunction operator. The equation above is equivalent with the 1 fuzzy narrower term described in [4]. If we define as a binary function such that

134>4 IKJ

4E>4MLON

then it can be easily shown that Equation 1 will collapse into the same expression regardless the selection of fuzzy conjunction operators.



The definition simply says that the membership degree that term is narrower-than term is the ratio between the number of co-occurrences of both terms and the number of occurrences of term . Therefore, the more frequent terms and co-occur and the less frequent term occurs in documents, is narrowerthan with higher degree of confidence. A membership value of 1.0 is obtained when a term always cooccurs with another term. In contrast, the membership value of narrower term relation between two terms that never co-occur will be 0.





8



: 8

This section describes our approach in constructing a fuzzy ontology based on the narrower and broader term relations. The technique employed can also be considered In general, the fuzzy ontology construction can be grouped into two stages. The first stage is to create a full ontology from fuzzy narrower term relations. The full fuzzy ontology is then pruned by eliminating unnecessary relations in the second stage.

3.1. Building Fuzzy Ontology from Fuzzy Narrower Term Relation

1 if 0 otherwise



3. Fuzzy Ontology Construction

A fuzzy ontology can be constructed by first calculating the membership values of two NT relations for each pair of two distinct terms (e.g., and ). A set of tests is then applied to select an NT relation that will be incorporated in the fuzzy ontology. The selection process at this stage will eliminate redundant, less meaningful and unrelated term relations.

*X;=< 8  

*X;Y<  !

3.1.1 Redundant Term Relation Elimination

             * ;=<    [*  Z ;=<      * ;=< * ;=< *X;=
 –   *,;=< 8  *,;=<  !     or *H576    L – 576    & *,;=<      into OntolAdd 576 ogy.     Ontology 3. Second stage. For each 576  6    0  , 596  0  = U Find P= =  59 57 6   >   . !#"%$ ;='< &)(+*-, (/.10 C P  * ;=<      If *,;=

Suggest Documents