International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
Semantic Query Routing for Ontological Knowledge Chain Rui Wang College of Computer and Software, Taiyuan University of Technology
[email protected] Xueli Yu College of Computer and Software, Taiyuan University of Technology
[email protected] Yingjie Li College of Computer and Software, Taiyuan University of Technology
[email protected] Abstract With a large quantity of data and services existing in virtual organization, query routing is an efficient approach to satisfy entities on community for their requirements of knowledge. The entities, which are organized according to the semantic relationship of each other, together with the query routing method under the guidance of semantics, endeavor to meet the querist with the most relevant answer. This paper proposes an extended query routing method that draws on the relationships between different concepts on Domain Ontology to mine the relationship between the knowledge querist holds and the knowledge querist wants. Through Related Segmentation of Ontology Algorithm, this paper tries to explore a cognitive chain to get the querist to fully understand how the knowledge in answer comes about, and to infer the answer querist wants to know.
1. Introduction There are huge amount of data and services available on the Web that gives users a chance to satisfy their knowledge needs. Meanwhile the emergence of Semantic Web promises context for knowledge and data engineering [7], which improves the organization of knowledge on the Web, and makes a user access what he needs easily. However, traditional searching technologies can not locate the knowledge that a user need exactly, and can not return the user a chain between the knowledge he wants and the knowledge he holds. Ontology Partition as a method for extracting relevant segments out of large description logic ontology increases tractability for both humans and computers. This technique takes advantage of the detailed semantics captured within an OWL ontology to produce highly relevant segments [1]. Although this segmentation is small, it is focused and still contains enough information for a user to infer new knowledge from original one, and to find the tache between different knowledge. In this paper we present the current achievement of our research activity in the RCCOKR (Research on Content-awareness and Context awareness Ontology Knowledge Routing) project. The problem solving method of this paper is limited to a small community, in which there is a few number of nodes. Although the implementing community is too small to satisfy the real world network, a large scale network system can be divided into many communities with small granularity. This paper tries to optimize the small community, which can pave way for optimizing a larger community.
73
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
Our research on query routing focuses on making best of domain ontology, drawing on the clue provided by the relationships of concepts in ontology to find a target node which can answer the query. At the same time, our research tries to explore a cognitive process for network thinking [16]. For this process, a knowledge chain will be found to record how other knowledge relates the knowledge a user holds to the target knowledge the user want, which helps the user to understand how to infer the answer, and what he should know before he comes to the answer. This paper puts forward a new routing mechanism, which is called Ontology Partition Based Routing (OPBR). This mechanism relies on the related segmentation of ontology method that produces a concept chain by partitioning description logic ontology, which is attained by setting out for searching from an original concept to a target concept. The community, on which this paper wants to carry out query routing, is organized following the relations of concepts in Domain ontology. Therefore it fuses semantics into its running. In addition, this paper uses Description Logic to prove theorem and explains models. And the description logic follows the basic description language AL. The related knowledge can be found in [8], [9]. The content of this paper are organized as follow. Section 2 describes the ontology partition method and introduces how this method can be applied to query routing. Section 3 presents an abstract system model for carrying semantic query routing on it. Section 4 detailed the algorithm OPBR for query routing. A simulation experiment and corresponding evaluation are shown in section 5. In section 6, some difficulties will be discussed for widespread application of our idea.
2. Ontology Partition method Ontology partition technique shown in this paper helps user in two respects. The first one is that this techniques help to create a custom ontology in which the ontology terms covers the related knowledge about the query. With the segmentation of ontology, peers who emit the query can have a panorama of the knowledge they require. Another one, the most important idea of this paper, is that ontology partition can exploit semantic links between ontology terms, and by which querist peer with local knowledge can be connected to other peers who have the target knowledge. And this is what the query routing needs. Accordingly, this paper will explain this method with Description Logic (DL), the detail of which can be got from [8]. 2.1. Relations of concepts in ontology A concept has three fundamental characteristics as following definition: Definition 1 (Super-class Relationship) A class C can inherit the common properties of its Super-class Sp, which serves as fundamental properties of all the classes that belongs to the same Super-class. In DL, it can be a “is a kind of” relationship, and represented as:
C ⊆ ∃hasSuperCl ass.Sp Definition 2 (Sub-class Relationship) A class C can transmit fundamental definition and its properties to its Sub-classes Sbs. And the relationship between a class and its Sub-class is a kind of subsumption. In DL, it can be a “has a kind” relationship, and represented as:
C ⊆ ∃hasSubClass.Sb
74
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
Definition 3 (Restrictions Relationship) Any role R as primitive binary predicate denotes binary relationships between individuals. In R (C1 , C 2 ) , R is the restriction of C1 , C 2 is a filler concept of R. And R can be seen as a property of C1 . In OWL, this relationship takes C1 as the Domain of an ObjectProperty, and gets C2 as the Range. Therefore
C1 ⊆ ∃hasObject Pr operty.C 2 2.2. Related segmentation of ontology Due to the existence of three fundamental Relationships between classes above in Ontology, we can find a semantic link from one class to another. During the course of ontology partition, the algorithm goes across the ontology, sets out from the original concept C1 to seek the target concept C2 , or searches inversely. This process can be called dualistic ontology partition in this paper. The model for this partition can be defined as a quadruple:
OP2 =< C1 , C 2 , L, R > . L represents the Related Partition Function (RPF):
L : C1 × C 2 → R . R is a Related Chain Ontology (RCO) generated by ontology partition. For a general extension, this paper extends the dualistic model to a multiple model that is represented as quintuple:
OPn =< S , T , LS , LT , R > where S is an original concept, and T is a set of target concepts as {C1 ,..., C n −1 } , LS : S × C i → R , LT : C i × C j → R , i, j ∈ [1,..., n − 1] . With this multiple model, a query, which bears rather than a single concept, can be solved. 2.3. Rationality of ontology partition for query routing A query introduced by a querist can be classified into two sorts. One sort of query is due to the querist’s curiosity, and has nothing to do with the querist’s own knowledge. Another one is generated basing on the querist’s own knowledge, which requires an answer related with the concepts or rules the querist holds. This paper mainly focuses on the later sort of query, with which there comes a lemma: Lemma (Relationship) Q = {C1 ,..., C n } is a set of concepts extracted from the query. K(querist) represents a set of concepts that the querist has held. Assume that a query asked by a querist has direct or indirect relationship with the knowledge the querist holds. Then there exits a formula and it is true: If a query relates to its producer’ K (qurist) ⊆ ∃hasRelationwith.(C1 ∪ C2 ∪ ... ∪ Cn ) knowledge, thus this relationship can be quantified. Theorem (Chain Ontology and Route) If a query has something to do with querist’s knowledge, then there exists a semantic route Rt (without consideration direction) leading from the concept of querist to one target concept T of query domain ontology. Therefore through ontology partition, a Chain Ontology can attained.
its of in be
Proof: In order to prove that there exists a semantic route between the knowledge of querist and the query, two preconditions should be declared as follows: At first, consider that all the concepts that the querist helds, and the query includes can be found
75
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
in OWL document. Secondly, all these concepts and relationships can be expressed with Description logic. The relationship R of this paper is limited to only three sorts which has discussed above. Therefore the relationships between concepts can be represented as: R ∈ {isSubClass of , isSuperCla ssof , . hasObjectp roperty } Concept S comes from the querist’s knowledge, concept C is one concept extracted from query. Because S is not a concept that has nothing to do with others, therefore there exits an atomic relation R and a concept T1 , which satisfies the following equation:
S = ∃R.T1 .
(1)
T1R.T SimilaTrly=, ∃ is a concept related with others in ontology. Therefore a serial of 1 2 equations can be deserved: ....
(2) n > 0. .... According to the equations above, another equation can be deserved from (1), (2): Tn -1 = ∃R.Tn S =∃ .∃2 R... R.Tn (3) 1R4 4∃3 n (4) According to the Lemma, we can deduce: S ⊆ ∃hasRelationwith.C And due to (3), (4), a conclusion can be reached: there exists a finite n ( n ≥ 0) to satisfy the equation: C = Tn . To sum up, there exists a semantic chain S → T1 → ... → Tn , Tn = C from S to C. Deduction: An ontology, including two interrelated concepts S and C, can be partitioned into a new ontology, which shows how concept S passes through a knowledge chain to be related with another concept C, and also tells the relationship among knowledge.
3. Abstract system model 3.1. Organization The system consists of three main parts: Routing Node (RN), User Node (UN), and System Domain Ontology Depository (SDOD). SDOD guides to build the architecture of system, according to its hierarchy of classes in ontology. RN is organized following the structure of classes in SDOD. The RNs of the same level interconnect with each other. The relationships between RNs are up to the relationships of the classes that live in the RNs as the [Figure 1]. In this figure, concept 1 lives in the R_table of RN B, its super-class concept 2 lives in RN A, its sub-classes concept 3 and 4 are stored respectively in C1, C2. But its restriction’s filler concept 5 is in D. And all the UNs with concepts in R_table of RN B are linked under B. Every root class in ontology, except for the class “Thing”, take up a RN exclusively, which is been called top RN. And to meet the demands of system efficiency, some RNs will be duplicated to enhance the efficiency of query routing. All these top RNs are connected, following the structure of Decentralized P2P system [10], and Figure 2 denotes it. All the UNs are hung under the corresponding RNs, according to the classes their knowledge belongs to. Before a UN joins in the community, it should be required
76
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
to compare its knowledge with the knowledge in community, and call semantic matching function to compute which topic its knowledge has the most semantic similarity with. The design of semantic matching function can be referred to our other paper [17]
Figure.1. Relationship between nodes
Figure.2. Topology of RNs
(1). Routing Node: RN takes charge of pointing out the UN which has the answer for the query or suggesting the next RN which can lead to the answer. The structure of RN can be modeled as a quadruple: RN =< NID, R _ table, matcher , Z _ link > where NID is a unique identity of Routing Node. R_table is a routing table that helps to guide the query message to its target node, which has the knowledge querist requires. The R_table is a quadruple:
R _ table= . CID is a kind of unique class identifier, it points out that this node links the UN who has the detailed knowledge of the class. Sup_entry is an entry address to a RN whose CID is a super-class of this node’s. Similarly, Sub_entry shows its subclass RN. P_entry denotes hasObjectproperty relationships (section1.1) between the CID of this node and that of its P_entry points to. And all this entry items contain two part: one is the RN’s address this link points to; another is the concept ID this CID relates to. The size of the R_table is up to the efficiency of the whole system. An example R_table for node B in Figure.1 can be designed as Figure 3:
Figure.3. A simple example of R_table
77
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
Matcher manages to compare the message’s topic concept with the RN’s CID in the R_table. Z_link is zonal link that is made up of four parts: the first one is Knowledge Link that leads message to UNs which has the knowledge about the CIDs in RN. The second one is Super-class Link which leads message to RNs that the Sup_entry points to. The next Sub-class Link is oriented to the Sub_entry points. The last link is Property Link, which leads message to RN whose CID is a filler concept of the restriction of related concepts. (2). User Node: UN is a kind of node with local knowledge. It is a user, and meanwhile a supplier in the system. The UN is a triple:
UN =< UID, KB, Onto > . UID is a unique user identity. KB is a file depository of the user. Onto is the local domain ontology for user’s knowledge. The user can visit the ontology of other nodes to seek the interested concepts and to find the certain knowledge file in KB. In addition R_link is a kind of link that connects UN with RN. The main title (on the first page) should begin 1 3/16 inches (7 picas) from the top edge of the page, centered, and in Times New Roman 14-point, boldface type. Capitalize the first letter of nouns, pronouns, verbs, adjectives, and adverbs; do not capitalize articles, coordinate conjunctions, or prepositions (unless the title begins with such a word). Please initially capitalize only the first word in other titles, including section titles and first, second, and third-order headings (for example, “Titles and headings” — as in these guidelines). Leave two blank lines after the title. 3.2. Message model There are two kinds of message in system for communication between nodes. One is query message (QM), which is a sextuple:
QM =< QID, T , UID, Rt , NT , hop > . In this model, QID represents a unique query ID to ensure that RN does not handle the same query it has already received, which can help to avoid cyclical traversal during query routing. Besides, QID can help to gather the answers to user, if the query contains more than one concept. T is a target concept of the query. UID is the unique user ID that is generated when a user node joins in the system. Rt dynamically records a route QM passes by. NT is node type that dynamically denotes the relationship between the CID of this node and that of the former node the QM just passes by. And NT = { property, Super − class, Sub − class}. Another kind of message is Answering Message (AM), which is a quadruple:
AM =< T , UID, QID, Rt > . T is a concept as an answer for the query. UID demotes the querist who holds the answer. QID points out which query the answer for. Rt stores the path from the querist to the answerer. In addition, a concept extracting function is needed to extract concepts from a query, which can be modeled as: Φ ext : query → T = {C1 , C 2 ,..., C n } . And the relevant research can be referred to [14], [11]. 3.3. Domain ontology and extensible Before the formation of community for certain interest, there should be a domain ontology created to lead semantic organization of files and documents in this community, which is rational and can be reached with the participant of domain expert.
78
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
And it is not necessary to build a domain ontology that contains all the knowledge, which is not possible in real world. At first, a small ontology can be built for basic purpose. With the technology of CO-ODE ∗ , the community can extend its existing ontology for specific use as users’ knowledge requirements increasing. And the ontologies of community can be stored and managed dispersedly [15], which can help to solve the problem of storing a large ontology in community. With the Domain ontology, the community can be organized semantically. 3.4. Mechanism When a query is generated by a user, it will be handled by the concept extracting function Φ ext firstly. Then there will be a concept set of query, and each concept Ci is encapsulated into a QM. The QM is sent to an adjoining RN, in which this QM can be transmitted to other RNs, according to the relationship of the CID in this RN and that of other interconnected RNs. Consequently, RN compares C i with the CID in its R_table. If it comes with a matching result, the target node is founded. And an AM is produced to respond the querist. Otherwise, the QM will be transmitted to next batch of RNs in the view of thee links (Sup_entry, Sub_entry, P_entry). If the number of hops of a QM exceeds a predefined threshold—the maximum number of hops (TTL), which will discussed in our future work. The QM will be abandoned, and a failure message will be returned. When all AMs and failure messages come back, the querist node will ensemble these AMs to a whole answer for the query, and get the knowledge along the recorded path Rt.
4. Routing algorithm 4.1. Existence checking Before query routing, there is something should be confirmed: whether the concept of query exists in the system or not. Because the system is built with the guidance of SDOD, thus all concepts that appear in system are going to find their corresponding concepts in SDOD. Therefore checking SDOD at first for the concept C in query helps user to know what his expectation will receive. A checker is expressed in Description Logic as:
C ⊆ ∃isASubconception. ┬ where “┬” stands for the highest level concept in SDOD. In OWL document, this concept is “Thing”. And finding a concept in OWL document is an easy work. If the concept C can pass the checker, the subsequent steps can be taken to get the route. 4.2. Necessary Consideration In course of query routing, the algorithm may fall into cyclical routing due to the existence of Reciprocal Links between concepts in SDOD. For example, some classes in ontology have inverse relationship of each other. An example:
Earth ⊆ ∃isAPlanetof .SolarSystem
SolarSystem ⊆ ∃hasAPlanet.Earth. ∗
www.co-ode.org
79
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
When a QM comes to this Reciprocal Link, it will be returned to the justly passed RN, and this QM will be not transmitted by the RN any more because of its QID appearing twice in this RN’s memory. This measure is effective to avoid infinite cyclical routing. In addition, another cyclical routing condition may happens in the routing process when the searching forwards several steps from a concept, it comes back to the origin. Because of the complex relationship between classes, there will be many cross-links between different classes that results in the-route-come-back. However, with the RN’s memory of QM’s QID, the turning back QM will be not forwarded to course the message flood. 4.3. Heuristic rule Rule 1. Checking whether there is a super-class or sub-class relationship between CID of RN and the concept of query or not at first. This step will help to avoid randomness in searching, and to save the searching space. Besides, another reason is that super-class relationship and sub-class relationship between two classes is closer than other relationships, and the two classes and classes in chain between them have the most commonness. Rule 2. If the two concepts don’t have the relations in rules 1, the algorithm will traverses along property relationship link to find the target, which chooses the second important relationships between two classes. Rule 3. Super-class searching and subclass searching can not be mixed next to each other in order during the algorithm seeking the target. Because when the algorithm carries on super-class searching, it extracts the common features of the two concepts, which is inverse to the sub-class searching that explores the individuality of concept. If these two kinds of searching are carried on closely in order during the whole searching, the chain gained by searching will low the pertinence of the two concepts. 4.4. Ontology Partition based Routing Algorithm (OPBR) In the algorithm, we only consider how a QM passes through the network to a UN who possesses target concept querist requires. A QM sets out from a UN. When the first RN receives this QM, at first, it will confirm whether its Super-node or Sub-node can satisfy the concept in the QM. If this step success, the Subclass searching or Super-class searching will be carried out by Traverse(Super-Node(RN)) for T or Traverse(Sub-Node(RN)) for T. Otherwise, the searching will follow the P_entry to explore new classes, and repeats the same course until the target concept is founded. During the whole process of searching, the algorithm makes use of the semantic information of the knowledge in system as much as possible, which takes the algorithm under the guidance of semantics, and matches the answer exactly and finds the semantic route Rt. The detail of algorithm can be seen in Figure.4
80
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
Figure. 4. Ontology Partition based Algorithm 4.5. Route Selection During the running of the Algorithm, maybe there are many routes to be found. But the best route should be selected for the user to get his answer. The best route should follow three rules: Rule 1. The route should contain as least number of nodes as possible, i.e. the route makes the semantic distance between the query and answer as short as possible. Rule 2. The route should promise that it should produce little semantic loss on the chain between the concepts of query and answer. This rule makes the concepts in chains relate the concept of query with the concept of answer closely. Rule 3. The best route is selected in the view of balancing those two rules above.
5. Experiment Evaluation 5.1. Algorithm complexity analysis
81
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
The complexity is up to how many nodes the algorithm visits. Accordingly, this paper will analyze from the case with a single hop. When a QM reaches a RN, it has three choices to take as its next hop. These three choices are P_entry, Sup_entry, and Sub_entry. for the purpose of computing the average complexity of algorithm easily, assuming that, On average, the probability of these three choices are taken equally, i.e. P( P _ entry) = P(Sup _ entry) = P( Sub _ entry) But it is possible that the real condition is not all like this. Assuming that: (1), the number of elements in P_entry of R_table in every RN is α on average. (2), the number of elements in Sub_entry in every RN is β on average. Sub_entry only has one element. Therefore the increasing number of nodes in the next hop of QM on average is: 1 avg = 1 ⋅ P(Sup _ entry) + α ⋅ P( P _ entry) + β ⋅ P(Sub _ entry) = (1 + α + β ), avg > 1. 3
In addition, the maximum number of hops of a message is TTL. To sum up, on average, the maximum number of nodes a same kind of QM will visits under the limitation of TTL: avg TTL +1 − 1 . Therefore, the complexity of the algorithm N max = 1 + avg + avg 2 + ... + avg TTL = avg − 1 is: Ο(
avg TTL +1 − 1 ) ≈ Ο(avg TTL ) . In the worst condition, avg = max(1, α , β ) . avg − 1
5.2. Routing example and semantic relevancy of the chain for the original concept and the target one This experimental evaluation carries out the algorithm with an ontology called TerrorOnt * to provide an example for the process and result of the routing. Before the result is analyzed, an evaluation method is introduced for the purpose of measuring how close a semantic chain relates two concepts. In this paper, there are three kinds of links in the way of searching, and each link will cause different semantic loss: K 1 represents the semantic loss of super-class link of one step; K 2 represents the semantic loss of sub-class link of one step; and K 3 represents the semantic loss of property link of one step. K 1 , K 2 , K 3 ∈ (0,1) “TravelEvent” is assigned as the original concept, “Terrorist_Group” is assigned as the target concept. With the algorithm, 37 semantic chains are founded. Due to the space limitation, not all the chains can be listed here. One chain will be used to explain how to use these chains to analyze the relationship between two concepts:
Figure.6. An example of semantic chain
*
http://www.mindswap.org/2004/terrorOnt.owl
82
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
The semantic chain selected by the algorithm may enlighten us to discover some helpful relationship between concepts. Consider the chain in Figure.6, at first, the chain checks “Vehicle” used in “TravelEvent”, and then see whether “Vehicle” has something to do with “TerroristIncident” or not. Consequently, “Person”, who is related to “TerroristIncident”, is considered. In the next step, the chain will examine what “Facility” is used in the past “TerroristIncident”, and which “Organization” is involved into this event. The last step will check whether “Organization” has some characters related to “Terrorist_Group” or not. Through the checking process under the guidance of the chain, a doubtful “TravelEvent” may be deduced, which may help government to forecast “TerroristIncident”. Although this thinking process seems not very rational, it can assist human to analyze a great deal of complicated relationships in the real world with the heuristic rule designed by human. And the whole checking course can be carried out by machine automatically with necessary information. There are five property links and one sub-class link in this chain, hence semantic loss of this chain can be computed as: step1 : loss1 = K 3 step 2 : loss2 = ( 1 − K 3 )K 3 .......... step 6 : loss6 = (1 − K 3 + K 32 − K 33 + K 34 − K 35 ) ⋅ K 2
Therefore the semantic loss for this chain is loss 6 . However, the semantic loss for the three links is volatile in different condition, and how to compute it accurately is a challenge for us in our future work.
6. Discussion Optimization in routing Consider that the algorithm of this paper is applied to a large community with thousands of millions of nodes, there is a fatal problem that is called Combinatorial Explosion, which blocks the applied range of the algorithm. Due to the heuristic rule of the algorithm can not limit routing to only one choice at each step, therefore when the number of nodes in community is large enough, the algorithm will visit the number of nodes that increases exponentially. Thus an advanced heuristic rule is needed to extend our algorithm to a large problem space. We plan to consider the semantic loss of three relationship link—Super-class link, Sub-class link, and Property link. And the algorithm will choose the link to forward QM in one step with consideration of this link having the least semantic loss for query. Community dividing and granularity Although a large community that contains as much knowledge as possible can satisfy the user as many requirements as possible, the searching and responding time may be so long that users can not tolerant. Therefore a large community should be divided into many smaller communities to improve the responding speed. Between responding speed and the satisfying degree of user’s requirement, there should be a balance. And we will find this balance recurring to Granularity computing. There are two crucial problems that should be solved.
83
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
One is how to decide the size of granularity dividing. Another is how to find the relationship between the size of community and routing algorithm quantificationally Besides, how to relate small community together to enhance the capacity of community that tries its best to satisfy users is another challenge for us in the future work
7. Related Work Currently, there are three main methods for query routing in the network system. The first approach is based on indexing P2P system, which is called distributed hash table (DHT) [12]. This approach avoids using the central index for routing query with certain keys. But without semantics, this approach cannot be efficient to locate the peers with semantic similarity for the query. The second approach routes the query follows the principle of “small world” [4 ] in the model of social network, but without semantics as well. The third approach equips P2P system with semantics, and organizes the peers together or interconnected according to their semantic similarity. Paper [2] combines this method with the index approach, and relies on a Dynamic Short-cut Algorithm in INGA to rout a query message. And this paper makes use of “small world” principle to guide its short-cut index creating. A similar idea can be seen from [3] as well, but it adopts unstructured P2P network—Gnutella—as its releasing system. Another interesting idea from [5], which improves peers with semantics. And according to the ontology of peers, a semantic matching algorithm is applied to build H-links between different peers, with which a query will be forwarded to the neighbor of the most semantic similarity with querist. In addition, [6] use semantic mapping in PDMS for its routing strategy. Although these paper try to improve the efficiency of query routing with P2P system, the semantic similarity, which is computed between the concepts of query and the ontology of other peers, makes the semantic links between peers can not satisfy various knowledge needs of querist. If a query contains a concept that has no semantic similarity with the routing peers, an answer will never be found along this route. Another deficiency lies in semantic loss during the routing process, because the routing peer choosing is up to the semantic similarity between the sending peer and the receiving peers, and as the routing continues, the ontology of routing peer has fewer and fewer semantic similarity with the ontology of querist peer. Therefore there should be a structure of system reflecting an overview of most knowledge in the system as our paper do. This will guide a query with different knowledge to be routed macroscopically, and can partly weaken deficiency of message flood of P2P system. According to the problem existing above, this paper builds the routing mechanism on holding the macro-knowledge of system. Therefore the routing process isn’t based on the semantic similarity between neighbors, but follows abundant semantic information provided by system such as R_table to forward the query by the classification its concept belongs to, which decreases the semantic loss during the routing. In addition, due to the semantics lies in the organization of system, query routing is not blind but under the guidance of semantic clew along the way to the target.
84
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
8. Conclusion and future work This paper applies a new routing method—related partition of ontology—to seek a route for querist to meet with his query need accurately. The route that is found is not a simple way for querst to get his answer, but a semantic chain that explain to the querst how his knowledge can be linked with the target answer. And this approach suggests us a way to make the machine think with Domain Ontology on Semantic Web. Our future work will fuse Autonomic computing into our research, and build an intelligent Multi-agent Community with the idea in this paper. The ABLE (Agent Building and Learning Environment) [14] will be chose to realize the autonomic system. And a evaluating method to quantify the relevancy of ontology chain between two concepts will be designed. In addition, we will continue to optimize our algorithm for wider use.
10. References [1] Julian Seidenberg, Alan Rector, “Web Ontology Segmentation: Analysis, Classification and Use”, IW3C 2006, ACM, 2006. [2] Alexander Loser, Steffen Staab, and Christoph Tempich, “Semantic Methods for P2P Query Routing”, MATES 2005, pp. 15-26. [3] Yamini Upadrashta, “Semantic Social Routing in Gnutella”, thesis, University of Saskatchewan, Canada, 2005. [4] Oskar Sandberg, “Distributed Routing in Small-World Networks”, ALENEX (2006), 2006. [5] Silvana Castano and Stefano Montanelli, “Enforcing a Semantic Routing Mechanism based on Peer Context Matching”, C&O-2006, Italy, 2006. [6] Federica Mandreoli, Riccardo Martoglia, “Using Semantic Mappings for Query Routing in a PDMS Environment”, SEBD 2006, Italy, 2006, pp. 56-63. [7] Gottfried Vossen, Miltiadis Lytras, and Nick Koudas, “Editorial: Revisiting the (Machine) Semantic Web: The Missing Layers for the Human Semantic Web”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, IEEE, 2007, pp. 145-148. [8] Franz Baader, Deborah L. McGuinness, Daniele Nardi, Peter F. Patel-Schneider, THE DESCRIPTION LOGIC HANDBOOK: Theory, implementation, and applications, 2002. [9] Frank Wolter and Michael Zakharyaschev, “Dynamic description logics”, Advances in Modal Logic 1998, 1998, pp. 431-446. [10] Dejan S. Milojicic, Vana Kalogeraki, Rajan Lukose, “Peer-to-Peer Computing”, HP Laboratories Palo Alto, 2003. [11] Eric Brill, Susan Dumais and Michele Banko, “An Analysis of the AskMSR Question-Answering System”, ACM Comput. Surv, EMNLP 2002, Philadelphia, 2002, pp. 257-264. [12] S. Androutsellis-Theotokis and D. Spinellis, “A survey of peer-to-peer content distribution technologies”, 2004, pp. 335-371. [13] Mark Meyer, “The features and facets of the Agent Building and Learning Environment (ABLE)”, http://www-106.ibm.com/developerworks/autonomic/library/ac-able1/, IBM, 2004. [14] Nick Cercone, Lijun Hou, “From computational intelligence to Web intelligence”, IEEE Computer Society, 2002. [15] Harrison, R. Chan, C.W., “Distributed ontology management system”, Electrical and Computer Engineering, 2005. Canadian Conference on Volume , Issue , 1-4 May 2005 pp. 66-664. [16] Melanie Mitchell, “Complex systems: Network thinkinge”, Artificial Intelligence, Elsevier Science, 2006, pp. 1194-1212. [17] Li Wang, Yingjie Li, Wen Li, Yu Xing, Xinqi Wang, Xueli Yu. “The Semantic Matching of the Semantic Web Services”, The 2004 IEEE / WIC / ACM Workshop: KGGI 2004, 2004.
85
International Journal of Software Engineering and Its Applications Vol. 2 No. 4, October, 2008
Authors Xueli Yu received her B.S. degree in computer science from Department of Automating Control of Tsinghua University in China. Recently, she is the chief academic leader of the subject of Applied Computer, Doctoral Advisor. She is the advanced member of China Computer Federation, the third Executive Director of Computer Education Research Association of Chinese Universities. She has achieved many accomplishments in research and education of Intelligent Information Processing on Web, Software Architecture, Multi-media technology on Web, and so on.
Rui Wang received his B.S. degree in computer science from Shandong University of Science and Technology in China. Since 2005, he has been a leader of the graduate research group in the research of the Natural Science Foundation of China: Research on Content-awareness and Context-awareness Ontology Knowledge Routing (No. 60472093) at Taiyuan University of Technology in China. He’s research focus on Semantic Web, Semantic Web service, Multi-agent System, Behavior Analysis.
86