. The premises of rule3 are:
, . We may find that the conclusion of rule1 matches the second premise of rule3, because there has the triple: in RDFS axiomatic triples and we suppose that user accepts it as default. When rule1 triggers rule3, only statement: which has been already deduced from axiomatic triples can be inferred. So the dependency between rule1 and rule3 is meaningless, it can be removed. Compared to the dependent table used in sesame algorithm we can find that 40% of dependent relationships are removed using this optimization. 4.3
The Reasoning Algorithm
Applying the optimized method to all the entailment rules (RDFS entailment rules and pD* entailment rules), a complete dependent table between all entailment rules can be illustrated. We do not list the complete dependent table for the reason of limited space. Using the complete dependent tables, we propose a forward-chaining reasoning algorithm. It consists of a simple loop to obtain the pD* closure of an RDF graph G. It is an iterative procedure to apply the entailment rules to G and terminate until no new statements can be derived. The detailed algorithm is described as follows: 1. 2. 3. 4.
Initialize all rules are recorded as triggered. Read in RDF graph G and all the axiomatic triples. Begin iteration. For each rule, determine whether it is triggered in last iteration. If its premises are matched by newly derived triples in last iteration, apply this rule to graph G and record the rules triggered by it. 5. Iteration terminate until no new triple added to G.
A Reasoning Algorithm for pD*
297
When a certain rule is applied to G, it means that firstly we search the triple newly derived in last iteration for the triple matches one of the premises of the rule, if succeed, try to find the other premises in G, if all premises can be matched, the conclusion of this rule is deduced and added to G. For example, when rule rdfs2 is triggered, we will firstly search all the triples derived in last iteration to find a triple that matches at least one of the premises. If succeed, then search G for the triples matched the other premise. A pair of triples that matched with these two premises will deduce the conclusion triple, all such pairs are found out and the corresponding conclusions are derived. Finally, the rules triggered by rdfs2 are recorded for next iteration. Using this algorithm, the pD* closure(Gp) of G can be computed in polynomial time. And whether G pD* entails RDF graph H can be converted into checking if Gp contains an instance of H as a subset or contains a P-clash. A P-clash is either a combination of two triples of the form , , or a combination of three triples of the form , , . Same as Sesame algorithm, this algorithm is also guaranteed to terminate.
5
Test Results
The OWL Web Ontology Language Test Cases [2] is a W3C Recommendation. Because there does not have the real pD* test cases, we test our algorithm on the positive entailment test cases of OWL. We select all the test cases responding to the vocabulary supported by pD* from [2]. The results are shown in Table2. For each OWL test case, the symbol ‘—’ indicates that it is not a positive entailment test case, otherwise, there are two denotations. The symbol “S” (or “U”) at the left position indicates that the underlying semantic condition of this test case is supported totally(or unsupported) by pD*, while the symbol “P” (“U”or “F”) at the right position indicates that our algorithm passes (unsupports or fails in) the corresponding test case. Totally, there are 37 positive entailment tests about the OWL vocabulary subset included by pD*. From the test results we observe that the underlying semantic conditions of 18 test cases are supported totally by pD*. Among them, our algorithm passed 16 tests. Take the first test case of owl:allValuesFrom for instance, our algorithm will apply firstly the entailment rule rdfs9 to infer from and that is tenable. Then after applying entailment rule rdfp16, we infer from , < :a onProperty p>, < :a allValuesFrom c>, that is tenable. Since and are tenable in the premises, the conclusions of this test case are all tenable. So this test case is passed. The other passed test cases are similar to this example. Among the 18 pD* test cases, two of them which includes datatype are failed because our algorithm does not support the reasoning of datatype by now. The test results listed above illuminate that with respect to the pD* most of the test cases can be passed by our algorithm.
298
H. Li et al. Table 2. Test results Positive Entailment Test 001 FunctionalProperty S/P InverseFunctionalProperty S/P Restriction — SymmerticProperty S/P TransitiveProperty S/P allValuesFrom S/P differentFrom U/U disjointWith S/P equivalentClass S/P equivalentProperty S/P inverseOf S/P sameAs S/P someValuesFrom U/U
002 S/F S/F — U/U U/U — U/U S/P S/P S/P — — U/U
003 U/U U/U — U/U — — — — S/P S/P — — U/U
004 U/U U/U — — — — — — U/U U/U — — —
005 006 U/U — — — — U/U — — — — — — — — — — — U/U U/U S/P — — — — — —
007 — — — — — — — — U/U — — — —
For illustrating the performance of our algorithm, we use five different data sets to test the loading time of three different algorithms. One of them is the exhausitive forward-chaining algorithm which does not use the dependencies between rules, one is our algorithm discussed above, the other is a more simple algorithm in which some entailment rules are taken off for promoting the performance. The reasoning results of this simple algorithm are guaranteed to be sound, but may be incomplete. The test results show that the data loading time of our algorithm using optimized dependencies between rules is better than exhausitive forward-chaining algorithm. Among these three algorithms, the performance of simple algorithm is the best. From the test results listed above, we find that with respect to the pD* most of the test cases can be passed by our algorithm and its data loading time is better than exhausitive forward-chaining algorithm. In addition for the users who want to get some usual results rapidly but does not like to wait for complete reasoning results, the simple algorithm is more suitable.
6
Conclusion and Future Work
In this paper, we have presented a forward-chaining reasoning algorithm which supports the reasoning of pD*. Based on the premise that metaschema level statements are usually absent in users’ RDF or OWL files, an optimization to the dependencies between entailment rules is applied for elevating the algorithm’s performance. The test results shows that its data loading time is better than exhaustive forward-chaining algorithm. In addition, we also provide an efficient approximate algorithm for users who want some correct results rapidly but do not require the quick answers to be complete. The work reported in this paper can be seen as a first step towards a complete system for storing and querying Semantic Web data with pD* semantics. There are a lot of works to do towards this direction, such as to deal with the conse-
A Reasoning Algorithm for pD*
299
quences of delete operations, to improve the performance for scalability. How to solve these problems will be discussed in our future work.
Acknowledgments The work is supported partially by the 973 Program of China under Grant 2003CB317004, the JSNSF under Grant BK2003001 and the European project Knowledge Web FP6 Network of Excellence EU project Knowledge Web (IST2004-507842). We would like to thank our team members for their work on related experiments.
References 1. Broekstra, J., Kampman, A., Harmelen, F.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In Proc.of the first International Semantic Web Conference (2002), pp. 54-68. 2. Carroll, J.J., Roo, J.D. (Eds.): OWL Web Ontology Language Test Cases. W3C Recommendation 10 February 2004. http://www.w3.org/TR/2004/REC-owl-test20040210/. 3. Guo, Y., Pan, Zh., Heflin, J.: An Evaluation of Knowledge Base Systems for Large OWL Datasets. In Proc. of the 3rd International Semantic Web Conference (2004), pp. 274-288. 4. Haarslev, V., Moller, R.: RACER system description. In Proc. of the Int. Joint Conference on Automated Reasoning (IJCAR 2001), volume 2083 of Lecture Notes in Artificial Intelligence, pp. 701-705. 5. Hayes, P. (Ed.): RDF Semantics. W3C Recommendation 10 February 2004. Latest version is available at http://www.w3.org/TR/rdf-mt/. 6. Horst, H.J.: Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary. Journal of Web Semantics 3 (2005), pp. 79-115. 7. Pan, J.Z.,Horrocks, I.: RDFS(FA) and RDF MT: Two Semantics for RDFS. In Proc. of the 2nd International Semantic Web Conference (ISWC2003), pp. 30-46. 8. Patel-Schneider P. F.: Building the Semantic Web Tower from RDF Straw. In Proc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI 2005). 9. Sirin, E., Parsia, B., Grau, B.C., Kalyanpur A., Katz, Y.: Pellet: A Practical OWLDL Reasoner. Submitted for publication to Journal of Web Semantics. 10. Tsarkov, D., Horrocks, I.: Efficient reasoning with range and domain constraints. In Proc. of the Description Logic Workshop (2004), pp. 41C50.