Secure Resource Description Framework: an Access Control Model Amit Jain
Csilla Farkas
Center for Information Assurance Engineering Department of Computer Science & Engineering University of South Carolina Columbia, SC 29208
Center for Information Assurance Engineering Department of Computer Science & Engineering University of South Carolina Columbia, SC 29208
[email protected]
[email protected]
ABSTRACT In this paper we propose an access control model for the Resource Description Framework (RDF). We argue that existing access control models, like the ones developed for securing eXtensible Markup Language (XML) documents, do not provide sufficient protection for RDF. Our model is based on RDF data semantics and incorporates RDF and RDF Schema (RDFS) entailments. RDF protection objects are represented as RDF-patterns (triples). The flexible security granularity allows to express restrictions on a single resource, property, or value, or any combination of these. RDF-patterns are mapped to RDF and RDFS statements to determine their security requirements. We develop methods to assign security classification to entailed statements and to detect unauthorized inferences. We propose a twolevel conflict resolution strategy. Simple conflict resolution addresses the problem when more than one pattern can be mapped to the same RDF statement, resulting in conflicting classification. Inference conflict resolution is used on entailed statements for their security requirements and generated inconsistencies. Keywords: RDF, RDFS, Entailment, Access Control, Conflict Resolution
1. INTRODUCTION The World Wide Web is rapidly changing from primarily human usage to machine readable and lately, machine understandable Web. The Resource Description Framework (RDF) [13] provides machine understandable description of resources, their properties and relationships, thus supporting inter-operation between applications. The RDF data model is syntax neutral. It describes data semantics using three object types: resources, properties, and values. Resources define “all things” being described by RDF expressions. Properties represent characteristics (attributes) of the resources. RDF statements link together specific resources with their named properties and values. RDF can be con-
ceptualized as directed labeled graph, where each node is a resource or a literal, and edges are properties. Only object nodes can be literal. Several commercial and academic efforts target the development of RDF repositories. The Tap knowledge base project [8] is an academic project to build a semantic knowledge database in RDF. Kowari metastore [23], developed by Tucana Tech Inc., is an open source project that claims to store more than 350 million RDF triples. Semagix Inc. [20] is a commercial company building anti-money-laundering and national security applications based on semantic connectivity and association among the data entities. Semagix Inc. uses RDF metabase to store metadata and build ontologies. As these applications are becoming more-and-more widely used, the need to develop authorization framework for RDF increases. Currently XML is the most widely used syntax for representing RDF data. This leads naturally to the approach to use existing XML access control models to secure RDF. Several access control models have been developed for XML [2], [5], [14]. While these models provide fine grained authorization frameworks, the identification of the protection objects is based on XML syntax and structure only. While this approach may be sufficient for XML documents, it is not satisfactory for RDF for several reasons. First, the same RDF statements can be represented in several different syntactic ways (see Figure 1 for an example). This requires that different XML access control policies must be developed for each XML representation. Second, new RDF statements may be generated from the explicitly stored ones via RDF and RDFS entailments. Therefore, mechanisms that 1) assign security classifications to the newly generated statements and 2) check for unauthorized inferences, need to be developed. Current access control models do not address these problems. Finin et al. [17] proposed a policy based access control model for RDF data in a RDF store. The model provides control over the different action modes possible on the RDF store, like inserting a set of triples into the store, deleting a triple, and querying whether or not a triple is in the store. The authors define a set of policy rules, enforced by a policy engine to reach the authorization decisions. Kaushik et al. [12] proposes a constraint logic programming based policy language for securing full or partial ontologies. While their methods can be applied to RDF/S databases, they do not consider RDF/S entailments. To the
sl then Inference security violation; /* a higher security object could be entailed from objects with lower security classification */ Generate Warning and I = I ∪ (t, sl′ ); else S = S ∪ (t, sl); /* Policy Verification */ if there is a ν from a pt to t such that ν : pti → t then create pair (t, sli ) where sli is the security label of pti ; if sli > sl then Security Violation ;
Security Pattern [r, p, v] [r, ?x, v] [r, p, ?x] [?x, p, v] [r, ?x, ?y] [?x, p, ?y] [?x, ?y, v] [?x, ?y, ?z]
Table 1: RDF Patterns Interpretation All elements of the triple are specified as constants Subject and object are specified as constants and property a variable. Subject and property are specified as constants and object is variable Property and object are specified as constants and subject is variable Subject is specified as constant and property & objects as variable Property is specified as constant and subject & object as variable Object is specified as constant and subject & property are variables All elements are variables
2.4
Theorem 2.2. Alg 2
Example [John,studentOf,USC] [John,?x,USC] [John, studentOf, ?x] [?x,studentOf,USC] [John,?x,?y] [?x,studentOf,?y] [?x,?y, USC]
Conflict Resolution
We distinguish between two levels of conflict resolution. 1. generates a cover of all entailed RDF/S triples 2. generates only entailed RDF/S triples 3. is secure (i.e., the security label of a newly generated triple dominates the security label of triples used in entailment) 4. is least restrictive (i.e., if the security label of a newly generated triple is l and the security labels of triples used in the entailment are li , lj then there is no security label l’ such that l′ ≥ li , l′ ≥ lq and l′ < l 5. is conflict free
PROOF SKETCH : 1-2 Algorithm 2 uses process similar to the Chase process as described in [3] to apply entailment rules, represented as Horn clause constraints. 1-2 follows from the properties of pattern mapping and the Horn clause constraints. 3 The entailed triple is correctly assigned a security label as the least upper bound of the security labels of all the used triples and hence dominates all of them. 4 Since the entailed triples has the least upper bound of the used triples, this is the lowest security label possible without any security violation.
Simple conflict resolution addresses the problem that there might be several RDF-patterns that can be mapped to a particular RDF/S statement. This could result in different security labels for the same RDF statement. Clearly, this is undesired. In this case, we choose the most restrictive classification or the lowest upper bound of the security labels that can be assigned to the statement. We also require that subsuming patterns have less restrictive security classifications than the more specific, subsumed patterns. The rational behind this policy is that general patterns can define access restrictions on a set of statements, while exceptions can be represented by the more specific patterns. Based on the “more restrictive take precedence” resolution, the exception will be correctly classified at the higher level. Algorithm 1 addresses these issues. The second level conflict resolution, called Inference conflict resolution, addresses potential inconsistencies that occur due to newly entailed RDF/S statements. Table 2 shows the automatically assigned security classifications to the entailed statements. However, it may occur that a security pattern from the policy may also be mapped to the newly generated statement. The following options can be evaluated: 1. The automatically generated security label is the same as the security label of the mapped RDF-pattern. This does not represent any security problem and the labeling is consistent.
5 If generated pair is (t, sl), then this algorithm checks for existence of a pair (t, sl′ ) in the security cover such that sl′ > sl. This points to a security violation and a warning is generated, making the algorithm, conflict free. The policy verification which leaves no triple in the cover with a default security label also adds to the conflict free property of the algorithm.
2. The automatically generated security label dominates the security label of the mapped RDF-pattern. This does not represent any security problem, i.e., the statement can be entailed from statements that are classified higher than the new statement requires. The security label of the new statement should be changed to the label required by the security pattern mapping, i.e., the less restricting label.
2
3. The automatically generated security label is domi-
SN rdf1 rdf2
rdfs2
Rules ∀x, y, z(x, y, z) −→ (y, rdf : type, rdf : P roperty) ∀x, y, z(x, y, z) −→ ( : a, rdf : type, rdf : XM LLiteral) where z is a typed XML Literal ∀x, y, z(x, y, z) −→ ( : m, rdf : type, rdf s : Literal) where z is a plain literal and :m is a blank node allocated to z ∀x, y, z, z1(x, y, z) ∧ (y, rdf s : domain, z1) −→ (x, rdf : type, z1)
rdfs3
∀x, y, z, z1(x, y, z) ∧ (y, rdf s : range, z1) −→ (z, rdf : type, z1)
rdfs4a rdfs4b rdfs5
∀x, y, z(x, y, z) −→ (x, rdf : type, rdf s : Resource) ∀x, y, z(x, y, z) −→ (z, rdf : type, rdf s : Resource) ∀x, y, z(x, rdf s : subP ropertyOf, y) ∧ (y, rdf s : subP ropertyOf, z) −→ (x, rdf s : subP ropertyOf, z) ∀x(x, rdf : type, rdf : P roperty) −→ (x, rdf s : subP ropertyOf, x) ∀x, y, z, z1(x, y, z) ∧ (y, rdf s : subP ropertyOf, z1) −→ (x, z1, z)
rdfs1
rdfs6 rdfs7 rdfs8 rdfs9 rdfs10 rdfs11 rdfs12 rdfs13
∀x(x, rdf : type, rdf s : Class) −→ (x, rdf s : subClassOf, rdf s : Resource) ∀x, y, z(x, rdf : type, y) ∧ (y, rdf s : subClassOf, z) −→ (x, rdf : type, z) ∀x(x, rdf : type, rdf s : Class) −→ (x, rdf s : subClassOf, x) ∀x, y, z(x, rdf s : subClassOf, y) ∧ (y, rdf s : subClassOf, z) −→ (x, rdf s : subClassOf, z) ∀x(x, rdf : type, rdf s : ContainerM embershipP roperty) −→ (x, rdf s : subP ropertyOf, rdf s : member) ∀x(x, rdf : type, rdf s : Datatype) −→ (x, rdf s : subClassOf, rdf s : Literal)
Security Label sl = sl1 sl = sl1 sl = sl1 sl LU B(sl1 , sl2 ) sl LU B(sl1 , sl2 ) sl = sl1 sl = sl1 sl LU B(sl1 , sl2 ) sl = sl1 sl LU B(sl1 , sl2 ) sl = sl1 sl LU B(sl1 , sl2 ) sl = sl1 sl LU B(sl1 , sl2 ) sl = sl1
= =
=
=
=
=
sl = sl1
Table 2: Inference Rules from W3C recommendation and security label assignment nated by the security label of the mapped RDF-pattern. This is a security violation via unauthorized inference. That is an RDF statement can be inferred from statements that has lower security classifications than the inferred statement requires. In this case inference channel removal is required and security policy needs to be fixed.
3. IMPLEMENTATION We have developed a prototype for RACL. The UML diagram, shown in Figure 3, shows the high level architecture of our implementation and Figure 4 shows a screenshot of our RACL module. It depicts the three stages of the system namely Policy to RDF DB Mapping, RDF Inferencing and Consistency Checking & Conflict Resolution. We have used Java J2SE 1.5.0 as the development platform. Java SWING is used to create the user interface. Jena 2.1 [4] developed by HP is being used as the RDF API to provide programming environment with Java. Jena provides the ability to access and modify RDF and RDFS models. JESS [7] is being used as the rule engine to implement the RDF/S inferencing. After pattern mapping is done, JESS engine saves the security labeled triples(security objects) as JESS facts. RDF/S entailment rules written as JESS rules are fired by the inferencing engine to do the inferencing. Several JESS functions are written to do the security label generation for the entailed statements using the security labels of the used statements. The entailed triples are saved as JESS facts. We have successfully implemented the first and second stages of the prototype. We’re currently work-
ing on implementing the third stage. We are working on finding the policy conflicts by mapping the security policy to the entailed database and integrating these modules. We are also working on experimenting with large RDF data sets stored in RDF native databases like Kowari [23].
4.
CONCLUSION AND FUTURE WORK
This paper presents our initial attempt to secure RDF data. Our motivation was that existing access control models for XML do not provide adequate security. In particular, they do not address data semantics and RDF entailment. We propose an approach to secure RDF using RDF-patterns. Each pattern is associated with an RDF instance (DT ) and schema (ST ), and a security classification. RDF-patterns are mapped to the statements in DT and ST to determine security classifications for the statements. Entailed statements are classified based on the security classifications of the statements used in the entailment as well as by mapping RDF-patterns to the newly generated statements. We also provide default classification of RDF statements not covered by the security objects. We propose a two-level conflict resolution strategy. Simple conflict resolution addresses inconsistencies that may occur due to the mapping of more than one pattern to the same RDF statement. Inference conflict resolution detects unauthorized inferences, where a higher security statement can be inferred from lower security statements. We are currently implementing our system, using open source software tools. Future work includes formalizing our conflict resolution strate-
Policy to RDF DB Mapping
Instance I
RDF Inferencing
Consistency Checking & Conflict Resolution
Rules R Map G to P & Assign Sec Labels
Create Policy P RDF Inferencing Schema S
Policy P
Inconsistency & Conflict Checking
Mapping P, I, S to RDF DB
RDF TripleBase Closure G with labels C’
IF C=C’
(YES) Policy is Safe
(NO)
Do Conflict Resolution Level:1 Do Conflict Resolution Level:2 & Fix the policy RDF Triple Base K0 with labels C
Figure 3: UML diagram showing system architecture for RDF access control gies, and properties of our access control model. In particular, we will address the completeness and consistency properties and compare our model to the flexible authorization framework proposed by Jajodia et al. [10]. We will also complete and evaluate our implementation from the perspectives of security and performance.
5. REFERENCES [1] D. Beckett. W3C recommendation, RDF/XML syntax specification. http://www.w3.org/TR/rdf-syntax-grammar/, February 2004. [2] E. Bertino, S. Castano, E. Ferrari, and M. Mesiti. Specifying and enforcing access control policies for XML document sources. World Wide Web, 3(3):139–151, May 2000. [3] A. Brodsky, C. Farkas, and S. Jajodia. Secure databases: Constraints, inference channels, and monitoring disclosure. IEEE Trans. Knowledge and Data Eng., November, 2000. [4] J. Caroll. Jena a semantic web framework for java. http://jena.sourceforge.net/index.html. [5] E. Damiani, S. D. C. di Vimercati, S. Paraboschi, and P. Samarati. A fine-grained access control system for XML documents. ACM Transactions on Information and System Security TISSEC, 5(2):139–151, 2002.
[6] S. Dawson, S. C. di Vimercati, and P. Samarati. Specification and enforcement of classification and inference constraints. In Proc. of the 20th IEEE Symposium on Security and Privacy, Oakland, CA, May 9–12 1999. [7] E. Friedman-Hill. Jess, the rule engine for the javatm platform. http://herzberg.ca.sandia.gov/jess/. [8] R. Guha, R. McCool, A. Sundarajan, and K. Joly. TAP: Building the semantic web. http://tap.stanford.edu. [9] P. Hayes and B. McBride. W3C recommendation, RDF semantics. http://www.w3.org/TR/rdf-mt/, February 2004. [10] S. Jajodia, P. Samarati, M. L. Sapino, and V. Subrahmanian. Flexible support for multiple access control policies. ACM Transactions on Database Systems, 26(4):216–260, 2001. [11] S. Jajodia, P. Samarati, V. S. Subrahmanian, and E. Bertino. A unified framework for enforcing multiple access control policies. In Proceedings of the 1997 ACM SIGMOD international conference on Management of data, pages 474–485. ACM Press, 1997. [12] S. Kaushik, D. Wijesekera, and P. Ammann. Policy-based dissination of partial web-ontologies. In SWS ’05: Proceedings of the 2005 workshop on Secure
Figure 4: Screenshot of RACL Implementation web services, pages 43–52, New York, NY, USA, 2005. ACM Press.
systems. IEEE Trans. Knowl. Data Eng., 9(4):524–538, 1997.
[13] G. Klyne and J. Carroll. W3C recommendation, RDF concepts and abstract syntax. http://www.w3.org/TR/rdf-concepts/, February 2004.
[19] R. Sandhu, E. J. Coyne, H. Feinstein, and C. Youman. Role-based access control models. IEEE Computer, 29(2):38–47, February 1996.
[14] M. Kudo and S. Hada. XML document security based on provisional authorization. In Proceedings of the 7th ACM Conference on Computer and Communications Security, Athens, Greece, pages 87–96, November 2000.
[20] A. Sheth. Semagix inc. http://www.semagix.com.
[15] T.-B. Lee. Primer: Getting into RDF & Semantic Web using N3. http://www.w3.org/2000/10/swap/Primer, October 2000. [16] D. Marks, A. Motro, and S. Jajodia. Enhancing the controlled disclosure of sensitive information. In Proc. European Symp. on Research in Computer Security, Springer-Verlag Lecture Notes in Computer Science, Vol. 1146, pages 290–303, 1996. [17] P. Reddivari, T. Finin, and A. Joshi. Policy based Access Control for a RDF Store. In Proceedings of the Policy Management for the Web Workshop, A WWW 2005 Workshop, pages 78–83. W3C, May 2005. [18] P. Samarati, E. Bertino, A. Ciampichetti, and S. Jajodia. Information flow control in object-oriented
[21] P. Stachour and B. Thuraisingham. Design of LDV: A multilevel secure relational database management system. IEEE Trans. Knowledge and Data Eng., 2(2):190–209, June 1990. [22] M. B. Thuraisingham. Mandatory security in object-oriented database systems. In OOPSLA ’89: Conference proceedings on Object-oriented programming systems, languages and applications, pages 203–210. ACM Press, 1989. [23] D. Wood. Kowari-metastore. http://www.kowari.org.