representation of the security semantics of a multilevel secure database ... to look at the database application at an abstract level by using a conceptual data.
The Entity-Relationship Model for Multilevel Security Günther PERNUL, Werner WINIWARTER, A Min TJOA Institute of Applied Computer Science and Information Systems University of Vienna, Austria
Abstract. A design environment for security critical database applications that should be implemented by using multilevel technology is proposed. For this purpose, the Entity-Relationship model is extended to capture security semantics. Important security semantics are defined and a language to express them in an ER model by means of security constraints is developed. The main contribution consists of the development and implementation of a rule-based system with which security semantics specified may be checked for conflicting constraints. The check involves application independent as well as application dependent integrity constraints and leads to a non conflicting conceptual representation of the security semantics of a multilevel secure database application.
1 Introduction Designing a database is a complex and time consuming task, even more in the case attention must be given to the security of the information being considered for representation in the database. In order to simplify the design activity, it is necessary to look at the database application at an abstract level by using a conceptual data model, for example, the Entity-Relationship (ER) Model [1]. A conceptual data model must be powerful enough to capture all application dependent knowledge. For applying such a model successfully for the design of a security critical application, the model must in addition to represent the data semantics represent the security properties of data (i. e. the security semantics of the application) as well. The ER model provides a graphical language to describe the information items of the application by means of semantic modeling constructs. Unfortunately, it does only offer restricted possibilities to represent constraints that are imposed on those information items by the application, such as for example security constraints. The goal of this paper is to extend the notion of the standard ER model to also capture security semantics, to provide a language in which application dependent security requirements may be expressed on the concepts of the ER model, and to provide a technique to check constraints resulted from the security requirements for overall consistency. The model proposed includes two levels of representation for security semantics. At the user level we introduce a graphical representation and a constraint language, and at an internal level a knowledge base
to check for conflicting constraints. The constraint language and the knowledge base to check conflicting constraints has already been implemented by using the deductive DBMS LDL, the implementation of the user interface (graphical browser) including the proposed graphical extensions to ER is under development. The outline of the paper is as follows: Section 2 contains relevant related work. In section 3 the security semantics are defined, expressed in the constraint language, and explained by means of an example. Section 4 deals with conflict management in order to achieve a non conflicting representation of a MLS database application. Section 5 concludes the paper.
2 Background and Related Work For applications in which the secrecy and confidentiality of the information is of major concern DBMSs supporting Mandatory Access Controls (MAC) may be chosen. Mandatory Security requires that data and users are assigned to certain security classifications (security levels, like top-secret, secret, classified, unclassified, or like company-confidential, confidential, private, public, ...). A level represents the sensitivity of the labeled information (classification) or the trustworthiness of a user (clearance) not to disclose information to other users not so trusted. Classifying data is done by means of rules and one of the major questions involved is how the data of a database should be classified without specifying conflicting rules. Mandatory security is often defined in terms of the Bell-LaPadula (BLP) [2] security paradigm which distinguishes between read access and write access by using the two following rules: (1) User u is allowed to read data d if clear(u) ≥ class(d) and (2) u is allowed to write data if clear(u) ≤ class(d). Because of this two rules, information flow controls are implemented restricting users with high clearances to write data in a lower classified storage area and thereby disclosing sensitive information. Mandatory security leads to multilevel secure (MLS) databases because the content of the database may appear different for users with different clearances. For more information and for a formal treatment of the MLS relational data model consult [3] or [4]. Today, there are several commercially MLS database management systems available, for example Informix, Ingres, Oracle, Rdb, Sybase, and others offer in addition to their general purpose system a multilevel version. However, the first and most prestigious effort towards the design and implementation of a MLS DBMS has come from the SeaView project (e.g. [5], [6]). In this paper we focus on database applications that should be implemented in systems supporting a MLS data model. In this context, the major problems involved are to identify the security semantics (requirements) of the application, to specify the security sementics by means of constraints and rules in a conceptual data model, to check the constraints for completeness and to resolve conflicts between security semantics. The conceptual model developed in the proposed approach may be finally transferred into the MLS data model that is supported by a target DBMS in use. Compared to the huge amount of work published in semantic modeling and conceptual design of databases not much work has been done investigating security
semantics of MLS database applications. Only recently several research efforts have started to provide tools and assistance to aid a designer in creating a MLS application. The first attempt to use a conceptual model to represent security semantics is given in [7] and [8]. The author develops the semantic model for security (SDMS) based on a conceptual database model and extends the constraint language ALICE (proposed in [9]) to include constructs to describe security constraints. In contrast to the proposed approach their model has not been completely formalized and does not offer sophisticated techniques for the detection of conflicting constraints and if existing, techniques for their resolution. A more recent approach has been made in [10] and [11]. The authors have developed the SPEAR model which is a high-level data model and similar to the ER approach. This model consists of an informal description of the application domain and of a formal mathematical specification using the Z-notation [12]. The model seems to be powerful, however does not offer a graphical notation and only limited support to detect conflicting constraints. Two further related projects are known. Both projects consider, in addition to modeling the static of the application, to include the behavior of the system within the datbase design process. In [13] the ER model has been extended to capure limited dynamics by including the operations 'create', 'find' and 'link' into the conceptual database representation while in [14] ER has been used to model the static part of a MLS application and data flow diagramming to model the behavior of the system. Both approaches do not offer any consistency checks of the security constraints specified. The proposal in this paper extends previous work on security semantics by • carefully defining the major security semantics that need to be represented during the design of a database, • by developing a constraint language for expressing corresponding rules in a conceptual model (ER model) of the appliciation, and • by developing and implementing a rule-based system with which security semantics specified may be tested for conflicts. Conflicting constraints are notified to the designer and may be resolved. The rule-based system is implemented by using the deductive DBMS LDL.
3 Concepts of Security Semantics In the following we give a taxonomy of security semantics which consists of the most common requirements on multilevel security. Each concept is formally defined, expressed in the security constraints language (SCL), graphically included in the notion of the ER model, and explained by means of an example. We will start with defining the basic concepts. A security object O is a semantic concept of reality that is described by certain properties. Using extended ER terminology, O might be an entity type, a specialization type, a generic object type, or a relationship type. In a security terminology, O is the target of protection and might be denoted by O(A1,...,An). Ai is a characteristic property and defined over a domain Di. Each security object O
must have an identifying property K (K⊆{A1,...,An}) which is either a single characteristic property or a set of properties that combined together form an identifier and make instances (occurrences) o∈O (o={a1, ..., an}, ai∈Di) distinguishable from others. Moving to a multilevel world each property of a security object must be assigned to at least one security level. A security level is an entry in a hierarchical list of levels SL. A multilevel security object Om(A1,C1, A2,C2, ..., An,Cn) is a security object where each characteristic property Ai is assigned to a security classification Ci. The valid domain of Ci is specified by an interval consisting of possible security classifications between the lowest classification (Li) and the highest classification (Hi) possible for a value of characteristic property Ai (Li, Hi∈SL). The set of instances of a multilevel security object is a set of distinct tuples of the form (a1,c1, a2,c2, ..., an,cn,) where each ai∈Di, ci∈[Li,Hi]. The process of assigning data items to security classifications is called classifying and results into the transformation of a security object O into a corresponding multilevel security object Om (O⇒Om). The transformation is performed by means of security constraints specified in a constraints language. In the following we give a taxonomy of the most relevant security semantics that should be expressed in a conceptual data model. Some of the constraints have been discussed before (e.g. in [7], [15]), however, to the best of our knowledge the following is the first careful formalisation of the constraints. It is distinguished between two kinds of constraints: Constraints that classify characteristic properties of security objects (simple, content-based, complex, and level-based constraints) and constraints that classify retrieval results (association-based, inference, and aggregation constraints). The example design scenario that will be used represents part of a hospital information system containing historical information. Of interest are the security objects: • patient (ssn, pname, karnofski, incompatibility) • treatment (ssn, icd, duration, status, course) • disease (icd, dname, medication) 3.1 Simple Constraints These constraints classify certain characteristic properties of security objects. By assigning a security classification to the identifying property even the information about the existence of the security object can be classified. • Definition: Let X be a set of characteristic properties of security object O i.e. X⊆ {A1,..,An}. A simple security constraint SiC is a classification of the form SiC(O(X))=C, C∈SL and results into a multilevel object Om(A1,C1, A2,C2, ..., An,Cn) whereby Ci=C for all Ai∈X, Ci left unchanged for the remaining Ai∉X. • SCL predicate: sic(O, X, C) O ... security object, X ... set of classified characteristic properties, C ... security level
• Example: The status of a medical treatment should be regarded as private information. ⇒ sic(treatment, {status}, private) 3.2 Content-Based Constraints They classify characteristic properties of instances of security objects based on the evaluation of a predicate on a specific characteristic property of the same instance. • Definition: Let Ai be a characteristic property of security object O with domain Di, let P be a predicate defined on Ai and let X⊆{A1,...,An}. A content-based constraint CbC is a classification of the form CbC(O(X),P:Aiθa)=C or CbC(O(X),P:AiθAj)=C ( θ∈{=,>,n))=C (n∈Ν) and results into the classification of C for the retrieval result in the case count(O)>n, i.e. the number of obtained instances of O exceeds the threshold value n. • SCL predicate: agc(O, A, N, C) A ... sensitive characteristic property Ai, N ... threshold value n • Example: The report of an individual medical treatment for a specific patient is regarded as less sensitive than the complete medical history obtained from a retrieval of all medical treatments of the patient. Therefore, an aggregation-based constraint is formulated that declares a threshold value of 5 treatments. If this limit is exceeded, a security classification of level confidential is performed. ⇒ agc(treatment, ssn, 5, confidential) Figure 2 contains the summarised graphical representation of the constraints classifying retrieval results of the examples given above. treatment n
patient ssn
P U
C Σ
pname
karnofski incompatibility
m
disease
ssn
icd
icd
dname
duration
medication
status
course X S
Fig. 2. Example of constraints on retrieval results
4 Conflict Management For complex and security critical database applications it might be necessary that a large set of security constraints need to be expressed at the conceptual database level. In the following we will discuss the methods used for the detection of conflicting constraints specified as well as the techniques used for the conflict resolution. The conflict management represents the internal layer of the proposed approach and is
responsible to enforce two different kinds of integrity constraints in the security semantics specified: application independent and application dependent integrity constraints. Conflict management is performed by our prototype implementation in LDL. For a more detailed study of conflict management we refer to [16]. 4.1 Integrity of Security Semantics Application independent integrity constraints are rules that must be valid in each MLS database. By expressing the security constraints introduced on the conceptual representation of the database application, integrity constraints might be violated. In the proposed system, those conflicts are detected automatically by the implemented rule system, the conflicts are resolved and finally notified to the designer. However, in the case a conflict involves an application independent integrity constraint, the designer is not given a possibility to override the changes performed by the tool. Some of the following integrity properties have first been proposed within the SeaView project [5] and have been carefully defined in the Jajodia-Sandhu model [3]. For conflict management we consider: [I1] Multilevel integrity property: Each property must have a security label. This is satisfied because during initial classifying each characteristic property is classified with the default security level. [I2] Entity integrity property: A multilevel security object Om with identifying property K satisfies entity integrity property if for all occurrences o∈Om 1. Ai∈K ⇒ val(Ai)≠null (with val(Ai) the value of property Ai in o) 2. Ai,Aj∈K ⇒ val(Ci) = val(Cj) (with val(C) classification of property A in o) 3. Ai∉K ⇒ val(CK)≤val(Ci). (with val(CK) classification of the key-value in o) Entity integrity states that an indentifying property may not be null, must be uniformly classified and its classification must be dominated by all other classifications of the other attributes. This is necessary because by having only access to part of the key for lower cleared users it would not be possible to uniquely identify objects. Please note that this may contradict to certain applications, for example, in applications where access to key-properties should be denied while access to some other non-identifying properties should be possible (e.g. for statistical queries). [I3] Foreign key property: Let K be the identifying property in multilevel security object Om and and let it be a foreign key K' in multilevel security object Om'. The foreign key property is valid, if val(CK) ≤ val(CK'). The foreign key property guarantees that no dangling references between depending objects will occur for users cleared to access lower classified data only. [I4] Near-key property: Near-key property is important in the case an association based constraint is specified. In this case the level C assigned by the constraint abc(O, X, C) is automatically propagated to each corresponding association including a near-key (or candidate key) instead of the identifying property of O. [I5] Level-based property: In order to avoid transitive dependencies (which may result into propagation cycles) between level-based constraints specified, for any two level-based constraints on the same security object lbc(O, X, A) and lbc(O, X', A') A
∉X'∧A'∉X must hold. Because of the entity integrity property a level-based constraint may not be defined on the key. [I6] Multiple-Classification: Each property value may only have a single classification. In the case different security constraints assign more than one level to a particular property value we refer to it as multiple-classification. Such a conflict is notified to the designer which may decide whether to apply the default resolution strategy or not. 4.2 Conflict Resolution Stategies and the Example Design Classifying is done by stepwise insertion of security constraints into the rule-base. Declaring a new constraint is an interactive process between the designer and the tool. The six integrity constraints given above must be validated for any new security constraint specified. If conflicts are detected the resolution strategy applied depends on the kind of conflict. For conflicts due to application independent constraints the integrity is preserved by propagating the required classifications to the characteristic properties involved. Application dependent constraints leading to multipleclassification of characteristic properties are notified to the designer who may decide about a proper classification. As default strategy the design tool suggests the maximum of the conflicting security levels to guarantee the highest degree of security possible. However, accepting the default strategy may lead to overclassification of the database. Let us now explain the conflict resolution strategies by applying them to the example and the security constraints developed in the preceding chapters, summarized once more: 1. 2. 3. 4. 5. 6. 7.
sic(treatment, {status}, private) cbc(treatment, {ssn, icd}, duration, '>', 8, confidential) coc(treatment, {ssn}, disease, dname, '=', carcinoma, secret) lbc(treatment, {course}, status) abc(patient, {karnofski}, private) ifc(patient, {pname, incompatibility}, disease, {dname, medication},secret) agc(treatment, pssn, 5, confidential)
The process starts with the initial assignment of the default security level public to each data item. The insertion of 1) results in the assignment of security level private to property status. No conflicts arise. Constraint 2) is a cbc and results into the assignment of the range [∅..C] to properties ssn and icd. That is, in the case the predicate evaluates into true classification C is assigned, otherwise the classifications remain public (i. e. the default value denoted as ∅). In order to resatisfy the application independent property that the classification of the key must be dominated by all other classifications (entity integrity property as stated above) the assigned classifications are propagated to the other properties of relationship type treatment. The propagation results into the first conflict between application dependent constraints specified. This is because of property status (which is already classified
as private by 1)) will be multiple-classified by propagating [∅..C]. This conflict is notified to the designer. Let us consider that the designer confirms the suggested default resolution strategy resulting in the assignment of range [P..C] to property status. The next rule 3) is a coc that assigns the ssn of patients suffering from 'carcinoma' in treatment to secret. This leads to a classification range for the property ssn of [∅..S]. Again this disagrees with the already existent range of [∅..C] for ssn. Therefore, the designer has to decide whether to accept the suggested classification range [∅..S] or not. In the case of acceptance the new classification of ssn must be propagated to icd because of entity integrity (icd constitutes the second part of the key). Now the complete key takes the new classification [∅..S] which makes the propagation of [∅..S] to all other properties of treatment necessary. This propagation again causes multiple-classification of all non-identifying properties of treatment resulting in further notations for the designer (once more confirmed). Lbc 4) assigns the classification of status to course, yielding a new range of [P..S] for course. The designer receives the notation about the conflicting ranges [P..S] and [∅ ..S] and chooses the default resolution strategy. The remaining security constraints 5)-7) do not deal with characteristic properties but instead classify retrieval results. Because of the near-key integrity constraint 5) is also propagated to the near-key pname in patient. The remaining constraints 6) and 7) do not cause any conflicts.
5 Conclusion For security critical database applications MLS technology may be chosen as the implementation platform. In such an environment data items need to be assigned to security classifications that properly represent the security semantics of the database application. In this paper we have carefully defined the important security semantics that need to be represented during the design of a database, have developed a constraint language for expressing them, and have suggested to extend the EntityRelationship model to capture security semantics. We see as the main contribution of our research the development of a rule-based system holding classification rules and certain integrity constraints that must be valid among the rules specified. Whenever a database designer inserts a new classification rule, the rule-base is checked for resulting conflicts. The checks are performed against the integrity constraints and all other rules already in the rule-base. In the case a classification rule causes conflicts, a conflict resolution strategy has been developed and implemented. The research presented in this paper provides the basis to assist database designers and security engineers in getting a better understanding of the security requirements of the static part of the database application. Future research in this area may be required because the security of a database may also be violated by abusing the functional part of the system. What will be necessary to do in order to achieve a high degree of data protection is to look at the dynamic aspects of MLS database applications too.
References 1. P. Chen. The Entity-Relationship Model: Towards a Unified View of Data. ACM Trans. on Database Systems (ToDS). Vol. 1, No. 1, 1976. 2. D. E. Bell, L. J. LaPadula. Secure Computer System: Unified Exposition and Multics Interpretation. Technical Report MTR-2997. MITRE Corp. Bedford, Mass, 1976. 3. S. Jajodia, R. S. Sandhu. Toward a Multilevel Secure Relational Data Model. Proc. 1991 ACM Int'l. Conf. on Management of Data (SIGMOD'91), 50-59. 4. K. Smith, M. Winslett. Entity Modeling in the MLS Relational Model. Proc. 18th Conf. on Very Large Databases (VLDB'92), Vancouver, BC, 1992. 5. D. E. Denning, T. F. Lunt, R. R. Schell, W. R. Shockley, M. Heckaman. The SeaView Security Model. Proc. 1988 IEEE Symposium on Research in Security and Privacy, 218-233. 6. T. F. Lunt, D. Denning, R. R. Schell, M. Heckman, W. R. Shockley. The SeaView Security Model. IEEE Trans. on Software Engineering (TOSE), Vol. 16, No. 6 (1990), 593-607. 7. G. W. Smith. The Semantic Data Model for Security: Representing the Security Semantics of an Application. Proc. of the 6th Int. Conf. on Data Engineering (ICDE'90), 322-329, IEEE Computer Society Press 1990. 8. G. W. Smith. Modeling Security Relevant Data Semantics. Proc. 1990 IEEE Symposium on Research in Security and Privacy, 384-391. 9. S. D. Urban. 'ALICE': an assertion language for integrity constraint expression. Proc. Computer Software and Appl. Conf., Sept. 1989. 10. S. Wiseman. Abstract and Concrete Models for Secure Database Applications. Proc. 5th IFIP WG 11.3. Working Conf. on Database Security. Shepherdstown, WV, Nov. 1991. 11. P. J. Sell. The SPEAR Data Design Method. Proc. 6th IFIP WG 11.3. Working Conf. on Database Security. Burnaby, BC, Aug. 1992. 12. J. M. Spivey. The Z-Notation: A Reference Manual. Prentice Hall International, 1989. 13. R. K. Burns. A Conceptual Model for Multilevel Database Design. Proc. 5th Rome Laboratory Database Security Workshop, Oct. 1992. 14. G. Pernul. Security Constraint Processing During MLS Database Design. Proc. 8th Ann. Computer Security Applications Conf. (ACSAC'92). IEEE Computer Society Press. 15. M. Collins, W. Ford, B. Thuraisingham. Security Constraint Processing During the Update Operation in a MLS DBMS. Proc. 7th Annual Computer Security Applications Conf. (ACSAC'91). IEEE Computer Society Press. 16. G. Pernul, W. Winiwarter, A M. Tjoa. The Deductive Filter Approach to MLS Database Prototyping. Proc. 9th Annual Computer Security Applications Conference (ACSAC'93), Orlando, FL, Dec. 1993. IEEE Computer Society Press.