Enforcing Access Control in Web-based Social Networks

Enforcing Access Control in Web-based Social Networks BARBARA CARMINATI, ELENA FERRARI, and ANDREA PEREGO DICOM, Università degli Studi dell’Insubria, Varese, Italy In this paper, we propose an access control mechanism for Web-based Social Networks, which adopts a rule-based approach for specifying access policies on the resources owned by network participants, and where authorized users are denoted in terms of the type, depth, and trust level of the relationships existing between nodes in the network. Differently from traditional access control systems, our mechanism makes use of a semi-decentralized architecture, where access control enforcement is carried out client-side. Access to a resource is granted when the requestor is able to demonstrate of being authorized to do that, by providing a proof. In the paper, besides illustrating the main notions on which our access control model relies, we present all the protocols underlying our system and a performance study of the implemented prototype. Categories and Subject Descriptors: H.3.5 [Information Storage and Retrieval]: Online Information Services—Data Sharing; Web-based Services; K.6.5 [Management of Computing and Information Systems]: Security and Protection General Terms: Design, Theory Additional Key Words and Phrases: Access Control, Semantic Web, Social Networks

1.

INTRODUCTION

Web-based Social Networks (WBSNs) are online communities that allow users to publish resources and to record and/or establish relationships with other users, possibly of different type (“friend”, “colleague”, etc.), for purposes that may concern business, entertainment, religion, dating, etc. Recently, the usage and diffusion of WBSNs have been increasing, with about 300 Web sites collecting the information of more than 400 millions registered users.1 The “net model” is today more and more used also by companies and organizations to communicate, share information, making decisions, and doing their business. Regardless of the purpose of a WBSN, one of the main reasons for participating is to share and exchange information with other users. Recently, the adoption of Semantic Web technologies, such as FOAF (Friend of a Friend) [Brickley and 1 See:

http://en.wikipedia.org/wiki/List_of_social_networking_websites.

Authors’ addresses: B. Carminati, E. Ferrari, and A. Perego, Dipartimento di Informatica e Comunicazione, Universit` a degli Studi dell’Insubria, Via Mazzini 5, 21100 Varese, Italy; email: {barbara.carminati, elena.ferrari, andrea.perego}@uninsubria.it. Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. c 20YY ACM 0000-0000/20YY/0000-0001 $5.00 ⃝ ACM Journal Name, Vol. V, No. N, Month 20YY, Pages 1–40.

2

⋅

Barbara Carminati et al.

Miller 2007; Ding et al. 2005], has simplified information access and dissemination over multiple WBSNs. If this has been quite a relevant improvement towards an easier sharing of information, it is now necessary that information owners have more control over its diffusion. So far, this issue has been addressed by most of the available Social Network Management Systems (SNMSs) by allowing a user to specify whether a given piece of information (e.g., personal data and resources) must be public, private, or accessible only by the users with whom he/she has a direct relationship, or by providing simple variants to this basic setting. Such simple access control paradigm has the advantage of being straightforward and easy to be implemented, but it suffers from several drawbacks. On one hand, it may either grant access to non-authorized users or limit too much information sharing, and, on the other hand, it is not flexible enough to express the heterogeneous access control requirements that different WBSN users may have. For instance, such access control paradigm does not take into account the ‘type’ of the relationships existing among users. Consequently, it is not possible to state access control policies such as “only my friends or my colleagues can access a given piece of information”. We think that more sophisticated access control mechanisms should be designed for current WBSNs. Besides relationships, some other information can be used for access control purposes. In fact, also the depth of a relationship—i.e., the length of the shortest path(s) between two nodes in the graph representation of a WBSN— may be a useful parameter to customize access control policies, in that it allows one to control the propagation of access rights in the network. Moreover, in some WBSNs, users can specify how much they trust other users, by assigning them a trust level. Such information is currently exploited for purposes that encompass the primary objectives of a WBSN, e.g., as a basis for recommender systems [Adomavicius and Tuzhilin 2005], but we believe it can be used as well to denote the users authorized to access a resource in terms of their trustworthiness. In this paper, we propose a discretionary access control model and a related enforcement mechanism for controlled sharing of information in WBSNs. The model allows the specification of access rules for online resources, where authorized users are denoted in terms of the relationship type, depth, and trust level existing between nodes in the network. In devising the related enforcement mechanism, we adopt a semi-decentralized strategy, where, differently from traditional information management systems, each participant is in charge of specifying and enforcing access control policies. As clarified in Section 4, this solution is semi-decentralized in that we assume the presence of a repository managing certificates concerning the existing relationships. Indeed, due the increasing concerns about privacy in WBSNs, we believe this solution is a good trade-off between efficiency and scalability, and the emergent wish of users to have more control over their data. In the paper, besides describing the access control model and the related access control mechanism, we illustrate the prototype implementation we have developed and the performance evaluation we have carried on. Moreover, we present the security analysis of the proposed protocols. This paper extends the work reported in [Carminati et al. 2006], where the access control model has been proposed. [Carminati et al. 2006] focuses only on the access control model, but no details are provided on access control enforcement. Here, we ACM Journal Name, Vol. V, No. N, Month 20YY.

Enforcing Access Control in Web-based Social Networks

⋅

3

extend such work with the definition of two protocols for certifying relationships and enforcing access control, and with a performance evaluation of the implemented system. Moreover, we perform a security analysis of our protocols that shows that they are robust to the main security threats. The remainder of this paper is organized as follows. Section 2 discusses related work and provides an overview of existing WBSNs. Section 3 introduces the WBSN model we use throughout the paper and discusses WBSNs’ access control requirements. Section 4 presents the proposed access control model. The enforcement mechanism is illustrated in Section 5. Sections 6 and 7 describe, respectively, the protocols for relationship certificate management and access control enforcement. Security issues concerning our protocols are discussed in Section 8. Section 9 illustrates the prototype implementation, whereas Section 10 deals with performance evaluation. Finally, Section 11 concludes the paper and outlines future research directions. The notation used throughout the paper is reported in Appendix A, whereas Appendices B and C illustrate, respectively, the usage of Notation 3 Logic (N3) [Berners-Lee et al. 2008] for representing relationships and access rules, and the performance of the reasoner used in our system to generate a proof. 2.

RELATED WORK

In this section, we first overview the characteristics of current WBSNs, and then we discuss related work in the area of WBSN security and trust. 2.1

An Overview of Existing WBSNs

Usually, a social network is defined as a small world network [Watts 2003], consisting of a set of individuals (persons, groups, organizations) connected by personal, work, or trust relationships. Social networking is then quite a broad and generic notion which, in the Web context, might be applied to any kind of virtual community. When a user registers to a WBSN, the system gives him/her an account (also called profile), where he/she will be able to insert personal information, specify relationships with other users, and, in some WBSNs, manage personal resources (such as blogs, photos, video and audio files). Usually, a WBSN member can also decide which personal information, relationships, and/or resources are accessible by other members. The basic protection options are to mark a given item as public, private, or accessible by direct contacts. In order to give more flexibility, some WBSNs enforce variants of this setting. For instance, besides the basic setting, Bebo (http://bebo.com), Facebook (http://facebook.com), and Multiply (http:// multiply.com) support the option “selected friends” (selected contacts); Last.fm (http://last.fm) supports the option “profile neighbors” (i.e., the set of WBSN members, computed by the SNMS, having musical preferences and tastes similar to mine); Facebook, Friendster (http://friendster.com), and Orkut (http://www. orkut.com) support the option “friends of friends” (2nd degree contacts); Xing (http://xing.com) supports the options “contacts of my contacts” (2nd degree contacts), and “3rd” and “4th degree contacts”; LinkedIn (http://www.linkedin. com) and Multiply support the option “my network” (𝑛th degree contacts—i.e., all the WBSN members whom I am either directly or indirectly connected to, independently from how distant they are). It is important to note that all these approaches have the advantage of being easy to be implemented, but they lack ACM Journal Name, Vol. V, No. N, Month 20YY.

4

⋅


flexibility. In fact, the available protection settings do not allow users to easily specify their access control requirements, in that they are either too restrictive or too loose (e.g., the options “1st degree contacts” and “my network” in LinkedIn). Which types of personal/work relationships are supported depends on the purposes of a WBSN, and on how relationships are used. For instance, WBSNs aimed at connecting and finding friends, such as Friendster, Facebook, and Bebo, and those with entertainment purposes, such as Last.fm, support just the “friend” relationship type. Some WBSNs with business purposes, like Xing, support a single, generic, relationship type, denoting the fact that I “know” a given person. In contrast, some other WBSNs provide a wider range of choices among personal and/or work relationships. Typical examples are LinkedIn, which gives its members the possibility of choosing among “colleague”, “classmate”, “business partner”, “friend”, “groups and associations”, “other”, and even “I don’t know 𝑋”, and Multiply, which supports more than 30 relationship types, grouped into the following categories: “friend”, “online buddy”, “family”, and “professional contact”. Some WBSNs give their members also the ability of specifying how much they trust other members, thus establishing trust relationships. This can be done either by expressing a recommendation, or by rating other users according to a numeric scale. An example of WBSN supporting recommendations is LinkedIn, where a free text label can be associated with a user, explaining why he/she is recommended by another user. In contrast, in Orkut and RepCheck (http://repcheck.com) users’ trust can be expressed according to a numeric scale. The semantics of trust varies depending on the specific purposes of a WBSN: for instance, Orkut allows its members to rate personal trust, whereas RepCheck supports both personal and business trust. As far as relationship specification is concerned, it is currently a common practice to require the consent of both members before recording a new relationship. Usually, if a member 𝐴 asks to create a relationship with another member 𝐵, the system sends 𝐵 an email asking for a confirmation. This procedure is adopted for both personal and work relationships in all the WBSNs we have reviewed, and it is also the approach we adopt in our system. In contrast, a trust relationship does not need the consent of the trustee to be established, but he/she is however notified of being rated, and he/she will be able to verify who has posted the rating and the rating value itself. Table I summarizes the characteristics of the WBSNs we have reviewed. 2.2

WBSN Security

So far, research on WBSN security has mainly focused on privacy-preserving techniques to allow statistical analysis on social network data without compromising WBSN members’ privacy (see Carminati and Ferrari [2008] for a survey on this topic). In contrast, access control in WBSNs is a new research area. As far as we are aware, the only other proposals are the ones by Hart et al. [2007], Ali et al. [2007], and the D-FOAF system [Kruk et al. 2006]. In their position paper, Hart et al. [2007] discuss the access control requirements of WBSNs, and they argue that existing WBSN relationships can be used to denote authorized members. However, only direct relationships are considered, and the ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

5

Table I. WBSNs’ characteristics. “𝑖th degree contacts” denotes WBSN members whose distance in the network graph is equal to 𝑖; “𝑛th degree contacts” denotes WBSN members connected by paths of undefined length; “online contacts” denotes people known online, but who are not “real world” contacts. WBSN

Purpose

Relationships

Trust

Bebo

general

friend

none

Facebook

general

friend

none

Friendster

general

friend

none

MySpace

general

friend

none

Multiply

general

various

none

Orkut

general

friend

personal

Flickr

photos

friend/family

none

Last.fm

music

friend

none

Xing

business

generic

none

LinkedIn

business

various

business

RepCheck

reputation

generic

personal, business

Protection Options public, private, 1st degree contacts, selected contacts public, private, 1st-2nd degree contacts, selected contacts members from selected continents, private, 1st-2nd degree contacts public, members > 18 years old, private, 1st degree contacts public, private, 1st and 𝑛th degree contacts, 1st degree but not online contacts, selected contacts public, private, 1st-2nd degree contacts public, private, 1st degree contacts (friends or family) public, private, 1st degree contacts (and profile neighbors) public, private, 1st-4th degree contacts public, private, 1st and 𝑛th degree contacts none

notion of trust level is not not taken into account as one of the possible parameters to be used in access authorizations. Differently from our proposal, Ali et al. [2007] adopt a multi-level security approach, where trust is the only parameter used to determine the security level of both users and resources. More precisely, to each user 𝑢 a reputation value 𝑟(𝑢) is assigned, computed as the average of the trust ratings specified for him/her by other users in the system. After having logged in, user 𝑢 chooses an operating trust level 𝜏 , such that 0 ≤ 𝜏 ≤ 𝑟(𝑢). A resource 𝑜 created by user 𝑢 will then be assigned a confidence level equal to 𝜏 , whereas user 𝑢 can read only resources with confidence levels equal to or less than 𝜏 . Access control is enforced according to a challenge-response based protocol. For each resource 𝑜, the resource owner generates a secret key 𝐾, which is then processed by the (𝑘, 𝑛) threshold algorithm proposed by Shamir [1979]. The basic principle of such algorithm is that a key 𝐾 can be split into 𝑛 portions and then reconstructed based only on 𝑘 portions of it, where 𝑘 < 𝑛. In [Ali et al. 2007], the 𝑛 portions of 𝐾 are distributed to 𝑛 trustworthy nodes. If a requestor wishes to access resource 𝑜, the resource owner sends him/her the challenge encrypted with 𝐾. Then, the requestor retrieves the 𝑘 portions of 𝐾 from the set of 𝑛 nodes holding them. Such portions are released only if the requestor satisfies the trust requirements specified by the resource owner. Once the requestor has reconstructed 𝐾, he/she responds to the challenge, and gains access to the resource. The main difference between the approach described above and our proposal is that Ali et al. [2007] consider only direct trust relationships, whereas we consider (a) both direct and indirect relationships, and (b) both personal and trust relationships. This has the advantage of giving resource owners the ability to specify more flexible policies, making them able to better denote the constraints to be satisfied by users in order to access a resource. Another relevant difference is that we adopt a discretionary access control paradigm, whereas Ali et al. [2007] a mandatory one. ACM Journal Name, Vol. V, No. N, Month 20YY.

6

⋅


Finally, the D-FOAF system, described by Kruk et al. [2006], is primarily a FOAF-based distributed identity management system for social networks, where access rights and trust delegation management are provided as additional services. In D-FOAF, relationships are associated with a trust level, which denotes the level of friendship existing between the users participating in a given relationship. Although Kruk et al. [2006] discuss only generic relationships, corresponding to the ones modeled by the foaf:knows RDF property defined in the FOAF vocabulary [Brickley and Miller 2007], another D-FOAF-related paper [Choi et al. 2006] considers also the case of multiple relationship types. As far as access rights are concerned, they denote authorized users in terms of the minimum trust level and maximum length of the paths connecting the requestor to the resource owner. Such an approach shares some similarities with ours, in that we also associate trust levels with relationships, and we express access control policies in terms of the minimum trust level and maximum distance of the paths existing between two WBSN members. However, there also exist several relevant differences. As we argue later in this paper (see Section 4), in a relationship-based access control system, it is necessary to enforce a mechanism able to prevent forging of fake relationships. We address this issue by requiring that relationships are established only with the mutual consent of the involved WBSN members. Moreover, relationships are encoded into relationship certificates, hosted by a trusted third party, referred to as certificate server. In contrast, D-FOAF does not consider at all these issues. Another difference concerns access control policies. In D-FOAF, the relationships required to access a given resource always concern the requestor and the resource owner, whereas in our model we do not have such constraint. Moreover, in our model, policies are expressed not only in terms of the trust level and length of paths connecting two members, but also in terms of the type of relationship they denote. As mentioned previously, Choi et al. [2006] consider also the case of multiple relationship types, but they do not illustrate how this affects the access control model described by Kruk et al. [2006]. Finally, Kruk et al. [2006] do not discuss the case of multiple policies associated with the same resource, whereas our model supports the possibility of combining policies by using the AND and OR Boolean operators (see Section 5). The last main difference concerns access control enforcement. In D-FOAF, both path discovery and access control are enforced by the D-FOAF SNMS hosting the resource owner account. In contrast, in our system, we separate these tasks. Path discovery is performed by the certificate server, whereas, access control is enforced based on a rule-based approach, according to which is the requestor who must provide to the resource owner a proof of being authorized to access the requested resource (see Section 4). As we argue later in this paper (see Section 3.2.2), we think that such an approach is more suitable to WBSNs than the centralized one adopted by D-FOAF. 2.3

Trust

An analysis of the related literature shows that there does not exist a unique definition of trust, since it may vary depending on the context and for which purposes it is used. This affects also how trust is computed and expressed, whether it is a local or global measure of the trustworthiness of a given entity, and whether trust relaACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

7

tionships are explicitly or implicitly established. In this section, we first overview the different solutions adopted so far for trust representation and computation, then we discuss the specific requirements of trust modeling in our access control system. 2.3.1 Trust Representation and Computation. Although the notion of trust is often associated with the one of reputation, there exists a relevant difference between the two concepts. As pointed out by Jøsang et al. [2007], trust denotes whether, and, possibly, how much, a given entity 𝐴 considers trustworthy another entity 𝐵. As such, it expresses a personal opinion of 𝐴 about 𝐵, and thus trust can be considered as a subjective (or local) measure of trustworthiness. In contrast, reputation denotes the trustworthiness of a given entity for all the entities in a network. As such, it expresses the collective opinion of a community on one of its members, and thus it is an objective (or global) measure of trustworthiness. Note that here ‘subjective’ does not mean ‘arbitrary’. Rather, it denotes an opinion based on the observations made by a given entity 𝐴 on another entity 𝐵. Similarly, ‘objective’ is used to denote an opinion based on the observations made by all the entities in a network about 𝐵. The definitions above have two main implications. First, whether trust or reputation are used in a system depends on whether personal opinions or tastes are relevant or not. For instance, in a WBSN supporting collaborative rating of books or movies, reputation can be a measure of the expertise of a given user on a given topic. However, independently from a user’s reputation, I can consider him/her as untrustworthy, because his/her tastes are different from mine, and thus he/she will give a bad rating where I would give a good one. In contrast, reputation is useful when there exist precise requirements to be satisfied in order to be considered as trustworthy. A typical example is the one of P2P (Peer-to-Peer) systems, where the trustworthiness of a given peer depends on its reliability in providing a given service. The second implication is that the notions of trust and reputation are not disjoint, since reputation is derived from the trust relationships existing between entities in a network. This means that, in general, reputation can be computed after having evaluated trust relationships. As an example, both the EigenTrust [Kamvar et al. 2003] and PeerTrust [Xiong and Liu 2004] algorithms, designed for P2P environments, consist of two main steps: first they compute the trust existing between peers, and then they use this to compute the reputation of each peer in the network. A similar approach is adopted also by the RepCheck social network (see Section 2.1). The access control framework we present falls under the class of trust systems. In our approach, each WBSN member uses trust as one of the parameters for denoting the members authorized to access his/her resources. Therefore, in such a scenario, we believe that WBSN members’ trustworthiness cannot be assessed by a collective measure. For these reasons, in this section we focus on how trust is represented and computed, whereas we do not discuss the corresponding issues related to reputation. A trust relationship is usually modeled as a directed edge, connecting two entities 𝐴 and 𝐵, labeled with information stating whether, and, possibly, how much, 𝐴 considers 𝐵 trustworthy. The directed edge models a specific property of trust, i.e., its asymmetric nature. In fact, if 𝐴 trusts 𝐵, it does not necessary follow that 𝐵 trusts 𝐴. Different approaches have been proposed so far to represent ACM Journal Name, Vol. V, No. N, Month 20YY.

8

⋅


trust. The most accurate is probably the one making use of belief calculus, and, in particular, subjective logic (see, e.g., [Jøsang 1999] and [Jøsang et al. 2006]), where trust relationships are modeled based on three parameters, namely, belief, disbelief, and uncertainty. Despite the accuracy of such an approach, the most diffused trust measures are based on a single trust value, which may be either scalar or binary. Scalar trust relationships make use of a range of either continuous or discrete values, denoting how much 𝐴 considers 𝐵 as trustworthy. This includes also ordered set of trust levels, as in the PGP (Pretty Good Privacy) web of trust [Garfinkel 1996]. In contrast, binary trust relationships make use of a binary value t ∈ {0, 1}, which denotes whether 𝐵 is considered trustworthy (t = 1) or not (t = 0) by 𝐴. As such, binary trust can be considered a particular case of scalar trust, and it is usually adopted in environments with restrictive trust requirements, such as PKIs (Public Key Infrastructures). Some P2P systems and WBSNs adopt binary trust relationships, but for different reasons. In WBSNs, this is just a way to make as simple as possible the task of trust specification (see, e.g., [Golbeck and Hendler 2006]), whereas, in some P2P file-sharing services, a peer providing even only one fake or corrupted file, is unreliable, and thus it is considered as totally untrustworthy (see, e.g., [Xiong and Liu 2004]). In contrast, scalar trust relationships can be used to rank entities based on their trustworthiness, making other entities able to decide which is the threshold that makes an entity trustworthy or not. Examples of this approach are provided by both P2P systems (see, e.g., [Kamvar et al. 2003]) and WBSNs (see, e.g., [Avesani et al. 2005; Golbeck 2005; Kruk et al. 2006; Choi et al. 2006]). So far, we have discussed only direct trust relationships. However, the above considerations can be extended to relationships corresponding to paths of lengths > 1. This implies considering trust relationships as transitive. However, even if it is true that trust is not necessary transitive—i.e., if 𝐴 trusts 𝐵 and 𝐵 trusts 𝐶, it does not necessary follow that 𝐴 trusts 𝐶—trust paths may be useful to predict the trustworthiness existing between entities not directly connected. More precisely, the notion of transitive trust relies on the assumption that, if trust relationships exist between entities 𝐴 and 𝐵, and 𝐵 and 𝐶, but not between 𝐴 and 𝐶, then it is possible to use the trust path 𝐴𝐵𝐶 to determine whether and/or how much 𝐴 considers 𝐶 trustworthy. Computation of transitive trust has been investigated in different fields, comprising federated PKIs, P2P systems, and social networks. The main issue concerns which trust paths must be considered in order to obtain an accurate trust value, since multiple paths may exist connecting two entities. Several solutions have been proposed so far, and, usually, they enforce some constraints in order to select just some of the existing paths. For instance, Beth et al. [1994] discard trust paths with either maximum or minimum trust values, whereas Reiter and Stubblebine [1997] do not consider trust paths having a length greater than a given bound 𝑏. Constraints on the maximum path length are also used by the MoleTrust [Avesani et al. 2005] and TidalTrust [Golbeck 2005] algorithms, where, in addition, the exploration of the network graph terminates as soon as all the shortest trust paths have been discovered. MoleTrust and TidalTrust introduce also the notion of trust threshold, which is used to select the trust relationships to be considered. The constraint on ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

9

the maximum length of trust paths is motivated by claiming that the reliability of propagated trust decreases as the length of the considered trust path increases. Golbeck [2005] provides also experimental results supporting this claim. Finally, the D-FOAF system [Kruk et al. 2006; Choi et al. 2006] adopts a simpler approach according to which the trust level of a relationship existing between two WBSN members is determined by the path having the highest trust level among those of a given maximum length. The paths’ trust levels correspond to the product of the trust levels associated with their edges. 2.3.2 Trust in WBSNs. Golbeck [2005] discusses how trust is and could be used in WBSNs. In particular, she motivates the adoption of either binary or scalar trust relationships, instead of more sophisticated approaches (such as the one based on subjective logic), by claiming that the semantics of trust must be clear to average Web users, as WBSN members are, and that the task of trust specification must be as simple as possible. Although Golbeck [2005] focuses on WBSNs for collaborative rating, we believe that such considerations apply also to our access control scenario. As it will be illustrated in the following sections, in our approach trust is one of the factors that determines whether a given WBSN node is authorized to access a given resource. Clearly, in our scenario trust has a different meaning from the one used in collaborative rating WBSNs. Trust should mainly convey information about how much confidence a user has that another user does not disclose sensitive information to unauthorized users, and thus its purpose has some similarities with the notion of security level used in mandatory access control models (see [Ferrari and Thuraisingham 2000] for a survey on this topic). For these reasons, in our model we use a new type of trust levels, called security trust levels, which are represented by the range of rational numbers between 0 and 1. Note that such definition of trust differs from the one adopted in D-FOAF [Kruk et al. 2006], where trust levels model friendship degrees and are used both for access control purposes and collaborative filtering, thus merging two notions of trust However, similarly to D-FOAF, we do not consider trust relationships as independent from personal relationships, which is the approach usually adopted. In fact, when sharing resources, a user usually has in mind a specific audience, which, in a WBSN, is determined on the basis of personal direct or indirect relationships. For instance, some resources are meant for friends and should not be accessed by colleagues (or vice versa). From this follows also that there may exist different trust relationships with the same user. As an example, suppose that I have both a relationship of type friendOf and colleagueOf with a given user: the confidence I have with him/her as a friend rather than as a colleague may be different, since it depends on the context (free time vs. work activities) and on the set of resources to be disclosed (recreative vs. business documents). Consequently, we model trust as a property of personal relationships. Therefore, trust transitivity is related to the transitivity of personal relationships. In other words, there exists a trust path between two WBSN nodes 𝐴 and 𝐵, only if such path consists of edges denoting the same type of relationship. Finally, considering the purpose for which trust relationships are used in our context, it is preferable to let resource owners explicitly specify how much they trust possible requestors, instead of implicitly deriving ACM Journal Name, Vol. V, No. N, Month 20YY.

10

⋅


their trustworthiness from other information (which, however, can be used by the resource owner to help him/her in the computation of the trust value). As far as the computation of transitive trust is concerned, our purpose is to avoid bounding our system to a specific algorithm, with the only constraint that, following Avesani et al. [2005] and Golbeck [2005], only the shortest trust paths are considered. Therefore, to be as general as possible, currently, our system separates path discovery from trust computation. As it will be illustrated in Section 7.2, our algorithm consists of two steps. First, it discovers all the shortest paths, independently from their trust values, and then it computes transitive trust. This approach differs from the one adopted by some trust computation algorithms, such as MoleTrust and the one adopted in D-FOAF, where trust is used as a search parameter. This choice makes us also able to test the performance of our system (cfr. Section 10) by considering the worst case (i.e., the absence of threshold). However, as it will be clearer in the remainder of the paper, other approaches for trust computation can be easily adopted as well without too much impacting our system. In the current version of our system, we adopt the variant of the TidalTrust algorithm granting the best accuracy in trust computation [Golbeck 2005]. First, all the shortest paths are discovered. Then, they are processed in order to set a trust threshold max t, which is used to discard trust paths consisting of edges with a trust value less than max t. More precisely, given two nodes 𝑣 and 𝑠, connected by one or more trust paths of length 2, the predicted trust existing between 𝑣 and 𝑠, denoted t𝑣,𝑠 , is computed as follows:

t𝑣,𝑠 =

∑

𝑢∈𝑁 ∣t𝑣,𝑢 ≥max t

∑

t𝑣,𝑢 t𝑢,𝑠

𝑢∈𝑁 ∣t𝑣,𝑢 ≥max t

t𝑣,𝑢

(1)

where t𝑣,𝑢 (t𝑢,𝑠 ) denotes the trust value of the relationships existing between nodes 𝑣 and 𝑢 (𝑢 and 𝑠), whereas 𝑁 denotes the set of nodes with an incoming edge exiting from 𝑣. If the distance between 𝑣 and 𝑠 is greater than 2, the formula above is applied recursively, until the predicted trust existing between 𝑣 and 𝑠 is computed. The trust threshold max t is computed as follows. For each of the discovered paths, all its edges, except the one entering in 𝑠, are evaluated to compute the path’s strength, that is, the minimum trust value associated with such edges. Then, the trust threshold max t is set to the strength of the strongest path. For instance, consider the following trust paths 𝐴𝐵𝐶𝑆 and 𝐴𝐷𝐸𝑆, such that t𝐴,𝐵 = 0.2, t𝐵,𝐶 = 0.8, t𝐴,𝐷 = 0.4, and t𝐷,𝐸 = 0.6. In such a case, the strength of 𝐴𝐵𝐶𝑆 is equal to 0.2, whereas the strength of 𝐴𝐷𝐸𝑆 is 0.4. Consequently, the trust threshold is set to 0.4. Trust is then computed by considering only the paths with a strength equal to or greater than the trust threshold (i.e., only 𝐴𝐷𝐸𝑆 will be considered).

3.

BACKGROUND & REQUIREMENTS

In this section, we first introduce the formal model of WBSN used throughout the paper, then we discuss access control requirements for WBSNs. ACM Journal Name, Vol. V, No. N, Month 20YY.

Enforcing Access Control in Web-based Social Networks C

⋅

11

F

(friendOf , 1) (friendOf , 0.7)

(friendOf , 0.6)

(friendOf , 0.6)

(friendOf , 0.8)

(colleagueOf , 0.4) (friendOf , 1)

A

D

G (colleagueOf , 1)

(friendOf , 0.2) (friendOf , 0.8)

(colleagueOf , 0.1)

(colleagueOf , 0.9)

(colleagueOf , 0.8)

B

(colleagueOf , 0.3)

E

Fig. 1. A subgraph of a WBSN. Labels associated with edges denote the type and trust level of the corresponding relationship.

3.1

Preliminary Notions

Similarly to other networks, a WBSN can be represented as a graph, where each node denotes a user in the network, whereas edges represent the existing relationships between users, and their trust levels. Edge direction denotes which node specified the relationship and the node for which the relationship has been specified, whereas the label associated with each edge denotes the type of the relationship. Moreover, we associate a trust level t with each edge. The number and type of supported relationships depends on the specific WBSN and its purposes; our only assumption is that there exists at least one relationship type. We also assume that, if RT denotes the set of supported relationship types, given two nodes 𝐴, 𝐵 ∈ SN , there may exist at most ∣RT ∣ edges from 𝐴 to 𝐵 (from 𝐵 to 𝐴, respectively), all labeled with distinct relationship types. We can now formally define a WBSN as follows. Definition 3.1 WBSN. A WBSN SN is a tuple (𝑉SN , 𝐸SN , RT SN , 𝜙SN ), where RT SN is the set of supported relationship types, 𝑉SN and 𝐸SN ⊆ 𝑉SN ×𝑉SN ×RT SN are, respectively, the nodes and edges of a directed labeled graph (𝑉SN , 𝐸SN , RT SN ), whereas 𝜙SN : 𝐸SN → [0, 1] is a function assigning to each edge 𝑒 ∈ 𝐸SN a trust level t, which is a rational number in the range [0, 1]. An edge 𝑒 = 𝑣𝑣 ′ ∈ 𝐸SN expresses that node 𝑣 has established a relationship of a given type rt 𝑒 ∈ RT SN with node 𝑣 ′ . We say that such relationship, denoted rt(𝑣, 𝑣 ′ ), is direct, since 𝑣 and 𝑣 ′ are directly connected by edge 𝑒. As an example, consider the WBSN depicted in Figure 1, where Alice (𝐴) has a direct relationship of type friendOf and trust level 0.6 with Carl (𝐶). Note that, in a given WBSN SN , multiple paths may exist between two nodes, denoting the same type of relationship. For instance, in the WBSN depicted in Figure 1, four paths exist from Alice to David (𝐷) denoting a relationship of type friendOf —namely, 𝐴𝐵𝐷, 𝐴𝐶𝐷, 𝐴𝐶𝐹 𝐷, and 𝐴𝐶𝐹 𝐺𝐷. As discussed in Section 2.3, trust computation is more accurate when only the shortest paths are taken into account. As such, we adopt this approach throughout the paper. Therefore, we extend the notion of relationship by saying that a relationship rt(𝑣, 𝑣 ′ ) is the set of all the shortest paths from 𝑣 to 𝑣 ′ consisting of edges labeled with relationship type rt. ACM Journal Name, Vol. V, No. N, Month 20YY.

12

⋅


Besides type and trust level, relationships have a further property, i.e., their depth. The depth drt(𝑣,𝑣′ ) ∈ ℕ of a relationship rt(𝑣, 𝑣 ′ ) corresponds to the length of any of the paths in rt(𝑣, 𝑣 ′ ). In contrast, the trust level of rt(𝑣, 𝑣 ′ ), denoted trt(𝑣,𝑣′ ) , is computed by using formula 1 in Section 2.3.2. 3.2

Requirements for Access Control in WBSNs

As discussed in Section 2.1, currently, most WBSNs enforce very simple mechanisms for controlling access to resources. Such simple access control mechanisms have the advantage of being straightforward to be implemented, but they are not flexible enough in denoting authorized users. Therefore, in this section we first discuss requirements that an access control model for WBSNs should satisfy. We then point out some relevant issues related to access control enforcement. 3.2.1 Access Control Model Requirements. In what follows, we consider the WBSN depicted in Figure 1 by analyzing different scenarios with varying access control requirements. Suppose, for instance, that Alice is the owner of a set of resources 𝑅𝐴 , and that she wishes to share them with some of her friends. In this simple scenario, standard access control policies provided by Database Management Systems (DBMSs) fit very well. Indeed, since an access control policy basically states who can access what and under which modes, and since Alice knows a priori her friends, she is able to set up a set of authorizations to properly grant the access only to (a subset of) her friends. However, if we consider a more general scenario, the traditional way of specifying policies is not enough. For instance, let us suppose that Alice decides to make available her resources not only to her friends, but also to their friends, the friends of their friends, and so on. The problem is that Alice may not know a priori all her possible indirect friends, and thus she may not be able to specify a set of access control policies applying to them. Additionally, even if she knew all of them, she should specify a huge number of policies. Moreover, if we consider that relationships among users of a WBSN could change dynamically over time, this solution implies a complex policy management. An access control model for WBSNs should therefore take into account that usually a node in the network wishes to share its data with other nodes on the basis of both direct and indirect relationships existing among them. Therefore, a first requirement that we need to address is supporting access control based on users’ relationships and their types. Let us consider again the WBSN depicted in Figure 1, and assume once again that Alice wishes to share her data with some of her direct and indirect friends. In particular, she wants to grant access to Bob (𝐵) and Carl, since they are direct friends of hers. She wants to allow also David and Fred (𝐹 ) to access her data, even if Alice does not know them directly, because they are direct friends of Bob and Carl. In contrast, Alice may not want to give Greg (𝐺) access to her resources, since she does not know how Fred chooses his friends. In conclusion, when considering a WBSN, the length of the path connecting two nodes (i.e., the depth of a relationship) is a relevant information for access control purposes. Thus, an access control model for WBSNs should make a user able to state in a policy not only the type but also the maximum depth of a relationship. Although the notions of depth and trust may be related, they are not equivalent. ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

13

For instance, let us suppose that Alice does not trust Bob very much, and that, in contrast, she considers Carl highly trustworthy. In this case, the depth of the relationship is the same for both Bob and Carl, but the trust level is different. Therefore, access control policies should support also constraints on the minimum trust level of a relationship. 3.2.2 Access Control Enforcement Requirements. Usually, access control is enforced by a software module, called reference monitor, that intercepts each access request submitted to the system and, on the basis of the specified access control policies, determines whether the access can be partially or totally authorized, or it must be denied. Therefore, the robustness of access control relies on the trustworthiness of the entity implementing the reference monitor, which should correctly enforce all and only the specified access control policies. As a consequence, when designing an access control enforcement mechanism for WBSNs, one has to decide where the reference monitor has to be placed, that is, which is the trusted entity of the WBSN architecture in charge of evaluating access control policies. A first possibility is to delegate to the SNMS the role of reference monitor. According to this choice, users have to completely delegate the control of their data to the SNMS, by simply stating how data must be released to other network nodes. In this scenario, to which we refer to as centralized access control enforcement, the SNMS stores the access control policies of each user in the network, it processes each access request and evaluates over it WBSN members’ access control policies. Even if this kind of solution is largely accepted in other Web-based applications, it is important to understand whether centralized access control is appropriate in WBSN scenarios. The main reason of this concern is that, adopting centralized access control enforcement, implies to totally delegate to the SNMS the administration of user data. Since access control is enforced by the SNMS, users actually do not know whether access control is correctly enforced. They do not have any assurance about the behavior of SNMSs with respect to their data (for instance, they could maliciously release them to unauthorized users). They have to totally trust SNMSs. Therefore, it is important to carefully evaluate whether this could be easily accepted by WBSN members. It is true that, in current WBSNs, users are already providing SNMS a huge amount of personal data. But, it is also true that some recent events have made users aware that the SNMS’s behavior is not always honest and transparent. Let us consider, for instance, some privacy concerns related to Facebook [EPIC 2008a]. In 2006, Facebook received the complaints of some privacy activists against the use of the News Feed feature [Chen 2006], introduced to inform users with the latest personal information related to their online friends. These complaints resulted in an online petition signed by over 700,000 users demanding the company to stop this service. Facebook replied by allowing users to set some privacy preferences. More recently, November 2007, Facebook received other complaints related to the use of Beacon [Berteau 2007]. Beacon is part of the Facebook advertising system, introduced to track users’ activities on more than 40 Web sites of Facebook partners. Such information is collected even when a user is off from the social-networking site, and are reported to the user’s friends without the consent of the user him/herself. Even in this case, the network community promptly reacted with another online petition that gained more than ACM Journal Name, Vol. V, No. N, Month 20YY.

14

⋅


50,000 signatures in less than 10 days. These are only few examples of privacy concerns related to SNMSs. All these events have animated several online discussions about WBSN privacy, and government organizations started to seriously consider this issue [EPIC 2008b; Hogben 2007; Canadian Privacy Commission 2007; Federal Trade Commission 2007]. These increasing privacy concerns about how SNMSs manage personal information lead us to believe that a centralized access control solution is not the most appropriate in the WBSN scenario. As mentioned before, one could argue that this paradigm is already well-accepted and adopted in several Web-based applications (for instance, home banking or email services just to mention two of them). However, the main difference with respect to WBSNs is that in all these scenarios users have no choice: if they want to exploit the service, they have to accept that their data are managed according to the policies specified and enforced by the entity providing the service. In contrast, in WBSNs the real services are provided by users. Indeed, relationships are created by them, data are published by them. SNMSs only provide the framework, but, without users and their contents, the framework is completely useless. For all these reasons, we believe that in the near future WBSN participants would like to have more and more control over their data. In view of this, we believe it is necessary to investigate alternative ways of enforcing access control, which make users not totally dependent on SNMSs. A possible solution is to make the network participants themselves able to evaluate their access control policies. In this scenario, to which we refer to as decentralized access control enforcement, each participant is in charge of specifying and enforcing his/her access control policies. Each time a user receives an access request, the reference monitor, which is locally hosted by each network node, evaluates it against the specified policies, and decides whether access to the resource can be granted or not. The main drawback of this solution is that implementing a decentralized access control mechanism implies software and hardware resources more powerful than those typically available to WBSN participants. For instance, since access to a resource in a WBNS is usually granted on the basis of the direct/indirect relationships the requestor node has with other nodes in the network, answering an access request may require to verify the existence of specific paths within a WBSN. This task may be very difficult and time consuming in a fully decentralized solution. Therefore, a further essential requirement of access control enforcement in WBSNs is to devise efficient and scalable implementation strategies. In the following section we propose a semi-decentralized solution, as a way to trade-off between all the discussed requirements. 4.

OVERVIEW OF THE PROPOSED MECHANISM

In order to cope with the requirements outlined in the previous section, we propose a rule-based access control model for WBSNs, which allows the specification of access rules for online resources, where authorized users are denoted in terms of the type, maximum depth, and minimum trust level of the relationships existing between nodes in the network. The proposed access control model, described in ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

15

Section 5, totally satisfies the requirements discussed in Section 3.2.1. As far as access control enforcement is concerned, we have decided to take the view outlined by Weitzner et al. [2006], who propose to enforce access control in the Semantic Web according to a strategy which is analogous to the one adopted by trust management systems such as PolicyMaker [Blaze et al. 1998], SPKI/SDSI [Ellison et al. 1999], and KeyNote [Blaze et al. 1999]. Differently from traditional access control mechanisms, in such an approach the task of verifying whether an access is authorized is in charge of the requesting user, who must prove to the resource’s owner that he/she satisfies the requirements expressed by the owner’s access control policies. Adopting this solution in the context of WBSNs implies the following steps. When a user, hereafter called the resource owner, receives from another user, hereafter the requestor, an access request for one of his/her resources, he/she replies by sending the set of access rules regulating the release of the resource. In order to gain access to the resource, the requestor has to provide the resource owner with a proof showing the existence of the required relationships, and that these relationships have the required depth and trust level. Therefore, each node is equipped with a reasoner for rule evaluation and proof generation. By means of the reasoner, the resource owner is then able to locally verify the proof, if he/she does not trust the requesting user. Implementing this client-side access control mechanism implies to address relevant issues related to the trustworthiness of the proofs sent by the requestor. Indeed, the resource owner should be able to verify that the proof received by the requestor has not been forged. To cope with this issue, we propose a solution based on the notion of relationship certificates (see Section 6 for more details), according to which, whenever a user, say Alice, establishes a new relationship with another user, say Bob, they both create and sign a certificate stating that between them there exists a direct relationship of a certain type and with a certain trust level. A proof regarding the existence of an in/direct relationship of a given type between users 𝐴 and 𝐵 can therefore be generated and verified through a set of certificates confirming the existence of a path of that type between them (hereafter we refer to this set of certificates as certificate path). Thus, providing the resource owner with certificate paths makes him/her able to verify the correctness of a proof certifying the existence of a given in/direct relationship as well as its depth, in that the number of certificates in the path represents the length of the path itself. In contrast, verifying the relationship’s trust level needs a more complex strategy. Indeed, since the trust level between two nodes is computed taking into account all the shortest paths connecting them, in order to verify the validity of the trust level contained into a proof the requestor must provide the resource owner with all the corresponding certificate paths. However, how can the owner be sure that the requestor has actually provided all the shortest certificate paths? If there exist more than one path, the requestor may maliciously omit one or more of them, providing only the paths with the highest level of trust. Example 4.1. Consider the WBSN depicted in Figure 1, and suppose that David requests access to a resource rsc, owned by Alice, for which it is required to be a friend of hers, with maximum depth equal to 4, and with a minimum trust level equal to 0.8. The relationship of type friendOf existing between Alice and David ACM Journal Name, Vol. V, No. N, Month 20YY.

16

⋅


consists of four paths, 𝐴𝐵𝐷, 𝐴𝐶𝐷, 𝐴𝐶𝐹 𝐷, and 𝐴𝐶𝐹 𝐺𝐷. The shortest paths, namely, 𝐴𝐵𝐷 and 𝐴𝐶𝐷, must all be taken into account when computing the trust level, because they are required to compute the trust threshold used to select the paths to be considered when computing the trust level of the relationship. Now, the strengths of 𝐴𝐵𝐷 and 𝐴𝐶𝐷 are equal to 0.2 and 0.6, respectively. Consequently, only path 𝐴𝐶𝐷 will be used for trust computation. According to formula 1 in Section 2.3.2, the trust level between Alice and David is equal to 0.7, and thus David cannot access rsc. Yet, if we consider the two shortest paths separately, we = 0.8 and t𝐴𝐶𝐷 = 0.6×0.7 = 0.7. Thus, if David provides have t𝐴𝐵𝐷 = 0.2×0.8 0.2 0.6 to Alice only the certificate path corresponding to 𝐴𝐵𝐷, he will gain access to rsc since t𝐴𝐵𝐷 = 0.8, even if he is not actually authorized. To avoid this problem, we assume the presence of a trusted Certificate Server (CS ). This server acts like a certificate repository in charge of storing into a central certificate directory 𝒞𝒞𝒟 all the relationship certificates specified by WBSN nodes, and enhanced with the functionality of discovering certificate paths (see Section 7.3 for more details). Thus, whenever the requestor needs to prove to the resource owner the existence of a given relationship, as well as its depth and trust, he/she requests the CS to discover the set of certificate paths corresponding to all the shortest paths referring to the required relationship. Such certificate paths, signed by CS , are then used by the requestor to generate the proof. The proof and the certificate paths are sent to the resource owner, which can locally verify the validity of the proof, if needed. This solution has several benefits in term of efficiency and scalability with respect to the fully decentralized one. Indeed, introducing the certificate server makes the overall framework more efficient, in that the burden of certificate management is on the CS , which obviously performs this task more efficiently than any other single node in the network. Moreover, the framework gains in scalability, in that a WBSN could exploit several (external) certificate servers, on the basis of the number of its participants. Moreover, this solution might be extensible to interactions among different WBSNs. Indeed, users of a given WBSN could interact with participants of another WBSN, under the assumption that there exists a mutual agreement between the corresponding certificate servers. However, besides the benefits, introducing the certificate server makes the solution no more fully decentralized. Indeed, in the proposed solution users locally take access control decisions but on the basis of certificate paths discovered by certificate servers. In some way, they still rely on external and potentially untrusted entities for their access control decision. For this reason, we define the proposed solution as semi-decentralized, in that the access control decision is taken by resource owner itself but exploiting information discovered by certificate servers. Even if this solution is not fully decentralized, there is a relevant difference between the proposed strategy and the centralized one. Indeed, in a centralized solution users have to trust the entity enforcing access control (i.e., the SNMS). They have no chances to verify the correctness of access request evaluation. In contrast, according to our semi-decentralized solution, users rely on external entities only for certificate management and certificate path discovery. Moreover, they are still able to verify whether the certificate paths discovered by a CS are correct or not. For instance, ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

17

users could inquiry other certificate servers to cross-check the received certificate paths, or they could directly contact other participants to verify certificates’ correctness. This, in addition to the benefits in terms of efficiency, leads us to consider the proposed semi-centralized solution a good trade-off between efficiency and security requirements. 5.

AN ACCESS CONTROL MODEL FOR WBSNS

In our model, access control requirements applying to a resource are expressed by specifying one or more access conditions, by which the resource owner 𝑂 determines the type of relationships that a requesting node 𝑅 must have with a given node (which may correspond to 𝑂 or to any other node in the network), possibly along with their maximum depth and minimum trust level. Access conditions are formally defined as follows. Definition 5.1 Access Condition. An access condition ac is a tuple (inode, rt, dmax , tmin ), where inode ∈ 𝑉SN ∪ {∗} is the node with which the requesting node must have a relationship, rt ∈ RT SN ∪ {∗} is a relationship type, whereas dmax ∈ ℕ ∪ {∗} and tmin ∈ [0, 1] ∪ {∗} are, respectively, the maximum depth and minimum trust level that the relationship must have. If inode = ∗ and/or rt = ∗, inode corresponds to any node in 𝑉SN and/or rt corresponds to any relationship in RT SN , whereas, if dmax = ∗ and/or tmin = ∗, there is no constraint concerning the depth and/or trust level, respectively. Here and in what follows, given a tuple 𝑡 we denote with 𝑐𝑜𝑚𝑝(𝑡) the value of the component 𝑐𝑜𝑚𝑝 of tuple 𝑡. Therefore, we denote with inode(ac), rt(ac), dmax (ac), and tmin (ac) the different components of a given access condition ac. Moreover, if one or more components of an access condition ac is set to the wildcard (∗), we say that it is a ∗-condition (denoted, ∗ ac). Given a resource rsc, the access control requirements of rsc are expressed through a set of access rules specified by its owner 𝑂. The notion of access rule is formally defined as follows. Definition 5.2 Access Rule. An access rule ar is a pair (rid , AC ), where rid is the identifier of resource rsc, whereas AC ∕= ∅ is a set of access conditions (also referred to as condition set), expressing the requirements a node must satisfy in order to be allowed to access resource rsc. The conditions in AC do not denote a set of alternative requirements, but all the requirements to be satisfied. In other words, the semantics of a condition set {ac1 , . . . , ac𝑛 } can be expressed as ac1 ∧ ⋅ ⋅ ⋅ ∧ ac𝑛 . It may be also the case that more than one access rule is specified for a given resource, when alternative access control requirements must be met. Example 5.3. Consider the WBSN depicted in Figure 1, and suppose that Alice owns a resource rsc, which she wishes to make available only to her direct and indirect friends, with the constraint that their relationships have a maximum depth equal to 3, and a minimum trust level equal to 0.8. Such policy can be expressed through the following access rule: ar1 = (rid , {(𝐴, friendOf , 3, 0.8)}), where rid is the ID of resource rsc. In contrast, suppose that resource rsc should be accessed ACM Journal Name, Vol. V, No. N, Month 20YY.

18

⋅


by users that are either friends of Alice (with the constraints on depth and trust level stated by ar1 ) or direct colleagues of Carl, independently from their trust level. This can be achieved by specifying two distinct access rules, namely, ar1 above, and ar2 = (rid , {(𝐶, colleagueOf , 1, ∗)}). Finally, suppose that resource rsc should be accessed by users that are both friends of Alice and colleagues of Carl, with the same constraints on depth and trust level stated by ar1 and ar2 . In such a case, Alice can specify the following rule: ar3 = (rid , {(𝐴, friendOf , 3, 0.8), (𝐶, colleagueOf , 1, ∗)}). 6.

CERTIFYING RELATIONSHIPS

As stated in Section 4, to support client-side access control, we need a way to ensure relationships’ authenticity. For this purpose, we assume that a relationship is expressed by means of a relationship certificate. More precisely, whenever a node inode ∈ 𝑉SN wishes to establish a relationship of type rt with another node tnode ∈ 𝑉SN , it generates a certificate where it declares the existence of a relationship of type rt and given trust level with tnode. The certificate is signed by both inode and tnode. In the following, we denote with PK 𝑣 and SK 𝑣 the public and private keys of a node 𝑣 ∈ 𝑉SN , respectively. The notion of relationship certificate is formally defined as follows. Definition 6.1 Relationship Certificate. Let inode ∈ 𝑉SN be a node wishing to establish a relationship of type rt with another node tnode ∈ 𝑉SN . Let t be the trust level inode wishes to assign to the relationship, and ts a timestamp denoting the time instant when the relationship has been established. The certificate rc of such relationship is given by the concatenation of the tuple relSpec = (inode, tnode, rt, t, ts) with its double signature DSig, i.e., a pair (Sig SK inode (relSpec), Sig SK tnode (relSpec)), where Sig is a signing function.2 Example 6.2. The following are examples of certificates corresponding to the relationships existing between Alice and Bob in the WBSN depicted in Figure 1: —(𝐴, 𝐵, friendOf , 0.2, ts)∣∣(Sig SK 𝐴 (𝐴, 𝐵, friendOf , 0.2, ts), Sig SK 𝐵 (𝐴, 𝐵, friendOf , 0.2, ts)) —(𝐴, 𝐵, colleagueOf , 0.8, ts′ )∣∣(Sig SK 𝐴 (𝐴, 𝐵, colleagueOf , 0.8, ts′ ), Sig SK 𝐵 (𝐴, 𝐵, colleagueOf , 0.8, ts′ )). After being generated and signed, certificates are uploaded to the certificate directory 𝒞𝒞𝒟 of the certificate server, which acquires them after having checked the validity of their signatures. Copies of such certificates are also held by the nodes that generated them into their local certificate directories. Moreover, the certificate server is equipped with a Certificate Revocation List to manage certificate revocation. 2 Sig

𝑘 (𝑑) denotes the signature of 𝑑 with key 𝑘. For simplicity, we use here the same pairs of private and public keys for both encrypting and signing messages. It is however possible to have two different pairs of private/public keys, one to be used for encrypting and the other for signing.

ACM Journal Name, Vol. V, No. N, Month 20YY.


7.

⋅

19

ACCESS CONTROL ENFORCEMENT

As pointed out in Section 4, in order to access a given resource rsc, the requestor must provide the resource owner with a proof demonstrating to be authorized to do that. Therefore, before illustrating the steps involved in access control enforcement, we clarify the notion of proof. 7.1

Proofs

A proof certifies that a requestor 𝑅 satisfies at least one of the access rules associated with the requested resource. Suppose that a resource rsc is protected by the set of access rules AR rsc . To generate a proof, the requestor must attest that there exists at least one access rule ar ∈ AR rsc such that he/she satisfies all the conditions stated by the condition set AC (ar). This is obtained by first computing for each condition ac ∈ AC (ar) an assertion stating that between node 𝑅 and node inode(ac) there exists a relationship of type rt(ac), with a certain depth and trust level. The assertion is computed by 𝑅 on the basis of the corresponding certificate paths discovered by CS (see Section 7.4 for more details). The result is a set of assertions 𝑅𝐴 = {𝑟𝑎1 , . . . , 𝑟𝑎𝑛 }, one for each ac ∈ AC (ar), of the form 𝑟𝑎 = (inode, 𝑅, rt, d, t), which are then matched with the set of access conditions AC (ar) in order to obtain a proof. More precisely, a proof is obtained if, for each access condition ac ∈ AC (ar), there exists an assertion 𝑟𝑎 ∈ 𝑅𝐴 such that: (1) inode(𝑟𝑎) = inode(ac), (2) rt(𝑟𝑎) = rt(ac), (3) d(𝑟𝑎) ≤ dmax (ac), and (4) t(𝑟𝑎) ≥ tmin (ac). Example 7.1. Consider the WBSN depicted in Figure 1 and the access rules in Example 5.3, and suppose that David requests to access a resource owned by Alice, protected by access rule ar1 = (rid , {(𝐴, friendOf , 3, 0.8)}). As already seen in Example 4.1, the shortest paths of type friendOf between Alice and David are 𝐴𝐵𝐷 and 𝐴𝐶𝐷. The resulting assertion generated by David, namely, (𝐴, 𝐷, friendOf , 2, 0.7), states that between Alice and David there exists a relationship of type friendOf , depth 2 and trust level 0.7. Since this assertion does not satisfy the condition on the minimum trust level required by ar1 , a proof is not obtained, and consequently David will be not authorized to access the resource. We use the Cwm reasoner [Cwm 2006] in order to compute the proof. For this purpose, before running Cwm, we transform both assertions and access rules into equivalent logical formulas, expressed by using N3 (see Appendix B for more details on N3). As a result, we obtain an N3-encoded proof, denoted as 𝜋, containing, besides the assertions and the rule, also the steps followed by the reasoner to carry out the demonstration. 7.2

Access Control Protocol

In order to implement in a secure way the access control procedure sketched in Section 4, we need to devise a protocol ensuring both the requestor and the resource owner that access rules and proofs are valid and authentic. For these purposes, we have devised an access control protocol (depicted in Figure 2) consisting of the steps described in Figure 3.3 Note that all the exchanged messages are encrypted 3 In

the figure and in the remainder of the paper, 𝐸𝑘 (𝑑) denotes the encryption of 𝑑 with key 𝑘. ACM Journal Name, Vol. V, No. N, Month 20YY.

20

⋅

Barbara Carminati et al. CS 3. EPK CS (ESK R (AC (ar), n))

4. EPK R (ESK CS (CP, n)) 5. EPK O (ESK R (rid , π, ESK CS (CP, n)))

R

6. EPK R (ESK O (rsc))

O

2. EPK R (ESK O ({(ar1 , n1 ), . . . , (arn , nn )}) 1. EPK O (ESK R (rid ))

Fig. 2. Access control protocol. 𝑅 is the node requesting a resource with identifier rid, 𝑂 is the node owning such resource, whereas CS is the certificate server. (1) 𝑅 submits to 𝑂 an access request for resource rsc, with identifier rid. (2) If the resource is public, access is granted. Otherwise, 𝑂 returns to 𝑅 the set of access rules AR = {ar1 , . . . , ar𝑛 } regulating the access to rsc. With each access rule ar𝑖 ∈ AR, 𝑖 ∈ [1, 𝑛], a distinct nonce value n𝑖 is associated as a session identifier. (3) 𝑅 chooses from AR an access rule ar and sends CS the nonce value n associated with ar and the corresponding condition set AC (ar). More precisely, since the certificate server CS has only to discover the shortest certificate paths referring to the relationships denoted by AC (ar), whereas the requestor is in charge of trust computation, for each ac ∈ AC (ar), 𝑅 sends CS a modified version of the corresponding set AC (ar) of access conditions (denoted AC (ar)), where the tmin component of each ac ∈ AC (ar) is set to null. (4) CS returns 𝑅 the set 𝒞𝒫 of shortest certificate paths, if any, related to the relationship constraints expressed by the access conditions in AC (ar), along with the nonce n associated with ar; otherwise, CS returns a failure message. In the latter case, 𝑅 goes back to step 3 and chooses another access rule, until CS returns the set 𝒞𝒫, if any, or all the access rules have been processed (the algorithm for certificate path discovery is described in Section 7.3). (5) Based on the certificate paths in 𝒞𝒫, 𝑅 computes the corresponding set of assertions 𝑅𝐴, and then he/she invokes the reasoner in order to match them against access rule ar (see Section 7.4 for more details). If a proof is not obtained, 𝑅 goes back to step 3 and chooses another access rule; otherwise, he/she sends 𝑂 a message, which contains the resource identifier, the proof 𝜋, and the certificate paths obtained from CS . 𝒞𝒫 and n are kept encrypted with the private key of CS , in order to grant their authenticity. (6) 𝑂 sends 𝑅 the requested resource in case the proof 𝜋 is valid and the nonce value n corresponds to the correct session identifier. Before granting access to the resource, 𝑂 can locally check whether the set of assertions used in the proof are actually derived from the received certificate paths in 𝒞𝒫, by performing the same steps done by the requestor for proof generation (see Section 7.5).

Fig. 3.

Description of the access control protocol depicted in Figure 2

with the private key of the sender and with the public key of the receiver, in order to ensure their authenticity, integrity, and confidentiality. Example 7.2. Consider the WBSN depicted in Figure 1, and suppose that David requests to access a resource rsc owned by Alice, protected by two access rules, ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

21

namely, ar1 = (rid , {(𝐴, friendOf , 3, 0.8)}) and ar2 = (rid , {(𝐶, colleagueOf , 1, ∗)}). According to the access control protocol described in Figure 3, evaluating this access request requires the following steps: (1) David sends Alice an access request 𝐸PK 𝐴 (𝐸SK 𝐷 (rid )), where rid is the identifier of resource rsc. (2) Alice sends back to David a message 𝐸PK 𝐷 (𝐸SK 𝐴 ({(ar1 , n1 ), (ar2 , n2 )})), containing the access rules ar1 and ar2 applying to rsc, associated with two distinct nonces n1 and n2 . (3) David selects the first rule ar1 , extracts the corresponding set of conditions AC (ar1 ) = {(𝐴, friendOf , 3, 0.8)} and modifies the condition by setting the trust component to null, so that AC (ar1 ) = {(𝐴, friendOf , 3, null)}. Then, he sends CS the message 𝐸PK CS (𝐸SK 𝐷 (AC (ar1 ), n1 )). (4) CS verifies whether one or more certificate paths exist satisfying AC (ar1 ). As already seen in Example 4.1, the shortest paths of type friendOf existing between Alice and David are 𝐴𝐵𝐷 and 𝐴𝐶𝐷. CS then builds the corresponding certificate paths, namely (rc1 , rc2 ) and (rc3 , rc4 ), where: rc1 = (𝐴, 𝐵, friendOf , 0.2, ts1 )∣∣DSig 1 , rc2 = (𝐵, 𝐷, friendOf , 0.8, ts2 )∣∣DSig 2 , rc3 = (𝐴, 𝐶, friendOf , 0.6, ts3 )∣∣DSig 3 , rc4 = (𝐶, 𝐷, friendOf , 0.7, ts4 )∣∣DSig 4 . It then sends David the message 𝐸PK 𝐷 (𝐸SK CS ({(rc1 , rc2 ), (rc3 , rc4 )}, n1 )). (5) David must verify whether the relationship denoted by {(rc1 , rc2 ), (rc3 , rc4 )} satisfies ar1 . For this purpose, he computes the corresponding assertion (𝐴, 𝐷, friendOf , 2, 0.7). Since this assertion does not satisfy the constraint on the minimum trust level, David cannot obtain a proof. Consequently, he sends CS the modified set of conditions AC (ar2 ) = {(𝐶, colleagueOf , 1, null)} corresponding to rule ar2 . (6) CS verifies that a path of type colleagueOf and length 1 exists between Carl and David, and therefore it sends David the message 𝐸PK 𝐷 (𝐸SK CS ({(rc5 )}, n2 )), where rc5 = (𝐶, 𝐷, colleagueOf , 0.4, ts3 )∣∣DSig 5 , and n2 is the nonce value associated with ar2 . (7) From the received certificate path (rc5 ), David obtains the assertion (𝐶, 𝐷, colleagueOf , 0.4), which satisfies ar2 . Thus, David computes the proof 𝜋 and sends it to Alice in the message 𝐸PK 𝐴 (𝐸SK 𝐷 (rid , 𝜋, 𝐸SK CS ({(rc5 )}, n2 ))). (8) Alice verifies that the nonce value n2 is valid and that the proof 𝜋 is correct. Then, she can decide whether to grant access to the resource without any further check or to verify the correctness of the assertion derived from rc5 . 7.3

Certificate Path Discovery

In step 4 of the protocol (see Figure 3), CS discovers the shortest certificate paths referring to the set of access conditions AC (ar) (AC , for short) received by the requestor node. This can be achieved by exploring the network graph, a task which may have high computational cost, depending on the degree and the order of the graph itself. More precisely, exploring the network graph requires either 𝑂(𝑉SN + 𝐸SN ) or Θ(𝑉SN + 𝐸SN ) time complexity, depending on whether we use a breadth-first search (BFS) or a depth-first search (DFS), respectively. However, given the constraints on the relationship type and depth specified in an access ACM Journal Name, Vol. V, No. N, Month 20YY.

22

⋅


condition, we can reduce the size of the graph to be explored, and therefore the computational cost. In fact, the search can be terminated as soon as either the specified maximum depth is reached or the shortest path(s) between two nodes is (are) found. Moreover, we are not actually interested in discovering all the paths existing between two nodes, but only in those consisting of edges all labeled with one of the relationship types RT = {rt 1 , . . . , rt 𝑛 } specified in the input condition set AC = {ac1 , . . . , ac𝑛 }. Therefore, we explore only the set of subgraphs SN rt 1 , . . . , SN rt 𝑛 ⊆ SN , where SN rt 𝑖 denotes a subgraph of SN consisting of all and only the edges labeled with relationship type rt 𝑖 and the nodes connected by them. Unless ∣RT SN ∣ = 1, we have to explore graphs of a size which is usually far lower than the one of SN . Finally, since we have to find only the shortest path(s), the search is performed by using a BFS-algorithm. This means that exploring each subgraph SN rt 𝑖 ⊆ SN requires 𝑂(𝑉SN rt 𝑖 + 𝐸SN rt 𝑖 ) time complexity. Another factor that affects the system performance is the use of the wildcard ∗ for one or more of the components of an access condition. Since the certificate server CS performs the search based on a modified version of AC , where the constraints on the minimum trust level are omitted, we do not consider the case in which the access condition contains a wildcard ∗ in its trust component. We can then say that, in the ∑ general case, the time complexity required to evaluate an access condition is 𝑂( rt∈RT (ac) (𝑉SN rt + 𝐸SN rt )), where RT (ac) is the set of relationship types specified in the access condition ac. More precisely, ∣RT (ac)∣ = ∣RT SN ∣, if the rt component of ac is set to ∗; ∣RT (ac)∣ = 1, otherwise. Thus, since an access rule∑consists of ∑ one or more access conditions, evaluating an access rule ar requires 𝑂( ac∈AC (ar) rt∈RT (ac) (𝑉SN rt + 𝐸SN rt )) time complexity.

Let us now introduce how the certificate path discovery is carried out in our system. This task is performed by Algorithm 1, which receives as input the identifier of the requesting node 𝑅 and a set of modified conditions AC (ar) (AC , for short), referring to an access rule ar, and returns a data structure 𝒞𝒫. 𝒞𝒫 is a bi-dimensional array where each element is in turn an array containing the set of shortest certificate paths denoting relationships of 𝑅 that satisfy a condition in AC . More precisely, 𝒞𝒫[𝑖], 1 ≤ 𝑖 ≤ ∣𝒞𝒫∣, is an array where each element 𝒞𝒫[𝑖][𝑗], 1 ≤ 𝑗 ≤ ∣𝒞𝒫[𝑖]∣, contains the set of shortest certificate paths having the same initial and terminal node and consisting of edges all labeled with the same relationship type, which correspond to a given relationship satisfying the 𝑖th condition in AC . Finally, each certificate path in 𝒞𝒫[𝑖][𝑗] is a tuple of length 𝑛, where 𝑛 is the depth of the relationship denoted by 𝒞𝒫[𝑖][𝑗]. The reason why we model 𝒞𝒫 as a bidimensional array is to make it easier the computation of relationship assertions for ∗-conditions, as it will be explained in Section 7.4. The algorithm starts by setting 𝒞𝒫 to be empty and initializing the variables used in the subsequent steps, namely, the set RT of relationship types specified in AC and the set termRT of relationship types associated with edges entering in 𝑅 (lines 2-4). Then, it applies a preliminary check to determine whether the input access conditions cannot be satisfied by the requesting node. If the check is not satisfied, each access condition ac ∈ AC is iteratively considered (lines 9-54). This implies to first initialize array 𝒞𝒫[𝑖] and a temporary variable, i.e., Paths (line 12). The latter will contain the discovered certificate paths, if any, satisfying the current ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

23

Algorithm 1 Certificate Path Discovery 1: function DiscoverShortestPaths(𝑅, AC ) 2: Array 𝒞𝒫 is initialized to be empty 3: Let RT ⊆ RT SN be the set of relationship types specified in AC 4: Let termRT be the relationship types associated with edges entering in 𝑅 5: if RT ∕⊆ termRT then 6: return failure 7: else 8: 𝑖←0 9: for all ac ∈ AC do 10: 𝑖←𝑖+1 11: 𝑗←0 12: Array 𝒞𝒫[𝑖] and variable Paths are initialized to be empty 13: switch 14: case inode(ac) = ∗ ∧ rt(ac) = ∗ 15: for rt ∈ termRT do 16: Paths ← BFS(𝑅, ∗, Adj rt , dmax (ac)) 17: while ∣Paths∣ > 0 do 18: 𝑗 ←𝑗+1 19: Let path be a path in Paths 20: 𝒞𝒫[𝑖][𝑗] ← ExtractSimilarPaths(path, Paths) 21: Remove from Paths the paths in the elements of 𝒞𝒫[𝑖][𝑗] 22: end while 23: end for 24: end case 25: case inode(ac) = ∗ ∧ rt(ac) ∕= ∗ 26: if rt(ac) ∈ termRT then 27: Paths ← BFS(𝑅, ∗, Adj rt(ac) , dmax (ac)) 28: while ∣Paths∣ > 0 do 29: 𝑗 ←𝑗+1 30: Let path be a path in Paths 31: 𝒞𝒫[𝑖][𝑗] ← ExtractSimilarPaths(path, Paths) 32: Remove from Paths the paths in the elements of 𝒞𝒫[𝑖][𝑗] 33: end while 34: end if 35: end case 36: case inode(ac) ∕= ∗ ∧ rt(ac) = ∗ 37: Let initRT be the relationship types associated with edges exiting from inode(ac) 38: for rt ∈ termRT ∩ initRT do 39: 𝑗 ←𝑗+1 40: 𝒞𝒫[𝑖][𝑗] ← BFS(𝑅, inode(ac), Adj rt , dmax (ac)) 41: end for 42: end case 43: default case 44: Let initRT be the relationship types associated with edges exiting from inode(ac) 45: if rt(ac) ∈ termRT ∩ initRT then 46: 𝑗 ←𝑗+1 47: 𝒞𝒫[𝑖][𝑗] ← BFS(𝑅, inode(ac), Adj rt(ac) , dmax (ac)) 48: end if 49: end default case 50: end switch 51: if ∣𝒞𝒫[𝑖]∣ = 0 then 52: return failure 53: end if 54: end for 55: return 𝒞𝒫 56: end if 57: end function

access condition, and having the same initial and terminal nodes, and consisting of edges all labeled with the same relationship type. Such certificate paths will then be permanently stored by each element of array 𝒞𝒫[𝑖]. The algorithm then selects the more efficient search procedure for each access condition ac ∈ AC (lines 13-50). In particular, if inode(ac) = ∗ and rt(ac) = ∗, the search is enforced by lines 14-24, whereas, if inode(ac) = ∗ and rt(ac) ∕= ∗, ACM Journal Name, Vol. V, No. N, Month 20YY.

24

⋅


the search is implemented by lines 25-35. In contrast, lines 36-42 address the case when inode(ac) ∕= ∗ but rt(ac) = ∗. Finally, when inode(ac) ∕= ∗ and rt(ac) ∕= ∗, the procedure is enforced by lines 43-49. In all these cases, certificate path discovery is performed by the BFS() function, which receives as input the pair of nodes 𝑅, inode(ac), the maximum depth specified in the access condition, and the adjacency list Adj rt (denoted Adj rt(ac) , in case rt(ac) ∕= ∗), which associates with each node 𝑣 ∈ 𝑉SN the set of certificates denoting relationships of type rt (rt(ac)), where 𝑣 participates as terminal node. This function implements a standard BFS-algorithm, modified in order to end the search as soon as the maximum depth is reached or all the shortest paths have been found. In case the inode(ac) parameter is set to ∗, the BFS() function searches for all the shortest paths connecting 𝑅 with any other node in the network. In contrast, in case dmax = ∗, there is no limit in the depth of the search. Let us now see in detail each different case. When inode(ac) = ∗ and rt(ac) = ∗ (lines 14-24), the BFS() function is iterated for all the relationship types in termRT . In contrast, in case inode(ac) = ∗ and rt(ac) ∕= ∗, the BFS() function is executed only on the relationship type rt(ac) (lines 25-35). It is important to note that, in both these cases, the BFS() function returns certificate paths denoting relationships of 𝑅 with different nodes in the network (since inode(ac) = ∗). For example, in case inode(ac) = ∗ and rt(ac) ∕= ∗, the BFS() function returns all the shortest certificate paths connecting 𝑅 with any other node in the network, and having edges all labeled with relationship type rt(ac). In these cases, in order to compute the trust level of such relationships, it is necessary to consider separately the corresponding sets of discovered certificate paths. For this reason, the paths returned by the BFS() function are processed by function ExtractSimilarPaths() (lines 20, 31), which selects those of them connecting the same pair of nodes, that is, those denoting the same relationship. Such paths are then stored into a distinct element of array 𝒞𝒫[𝑖]. When inode(ac) ∕= ∗, we have two different cases. If rt(ac) = ∗, the BFS() function is executed only on the relationship types for which there could exist a relationship between 𝑅 and inode(ac). These types are given by the intersection of initRT and termRT (lines 36-42), where initRT denotes the set of relationship types associated with edges exiting from inode(ac). In case rt(ac) ∕= ∗, the BFS() function is executed only for relationship type rt(ac) (lines 43-49). Note that, if no certificate paths are found, the current access condition is not satisfied. In this case, since all the access conditions in AC must be satisfied, the algorithm ends, and returns a failure (line 52). Otherwise, the process is iterated on the next access condition. Example 7.3. Consider the access control protocol illustrated in Example 7.2, according to which, in step 3, the certificate server CS receives from David a message 𝐸PK CS (𝐸SK 𝐷 (AC (ar1 ), n1 )), where AC (ar1 ) = {(𝐴, friendOf , 3, null)}. To reply to this request, CS has then to explore the WBSN in Figure 1 in order to find the shortest paths, if any, between Alice and David, of type friendOf and maximum length equal to 3. Once verified that there exists an edge entering in node 𝐷 and an edge exiting from node 𝐴, both labeled with relationship type friendOf , CS calls the BFS() function, and starts the search by considering David’s neighbors with ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

25

respect to the relationship type friendOf —i.e., Bob, Carl, Fred, and Greg. Since Alice is not one of David’s neighbors, the BFS() function now considers, in turn, the neighbors of Bob, Carl, Fred, and Greg. The BFS() function discovers that Alice is a friend of Bob, and thus that there exists a path of length 2 between Alice and David. Since a path has been found of length less than 3, the BFS() function verifies only whether other paths of length 2 exist—that is, whether other shortest paths exist. Then, the BFS() function considers the next David’s neighbor, i.e., Carl, and it discovers that Alice is friend of Carl too. In contrast, Fred has just one neighbor, Carl, whereas Greg’s only neighbor is Fred. Consequently, the only paths satisfying AC (ar1 ) are 𝐴𝐵𝐷 and 𝐴𝐶𝐷, corresponding to the certificate paths (rc1 , rc2 ) and (rc3 , rc4 ) (see Example 7.2). Then, 𝒞𝒫 = [[{(rc1 , rc2 ), (rc3 , rc4 )}]]. 7.4

Proof Computation

Once received from CS all the requested certificate paths, 𝑅 has to generate the corresponding assertions and then compute the proof 𝜋 (step 5 of the protocol in Figure 3). The former task is performed by Algorithm 2, which receives as input the bi-dimensional array 𝒞𝒫 returned by Algorithm 1, and returns a set of assertions 𝑅𝐴, corresponding to the relationships denoted by the paths stored into the elements of 𝒞𝒫. For each element 𝒞𝒫[𝑖] of 𝒞𝒫, with 1 ≤ 𝑖 ≤ ∣𝒞𝒫∣, the algorithm iteratively considers each set 𝒞𝒫[𝑖][𝑗] of certificate paths, with 1 ≤ 𝑗 ≤ ∣𝒞𝒫[𝑖]∣. In particular, if the current set 𝒞𝒫[𝑖][𝑗] contains a single path path, which in turn consists of a single certificate rc, the algorithm sets the trust level of the relationship to the trust level in rc (lines 7-13). Otherwise, the algorithm computes the relationship trust level by using formula 1 in Section 2.3.2 (lines 15-42). This implies first to compute the trust threshold max t, and to remove from the current set of paths 𝒞𝒫[𝑖][𝑗] those having a strength less than max t (lines 16-22). Then (line 23), the certificates are grouped based on their position in all the paths in 𝒞𝒫[𝑖][𝑗] (i.e., RC 𝑘 denotes the set of certificates at the 𝑘th position in the paths in 𝒞𝒫[𝑖][𝑗]). Moreover, the initial node inode and the terminal node tnode of the path currently considered by the algorithm are determined (lines 24-26). Finally, the algorithm (lines 27-38) computes the trust value of the relationship between nodes inode and tnode by applying recursively formula 1 in Section 2.3.2 on the sets RC 1 , . . . , RC 𝑛 of certificates computed before. The last part of the algorithm (lines 39-42) is in charge of verifying whether the current set 𝒞𝒫[𝑖][𝑗] of certificate paths is to be used or not to generate a relationship assertion. Such check is carried out in order to obtain a single relationship assertion for each element 𝒞𝒫[𝑖] of 𝒞𝒫. We remind that each element 𝒞𝒫[𝑖] stores the certificate paths satisfying a given modified access condition ac, grouped based on the relationship they denote. When none of the components of a modified access condition ac are set to ∗, all the certificate paths returned by Algorithm 1 have the same initial and terminal nodes, and they consist of edges all labeled with the same relationship type, satisfying ac. Consequently, 𝒞𝒫[𝑖] will have just a single element, and a single assertion will be generated based on it. However, in case inode(ac) = ∗, the returned paths denote relationships of the same type existing between the requestor and any other node in the network, all satisfying ac. Moreover, if rt(ac) = ∗, the returned paths denote relationships between the same pair ACM Journal Name, Vol. V, No. N, Month 20YY.

26

⋅


Algorithm 2 Assertion Generation 1: function GenerateAssertions(𝒞𝒫) 2: 𝑅𝐴 ← ∅ ⊳ The set of assertions is initialized to be empty 3: for 𝑖 = 1, ∣𝒞𝒫∣ do 4: trust ← 0 5: for 𝑗 = 1, ∣𝒞𝒫[𝑖]∣ do 6: flag ← 0 7: if ∣𝒞𝒫[𝑖][𝑗]∣=1 then 8: Let path be the only path in 𝒞𝒫[𝑖][𝑗] 9: if path consists of a single certificate rc then 10: 𝑟𝑎 ← (inode(rc), tnode(rc), rt(rc), 1, t(rc)) 11: flag ← 1 12: end if 13: end if 14: if flag = 0 then 15: Let path be a certificate path in 𝒞𝒫[𝑖][𝑗] 16: depth ← ∣path∣ ⊳ The relationship depth 17: PathsStrength ← ∅ 18: for all cp ∈ 𝒞𝒫[𝑖][𝑗] do 19: Compute the strength of certificate path cp and add it to PathsStrength 20: end for 21: Let max t ← max(PathsStrength) be the trust threshold 22: Remove from 𝒞𝒫[𝑖][𝑗] the paths with a strength less than max t 23: Let RC 𝑘 be the set of certificates at the 𝑘th position in the paths in 𝒞𝒫[𝑖][𝑗] 24: Let firstcert and lastcert be, respectively, the first and last certificates in path 25: inode ← inode(firstcert) ⊳ The initial node of the relationship 26: tnode ← tnode(lastcert) ⊳ The terminal node of the relationship 27: 𝑘←1 28: while 𝑘 < depth do 29: 𝑉 ← {𝑣 ∈ 𝑉SN ∣∃rc ∈ RC 𝑘 such that 𝑣 = inode(rc)} 30: for all 𝑣 ∈ 𝑉 do 31: Let 𝑁 be the set of neighbors of 𝑣, based on the certificates in RC 𝑘 32: Let 𝑆 be the set of neighbors of the nodes in 𝑁 , based on RC 𝑘+1 33: for all 𝑠 ∈ 𝑆 do ∑ tinode,𝑢 t𝑢,𝑠 ∑ 34: tinode,𝑠 ← 𝑢∈𝑁 𝑢∈𝑁 tinode,𝑢 35: end for 36: end for 37: 𝑘 ←𝑘+1 38: end while 39: trust ← max(trust, tinode,tnode ) 40: if trust = tinode,tnode then 41: 𝑟𝑎 ← (inode, tnode, rt(firstcert), depth, trust) 42: end if 43: end if 44: end for 45: Add 𝑟𝑎 to 𝑅𝐴 46: end for 47: return 𝑅𝐴 48: end function

of nodes, but of different type, all satisfying ac. Finally, if inode(ac) = ∗ and rt(ac) = ∗, the returned paths denote relationships of any type existing between the requestor and any other node in the network, all satisfying ac. This means that, in case of ∗-conditions, the number of elements of 𝒞𝒫[𝑖] and the generated relationship assertions, will be equal to ∣𝑉SN ∣ × ∣RT SN ∣, in the worst case. Since all such relationship assertions satisfy ac, it would be quite inefficient to process them all. Indeed, we need only one of them. In order to address this issue, we discard all the returned sets of certificate paths, except one of those having the highest trust level, and then we computed the relationship assertion only based on it (lines 40-42). ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

27

(1) The resource owner 𝑂 verifies whether the received nonce is valid. 𝑂 maintains an access request list ARL where, for each access request, a set of tuples is stored, one for each access rule protecting the requested resource. Each tuple has the form (𝑅, ar, n), where 𝑅 is the requesting node, ar is one of the access rules applying to the requested resource, whereas n is the nonce value associated with access rule ar by 𝑂 (see step 2 of the protocol in Figure 3). When 𝑂 receives from 𝑅 the message 𝐸PK 𝑂 (𝐸SK 𝑅 (rid, 𝜋, 𝐸SK CS (𝒞𝒫, n))) at step 5 of the protocol, he/she first decrypts the pair (𝒞𝒫, n) with the public key PK CS of CS , and then he/she compares the nonce value n with the one currently stored into ARL. If (𝒞𝒫, n) cannot be decrypted and/or a tuple (𝑅, ar, n) is not present in ARL, 𝑂 denies the access to the resource; otherwise, (2) 𝑂 checks whether the proof sent by 𝑅 is correct; if the proof is not correct, 𝑂 removes the tuple (𝑅, ar, n) from ARL and denies the access to the resource; otherwise, (3) 𝑂 can choose between two options: (a) going directly to step 4, or (b) deriving from the certificate paths in 𝒞𝒫 the corresponding set of assertions, by using the same procedure described in Section 7.4. In the latter case, if the assertions computed by 𝑂 are different from those used in the proof delivered by 𝑅, 𝑂 removes the tuple (𝑅, ar, n) from ARL and denies the access to the resource*; otherwise, (4) 𝑂 removes the tuple (𝑅, ar, n) from ARL and grants the access to the resource. * Note that 𝑂 might also verify the validity of the certificates discovered by CS by contacting other certificate servers or other nodes in the network.

Fig. 4.

7.5

Proof verification steps

Proof Verification

In the last step of the access control protocol (i.e., step 6 in Figure 3), the resource owner 𝑂 verifies the validity of the proof returned by the requestor and then decides whether to grant or deny access to the resource. The four steps carrying out proof verification are illustrated in Figure 4. Example 7.4. Consider the access control request illustrated in Example 7.2. In step 7 of the protocol, Alice receives from David the message: 𝐸PK 𝐴 (𝐸SK 𝐷 (rid , 𝜋, 𝐸SK CS ({(rc5 )}, n2 ))) According to the protocol described in Figure 4, Alice first tries to decrypt 𝐸SK CS ({(rc5 )}, n2 ). Then, she verifies whether there exists in her access request list ARL a tuple (𝐷, ar2 , n2 ). Since Alice is able to decrypt 𝐸SK CS ({(rc5 )}, n2 ) and the required tuple is found in ARL, she then verifies the correctness of proof 𝜋 by running the Cwm reasoner. Once having verified that proof 𝜋 is correct, Alice removes the tuple (𝑅, ar2 , n2 ) from ARL and she may directly send David the requested resource or, alternatively, she can perform a further check on the certificate paths delivered by David. Thus, from {(rc5 )}, Alice computes the corresponding assertion and compares it to the one used in the proof. Since the two assertions match, Alice removes (𝐷, ar2 , n2 ) from ARL and grants access to the resource. 8.

SECURITY ANALYSIS

The potential attacks, which our system may be subject to, concern both the certificate server CS and resource owners, and can be grouped into two classes, depending on their purpose, namely, attacks aiming at gaining unauthorized access ACM Journal Name, Vol. V, No. N, Month 20YY.

28

⋅


to resources, and denial of service attacks. Access to a resource can be gained only if a requestor is able to provide the resource owner with a set of information (namely, a proof, a set of certificate paths, and a nonce value) demonstrating that he/she satisfies a given access rule. The simplest attack consists in forging such information. Consider, for instance, a node 𝑅 requesting a resource rsc to a node 𝑂. 𝑅 receives the set of access rules ar1 , . . . , ar𝑛 applying to rsc along with the corresponding nonce values n1 , . . . , n𝑛 . 𝑅 can choose one of these access rules (say ar1 ), forge an appropriate proof for it and a corresponding data structure 𝒞𝒫, and then send such information along with the nonce value n1 to 𝑂. This attack is prevented by our access control protocol (see steps 4 and 5 in Figure 3) by keeping the pair (𝒞𝒫, n1 ) encrypted with the private key SK CS of the certificate server. An alternative attack implies to upload to CS fake certificates, generated in such a way to have a (set of) certificate path(s) satisfying a given access rule. However, this attack is prevented by our system, since a certificate is double signed by the two nodes establishing the relationship, and thus it is not possible to certify fake relationships (see Section 6). A similar attack can be performed by forging only the proof. Suppose that 𝑅 must prove to have a relationship with node 𝑣. Suppose now that 𝑅 obtains from CS a set of certificate paths demonstrating that he/she actually participates with 𝑣 in a relationship of the required type and depth. Yet, when computing the proof, 𝑅 realizes that he/she does not satisfy the constraint on the minimum trust level. In such a case, 𝑅 may decide to forge and send the proof to the resource owner 𝑂, along with the encrypted pair (𝒞𝒫, n) sent by CS . In such a case, the nonce is valid and the proof seems to be correct. However, 𝑂 can realize that 𝑅 is not authorized to access the requested resource by checking the proof against 𝒞𝒫 (such operation is enforced as an option by the protocol in Figure 4). The idea underlying this strategy is making the owner able to customize the required security guarantees by taking into account the profile of the requestor. For instance, the choice of whether or not checking the proof against the set of received certificate paths can be based on the results of previous requests submitted by the same node. A different type of attack can be performed by impersonating an authorized node in the network. For example, suppose that 𝑅 knows that a given node 𝐴 is authorized to access a resource rsc owned by 𝑂. In such a case, 𝑅 may contact 𝑂 and CS , claiming to be 𝐴, and thus retrieving the appropriate information to gain access to rsc. The same attack can be performed by eavesdropping the messages exchanged between 𝐴 and 𝑂 related to an access request to resource rsc. 𝑅 can intercept the message with the proof and send it to 𝑂, claiming to be 𝐴. However, our access control protocol prevents both identity theft and man-in-the-middle attacks by requiring that all the messages are encrypted with the private key of the sender and the public key of the receiver. Finally, timing and replay attacks may also be performed in order to gain access to a resource. As an example, suppose that, at a given instant ts, 𝑅 requests to access resource rsc, and that he/she is authorized according to the corresponding rules. Suppose now that, at a given instant ts′ > ts, one or more certificates concerning 𝑅 have been revoked, and that, as a consequence, 𝑅 is no more authorized to access resource rsc. In such a case, 𝑅 could still send 𝑂 the previous proof and the ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

29

corresponding certificate paths received from CS at instant ts, thus gaining access to rsc. In order to prevent this attack, in our access control protocol nonce values are associated with the access rules sent by 𝑂, which are also returned by CS along with the corresponding set of certificate paths (see steps 3-5 of the access control protocol in Figure 3). Each nonce value identifies a given access rule with respect to a given session, and it cannot be modified by 𝑅, since it is encrypted with the private key of CS (step 5). Thanks to this, it is possible to discard proofs when the received nonce value differs from that of the corresponding access rule. Denial of service attacks concern mainly the certificate server, but also resource owners may be subject to them. The vulnerable service of the certificate server is the one in charge of discovering certificate paths. A node may maliciously submit a high number of requests requiring high computational costs to be managed. Typically, this means, for example, requests of certificate paths concerning two nodes not connected by any relationship. This forces CS to explore the whole network graph. To address this issue, we can adopt standard strategies used by online systems in order to reduce the risk of denial of service attacks. As an example, we can set an upper bound to the number of requests to be accepted and fix a maximum timeout for their evaluation, which may vary depending on the system workload. A similar approach can be adopted in order to avoid denial of service attacks on the side of resource owners, which can be overloaded by access requests or invalid proofs. Such attacks may also be prevented by allowing resource owners to track these kinds of behavior and to maintain a list of malicious nodes, to be used in order to refuse a priori their requests. 9.

SYSTEM IMPLEMENTATION

Figure 5 depicts the four main components of the system implementing our access control model: the certificate server CS , the SNMS, a peripheral node, corresponding to a node in the network, and the system interface. The certificate server, the SNMS, and the peripheral nodes are implemented as Web services, whereas the system interface is provided as an extension to the Mozilla Firefox browser, which can be downloaded and installed by users after being registered into the network. All these applications communicate by using the HTTPS protocol. Finally, the system makes use of OpenID (http://www.openid.net) as authentication framework, which has the advantage of simplifying the registration and authentication procedures by allowing users to log in into different WBSNs by using a single user ID and password. To store relationship certificates and the users’ data needed for authentication, the certificate server and the SNMS make use of the PostgreSQL relational DBMS. In contrast, peripheral nodes store the data concerning relationships, access rules, and resources by using RDF files. Our prototype supports in total 33 relationship types, corresponding to those defined in the RELATIONSHIP vocabulary [Davis and Vitiello Jr 2005] plus the FOAF knows property [Brickley and Miller 2007]. The prototype system has been implemented in a WBSN, called ACSoNet. System services can be accessed through the system interface, a client application, that is typically run by end users’ machines. Through the system interface it is possible to generate, update, and revoke certificates and access rules, as well ACM Journal Name, Vol. V, No. N, Month 20YY.

30

⋅


Fig. 5.

System architecture

as submitting access requests to the nodes in the network, and receiving and delivering rules and proofs. When activated, the system interface displays a window consisting of a toolbar and a set of tabs, allowing users to manage their personal profile, their contacts, and their resources, and to browse the resources shared in the network and the list of registered users. Figure 6 depicts a screenshot of the system interface, namely the “My resources” tab, which displays information about the resources owned by the user, along with the associated access rules and corresponding conditions. The user has also the possibility to decide whether the existence of a given resource should be publicly available to network participants. This is achieved by properly setting the “Visible” property of a given resource. In addition, access to a resource can be temporarily blocked by using the “Locked” option. 10.

SYSTEM PERFORMANCE

In this section, we discuss the performance of the implemented prototype in terms of time required to evaluate an access request. The main tasks affecting the performance of access control enforcement are: (1) certificate path discovery, performed by CS ; (2) assertions generation, performed by 𝑅 and 𝑂; (3) proof generation, performed by 𝑅 and 𝑂. In what follows, we evaluate the performance of each single task. 10.1

Certificate Path Discovery

The complexity of this task is the time required by Algorithm 1. The algorithm exploits the BFS() function (see Algorithm 1) to explore the social network graph and discover the certificate paths satisfying a given access condition. The BFS() function is iterated for each access condition in the rule on the set RT of relationship types to be taken into account. We consider the algorithm’s performance in the worst case, i.e., when each access condition in AC (ar) has the inode, rt, and dmax components set to ∗. Moreover, we assume that RT = RT SN —which implies that ACM Journal Name, Vol. V, No. N, Month 20YY.


Fig. 6.

⋅

31

The “My resources” section of the system interface

for each different relationship type in RT SN the algorithm searches for a different set of certificate paths—and that it is necessary to explore, in the worst case, the whole subgraph to verify whether a relationship exists or not between two nodes. Since in our prototype we support the 33 relationship types defined in the RELATIONSHIP vocabulary, plus the FOAF knows property, in the worst case the search is iterated at most 34 times for each access condition. To have an estimation of the time complexity in real world scenarios, we have thus performed several experiments over the BFS() function, by varying the order of the subgraph SN rt , as well as the indegree of the nodes in SN rt . The experiments were conducted on a 3.60GHz Dual-Core Intel Xeon GNU/LINUX machine, with 4GB RAM. As reported in Figure 7, we have considered subgraphs consisting of a number of nodes ranging from 100 to 6,000. According to the obtained results, exploring a graph of order 100 requires at most ∼0.001 sec, one of order 600 at most ∼0.01 sec, one of order 2,000 ∼0.1 sec, and one of 6,000 nodes ∼1 sec. We recall that the size is not referred to the order of the whole social network graph; rather, given a relationship type, it represents the number of nodes having at least a relationship of such type. The number of nodes participating in a relationship of a given type depends both on the size and the purposes of the network graph and on the type itself of the relationship. Thus, it is important to outline which kinds of WBSNs represent the target for our access control mechanism. We do not believe that the scenario that can benefit from our system is the one of a general purpose WBSN, that is, a WBSN set up with the aim of creating a place where a possibly huge amount of people can ‘meet’. Indeed, in this kind of WBSNs, the main users’ requirements are not related to security. In contrast, a reference scenario for our system is a WBSN set up for creating a place where users with common goals and interests should be able to share some information mainly for business or research purposes. An example of such a WBSN could be, for instance, the one set up into an organization, where users are employees, and relationship types corresponds to ACM Journal Name, Vol. V, No. N, Month 20YY.

32

⋅

Barbara Carminati et al. 100

|SN rt | = 6,000

10−1

seconds

|SN rt | = 2,000

10−2

|SN rt | = 600

10−3

|SN rt | = 100

10−4

2

3

4

5

6

7

8

9

10

20

30

40

50

60

70

80

90

100

indegree of the nodes in SN rt

Fig. 7. Experimental results on the performance of the certificate path discovery procedure, with subgraphs of order ranging from 100 to 6,000 and considering the worst case. The 𝑦-axis uses a logarithmic scale.

the work relationships existing between them (e.g., “manager of”, “secretary of”, “colleague of”). Another examples are the social networks built up for managing dynamic businesses. This could be the case, for instance, of a social network defined for a Virtual Organization (VO), where users are the entities participating to the VO and relationship types are the dynamic collaborations established between them. For such reference scenarios, the performance presented in Figure 7 makes feasible to adopt our system. We believe that in such kinds of social networks there exist several relationship types determining subgraphs of order ∼100. In contrast, we expect that a limited number of relationship types will involve 2,000 nodes, and only fewer 6,000 nodes. Note however that, even in the last case, the time required to discover certificate paths is around 1 sec. So far we have discussed the system performance when evaluating a single access condition. To obtain the performance of the overall procedure we must estimate the maximum number of access conditions in a rule—i.e., how many times the BFS() function is iterated. Theoretically, the maximum number of access conditions is ∣AC (ar)∣ = 34 × ∣𝑉SN ∣. However, this is a limit that is very unlikely to be reached in real situations, because it corresponds to an access rule that requires to have a relationship with each node in the graph for each supported relationship type. We believe that, in real situations, resource access requirements can be always specified in a more compact way, and that the number of access conditions is not greater than 10. Moreover, since our system supports 34 relationship types, we may have at most 34 assertions for each condition in a rule, when the rt component of such condition is set to ∗. In this case, the maximum number of iterations of the BFS() functions is 10 × 34 = 340. 10.2

Assertion Generation

This task is performed by Algorithm 2, which computes a different assertion for each different access condition ac ∈ AC (ar). In particular, given a set CP of certificate ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

33

paths satisfying ac𝑖 (i.e., the certificate paths contained into 𝒞𝒫[𝑖]), generating the corresponding assertion requires to read the set RC of certificates in CP . The complexity of this step is then linear to the number ∣RC ∣ of certificates in CP . The worst case occurs when the access condition ac𝑖 has the inode and rt components set to ∗, and termRT = RT SN . According to Algorithm 1, in this case, for each relation type in RT SN , the BFS() function is called, passing as input inode(ac) = ∗ (lines 14-24). More precisely, for each rt ∈ RT SN , the BFS() function searches for all the shortest paths connecting 𝑅 with any other node in the subgraph SN rt ⊆ SN . In the worst case, the length of the shortest certificate paths is equal to the diameter diam SN 𝑟𝑡 of subgraph SN 𝑟𝑡 , 1 ≤ diam SN 𝑟𝑡 ≤ ∣𝑉SN rt ∣ − 1, whereas the total number discovered shortest certificate paths corresponds to ( ∣𝑉of ) SN rt ∣−1 the binomial coefficient diam , which denotes all the possible combinations SN rt without repetitions of class diam SN rt of a given node (i.e., the requestor 𝑅) with any other node in SN rt . Consequently, since the BFS() is iterated on each relationship type in RT SN , the total number ∣RC ∣ of certificates in CP is equal to ( ∣𝑉SN ∣−1 ) ( ∣𝑉SN ∣−1 ) ∑ rt 𝑟𝑡∈RT SN diam SN rt diam SN 𝑟𝑡 ≈ diam SN diam SN . Such estimation can however be refined by taking into account the topological characteristics of typical social networks. Indeed, since social networks are small world networks [Watts 2003], they are characterized by a small diameter, which grows logarithmically in the size of the graph (see, e.g., [Watts 2003; Kleinberg 2000; Martel and Nguyen 2004] for a discussion on this topic). Based on this ( ∣𝑉SN ∣−1 ) . assumption, we can then estimate ∣RC ∣ ≈ ⌈log ∣𝑉SN ∣⌉ ⌈log ∣𝑉SN ∣⌉

Finally, since an assertion must be generated for each access condition in AC (AR), the overall time complexity of Algorithm 2 is given by ∣AC (𝐴𝑅)∣ × ∣RC ∣. However, as discussed in Section 10.1, we believe that in real world scenarios the number of access conditions is not greater than 10, and, as such, we can time com( estimate the ( ∣𝑉SN ∣−1 )) ( ∣𝑉SN ∣−1 ) plexity in the worst case as 10⌈log ∣𝑉SN ∣⌉ ⌈log ∣𝑉SN ∣⌉ , i.e., 𝑂 ⌈log ∣𝑉SN ∣⌉ ⌈log ∣𝑉SN ∣⌉ . In order to have a more realistic estimation of the time required by this task, we performed some experiments. In particular, since assertion generation is carried out client-side, we have conducted these experiments on a desktop PC (1.5GHz Intel Pentium M Windows, with 768MB RAM). The obtained results show that even in case of a huge amount of certificates, this task requires only few seconds, since it basically requires only to access all of them. For instance, in case of computing an assertion based on a set of 1,000 certificate paths, each of length 100 (i.e., 100,000 certificates), the required time is ∼0.2 sec. 10.3

Proof Generation

Proof generation is performed by the Cwm reasoner, which receives as input a set of assertions, along with the original access rule. Since Algorithm 2 returns exactly one relationship assertion per access condition, the performance of this task is affected only by the number of access conditions defined in the original access rule. Although, as discussed in Section 10.1, we expect to have at most 10 access conditions, we have carried out experiments by varying the number of access conditions from 1 to 30. According to our results (see Appendix C), the Cwm reasoner requires ∼1 sec to compute a proof based on 10 conditions, and ∼5 sec to evaluate 30 access conditions. ACM Journal Name, Vol. V, No. N, Month 20YY.

34

11.

⋅


CONCLUSIONS AND FUTURE WORK

In this paper, we have proposed an access control model and related enforcement mechanism for WBSNs, which adopts a rule-based approach for specifying access control policies on the resources owned by network participants, and where authorized users are denoted in terms of the type, depth, and trust level of relationships. Differently from traditional access control systems, our mechanism makes use of a semi-decentralized architecture, where the information concerning users’ relationships is encoded into certificates, stored by a certificate server, whereas access control enforcement is carried out client-side. We plan to extend our mechanism along several directions. A first extension concerns the support for a more expressive policy language, where, for instance, it is possible to express positive and negative access rules, content-based access control, as well as rules of the form “only my colleagues’ friends are authorized”. Moreover, we plan to perform a more extensive performance evaluation of our prototype, and to investigate further implementation strategies. REFERENCES Adomavicius, G. and Tuzhilin, A. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17, 6 (June), 734–749. Ali, B., Villegas, W., and Maheswaran, M. 2007. A trust based approach for protecting user data in social networks. In 2007 Conference of the Center for Advanced Studies on Collaborative research, CASCON 2007. ACM Press, 288–293. Avesani, P., Massa, P., and Tiella, R. 2005. A trust-enhanced recommender system application: Moleskiing. In 2005 ACM Symposium on Applied Computing, SAC 2005. ACM Press, 1589– 1593. Berners-Lee, T., Connolly, D., Kagal, L., Scharf, Y., and Hendler, J. 2008. N3Logic: A logical framework for the World Wide Web. Theory Pract. Log. Program. 8, 3 (May), 249–269. Berteau, S. 2007. Facebook’s misrepresentation of Beacon’s threat to privacy: Tracking users who opt out or are not logged in. CA Security Advisor Research Blog. Online: http://community. ca.com/blogs/securityadvisor/archive/2007/11/29/facebook-s-misrepresentation-ofbeacon-s-threat-to-privacy-tracking-users-who-opt-out-or-are-not-logged-in.aspx. Beth, T., Borcherding, M., and Klein, B. 1994. Valuation of trust in open networks. In Computer Security – ESORICS 94. LNCS, vol. 875. Springer, 3–18. Blaze, M., Feigenbaum, J., Ioannidis, J., and Keromytis, A. D. 1999. The KeyNote trustmanagement system version 2. IETF RFC 2704, Internet Engineering Task Force. Sept. Online: http://www.ietf.org/rfc/rfc2704.txt. Blaze, M., Feigenbaum, J., and Strauss, M. 1998. Compliance checking in the PolicyMaker trust management system. In 2nd International Conference on Financial Cryptography, FC 1998. LNCS, vol. 1465. Springer, 1439–1456. Brickley, D. and Miller, L. 2007. FOAF vocabulary specification 0.91. Namespace Document. Nov. Online: http://xmlns.com/foaf/0.1. Canadian Privacy Commission. 2007. Social networking and privacy. Online: http://www. privcom.gc.ca/information/social/index_e.asp. Carminati, B. and Ferrari, E. 2008. Access control and privacy in Web-based social networks. Int. J. Web Inf. Syst. 4, 4, 395–415. Carminati, B., Ferrari, E., and Perego, A. 2006. Rule-based access control for social networks. In On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. LNCS, vol. 4278. Springer, 1734–1744. Chen, L. 2006. Facebook’s feeds cause privacy concerns. The Amherst Student. Online: http: //halogen.note.amherst.edu/ãstudent/2006-2007/issue02/news/01.html. ACM Journal Name, Vol. V, No. N, Month 20YY.


⋅

35

Choi, H.-C., Kruk, S. R., Grzonkowski, S., Stankiewicz, K., Davis, B., and Breslin, J. G. 2006. Trust models for community-aware identity management. In Identity, Reference, and the Web Workshop, IRW 2006. Online: http://www.ibiblio.org/hhalpin/irw2006/skruk.pdf. Cwm 2006. Cwm – A general purpose data processor for the Semantic Web. Project Web site, World Wide Web Consortium. Oct. Online: http://www.w3.org/2000/10/swap/doc/cwm.html. Davis, I. and Vitiello Jr, E. 2005. RELATIONSHIP: A vocabulary for describing relationships between people. Namespace Document. Aug. Online: http://purl.org/vocab/relationship. Ding, L., Zhou, L., Finin, T. W., and Joshi, A. 2005. How the Semantic Web is being used: An analysis of FOAF documents. In 38th Annual Hawaii International Conference on System Sciences, HICSS 2005. Vol. 4. IEEE CS Press, 113c. ¨ nen, T. Ellison, C. M., Frantz, B., Lampson, B., Rivest, R. L., Thomas, B. M., and Ylo 1999. SPKI certificate theory. IETF RFC 2693, Internet Engineering Task Force. Sept. Online: http://www.ietf.org/rfc/rfc2693.txt. EPIC. 2008a. Facebook privacy page. Online: http://epic.org/privacy/facebook/. EPIC. 2008b. Social networking privacy. Online: http://epic.org/privacy/socialnet/default. html. Federal Trade Commission. 2007. Social networking sites: A parents guide. Online: http: //www.ftc.gov/bcp/edu/pubs/consumer/tech/tec13.shtm. Ferrari, E. and Thuraisingham, B. 2000. Secure database systems. In Advanced Database Technology and Design, M. Piattini and O. Diaz, Eds. Artech House, Chapter 11, 353–403. Garfinkel, S. 1996. PGP: Pretty Good Privacy. O’Reilly & Associates. Golbeck, J. A. 2005. Computing and applying trust in Web-based social networks. Ph.D. thesis, Graduate School of the University of Maryland, College Park. Golbeck, J. A. and Hendler, J. 2006. Inferring binary trust relationships in Web-based social networks. ACM Trans. Inter. Tech. 6, 4 (Nov.), 497–529. Hart, M., Johnson, R., and Stent, A. 2007. More content – less control: Access control in the Web 2.0. In Web 2.0 Security & Privacy 2007 Workshop, W2SP 2007. Online: http: //seclab.cs.rice.edu/w2sp/2007/papers/paper-193-z_6706.pdf. Hogben, G. 2007. Security issues and recommendations for online social networks. ENISA Position Paper 1, European Network and Information Security Agency. Oct. Online: http: //www.enisa.europa.eu/doc/pdf/deliverables/enisa_pp_social_networks.pdf. Horrocks, I., Patel-Schneider, P. F., Boley, H., Tabet, S., Grosof, B., and Dean, M. 2004. SWRL: A Semantic Web rule language combining OWL and RuleML. W3C Member Submission, World Wide Web Consortium. May. Online: http://www.w3.org/Submission/ SWRL. Jøsang, A. 1999. An algebra for assessing trust in certification chains. In 1999 Network and Distributed System Security Symposium, NDSS 1999. Online: http://www.isoc.org/isoc/ conferences/ndss/99/proceedings/papers/josang.pdf. Jøsang, A., Gray, E., and Kinateder, M. 2006. Simplification and analysis of transitive trust networks. Web Intell. Agent Syst. 4, 2, 139–161. Jøsang, A., Ismail, R., and Boyd, C. 2007. A survey of trust and reputation systems for online service provision. Decis. Support Syst. 43, 2 (Mar.), 618–644. Kamvar, S. D., Schlosser, M. T., and Garcia-Molina, H. 2003. The Eigentrust algorithm for reputation management in P2P networks. In 12th International Conference on World Wide Web, WWW 2003. ACM Press, 640–651. Kleinberg, J. 2000. The small-world phenomenon: An algorithmic perspective. In 32nd Annual ACM Symposium on Theory of Computing, STOC 2000. ACM Press, 163–170. Kruk, S. R., Grzonkowski, S., Choi, H.-C., Woroniecki, T., and Gzella, A. 2006. D-FOAF: Distributed identity management with access rights delegation. In The Semantic Web – ASWC 2006. LNCS, vol. 4185. Springer, 140–154. Martel, C. and Nguyen, V. 2004. Analyzing Kleinberg’s (and other) small-world models. In 23rd Annual ACM Symposium on Principles of Distributed Computing, PODC 2004. ACM Press, 179–188. ACM Journal Name, Vol. V, No. N, Month 20YY.

36

⋅


Reiter, M. K. and Stubblebine, S. G. 1997. Toward acceptable metrics of authentication. In 1997 IEEE Symposium on Security and Privacy, SP 1997. IEEE CS Press, 10–20. Shamir, A. 1979. How to share a secret. Commun. ACM 22, 11 (Nov.), 612–613. Watts, D. J. 2003. Small Worlds: The Dynamics of Networks between Order and Randomness. Princeton University Press. Weitzner, D. J., Hendler, J., Berners-Lee, T., and Connolly, D. 2006. Creating a policyaware Web: Discretionary, rule-based access for the World Wide Web. In Web & Information Security, E. Ferrari and B. Thuraisingham, Eds. IDEA Group Publishing, 1–31. Xiong, L. and Liu, L. 2004. PeerTrust: Supporting reputation-based trust for peer-to-peer electronic communities. IEEE Trans. Knowl. Data Eng. 16, 7 (July), 843–857. ACKNOWLEDGMENTS

We would like to thank the anonymous reviewers for their insightful comments, which have helped us in improving the presentation of the paper. Special thanks to our colleagues Marco Benini and Claudio Gentile, for their useful suggestions about the complexity analysis of the proposed algorithms. The work reported in this paper is partially funded by the ANONIMO project (PRIN-2007F9437X 004), funded by the Italian Ministry of University, Education and Research (http://www.dicom.uninsubria.it/dawsec/anonimo/).



⋅

37

APPENDIX A.

NOTATION USED IN THE PAPER

Table II reports the notation for the main notions used throughout the paper. Table II.

Notation

Meaning

SN 𝑉SN 𝐸SN RT SN rt t d

A WBSN graph The set of nodes of a WBSN graph SN The set of edges of a WBSN graph SN The set of relationship types supported by a WBSN SN A relationship type A trust level The depth of a relationship The set of shortest paths between nodes 𝑣 and 𝑣 ′ denoting a relationship of type rt The trust level of relationship existing between nodes 𝑣, 𝑣 ′ A subgraph of SN consisting of edges labeled with relationship type rt and the nodes connected by them A set of access conditions An access condition A set of access rules An access rule The set of access conditions in rule ar The public key of node 𝑣 The private key of node 𝑣 A relationship certificate A bi-dimensional array where each element 𝒞𝒫[𝑖], 1 ≤ 𝑖 ≤ ∣𝒞𝒫∣, is an array where each element 𝒞𝒫[𝑖][𝑗], 1 ≤ 𝑗 ≤ ∣𝒞𝒫[𝑖]∣, contains the set of shortest certificate paths having the same initial and terminal node and consisting of edges all labeled with the same relationship type A set of relationship assertions A relationship assertion A proof

rt(𝑣, 𝑣 ′ ) t𝑣,𝑣′ SN rt AC ac AR ar AC (ar) PK 𝑣 SK 𝑣 rc

𝒞𝒫

𝑅𝐴 𝑟𝑎 𝜋

B.

Notation used in the paper

USING N3 TO REPRESENT RELATIONSHIPS AND ACCESS RULES

N3 [Berners-Lee et al. 2008] is an RDF-compliant language, which allows the specification of rules and proofs (and, in general, any RDF statement) in a more compact form with respect to RDF-based rule languages, such as SWRL [Horrocks et al. 2004]. Such rules and proofs can then be placed into the headers of HTTP responses/requests, without the need of posting them as XML documents. Moreover, N3 allows us to integrate in our system the RDF-based reasoner Cwm [Cwm 2006], which carries out proof generation. ACM Journal Name, Vol. V, No. N, Month 20YY.

38 1 2 3 4

⋅


@prefix : @ p r e f i x base : @prefix owl : @prefix xsd :

. . . .

6

a o w l : C l a s s ; : i s D e f i n e d B y b a s e : ;

8

a o w l : C l a s s ; : i s D e f i n e d B y b a s e : ; : s u b C l a s s O f [ a o w l : R e s t r i c t i o n ; o w l : c a r d i n a l i t y ” 1 ” ; o w l : o n P r o p e r t y ] , [ a o w l : R e s t r i c t i o n ; o w l : c a r d i n a l i t y ” 1 ” ; o w l : o n P r o p e r t y ] , [ a o w l : R e s t r i c t i o n ; o w l : m a x C a r d i n a l i t y ” 1 ” ; o w l : o n P r o p e r t y ] , [ a o w l : R e s t r i c t i o n ; o w l : c a r d i n a l i t y ” 1 ” ; o w l : o n P r o p e r t y ] , [ a o w l : R e s t r i c t i o n ; o w l : m a x C a r d i n a l i t y ” 1 ” ; o w l : o n P r o p e r t y ] .

10

a o w l : O b j e c t P r o p e r t y ; : domain ; : i s D e f i n e d B y b a s e : ; : r a n g e .

12

a o w l : D a t a t y p e P r o p e r t y ; : domain ; : i s D e f i n e d B y b a s e : ; : range xsd : n o n N e g a t i v e I n t e g e r .

14


16


18

a o w l : D a t a t y p e P r o p e r t y ; : domain ; : i s D e f i n e d B y base : ; : range xsd : decimal .

20


22

base : a owl : Ontology .

Fig. 8. N3-encoded definition of the main classes and properties of the REL-X vocabulary. The instances of class relx:RelType, defining the supported relationship types, are omitted.

The general form of a N3 rule consists of two formulas—i.e., two sets of RDF triples, called antecedent and consequent—and an implication operator. For instance, a rule: {? x : f r i e n d O f } => {? x : c a n A c c e s s }

states that, if a user (represented by variable ?x) is a friend of the user with identifier userURI, such user can access the resource with identifier resourceURI. The access rules specified in our model can be translated into standard N3 rules, where the antecedent models the access conditions in the rule, whereas the consequent denotes the resource which can be accessed. However, in order to achieve this, we need to model relationships by using RDF/OWL. For this purpose, we have designed a vocabulary, REL-X 4 , where relationships are modeled by the OWL class relx:Relationship and by the following properties: relx:initNode, relx:termNode, relx:type, relx:trustLevel, and relx:depth. Additionally, the REL-X vocabulary defines a relx:canAccess property which is used to express predicates stating that a given user is authorized to access a given resource. Finally, the supported relationship types are defined as instances of class relx:RelType. Figure 8 reports the definition of these classes and properties. 4 Available

at: http://www.dicom.uninsubria.it/dawsec/vocs/relx.



⋅

39

Based on this, we can translate a rule ar = (rsc 1, {(Alice, friendOf , 2, 0.5)}) as follows: {? r e l : i n i t N o d e ; : termNode ? node ; : t y p e : F r i e n d O f ; : d e p t h [ math : n o t G r e a t e r T h a n 2 ] ; : t r u s t L e v e l [ math : n o t L e s s T h a n 0 . 5 ] . } => {? node : c a n A c c e s s }.

where variable ?rel represents a relationship, whereas variable ?node represents a requesting node. Since in our rules more than one condition may be present, their general form in N3 can be obtained by extending the approach sketched above according to the following procedure: —for each condition ac = (inode, rt, dmax , tmin ) in an access rule ar, the following N3 statement is generated: ? r e l : i n i t N o d e ; : termNode ? node ; : t y p e rt ; : d e p t h [ math : n o t G r e a t e r T h a n dmax ] ; : t r u s t L e v e l [ math : n o t L e s s T h a n tmin ] .

—the obtained statements are concatenated (by using the implied conjunction operator), to form the antecedent component of the rule; —the resource identifier rid is then used in the statement ?node :canAccess < rid >, which corresponds to the consequent component of the rule. In case of ∗-conditions, we transform the access condition component(s) set to ∗ as follows: —inode = ∗: ?rel :initNode [a foaf:Agent] —rt = ∗: ?rel :type [a relx:RelType] —dmax = ∗: ?rel :depth [math:notLessThan 1] —tmin = ∗: ?rel :trustLevel [math:notLessThan 0] For instance, a rule ar′ = (rsc 1, {(Alice, ∗, 2, 0.5)}) is expressed in N3 as follows: {? r e l : i n i t N o d e ; : termNode ? node ; : t y p e [ a r e l x : RelType ] ; : d e p t h [ math : n o t G r e a t e r T h a n 2 ] ; : t r u s t L e v e l [ math : n o t L e s s T h a n 0 . 5 ] . } => {? node : c a n A c c e s s }.

We can use N3 also for representing relationships, always according to the RELX vocabulary. For instance, the relationships between Alice and Bob depicted in Figure 1 can be represented by the assertions 𝑟𝑎 = (Alice, Bob, friendOf , 1, 0.2) and 𝑟𝑎′ = (Alice, Bob, colleagueOf , 1, 0.8), which corresponds to the following N3 statements: < r e l 1 > : i n i t N o d e ; : termNode ; : t y p e : F r i e n d O f ; : depth 1 ; : t r u s t L e v e l 0.2 . < r e l 2 > : i n i t N o d e ; : termNode ; : t y p e : C o l l e a g u e O f ; : depth 1 ; : t r u s t L e v e l 0.8 .

Such assertions are derived from the certificates managed by 𝐶𝑆, and can be used by the Cwm reasoner in order to generate proofs. Let us now see how this happens. ACM Journal Name, Vol. V, No. N, Month 20YY.

40

⋅


Suppose that Bob requests to access resource rsc 1, protected by rule ar′ . As soon as Bob receives from the certificate server the sets of certificate paths corresponding to the relationships existing between Alice and him, he computes the corresponding assertions—i.e., 𝑟𝑎 and 𝑟𝑎′ above—and gives them as input to Cwm along with the rule. Since 𝑟𝑎′ satisfies rule ar′ , Cwm infers the following statement: : c a n A c c e s s .

As we mentioned in Section 7.1, a proof 𝜋 contains not only the set of assertions and the access rule, but also statements about the steps followed by the reasoner to carry out the demonstration. The actual N3 encoding of the proof concerning our example is reported in Figure 9. Finally, the resource owner can verify whether the requestor is authorized or not to access a given resource by asking Cwm to check the correctness of the proof. The results of the check performed by Cwm on the proof in Figure 9, is reported in Figure 10. C.

CWM PERFORMANCE EVALUATION

In order to verify Cwm’s performance when generating proofs, we carried out some experiments by varying the cardinality of the set AC (ar) of access conditions in a rule ar. As we expected, the time complexity required by proof generation is proportional to the number ∣AC (ar)∣ of access conditions in the rule. Figure 11 reports the results of our experiments, by varying the number of conditions from 1 to 30. We remind that we expect that each access rule contains at most 10 access conditions (see Section 10.1). Since proof computation is carried out client-side, the experiments have been conducted on a desktop PC (1.5GHz Intel Pentium M Windows, with 768MB RAM). As shown by Figure 11, Cwm requires ∼1 sec to compute a proof while ∣AC (ar)∣ = 10, ∼2 sec when ∣AC (ar)∣ = 20, and ∼5 sec when ∣AC (ar)∣ = 30. Received Month Year; revised Month Year; accepted Month Year



1 2 3 4 5 6 7 9 10

11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

28 29 30 31 32 33 34 35 36 37 38

39 40 42

⋅

41

@prefix : . @prefix e x : . @prefix l o g : . @ p r e f i x math : . @prefix n3 : . @ p r e f i x r e l x : . @prefix r u n : . @forSome r u n : g 0 . [ a : C o n j u n c t i o n , : P r o o f ; : component r u n : g0 , [ a : I n f e r e n c e ; : b i n d i n g [ : boundTo [ n3 : u r i ” h t t p : / / s o m e U r l / r e l 1 ” ] ; : v a r i a b l e [ n3 : n o d e I d ” h t t p : / / s o m e U r l / e x . n3#r e l ” ] ] , [ : boundTo 0 . 2 ; : v a r i a b l e [ n3 : n o d e I d ” h t t p : / / s o m e U r l / e x . n3# g L 1 0 C 1 3 3 ” ] ] , [ : boundTo [ n3 : u r i ” h t t p : / /www . dicom . u n i n s u b r i a . i t / dawsec / v o c s / r e l x#F r i e n d O f ” ]; : variable [ n3 : n o d e I d ” h t t p : / / s o m e U r l / e x . n3# g L 1 0 C 6 1 ” ] ] , [ : boundTo [ n3 : u r i ” h t t p : / / s o m e U r l /Bob” ] ; : v a r i a b l e [ n3 : n o d e I d ” h t t p : / / s o m e U r l / e x . n3#node ” ] ] , [ : boundTo 1 ; : v a r i a b l e [ n3 : n o d e I d ” h t t p : / / s o m e U r l / e x . n3# g L 1 0 C 9 0 ” ] ] ; : evidence ( [ a : E x t r a c t i o n ; : b e c a u s e r u n : g 0 ; : g i v e s { r e l x : t r u s t L e v e l 0 . 2 . } ] [ a : F a c t ; : g i v e s { 0 . 2 math : n o t L e s s T h a n 0 . 1 . } ] [ a : E x t r a c t i o n ; : b e c a u s e r u n : g 0 ; : g i v e s { r e l x : d e p t h 1 . } ] [ a : F a c t ; : g i v e s { 1 math : n o t G r e a t e r T h a n 2 . } ] [ a : E x t r a c t i o n ; : b e c a u s e r u n : g 0 ; : g i v e s { r e l x : t y p e r e l x : F r i e n d O f . }] [ a : E x t r a c t i o n ; : because run : g0 ; : g i v e s { r e l x : F r i e n d O f a r e l x : RelType . }] [ a : E x t r a c t i o n ; : b e c a u s e r u n : g 0 ; : g i v e s { r e l x : termNode . } ] [ a : E x t r a c t i o n ; : b e c a u s e r u n : g 0 ; : g i v e s { r e l x : i n i t N o d e . } ] ); : rule [ a : E x t r a c t i o n ; : b e c a u s e r u n : g 0 ; : g i v e s { @ f o r A l l e x : node , e x : r e l . { @forSome r u n : g1 , r u n : g2 , r u n : g 3 . r u n : g 1 a r e l x : R e l T y p e . r u n : g 2 math : n o t G r e a t e r T h a n 2 . r u n : g 3 math : n o t L e s s T h a n 0 . 1 . e x : r e l r e l x : d e p t h r u n : g 2 ; r e l x : i n i t N o d e ; r e l x : termNode e x : node ; r e l x : t r u s t L e v e l run : g3 ; r e l x : type run : g1 . } l o g : i m p l i e s { e x : node r e l x : c a n A c c e s s . } . } ] ]; : g i v e s { @ f o r A l l e x : node , e x : r e l . a . a ; r e l x : c a n A c c e s s . a r e l x : R e l a t i o n s h i p ; r e l x : d e p t h 1 ; r e l x : i n i t N o d e ; r e l x : termNode ; r e l x : t r u s t L e v e l 0 . 2 ; r e l x : t y p e r e l x : F r i e n d O f . a r e l x : R e l a t i o n s h i p ; r e l x : d e p t h 1 ; r e l x : i n i t N o d e ; r e l x : termNode ; r e l x : t r u s t L e v e l 0 . 8 ; r e l x : t y p e r e l x : C o l l e a g u e O f . a . r e l x : C o l l e a g u e O f a r e l x : RelType . r e l x : F r i e n d O f a r e l x : RelType . { @forSome r u n : g1 , r u n : g2 , r u n : g 3 . r u n : g 1 a r e l x : R e l T y p e . r u n : g 2 math : n o t G r e a t e r T h a n 2 . r u n : g 3 math : n o t L e s s T h a n 0 . 1 . e x : r e l r e l x : d e p t h r u n : g 2 ; r e l x : i n i t N o d e ; r e l x : termNode e x : node ; r e l x : t r u s t L e v e l run : g3 ; r e l x : type run : g1 . } l o g : i m p l i e s { e x : node r e l x : c a n A c c e s s . } . } ]. run : g0 a : Parsing ; : because [ a : CommandLine ; : a r g s ” [ ’ cwm ’ , ’ e x . n3 ’,’−− t h i n k ’,’−−why ’ ] ” ] ; : s o u r c e .

Fig. 9.

Example of N3 encoding of a proof 𝜋


42

3 4 6 7 8 9 10 11 12 13

14


@prefix : @prefix ex : @prefix log : @ p r e f i x math :

. . . .

@ f o r A l l e x : node , e x : r e l . a . a ; : c a n A c c e s s . a : R e l a t i o n s h i p ; : d e p t h 1 ; : i n i t N o d e ; : termNode ; : t r u s t L e v e l 0 . 2 ; : type : FriendOf . a : R e l a t i o n s h i p ; : d e p t h 1 ; : i n i t N o d e ; : termNode ; : t r u s t L e v e l 0 . 8 ; : type : ColleagueOf . a . : C o l l e a g u e O f a : RelType . : F r i e n d O f a : RelType . { @forSome e x : g L10C133 , e x : g L10C61 , e x : g L 1 0 C 9 0 . e x : g L 1 0 C 1 3 3 math : n o t L e s s T h a n 0 . 1 . e x : g L 1 0 C 6 1 a : R e l T y p e . e x : g L 1 0 C 9 0 math : n o t G r e a t e r T h a n 2 . e x : r e l : d e p t h e x : g L 1 0 C 9 0 ; : i n i t N o d e ; : termNode e x : node ; : t r u s t L e v e l e x : g L 1 0 C 1 3 3 ; : t y p e e x : g L 1 0 C 6 1 . } l o g : i m p l i e s { e x : node : c a n A c c e s s . } .

Fig. 10.

Results of the check performed by Cwm on the proof reported in Figure 9

5 4.5 4 3.5 3

seconds

1 2

⋅

2.5 2 1.5 1 0.5

5

10

15

20

25

30

number of access conditions in a rule

Fig. 11.

Cwm performances with a number of access conditions varying from 1 to 30.