2011 Ninth Annual International Conference on Privacy, Security and Trust
Privacy Data Envelope: Concept and Implementation Mahmoud Ghorbel, Armen Aghasaryan, Stéphane Betgé-Brezetz, Marie-Pascale Dupont, Guy-Bertrand Kamga, and Sophie Piekarec Alcatel-Lucent Bell Labs France, Service Infrastructure Research Domain 91620 Nozay, France
[email protected] Abstract—. In this paper, we present a privacy control mechanism called PDE (Privacy Data Envelope) allowing users to protect their privacy sensitive content travelling over social and communication networks. Our solution is based on privacy policies expressed by the user and associated with his content. This approach makes use of a decentralized architecture carried out through a PDE feature that has to be added to the existing application access tools like email clients and web browsers. A prototype has been developed to embody the PDE paradigm and to illustrate a scenario where such envelopes cross the boundaries of enterprise social networks and other communications tools. Preliminary performance evaluations were done helping the understanding of the PDE plug-in behaviors and computation overhead. Keywords-component; privacy data envelope, privacy policy, policy enforcement, plug-in, social network, PDP .
I.
INTRODUCTION
Privacy is one of the most important issues in our evolving information age where technological developments lead to intensive processing and storage of personal information. Privacy is becoming critical with the recent development of virtual communities both in the domain of private life, e.g., group of friends or members of a family, as well as in the professional area, e.g., members of a project or colleagues of a company [1]. A key characteristic of all these environments is given by the fact that large quantities of privacy sensitive data are not just stored at central servers or at end-devices, but also continuously travel across networks of interconnected applications (via various social networks and communication tools) or move within cloud computing service infrastructures. Traditional central server-based storage and access control mechanisms are not well suited to mitigate the privacy violation risks in such open multi-application environments where the content items pass through different application domains. Decentralized solutions are required in order to extend the coverage of privacy data control beyond the limits of a single application domain by applying local decision making and enforcement of privacy rules. On the other extreme of the spectrum, one should avoid applying heavy weighted Digital Rights Management (DRM) solutions relying on specific hardware or dedicated software clients. In addition, DRM solutions miss flexibility in customizing the control access at the granularity level of each user. In this context, the privacy enforcement should not necessarily rely on full-fledged security mechanisms. As argued in [2], privacy preferences should not trigger mandatory
enforcement by technological means as this would defeat the ultimate objective of promoting content sharing and may also contradict the free speech principles. In addition to some lightweight technological enforcement, privacy preferences should also rely on different modes of social or legal enforcement. For example, raising the bar in the domains of user awareness and data traceability would already have an important positive impact. Another important aspect of a privacy protection mechanism is its usability, and in particular, the overhead that it introduces to usual practices of the end-user in his interactions with the applications. While dealing with multiple application environments, we advocate an approach that does not introduce yet another centralized identification service provided by a trusted third party and which would contribute to the usability overhead. Instead of identity federation, we argue that a “best effort” user identification relying solely on the identifiers within the existing application environments is sufficient. Our approach is to make use of plug-in technologies to reach to these application specific identifiers in a non disruptive way for the end user. The paper is organized as follows. In section II we discuss the related work in the state of the art, and in section III we introduce the concept of the Privacy Data Envelope (PDE) as well as a detailed use case for its application. In this paper we focus on a use-case in the professional area, although the approach would apply similarly to other domains of interconnected applications. The section IV presents a software architecture that realizes the approach of privacy data envelopes and which, instantiated in a proof of concept implementation, is described in section V. Finally, we conclude and discuss some perspectives of our future work in section VI. II.
RELATED WORK
In order to protect user data, several security and privacy frameworks have been proposed such as P3P, EPAL XACML [13][14][15]. They all rely on languages allowing to specify the conditions under which privacy related data are accessible. Some of these languages (e.g., EPAL and XACML) introduce also an important notion of obligations, which are the actions to be performed or not performed by the requestor after he get access to the data. These access control technologies are usually applied to protect data stored in a centralized point such as a central database. To deal with the privacy control of moving data, a family of approaches based on sticky policies has been introduced. The basic idea consists in accompanying the moving data with privacy protection policies which should
978-1-4577-0584-7/11/$26.00©2011 IEEE
apply all along the movement path of these data [3][4][6]. Some approaches also allow the use of multiple policy languages as well as to protect data shared by multiple applications [5]. They rely nevertheless on a dedicated user identification service and require a trusted authority which we want to avoid as source of usability overhead. In the particular domain of Social Networks where the privacy of user information is critical (e.g., profile data, published content, image or video), two sticky policy-based solutions are worth to be cited. Beate and al. [9] propose a practical, SNS platform-independent solution, for social network users to control their data. Their model has been implemented as a Firefox extension that knows about the users’ access control preferences and provides the enforcement mechanism using encryption. However, this work is limited to data published on a Social Network Site while our approach aims to support data travelling across any other communication or Web 2.0 applications. In addition, more advanced policies can be described with our approach, notably taking into account various actions, conditions and obligations. Kodeswaran et al. [10] propose to use sticky policy to control data access to perform statistical analysis (e.g., medical analysis) on social network data while preserving individual privacy and taking into account the purpose of this data processing. This approach also addresses access control to data stored in a given social network environment but it does not deal with data that can travel over different heterogeneous environments (e.g., social networks, mailing systems). In this later work, the data are also processed by a data mining service (held by a company or an administration) and not by another final end-user. From another side, others claim that it is difficult to handle the enforcement of the access control in a fully decentralized architecture. In this direction, Carminati et al. in [17] propose a rule-based approach for enforcing access control mechanism dedicated for web based social networks. Indeed, access to a resource is granted when the requestor is able to demonstrate being authorized to do so by providing a proof. The proposed mechanism is based on a semi-decentralized architecture as a tradeoff between the efficiency of a fully centralized access control server, and the emergent wish of users to have more control over their data. Authors argue that “The main drawback of this solution –decentralized one– is that implementing a decentralized access control mechanism –where users specify themselves their privacy policies– implies software and hardware resources more powerful than those typically available to WBSN participants” [17]. In contrast, we believe that one can achieve a lightweight decentralized solution paradigm and so the privacy sensitive user data can move over networks while allowing the user to keep the control on them. Beside the research works, there exist software solutions embedded in a web browser and offering privacy features to protect information shared between end-users in Social networks. For instance, Scramble [18] allows encrypting/decrypting a posted text. However, it only deals with one possible action (read) and introduces an important overhead regarding the centralized management of encryption keys and user identifiers. Another example is uProtect.it [19] which defines who can read a text posted in Facebook. This
solution is also based on a central server and tackles only the Facebook environment. In [7], we outlined an approach within the family of sticky policies that we further develop in this paper. Basically, we propose a policy-based privacy technology to protect data crossing different communication and Web 2.0 applications. This technology is based on an immersive Plug-in fully distributed and embedded in a “best effort” way within the different applicative environments (i.e., directly using the application features and their user identification systems). Moreover, this technology intends to support privacy policies controlling the information exchanged within different user groups as well as privacy policies applied to a hierarchical data structures (e.g., “composite documents” for which a different policy can be applied to each part). These policies are especially relevant in the enterprise or corporate domain that we particularly address in this paper. III.
PRIVACY DATA ENVELOPE PARADIGM
A. PDE Concept We advocate an approach within the family of sticky policies that allows dealing with hierarchical data structures as well as covering group privacy protection scenarios. In this approach, named Privacy Data Envelopes (PDE), each piece of information identified as privacy-sensitive is embodied (or “enveloped”) into a data structure that, in addition to the initial raw data, carries privacy-related properties and policies. The PDE structure contains three fields, see Figure 1:
Figure 1. PDE data structure : elementary and nested PDEs.
• Privacy-sensitive entities (or sensitive entities): define the parts of raw data that are different from the point of view of privacy; these parts are characterized by their individual properties. Note that in case of a flat PDE structure there is only one sensitive entity which represents the entire part of raw data contained by the PDE (Figure 1, left side). On the other hand, a nested PDE structure (Figure 1, right side) is needed to specify different expected behaviors with regard to different parts of the initial data. For example, under some circumstances certain parts of a given document can be required to be hidden or anonymized; personal identifiers like names or phone numbers need to be removed when diffusing a document to a larger audience.
• Properties or metadata: specify some semantics on the respective data entities, e.g. the data type and category, owner, or issue date. • Privacy policies: indicate which actions are authorized when the PDE travels across the network, (e.g., access control, time-to-live, data handling and disclosure policies) and which obligations must be fulfilled (e.g., data deletion after a certain time period, notification or consent request to owners). These policies are enforced locally at each PDE recipient’s client location. Each PDE is encrypted by using a simple encryption schema based on a symmetric key. The PDEs travel across the network in an encrypted form, therefore each node of the network (let us call it PDE network) that needs to access to its content has to possess the symmetric key. As the focus of our approach is not on the security mechanism itself, we suppose that this key is unique and available to all the certified nodes of the network. In order to become a node of the PDE network, and therefore to have potential access to PDE contents, an enduser application needs to integrate a specific client, called “PDE plug-in”, which is in charge of decrypting the PDE and enforcing its policies in the specific application environment. The general principle is illustrated by the Figure 2. , where two applications A and B can exchange (encrypted) PDEs thanks to their respective PDE plugins that ensure both encryption/decryption as well as policy enforcement. Note that this approach requires that the policies have to be specified using the application specific user identifiers. This point will be further elaborated in the next section on architectural design.
Figure 2.
or other non-authorized actions on the post from being executed. Suppose that John, a member of department A in Enterprise E1, publishes meeting notes on the social network (arrow (1) in figure 3) with the following policy: 1) Members of the department can see the meeting notes. 2) If the meeting notes is sent to an employee of the company, the author must be notified. 3) If the meeting notes is sent to an employee of another company, the acknowledgement of the author is required. Bob, another member of the department, having been notified of this post (2), forwards these notes to Alice (3), an employee of the same enterprise E1 belonging to another department B. The PDE plug-in of the mailing system then evaluates the policies associated with the published data and extracted from the envelope. According to these policies, Bob is allowed to send the notes to Alice (4) while John is informed of this forwarding. However, as also defined in the policies, when Alice tries to send the notes by email to Mike (5), who is an employee of the enterprise E2, the system will inform Alice that consent from John is needed and the email sending to Mike is blocked (6) waiting for John’s acknowledgement. Once the consent is received, the post is sent to Mike (i.e. under conditions, like omitting author names and confidential parts).
Multi-application approach based on PDE plug-ins.
B. Use case Our approach described above can be illustrated by providing privacy-aware communications in a corporate environment. The considered privacy-sensitive data are the posts within a group created in a professional social network. The problem is to control the way these data are propagated and used outside the group regardless the application environment. When a member of the group publishes sensible data such as notes or meeting minutes on the social network, he associates a policy which can be either chosen among predefined policies or defined through a user-friendly interface. Then, when these data are accessed or forwarded, the policies are evaluated and the system prevents inappropriate forwards
Figure 3.
PDE application use case
As illustrated with this use-case, the PDE guides the behavior of regular applications with respect to the privacysensitive data. Nevertheless, it must be stressed that the PDE does not provide a full-fledged security mechanism to protect against malicious applications. IV.
ARCHITECTURAL DESIGN
As previously mentioned, the PDE paradigm is materialized by a feature (the PDE plug-in) implementing all required functionalities. This feature has to be integrated in the existing communication tools or accessing interfaces (mailer, web browser, targeted applications, etc) as an add-on, plug-in or extension.
Besides, we define few terms that will be used throughout this paper. We call “Application Environments”, AEs, end-user applications such as Facebook, LinkedIn, Jive Engage or Mailing services including webmail. We call “Application Access Tools”, AATs, the tools allowing users to interact with the “Application Environments”, such as web browsers (i.e. Firefox or Internet Explorer), mail clients (i.e. Outlook or Thunderbird) or any other software allowing the interaction between the user and the AEs. This section describes the overall architecture of the PDE plug-in which represents an embodiment of the PDE paradigm applied to social network exchange and communication environments (for both professional and personal use cases). In order to understand the plug-in overall operations, a requirements analysis has been done. This analysis showed that the PDE plug-in operations could be grouped into three main phases described hereafter. A. PDE plug-in operations unfolding Generally, when someone uses social networks, mailing service or any other communication means, two main operations are practiced: •
Publishing: sending an email, sharing a document or a photo, posting a message, etc.
•
Consuming: reading an email or a post, opening an attached document, copying a content, etc.
Both operations are thereby considered as two PDE plug-in phases. Besides, another phase is needed for initialization. 1) Initialization In order to benefit from PDE functionalities, a user has to acquire the PDE plug-in and integrate it to his Application Access Tools (mailer, web browser, etc). The PDE plug-in has to be provided in several versions to support the majority of AATs used to access the Application Environments, and to cover different user access devices. The user can thereby equip his ATTs with the suitable plug-ins. Otherwise, a non readable content is shown instead; the user will not get access to all PDE embedded contents sent to him from his surround (contacts network). In a professional use case, the PDE plug-in could be already embedded in all enterprise communications tools that should be packaged together. Once integrated to the existing tool, the plug-in invites the user to create a local account. This account covers a personal space where the user builds his customized PDE plug-in profile. This account allows him to get access to his contacts and other personal parameters for configuration or updating operations. The PDE paradigm is based on privacy policies defined by the user depending on his personal information, like contact list. Indeed, the user could specify the name (or identifier, pseudonym, etc) of persons allowed to consume the content and what actions can be made. Thereby, a local database gathering all user contacts (with their identifiers/pseudonyms in the different AEs) is required by the PDE plug-in whenever the user configures a new policy. We call such database Contact Matrix. Each row and column of this matrix corresponds
respectively to a user contact and to an AE. Each element of the matrix represents the identifier/pseudonym/ email-address of the user contact in the corresponding AE. This Contact Matrix has to be built by the PDE plug-in by gathering all contact lists from all AEs where the user is registered. This operation should be done during this initialization phase at the first time the user uses the plug-in. However, it could be periodically repeated for update purposes. In this phase, the user can also make some configuration operations to define, for example, some privacy policy templates, or to set some security parameters. The plug-in is then ready to use. 2) Publishing When a user wants to publish a privacy-sensitive content that he would like to protect the plug-in functionalities are invoked to encapsulate this content within a PDE. The user can directly express his publishing intention and run the PDE plugin, alternatively the plug-in can automatically detect the user intention and propose him to encapsulate in a PDE the content he is willing to publish. At this stage, the user has to specify/define the policy to be associated with the content. Several alternatives could be proposed to the user to perform this task like: choosing a predefined policy regularly used for specific type of content or AE, modifying an existing policy, or creating an entirely new policy (based for instance on existing templates). Default policy can also be set (i.e. according to each content type or used AE) so that they are automatically selected without disturbing the user. Ideally, the plug-in should provide a user friendly interface to ease this policy definition operation to the user and to make it more intuitive because it represents the core element of the PDE paradigm that could influence its daily usage. Finally, the plug-in associates the policy to the content and the resulted envelope is thereby ready to be published in the Application Environment the user is using. The association operation depends on the type of the user content (text, image, document, etc) and the encryption mechanism used to protect the data integrity. 3) Consuming Once the PDE embedding the user’s content is published in an AE, any recipient willing to consume this PDE must have the plug-in integrated within his AAT. Besides, the recipient has to be identified by combining the plug-in identification and the user AE identification session detection. This best effort identification is essential to confirm the recipient identity, required later to apply the privacy policies. In the consuming phase, a plug-in feature runs as a background task to parse the AE content (like web pages, emails, etc) looking for PDEs. If detected, a PDE should be analyzed to extract the policy. In the meantime, the plug-in should know the action a recipient wants to perform on the content (e.g. reading, forwarding, etc) and its context (e.g. identity of the recipient if the action is forwarding). Such information is mandatory to decide, based on the privacy policy, whether the recipient is authorized to perform the needed action or not (some obligations could also be specified).
Indeed, the PDE plug-in helps a user, from one hand, to set a policy that protects his privacy, and from the other hand, to respect the privacy of other users while consuming their contents. The next section describes the functional blocks of the plug-in architecture.
•
Select a policy among a predefined list of policies which are previously configured by the user himself using the policy language or a policy editor tool.
•
Select a template of policy and instantiate some values (e.g., TTL, diffusion list).
B. Functional Architecture In this section, we present an embodiment of the PDE Plugin architecture and describe the features and interactions of the different modules. Three actors can interact with the plug-in through different modules; the user through the Graphical User Interface (GUI), the AEs through the Application Environments Access Manager, and the AATs through the Client Specific Interface. The latter interacts directly with the internal components of the Application Access Tool for which the plug-in is dedicated, in order to customize its behaviors and integrate the PDE functionalities.
•
Elaborate a new dedicated policy using the policy language or a policy editor tool.
The first time the user opens a session, the PDE plug-in gathers all the contact list information from the different application environments in order to build the Contact Matrix. This is done by the Contact Matrix Manager which connects to each AE through the Application Environment Access Manager. Later, the role of this module is to listen to the different sessions a user could open with different AEs. By this means, the plug-in can verify the user identity. Hence, when the user wants to publish content and protect it within a PDE, he has to connect to the AE using his corresponding login.
Once the policy is set, the PDE Builder is then devoted to the Privacy Data Envelope creation by gathering both the content and the related policies. Depending on the content type, different mechanisms are used to make this association. This module also requests the Encryption module to encrypt the set {content; policy}. The output of the PDE Builder is the PDE ready to be published in the Application Environment, either in the form of an encrypted document, an encrypted mail, or an encrypted web 2.0 content. On the other hand, to be able to access PDE-based protected content published in an Application Environment, a consumer must have the PDE plug-in and connect to any AE to be identified. The PDE Checker analyzes the content a user is consuming in the AE (e.g., web page, mail) to detect if it contains PDE. If so, the PDE is extracted and sent to the Policy Enforcement Point (PEP). This task is permanently executed in a background mode. The PEP extracts the policies from the PDE. Two configurations could be considered: •
The PEP asks the Policy Decision Point (PDP) for all authorizations, authorized actions (visualization, forwarding, copying, etc) in order to execute the related obligations and thereby respect the policy.
•
Or the PEP asks the PDP only for a default request (i.e. reading) and show the result while listening to the user events to intercept the action he wants to execute regarding a PDE in order to formulate the new request to send to the PDP.
One can note that the PEP and the PDP are common modules belonging to policy-based approach.
Figure 4.
PDE plug-in Architecture
The request to encapsulate a specific content within a PDE can be done for instance by a right-click on the entered content, then triggering the opening of a GUI to configure the PDE. The plug-in can also automatically detect the Application Environment input fields (e.g., text field, picture selection box) and open the PDE GUI that will be directly used to publish the content. Both behaviors are performed by the User Content Handler module. The Privacy Policy Wizard helps the user to define the privacy policies associated to the content. Different options can be considered for the user to define privacy policies:
C. Privacy Policies Regarding the PDE structure described above, the privacy policies associated to a piece of content or data are means for a user for specifying who is or isn’t authorized to do what actions on the related content under what conditions and/or obligations. The privacy policy considered here consists in a set of rules, each rule comprising several elements such as: •
An effect (Permit or Deny)
•
A target (individual or group of individuals to whom the rule applies)
•
A list of actions (Read, Write, Copy, Print, Forward, etc.)
A list of conditions that need to be satisfied before applying this rule (predicates based on the attributes of target , actions and contextual information) • A list of obligations (e.g. keep log, notify an entity, get consent, etc.) that are actions to be fulfilled jointly with the rule’s evaluation decision. An obligation is a kind of post condition for the related rule. Considering the use case described in the section III.B, the rule for controlling the sending of the meeting minutes outside of the company can be expressed in a pseudo language as followed:
plug-in, and packaged in one file called XPI file (which size including XACML library described hereafter is 306 kb). Three main types of file:
Rule SendingOutsideCompany, Effect = Permit Target-Resource resource-id = Meeting_Notes_01 Subject != John (document’s author or Target-Subject responsible) Action = Send Target-Action Condition Recipient doesn’t belong to John’s Company Get approval of John Obligation
2) Microsoft Outlook The Outlook PDE plug-in is a Microsoft Office COM addin implemented using Microsoft Visual C# .Net. It is deployed as a single DLL (Dynamic Link Library) file that needs to register itself with the Microsoft Outlook application in which it runs. As any COM add-in, it inherits and implements the methods from the IDTExtensibility2 interface [12], for communicating with the host application (Microsoft Outlook).
•
Figure 5.
Policy example
This rule states that sending a document (i.e. Meeting_Notes_01) outside of John’s company is subject to the approval of the latter. According to the standardized policy framework architecture [11], the rules are evaluated by the decision point (PDP) and the decision and possible obligations are sent to the enforcement point (PEP) that must fulfill the obligations before enforcing the rule’s evaluation decision. V.
IMPLEMENTATION
The PDE plug-in represents an embodiment of the PDE paradigm applied to social network and emailing services. This embodiment expenses an implementation effort to support all existing Applications Access Tools and Application Environments. Aiming at having a prototype as both a proof of concept and a test-bed for research purpose, we decided to focus only on two representative tools: Mozilla Firefox as a web browser and Microsoft Outlook as an email client. A. Plug-in Environments 1) Mozilla Firefox Regarding Mozilla Firefox, the PDE plug-in is carried out as a Mozilla extension 1 allowing developers to add new functionalities and to customize Mozilla applications. The Mozilla extensions are based on XPCOM 2 , a cross-platform component model from Mozilla similar to CORBA and Microsoft COM (Component Object Model). It has multiple language bindings (JavaScript, Java, C++, etc) making accessible the Gecko (core of Mozilla application) components and libraries. Concretely, the Mozilla extension is a set of code files implementing the needed functionalities, for instance PDE
•
XUL 3 files that describe graphical component to be added to the browser (new buttons, dialog boxes, etc).
•
JavaScript files that implement new behaviours the browser has to adapt (parsing contents, etc)
•
Java classes implementing the main PDE plug-in functionalities described in the previous. Additional libraries could be added (like Sun XACML lib).
Through those methods (e.g. OnConnection, OnDisconnection, etc.), the add-in is aware of what is ongoing within the host application (Outlook) and can then execute some dedicated actions such as intercepting the end-user’s actions (e.g. Send, Forward, Print, etc.), checking if an enduser’s action is subject to a policy, sending the policy to the PDP for evaluation and enforcing the resulting decision (e.g. blocking the user action). Moreover, in addition to the Microsoft Outlook application, this Outlook PDE plug-in requires the .Net 3.5 Framework and the Visual Studio 2005 Tools for Office Runtime to be installed on the end-user’s device. B. PDP integration The PDP implementation within the PDE plug-in is based on the XACML open source java library of SUN [16]. An instance of a PDP is created within the PDE Plug-in. When a Privacy Data Envelope is detected, the plug-in extracts the sticky policy and requests this PDP object with the envisaged action to get the returned result. This PDP result contains the decision (e.g., action accepted, refused or indeterminate) and the related obligations (if any). The PEP will then carry out this decision and the obligations in the application environment (e.g. allow the content to be read). This standard and matured XACML policy language allows covering the different requirements previously mentioned. While the XACML version 2.0 has been used for the prototype; the latest version 3.0, not yet available as open source, will offer new interesting features [15] notably to ensure the obligations to be actually executed before having the final decision given by the PDP. Finally, XACML is a powerful language allowing the definition of various privacy policies. However this policy language is certainly quite difficult to understand and use by any end-user. Therefore, we envisage using the XACML as the
1
Mozilla extensions are different from Mozilla plugins, which help the browser display specific content like playing multimedia files. 2 Cross Platform Component Object Model.
3
XML User Interface Language, Mozilla language for graphical interface.
internal PDE policy language. For the end-user, we will present the privacy policy in a simplified and user-friendly format through a policy editor able to translate the policy from this end-user format to the XACML format. C. Policy-Content association The policy association depends on the type of content a user may publish. In fact, to associate a policy to specific document, we use an encrypted xml format. First, the file to protect is encrypted using a first encryption key, then it is added as a part of an XML file also containing metadata and the associated policy. The set is then encrypted using a second key. The resulting document is the PDE. Double encryption is used in order to preserve document confidentiality if the policy inhibits the content reading request for instance. Regarding the text based content; the policy is concatenated to the initial text with a predefined separator. The set is then encrypted, delimited by special tags and finally posted in the corresponding AE. Having different association mechanisms is logic because of the different nature of both contents, but it is clear that this binding is slighter than the first one. Indeed, we think that text content is less restraining than document content, so we can tolerate this.
observe that a 10-thousand-character PDE content (more than 3 pages) is performed in less than 1s, exactly, in 856 ms, which is still very reasonable. The main question that still remains is the computation overhead involved by the PDE plug-in on the Application Access Tool, for instance, Firefox. In order to assess this parameter, additional evaluations have been done to measure the time that the PDE plug-in takes to analyze a web page not containing any PDE. The curve in the figure 8 shows clearly that a 5-Mega-Byte web page (which is a very big one) doesn’t expense more than 320 ms to be analyzed. Consequently, the computation overhead is not really significant. Time (s) 5 4 3 2 1 0 0
Figure 6.
D. Performance evaluation Although the current prototype version is not fully complete, some experimentations have been done and some parameters have been measured in order to understand the plug-in behaviors and to approximate the involved computation overhead. The prototype evaluation helps also adjusting the plug-in design and development to provide the best user experience.
Figure 7 logically shows that the plug-in performance decreases when the PDE content gets bigger. However, we can
10 15 20 25 30 x PDEs per web page
35
40
Plug-in performance regarding PDE number per web page Time (s) 10 8 6 4 2 0 0
The first set of experimentations concern text content published in Jive Engage professional social network [20]. The goal is to measure the PDE processing time when consuming PDEs. The measured operations cover i) web content parsing, ii) decrypting and policy extraction, and iii) PDP decision and iv) plain text display. Note that a default request was set for read action by an authorized user, all PDEs embedding the same privacy policy. As a first step, we have varied, from one hand, the number of PDEs per web page (PDE content fixed to 1000 characters), and from another hand, the PDE content size (character number). Both curves (figure 6 and 7) expectedly show the growth of the plug-in processing time when the number of PDEs per page or the PDE content size increase. For example, the plug-in processes 10 PDEs per page in 1s. Beyond 25 PDEs per page, it requires roughly more than 2s (figure 6). This looks quite heavy, particularly, if we add the web page loading process. However, such performance could be tolerated knowing that 25 is a big number of PDEs a web page could contain (10 is a reasonable number). Besides, the processes covered by the performance evaluation are mainly background tasks that could run in parallel with others. Thereby, this performance limitation could be bypassed.
5
5
10
15
20
25
30
35
40
PDE size (x1000 characters)
Figure 7. 5
Plug-in performance PDEs content size
Time (s)
4 3 2 1 0 0
Figure 8.
1
2 3 4 web page size (x Mbytes)
5
Plug-in computation overhead on a web page not including any PDE
Regarding the second set of experimentations, we have measured the time taken to compute document PDE with double encrypting, with two different algorithms. The first encrypted algorithm is DES with a 55 bits key, and the second is blowfish. We can notice that the bigger the file is, the most the computation time increases, especially for the DES algorithm (figure 9). However, for a file size less than 5 Mb, the time taken is less than 2s, for both algorithms which remains acceptable. We can also deduce that the choice of the encrypting algorithm can be one of the key point of the overall
process as computation times can be very different depending on the chosen algorithm. Time (s)
DES
Blowfish
8
point is a real issue especially in the cloud environments, where the PDE paradigm can satisfy significant requirements. In this direction, our ongoing research studies are focusing in such issues to apply the PDE to the cloud computing environments.
6
REFERENCES
4
[1]
2
[2]
0 0
5 10 15 Original file size (x Mbytes)
Figure 9.
20
[3]
PDE document building
As shown on the curves below (figure 10), we observe the similar behavior for decrypting time, i.e. the time taken to double-decrypt the PDE and get the original file back.
[4]
[5] Time (s) 8
DES
Blowfish
[6]
6 4 2
[7]
0 0
5
10 15 20 PDE size (x Mbytes)
Figure 10.
VI.
25
30
[8]
PDE extraction
CONCLUSION AND PERSPECTIVES
None of the two extremes, the central server-based access control mechanisms and digital rights management (DRM) approaches, is well suited to mitigate the privacy violation risks in such multi-application environments. The PDE paradigm represents a decentralized approach that allows users to publish their privacy sensitive contents while keeping control on them. The application of the policies defined by the user and thoughtfully associated with the content when moving over application environments grants the conformity of the user privacy and thereby protects his content. An embodiment of our hypothesis was carried out through the PDE plug-ins developed for different Application Access Tools, a web browser and an email client. It has confirmed the absence of major obstacles and the technical feasibility of our solution. In a second step, the preliminary plug-in performance evaluations have shown logic proportionality with both PDE number per web page and content size, whereas no significant computation overhead has been observed. Future evaluations will focus on usage and usability studies of advanced PDE prototype. On the other hand, our solution doesn’t tackle security and full enforcement issues. In fact, an advanced security studies could be drown up to set up sophisticated mechanisms (e.g. policy based key generation) to better secure the PDEs and to guarantee full policy enforcement. However, this could be considered as less important when the claim is simply about helping and reminding users to respect the privacy of each others or to comply with local privacy regulations. This last
[9]
[10]
[11]
[12] [13]
[14] [15]
[16] [17]
[18] [19] [20]
N. Surendra and A.G. Peace, A conceptual analysis of group privacy in the virtual environment, Internation Journal of Networking and Virtual Organisations, Vol.6, n.6, pp. 543-557, 2009. L. Gelman, Privacy, Free Speech, and 'Blurry-Edged' Social Networks, Boston College Law Review, Vol. 50, No. 5, 2009. M. Casassa Mont, S. Pearson and P. Bramhall, Towards Accountable Management of Identity and Privacy: Sticky Policies and Enforceable Tracing Services, Technical Report HPL-2003-49, HP Laboratories, March 2003. D.W. Chadwick and S.F. Lievens, Enforcing “Sticky” Security Policies throughout a Distributed Application, ACM Workshop (MidSec 2008), Leuven, Belgium, December 1-5, 2008. D. W. Chadwick, K. Fatema, An advanced policy based authorisation infrastructure, DIM '09, Proc. 5th ACM workshop on Digital identity management, November 13, 2009, Chicago, Illinois, USA. P. Kodeswaran and E. Viegas, A Policy Based Infrastructure for Social Data Access with Privacy Guarantees, In Proceedings of the IEEE International Symposium on Policies for Distributed Systems and Networks, July 21, 2010. A. Aghasaryan, M.P. Dupont, S. Betgé-Brezetz and G.B. Kamga, Privacy Data Envelopes for Moving Privacy-sensitive Data, W3C Workshop on Privacy and data usage control, MIT, Cambridge (MA), USA, Octobet 2010. M. Ghorbel, A. Aghasaryan, M.P. Dupont, S. Betgé-Brezetz, G.B. Kamga, and S. Piekarec A multi-environment application of Privacy Data Envelops, demonstratino paper, IEEE Int. Symposium on Policies for Distributed Systems and Networks, Pisa, Italy, June 6-8, 2011. F.Beato, M. Kohlweiss and K. Wouters, Enforcing Access Control in Social Network Sites, http://www.cosic.esat.kuleuven.be/publications/article-1240.pdf . P. Kodeswaran, E. Viegas, Towards A Privacy Preserving Policy Based Infrastructure for Social Data Access To Enable Scientific Research, Eighth Annual International Conference on Privacy Security and Trust Ottawa, August 17-19, 2010. G. Waters, J. Wheeler, A. Westerinen, L. Rafalow, and R. Moore. Policy framework architecture. Internet-draft, IETF, Network Working Group, Feb 1999 How to build an Office COM add-in by using Visual C# .Net. http://support.microsoft.com/kb/302901/EN-US/ Kumaraguru, P., Cranor, L., Lobo, J., and Calo, S. A survey of privacy policy languages. Workshop on Usable IT Security Management (USM 07). In SOUPS '07: Proceedings of the 3rd symposium on Usable privacy and security (New York, NY, USA, March 2007). W3C: The Platform for Privacy Preferences 1.0 (P3P 1.0). Technical Report. 2002 B. Parducci, H. Lockhart, eXtensible Access Control Markup Language (XACML) Version 3.0, http://docs.oasis-open.org/xacml/3.0/xacml-3.0core-spec-cs-01-en.pdf. SUN XACML Open source implementation. http://sunxacml.sourceforge.net/. Carminati, B., Ferrari, E., and Perego, A. 2009. Enforcing access control in Web-based social networks. ACM Trans. Info. Syst. Sec. 13, 1, Article 6 (October 2009), 38 pages. DOI = 10.1145/1609956.1609962 http://doi.acm.org/10.1145/1609956.1609962 Scramble software, http://www.primelife.eu/results/opensource/65scramble uProtect.it software, http://uprotect.it/index. Jive softwareork, http://www.jivesoftware.com.