Addressing Privacy: Matching User Requirements with Implementation Techniques Evangelia Kavakli1, Christos Kalloniatis1 and Stefanos Gritzalis2 enforced by appropriate technologies and this is the subject matter of this paper. The work presented is structured as follows. Firstly, the basic privacy requirements that should be considered during system design and development are analysed. Secondly, the privacy implementation techniques that realize these requirements are presented. The paper concludes with a critical analysis of the above techniques. The aim of this analysis is twofold: (a) to understand the coverage of the area and (b) to understand the best fit for purpose of different privacy implementation techniques.
Abstract--The paper considers the basic privacy requirements namely anonymity, pseudonimity, unlinkability and unobservability and how these requirements can be linked with related system implementation techniques thus guiding design decisions during system development. Index terms-- privacy, privacy requirements, privacy enhancing technologies
I. INTRODUCTION Privacy as a social and legal issue, traditionally, has been the concern of social scientists, philosophers and lawyers. However, the extended use of modern IT systems and communication networks in the context of several daily activities (e.g. banking, shopping, communicating with friends, collaborating with government authorities, etc), sets additional requirements for protecting the electronic privacy of individuals. Indeed, on the way to a Global Information Society with different national and international programmes aiming at the further development of data highways, there are multiple privacy risks. In the last decade most of the Internet users have raised a great concern regarding their privacy since they can sense and see how easily their personal data can be exposed. According to a survey presented in Business Week (1998), privacy and anonymity are the fundamental issues of concern for most Internet users, ranked above issues like ease-of-use, spam-mail, security and cost. According to the same survey, 78 percent of online users would use the Internet more and 61 percent of non-users would start using the Internet if privacy policies and practices where disclosed. It is true that nowadays this specific fact is being exploited for attracting more Internet users, since according to a Federated Trade Commission study, only 16 percent of corresponded sites were found to have had any privacy statement or policy in 1998, while in 2000 this percentage has increased to 65.7 percent. Besides it being a contemporary data highway on which the global information society may be built, Internet is known for many security risks. Thus, the vast development of new information infrastructures might lead to a vulnerable information society based on insecure technologies. In the information society, privacy cannot be sufficiently protected solely by legislation. Privacy should also be
II. DEFINING PRIVACY Review of current research in the area of user privacy highlights the path for user privacy protection in terms of four privacy requirements namely anonymity, pseudonymity, unlinkability and unobservability [1,5]. By addressing these requirements one aims to minimize or eliminate the collection of user identifiable data. The following A – D describe the definition of each of the above privacy requirements as found in relevant literature. A. Anonymity J. C. Cannon in [5] expresses anonymity as the state of being anonymous or virtually invisible; having the ability to operate online without being tracked. S. Fischer-Hübner in [1] presents anonymity as the ability of a user to use a resource or service without disclosing his/her identity. A formal definition for anonymity is given by A. Pfitzmann in [7]. Let RU denote the event that an entity U (e.g. a user) performs a role R during an event E. Let A denote an attacker and NCA the set of entities that are not cooperating with A. An entity U is called anonymous in role R for an event E against an attacker A if for each observation B that A can make the following relation holds: ∀U΄ ∈ NCA:0 < P(RU΄⎪B) < 1
The outcome of the above definitions is that anonymity serves the great purpose of hiding personal identifiable information when there is no need of revealing them. Browsing the Internet only for collecting information is one of many issues that anonymity plays a significant role and must be attained. B. Pseudonimity Pseudonymity is the user’s ability to use a resource or service by acting under one or many pseudonyms, thus hiding his/her real identity. However, under certain circumstances the possibility of translating pseudonyms to real identities exists. Pseudonyms are aliases for a user’s real identity. Users are allowed to operate under different
1 Department of Cultural Technology and Communication University of the Aegean, Harilaou Trikoupi & Faonos Str., 81100 Mytilene, Greece { kavakli, ch.kalloniatis}@ 2 Information and Communication Systems Security Laboratory, Department of Information and Communications Systems Engineering, University of the Aegean, 83200 Samos, Greece
[email protected]
behalf of its users. The main concept introduced by the Anonymizer tools is the existence of a third-party Web site [9], which acts as a middle layer between the user and the site to be visited. When the user wishes to visit a specific Web page, instead of establishing a direct link to the required Web server, he/she does so through the Anonymizer Web site. Having established the connection, Anonymizer forwards the information received from the desired Web site back to the user. Anonymizer is extremely easy to use but it has some inherent vulnerabilities. The first is that users should trust the Anonymizer service because the Anonymizer server monitors all the Web sites visited by the user, thus collecting information about his/her behaviour. The second vulnerability is that Anonymizer does not provide any assurance for the communication between the user’s machine and the Anonymizer server. Finally, no anonymity is provided when user invokes “helper applications” like Real Audio, since these applications by pass the server and establish their own Net connections.
aliases. Nevertheless revelation of user’s real identity occurs when acting unlawfully. Pseudonymity has characteristics similar to anonymity in that user is not identifiable but can be tracked through the aliases he/she uses [5]. Pseudonymity is used for protecting user’s identity in cases where anonymity cannot be provided (e.g. if the user has to be held accountable for his/her activities [1]. In [8] a classification of pseudonyms according to their degree of protection is presented. C. Unlinkability As J. C. Cannon states in [5], unlinkability expresses the inability to link related information. In particular, unlinkability is successfully achieved when an attacker is unable to link specific information with the user that processes that information. Also unlinkability can be successfully achieved between a sender and a recipient. In that case unlinkability means that though the sender and recipient can both be identified as participating in some communication, they cannot be identified as communicating with each other. A. Pfitzmann in [7] addresses unlinkability in the following formal way. Let XE,F denote the event that events E and F have a corresponding characteristic X. Two events E and F are unlinkable in regard of a characteristic X for an attacker A, if for each observation B that A can make, the probability that E and F are corresponding in regard of X given B is greater than zero and less than one:
B. Crowds Crowds is an anonymity agent based on the idea that people can be anonymous when they blend into the crowd. Crowds enables the retrieval of information over the Web without revealing so much potentially private information to several parties. The goal of Crowds is to make browsing anonymous, so that information about either the user or what information he/she retrieves is hidden from Web servers and other parties [11]. Specifically, Crowds concentrates Web users into a geographically diverse group, called “crowd” that performs Web transactions on behalf of its members. Each user is represented in the crowd through a process running on his/her local machine called “jondo”. According to [10] jondos interfere between the Crowds users and a Web server. When a user starts his/her jondo an automatic procedure is triggered through which the local jondo is informed about the current members of the crowd and viceversa. If this “negotiation” procedure is successful and the jondo is accepted as a member of the crowd, it can issue requests to Web servers ensuring that its identity is neither revealed to the Web servers nor to any other crowd member [6].
0 < P(XE,F⎪B) < 1 (2) The ability to link transactions could give a stalker an idea of your daily habits or an insurance company an idea of how much alcohol your family consumes over a month. Ensuring unlinkability is vital for protecting user’s privacy. D. Unobservability Unobservability protects users from being observed or tracked while browsing the Internet or accessing a service. Unobservability is similar to unlinkability in the sense that the attacker aims to reveal users identifiable information by observing rather than linking the information he/she retrieves. In [7] a formal representation of unobservability is stated as follows. An event E is unobservable for an attacker A if for each observation B that A can make, the probability of E given B is greater than zero and less than 1. 0 < P(E⎪B) < 1
C. Onion Routing Onion Routing is a general-purpose infrastructure for private communications over a public network. It provides anonymous connections that are strongly resistant to both eavesdropping and traffic analysis [12, 13]. Onion Routing consists of two parts: the network infrastructure that accommodates the anonymous connections and a proxy interface that links these connections to unmodified applications. The network contains a set of onion routers. Each router is a store and forward device that accepts a number of fixedlength messages from numerous sources, performs cryptographic transformations on the messages and forwards them to the next destination in a random order. Proxies act as interfaces between the applications and the
III. PRIVACY ENHANCING TECHNOLOGIES This section provides an overview of the software mechanisms, tools, protocols and services that have been designed for protecting user’s privacy. An extended comparison of these architectures can be found in [6]. A. Anonymizer One of the best-known Web anonymity tools is the Anonymizer, a service that submits HTTP requests on 2
network infrastructure. The functions of a proxy can be split into two: the part that links the initiator to the anonymous connection and the part that links the anonymous connection to the responder. After the initiator contacts his/her proxy, Onion Routing proceeds with the following steps: define the route, construct the anonymous connection, moves data through the anonymous connection, destroy the anonymous connection.
characteristics as Crowds (see 3.2.2) the functionality of these two is significantly different [6]. Hordes has been designed for gathering many participants in the same reception group. This is succeeded because Hordes takes advantage of the anonymity characteristics and performance benefits inherent in the IP multicast routing. Multicast addresses are not associated with any particular device attached to the network. They are just labels that refer to the receivers of the group as a whole. The number of hosts acting as receivers of a multicast routing tree, as well as, their status, is dynamic and unknown to routers and hosts. Hordes utilises multicast communication for the reverse path of anonymous connections, achieving not only anonymity but also sender unlinkability and unobservability. A detailed description of Hordes is given in [18].
D. DC-Nets A DC-Net (Dining Cryptographers Network) proposed in [14,15] allows participants to send and receive messages anonymously in an arbitrary network. It can be used to provide perfect sender anonymity. DC-Net is based on binary superposed sending. Each user station generates at least one key bit for each message bit and sends each key bit to exactly one other user station over a secure channel. For each single sending step, every user station adds modulo 2 (superposes) all the key bits it shares and its message bit, if there is one. Stations that do not wish to transmit messages send zeros by outputting the sums of their key bits (without any inversions). The sums are sent over the net and added up modulo 2. The result is distributed to all user stations. The result is the sum of all sent message bits, because every key bit was added twice. If a participant transmits a message, the message is successfully delivered as the result of the global sum to each participant.
G. GAP GNUnet’s Anonymity Protocol GAP [19] is a recently presented protocol claiming to achieve anonymous data transfers. GNUnet is a peer-to-peer network and its framework provides peer discovery, link encryption and message batching. GAP has been designed for securing GNUnet. It aims to hide the identity of an initiator and a responder from all other entities, including GNUnet routers, active and passive adversaries and the responder or initiator respectively [6]. The communication between all network nodes is confidential; no host outside the network can observe the actual data travelling across the network. Data types are not identifiable since all packets are padded in order to have identical size. The most significant difference between GAP and prior mix-based protocols is that traditional mix-based protocols always perform source rewriting at each hope. GAP mixes can specify a return-to-address other than their own, thereby allowing the network to route replies more efficiently. However, GAP’s greater vulnerability is that it has been customised to the functionality of a peer-to-peer network.
E. Mix-Nets The technique of Mix network, originally introduced in [16] and further discussed in [17] realises unlinkability of sender and recipient as well as sender anonymity against recipient and optionally recipient anonymity [1]. A mix is a network station, which collects a number of messages of equal length from the senders, discards repeats, changes their encodings, and forwards the messages to the recipients in a different order. By using only one mix, malicious users cannot identify the relation between sender and recipient but the mix and the sender can. A chain of independent mix stations is used for improving security. Using a chain of mixes a global attacker, who has access to all communication lines, can only trace a message through the mix network, if he/she has the cooperation of all mix nodes on the path or if he/she can break the cryptographic operations that mix stations use. Ensuring the trustworthiness of at least one mix, unlinkability of sender and recipient is ensured.
H. Tor Tor, presented in [20], is an architecture based on the Onion Routing architecture described above. However, Tor has many improvements over the original Onion Routing. Firstly, Tor satisfies perfect forward secrecy by using an incremental path-building design, where the initiator negotiates session keys with each successive hop in the circuit. Once these keys are deleted, subsequently compromised nodes cannot decrypt old traffic. Secondly, Tor uses a standard SOCKS proxy interface allowing users to support most TCP-based programs without modification. Tor improves efficiency and anonymity by multiplexing multiple TCP streams along each circuit whereas Onion Routing builds a separate circuit for every application-level request. This technology addresses traffic bottleneck, while none of the well-known methodologies do, by allowing the nodes at the edges of the network to detect congestion or flooding and send less data until the congestion subsides. Another improvement is Tor’s end-to-end integrity checking. Any node in the network has the ability to alter
F. Hordes Hordes is a protocol that engages multiple proxies for routing a packet to a responder in an anonymous way while it uses multicast services for routing the reply to the initiator again in an anonymous way. Hordes architecture is based on several proxies called jondos, which are logically positioned between a Hordes user and a Web server. Each user’s request is forwarded, through a number of jondos, to the destination Web server. While Hordes has the same
data before sending them to the next one. Onion Routing did no integrity checking on data. Tor verifies data integrity before leaving the network by performing an end-to-end integrity checking. Finally, Tor provides an integrated mechanism for responder anonymity via location-protected servers. Clients negotiate rendezvous points to connect with hidden servers. While Onion Routing included “reply onions” that could be used to built circuits to a hidden server, these onions did not provide forward security and became useless if any node in the path went down or rotated its keys. In Tor’s architecture reply onions are no longer required.
more reliable and robust way. Designers and developers should first decide what kind of services their enterprise will offer and how it will be offered and then choose the appropriate implementation technique. It should be noted that pseudonimity is not included in table 1 since as it is mentioned previously pseudonymity is provided mainly when anonymity is not allowed and most of the afore-mentioned techniques provide services for anonymity. Of course techniques that provide anonymity can easily provide pseudonimity as well. Additional techniques that are widely used for providing user pseudonimity, include:
• • • • •
Table 1 presents a summary of the basic privacy requirements and the relative implementation techniques that satisfy these requirements. It is obvious that all of the techniques mentioned in section III have as a primary concern the satisfaction of sender’s anonymity. Most of these techniques aim to satisfy users that often communicate over insecure public networks and the Internet. The majority of Internet users wish to browse web sites anonymously and only when it is important (i.e ebanking services) to sacrifice their anonymity for using the services offered. Of course, there are always enterprises that decide to take the next step and offer, on a venerable degree, services that offer not only anonymity but also unlinkability and unobservability to users. Nowadays, many intruders learn perfectly how to use new technologies against unsuspected users especially when the latter try to realize a bank transaction over the Internet. Users’ trust for using eservices rises when enterprises provide not only perfect anonymity but also untraceability, unlinkability and unobservability services. Technologies and protocols like Onion Routing, Tor, Hordes, GAP and Mix-Nets promise such services. Anonymizer Crowds Onion Routing DC-Nets Mix-Nets Hordes GAP Tor
Hash Functions Biometrics Public Encryption Keys Certificates Trusted Third Parties
V. CONCLUSIONS This paper presents the basic privacy requirements necessary for protecting users’ privacy, as well as the existing implementation techniques that satisfy these requirements. Analysis of the above techniques shows that most of the implementation techniques aim on satisfying anonymity and pseudonimity requirements whereas only three of the discussed techniques consider all basic privacy requirements. Nowadays, in a fully networked society, privacy is seriously endangered and cannot be sufficiently protected by privacy legislation or privacy codes of conduct alone. Data protection commissioners are therefore demanding that privacy requirements should also be technically enforced and that privacy should be a design criterion for information systems.
Y: Yes / N: No
Table 1. Matching Privacy Requirements with Implementation Techniques Despite the fact that most of the above techniques satisfy the majority of user privacy requirements another criterion must also be considered from the designers and the developers when selecting the most suitable implementation technique. The cost criterion. Providing anonymity through Crowds is a lot cheaper that providing it through Tor. Of course the Tor architecture provides more services and in a
VI. REFERENCES Fischer-Hübner, S.: IT-Security and Privacy, Design and Use of Privacy Enhancing Security Mechanisms. Lecture Notes in Computer Science, Vol. 1958. Springer-Verlag, Berlin Heidelberg New York (2001) Kalloniatis, C., Kavakli, E., Gritzalis, S.: Security Requirements Engineering for e-Government Applications: Analysis of Current Frameworks. DEXA EGOV’04 Conference, Lecture Notes in Computer Science, Vol. 3183. Springer-Verlag, Berlin Heidelberg New York (2004) 66-71 Loucopoulos, P., Kavakli, V.: Enterprise Knowledge Management and Conceptual Modelling. Lecture Notes in Computer Science, Vol. 1565. Springer-Verlag, Berlin Heidelberg New York (1999) 123-143 Loucopoulos, P.: From Information Modelling to Enterprise Modelling. In: Information Systems Engineering: State of the Art and Research Themes.
[5] [6]
[9] [10]
[11] [12]
[15] [16]
[17] [18]
Springer-Verlag, Berlin Heidelberg New York (2000) 67-78 Cannon, J.,C.: Privacy, What Developers and IT Professionals Should Know. Addison-Wesley (2004) Gritzalis, S.: Enhancing Web privacy and anonymity in the digital era. Information Management and Computer Security, Vol. 12, No. 3. Emerald Group Publishing Limited (2004) 255288 Pfitzmann,A.:Diensteinte-grierende Kommunikationsmnetze mit teilnehmerüberprüfbaren Datenschutz. Informatik-Fachberichte 234. Springer-Verlag, Berlin Heidelberg New York (1990) Pfitzmann, B., Waidner, M., Pfitzmann, A.: Rechsicherheit trotz Anonymität in offenen digitalen Systemen. Datenschutz und Datensicherheit(DuD) No. 6 (1990) 243-253 (Part 1), No. 7 (1990) 305-315 (Part 2) Anonymizer, available at Reiter, K.M., Rubin, D.A.: Crowds: Anonymity for Web Transactions. ACM Transactions of Information and System Security, Vol. 1, No. 1 (1998) 66-92 Reiter, K.M., Rubin, D.A.: Anonymous Web Transactions with Crowds. Communications of the ACM, Vol. 42, No. 2 (1999) 32-38 Reed, M., Syverson, P., Goldschlag, D.: Anonymous connections and Onion Routing. IEEE Journal on Selected areas in Communications, Vol. 16, No. 4 (1998) 482-494 Goldschlag, D., Syverson, P., Reed, M.: Onion Routing for anonymous and private Internet connections. Communications of the ACM, Vol. 42, No. 2 (1999) 39-41 Chaum, D.: Security without identification: Transactions Systems to make Big Brother Obsolete. Communications of the ACM, Vol. 28, No. 10 (1985) 1030-1044 Chaum, D.: The Dining Cryptographers Problem: Unconditional Sender and Recipient Untraceability. Journal of Cryptology, Vol. 1, No. 1 (1988) 65-75 Chaum, D.: Untraceable Electronic Mail, return Addresses, and Digital Pseudonyms. Communications of the ACM, Vol. 24, No. 2 (1981) 84-88 Pfitzmann, A., Waidner, M.: Networks without user Observability. Computers & Security, Vol. 6, Issue 2 (1987) 158-166 Shields, C., Levine, N.B.: A protocol for anonymous communication over the Internet. In: Samarati, P. and Jajodia, S. (eds.): Proceedings of the 7th ACM Conference on Computer and Communications Security. ACM Press New York NY, (2000) 33-42 Bennett, K., Grothoff, C.: GAP-Practical Anonymous networking. Proceeding of the Workshop on PET2003 Privacy Enhancing
Technologies (2003), also available at Dingledine, R., Mathewson, N., Syverson, P.: Tor: The Second-Generator Onion Router. Proceedings of the 13th USENIX Security Symposium, San Diego, CA, USA (2004)