Towards a Game Theoretical Model for Identity Validation in Social

2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing

Towards a Game Theoretical Model for Identity Validation in Social Network Sites Anna Cinzia Squicciarini

Christopher Griffin

Smitha Sundareswaran

Information Sciences and Technology Pennsylvania State University E-mail: [email protected]

Applied Research Laboratory Pennsylvania State University Email: [email protected]

Information Sciences and Technology Pennsylvania State University E-mail: [email protected]

Abstract—Social sites frequently ask for rich sets of user identity properties before granting access. Users are given the freedom to fail to respond to some of these requests, or can choose to submit fake identity properties, so as to reduce the risk of identification, surveillance or observation of any kind. However, this freedom has led to serious security and privacy incidents [23], due to the role users’ identities play in establishing social and privacy settings. In this paper, we take a step toward addressing this open problem, by analyzing the dynamics of social identity verification protocols. We use a game theoretical framework to describe a simple two-player general sum game describing the behavior of a server system (like Facebook) that provides utility to users. Users can choose to register a new identity using the true information, false information or no information (and remain anonymous). Likewise, the server may choose to believe and add the prospective social user, believe and yet fail the registration, or do nothing. We show criteria on the relative payoff of providing no information (anonymity) that produce various Nash equilibria. We then show that in the presence of a binding agreement to cooperate, most players will agree to share information. This result is consistent with reality, and suggests that sites that require users to authenticate with identity information should be prepared to provide strong guarantees on privacy in order to ensure that a social contract is maintained and the sites are not damaged. To the best of our knowledge, this is the first time an analytical model is developed to study the dynamics underpinning users’ registration is social media.

that constitutes an incentive for increasing the amount of user information requested [20], [14]. Despite the advanced authentication mechanisms available today, little, if any, verification about profiles attributes’ authenticity and/or validity is actually performed when users register on social sites, such as Facebook, Picasa, Twitter etc. Users can easily claim to be somebody else, enter completely wrong data, or generate fake accounts. On the one hand, the ability to generate unverified accounts is crucial to preserving users’ privacy over the Internet. On the other hand, the correctness of some fields may be essential for correct service provisioning, for instance home address for parcel mailing. Further, the ability to remain anonymous has led to several security and privacy incidents, such as misused identities, compromised accounts, and sybil attacks [23]. Some of these incidents have had severe consequences, facilitating cyber crimes such as identity theft, stalking, blackmailing, spamming etc. These issues have been confirmed by Mozelle Thompson (Facebook Chief Privacy Advisor) in the 2011 edition of the Australian Parliaments cyber-safety committee [32]. Thompson has made public that about 20,000 users are kicked off Facebook every day for various infractions, including for lying about their age. He declared that while the social network has mechanisms to detect liars, it is not perfect, despite the fact that Facebook takes safety very seriously, making efforts to remove numerous accounts everyday for activities including spamming, posting inappropriate content, and violation of age restrictions. When dealing with identity verification, the main challenge is to enable users to maintain their privacy, while ensuring that they do not misuse their profiles by either posing to be someone else, posing to be from another demographic group or belonging to another age group. Accounts that contain spoiled data lead not only to ineffective access control mechanisms, but also render users’ privacy settings that are based on others’ identities intrinsically meaningless [10], [5]. Our work is motivated by the gap between these two perspectives: one that suggests flexible, privacy-preserving authentication models to facilitate users’ social interactions, and the the other that suggests that misuse should not remain commonplace. To this end, we study whether or not social sites providers can reach a dynamic decision based on explicit factors, and what type of trade-offs are to be achieved.

I. I NTRODUCTION The increasing number of users involved in online activities, from gaming to social interactions and e-commerce, has resulted in a proliferation of digital identities. A digital identity corresponds to the profile information or attributes associated with a given user (also known as identifiers), that uniquely identifies him/her within a certain domain. In order to create digital identities users typically undergo a simple registration, during which they disclose a more or less detailed number of social and personal traits, such as name, location, age, preferences, contact information etc. Research has shown that user’s information disclosure is the result of the competing influences of the exchange benefits and two types of privacy beliefs: privacy protection belief and privacy risk belief. Individuals are more likely to disclose personal information if risks can be offset by benefits. Perceived usefulness of service has a positive impact on online users’ behavioral intention to disclose their personal information. In contrast, for service providers, rich user profiles have a significant economic value 978-0-7695-4578-3/11 $26.00 © 2011 IEEE DOI

1081

To formally reason about user and server behavior, we use techniques from game theory [24], which provide a rich set of mathematical abstractions and frameworks suitable for a quantitative treatment of identity validation and related trust problems. In particular, we study multi-party decision making with conflicting interests to formalize users/server interaction at the time of joining. Our objective is to gain additional insights on the possible strategies servers (i.e., social sites providers) and users can take in order to validate users’ identities, and develop algorithms that address existing and expected digital identity problems. To achieve this goal, we describe a simple twoplayer general sum game whose payoffs define the behavior of two players: a service provider like Facebook that offers utility to users, and the users. Users can choose to authenticate with the server using true information, false information or no information (stay anonymous). Likewise, the server may choose among the strategies: believe and add, believe and deny, and do nothing. We show criteria on the relative payoff of providing no information for both the user (providing anonymity) and server that produce various Nash equilibria sets. We then demonstrate that in the presence of a binding agreement to cooperate, most players will agree to share information. This is an important result, that proves the intuitive notion that sites that require users to authenticate with identity information should provide strong guarantees on privacy in order to ensure that a social contract is maintained and the sites are not damaged. Further, we show that despite the incentives provided by the binding agreement, a small population of users may prefer to remain anonymous, as is actually the case in most social network sites [5]. To the best of our knowledge, this is the first time a formal analysis of the user-server interaction of the authentication process is provided. All the non-trivial results are completed with formal proofs. The remainder of this paper is organized as follows. In Section II we discuss related work, while in Section III, we define critical terms and provide initial lemmas necessary for the proof of our primary results. In Section IV we provide our game model for the server and user. Section V includes Subsection V-A, where we provide results on this competitive two player game and Subsection V-B, where we discuss the cooperative game that arises when the server and users agree to be bound by an arbitrated contract (ensuring certain payoffs corresponding to privacy and security). We provide conclusions in Section VI.

sites and risk of over exposure ([11], [29], [2] are some notable examples), very few scholars [27], [18], [33], [21] have investigated users’ authentication on social computing sites. In [27], [18], the authors presented a personal-knowledge based authentication in social sites, and explore their applicability in Facebook-like communities, while a cryptographic-based approach for user authentication over the Web was proposed in [33]. Yet, none of these works studied how to maximize gain for both users and social site provider, nor do they address the question on how and if it is convenient to validate the information provided by the users. The only work analyzing social identities using analytical tools is from Alpcan and colleagues [4]. Alpcan’s work focuses on reputation and trust, where strategies are defined in terms of opinion, quantified through a simple cost function. As we discuss in the next sections, our focus is on validation and individual preferences, rather than community effects. Complementary to the body of work on authentication and identification, is the body of work on anonymity in social network sites [30], [7], [22]. The emphasis in these works is however on algorithmic approaches for non-disclosure and anonymity preservation, rather than on actual revelation. Analytical models for various security topics based on game, information and decision theories are rapidly growing in interest [16]. In particular, game theoretic approaches to reputation and trust first emerged in economics literature ([12] is a typical example) and then applied to online settings [25], [1], [20]. Closely related to our work is the work from Salim and colleagues [13]. Salim and colleagues also focus on authorization, but the emphasis is not on social sites. Rather, the authors focus on classic authorization systems in closed environments. As a result, the intent of the users (and therefore their payoff structure) is substantially different from that of social site users. It could be expected that the payoff to users providing false or no information is substantially higher when they are falsely authenticated while the loss to the server in such situations is substantially greater, relative to the damage from denying a user access. Another economically inspired work dealing with users’ privacy is discussed by Papadimitrous and colleagues [20], who propose a precise estimate of the value of the private information disclosed by a set of individuals, and a compensation for such information release that may induce users to release richer information. Yet, the model applies to a different set of applications, such as online survey and e-commerce applications, rather than social sites. Finally, parallel to this body of work is the work on reputation [35], [17], [15], [19]. Reputation of digital identities and trust in online environments have been investigated by multiple research communities ranging from computer science [26] to economics [8], [28].

II. R ELATED W ORK Digital identity constitutes one of the building blocks of the World Wide Web for activities, ranging from social networking to e-commerce [31]. A variety of digital identity and trust management mechanisms have been developed to satisfy the emerging needs [33], [3], [11]. However, there has been little work on the topic of digital identity validation and trust in the context of social computing. While various studies have explored identity sharing behavior in social network

III. M ATHEMATICAL P RELIMINARIES In this section, we present some basic notions that are relevant for our analysis and design. Specifically, we define a bimatrix game, mixed strategy space and payoff functions.

1082

j (j ∈ {1, . . . , n}), B·i < B·j , then for Player 2, strategy i is strictly dominated.

Definition 1 (Bimatrix Game). An bimatrix game is a tuple G = (P, Σ, A, B) where 1) P = {P1 , P2 } is the set of players; 2) Σ = Σ1 × Σ2 is the strategy space where Σ1 = 1 } and Σ2 = {σ12 , . . . , σn2 } are the strategies {σ11 , . . . , σm sets for Players 1 and 2; and 3) A, B ∈ Rm×n are payoff matrices so that element (i, j) of A (B resp.) is the payoff to Player 1 (2 resp.) when strategy pair (σi1 , σj2 ) ∈ Σ1 × Σ2 is played.

IV. M ODEL A. Modeling registration as a game Our study of identity validation focuses on symmetric and centralized digital identity systems. We design a simple bimatrix game, played by a prospective social user and the social computing platform (i.e., server), represented by a system administrator. For simplicity, each user is associated with a single identity and is symmetric in her properties. In other cases, e.g., e-commerce, users can be divided into two main groups (buyers and sellers). The game aims to exemplify the typical user account creation process in an online community; the user enters a number of identity attributes to create a “social profile” through a Web form and the server evaluates the user’s entered Web profile over a finite time interval. While filling-in the form, the user can reveal true information, reveal false information, or withhold as much information as possible. The server, in turn, can choose to validate (or believe) the information disclosed and accept the user within the community, validate the information disclosed and yet decide to reject the user (e.g., the user does not meet the Web site access criteria) or do nothing (e.g., deny with no information). Denying certain content with limited or no explanation can prevent individuals from attempting to game the server. As an explanatory example, let us suppose that the server provides only adult content and requires answering a series of personal questions to allow access, including the users date of birth. Suppose that the user na¨ıvely inserts the true date of birth. The server will deny access to the user if underage, but it is safer to not specify the reason, otherwise the user could simply change his date of birth and gain access to the site. We focus on the interesting scenario of a user preferring to withhold her identifying information and a server that would rather not invest validation resources on users, as it best represents the tension between users’ privacy and social disclosure. This scenario provides the most challenging situation in a two-player game, i.e., the players have complementary (joining the online community and accepting a new user) yet competing goals (limit information disclosure vs. obtaining truthful data). Other scenarios, when the user is willing to disclose as much information as needed to be validated, are less interesting from a design perspective, but could be the object of future work.

The following definitions and lemmas hold over any bimatrix game G = (P, Σ, A, B). Definition 2 (Mixed Strategy Space). The set: Δm =

[x1 , . . . , xm ]T ∈ Rm×1 : m

xi = 1; xi ≥ 0, i = 1, . . . , m

(1)

i=1

is the mixed strategy space in m dimensions for Player P1 . The space Δn for Player P2 is defined similarly and Δ = Δm ×Δn is the mixed strategy space for the game. Definition 3 (Mixed Strategy Payoff Function). Let x ∈ Δm and y ∈ Δn be mixed strategies for Players P1 and P2 respectively. Then mixed strategy payoff functions for the players are: u1 (x, y) = xT Ay T

u2 (x, y) = x By

(2) (3)

This notation is adapted from [34]. Definition 4 (Strict (Weak) Dominance). A mixed strategy x ∈ Δm for Player P1 stictly dominates another strategy y ∈ Δm if for all mixed strategies z ∈ Δn we have: u1 (x, z) > u1 (y, z)

(4)

Dominance is said to be weak if u1 (x, z) ≥ u1 (y, z) and for at least one z ∈ Δn , u1 (x, z) > u1 (y, z). A similar definition holds for strategies for Player P2 . Definition 5 (Dominated Strategy). A strategy x ∈ Δm for Player P1 is said to be weakly (strictly) dominated if there is a strategy y ∈ Δm that weakly (strictly) dominates x. A similar definition holds for Player P2 . Definition 6 (Nash Equilibrium). A strategy pair (x∗ , y∗ ) ∈ Δm × Δn is a Nash equilibrium if and only if: x∗ T Ay∗ ≥ xAy∗

∀x ∈ Δm

(5)

x∗ T By∗ ≥ x∗ T By

∀y ∈ Δn

(6)

B. Formal Model Formally, we consider the following two-player bimatrix G = (P, Σ, A, B), where: P = {Server, User}, Σ = Σ1 × Σ2 with Σ1 = {BA, BD, N } and Σ2 = {T, F, N }. A denotes a payoff matrix for the server and B is a payoff matrix for the user. In general, the numerical values of the payoff matrixes will be functions of a set of parameters that capture different features of this interaction. Some examples include security

Lemma 1. For every player Pi ∈ P, if strategy j is strictly dominated, then in any Nash equilibrium, the probability that Player Pi plays strategy j is 0. Lemma 2. If Ai· < Aj· (for some j ∈ {1, . . . , m}), then for Player 1, strategy i is strictly dominated. Similarly, If for some

1083

are the columns.

standards enforced by system administrator, disclosure requirements, quality of service gained, reputation, or consequences of non-disclosure. In what follows we describe few representative factors that we used for constructing the payoff functions used in this work. We consider the construction of the server’s payoff first. We assume that the choice made by the server when accepting a new user involves three key factors, summarizing the costs and the related benefits associated with validating and therefore accepting a new user (and vice versa). The first factor, denoted as φ, is convenience for data validation. This factor quantifies whether it is convenient for the server to spend resources to validate the user’s data. By validating the data, we mean that the server performs certain checks to verify the veracity of the data provided by the user. For example, it [the server] could check whether the provided location is reflective of the IP address of the user’s machine, or whether the email is from the domain indicated by the user (i.e., a .edu domain). The convenience of this operation depends on the user’s expected behavior. If the server chooses to validate and the validation is not successful the server will waste resources. If the user does not release information, the server will not have anything to validate. If the user tells the truth, the server will invest resources in a profitable way. We use the following values to represent these three options: • • •

• •

-1: the operation will not be convenient; 0: the server will not consume resources; 1: the validation operation will be profitable.

-1 to indicate that the server will be conservative; 0 to indicate that server will be neutral; 1 to indicate that server will not be conservative.

The third factor, denoted as ξ, is the risk of losing an honest user. This factor quantifies whether the strategic choice made by the server may cause the loss of an honest user. ξ takes the following values: • • •

⎤ 2 −2 −1 A = ⎣ 1 −1 0 ⎦ −a −a −a

(8)

Each value of the function is computed according to equation 7. For example, the first value of the matrix is obtained as SPayoff (φ, δ, ξ, BA, T ) = 1 − (−1) − 0. Similarly, the value -2 is obtained by computing SPayoff(φ, δ, ξ, BA, F ), where Φ(BA, F ) = −1, δ(BA, F ) = 1 and ξ(BA, F ) = 0. The remaining values are easily computed in the same manner. We also introduce parameter a in the Server Payoff Matrix to better investigate the non-disclosure option. Precisely, the parameter a represents the indifference of the server to disclose validation information. We distinguish among the following cases: a > 0 means that the server prefers to disclose validation information to the user, if a ≤ 0 it means that the server prefers to not do anything at the end of information validation; in particular, the server would prefer not to disclose why a user was rejected from the site. This is particularly relevant in (e.g.) the case where an age limit is in effect. We now turn our attention to the user’s payoff. To model the behavior of a user when interacting with a server which requires the disclosure of personal information, we rely on the following three key factors: α, β and γ. The first factor, denoted as α, aims to capture the fact that information disclosure in the online environment is subject to perception about the fairness of information disclosure, pertaining to whether the collection of certain information and subsequent usage by the server are fair relative to the context of the interaction. Low fairness may cause consumers to withhold personal information even if benefits override the contemporary privacy costs [14]. For simplicity, α is modeled as a two-valued variable, where -1 indicates that the user does not believe in the server’s fairness, and value 1 the opposite (the user trusts the server). The second factor, denoted as β, represents the user’s interest in the service offered by the server. This dimension denotes the social pressure of the user in entering the network. β is a boolean variable, that assumes either value 0 or 1, denoting respectively the case of a user being indifferent and the case of a user interested to the service offered. Finally, we use variable γ to model the user’s perceived privacy level. The perceived privacy level is an important factor that drives the interaction and depends on two aspects: control over intrusion and control over disclosure. Here, control over intrusion measures the perceived control against unauthorized access, while control over disclosure refers to the ability of the users to control their information and resource disclosure. We model γ as a boolean variable, similar to α and β, for consistency. Hence, γ takes either value -1, to indicate that the user is not concerned about his privacy or lack of thereof; and 1, to indicate that the user is concerned. These three factors are used to calculate the User’s Payoff Function:

The second factor, denoted as δ, quantifies whether the choice maintains a conservative policy in accepting new users, and hence may apply stringent requirements to the users’ entered data and their validation. δ takes the following values: •

⎡

-1 to indicate that the server will detect an unfair user; 0 to indicate that server will gain a honest user; 1 to indicate that server will lose a honest user.

These three factors are used in our study to calculate the Server Payoff Function: SPayoff (φ, δ, ξ, S, U ) = φ(S, U ) − δ(S, U ) − ξ(S, U ) (7) The values of the three factors vary depending on the strategy played by the Server (S) in according to the possible behavior of the User (U), where S belongs to ΣS = {BA, BD, N } and U belongs to ΣU = {T, F, N }. We compute the corresponding Server’s Payoff function value, as follows. Here, the server’s strategies are the rows of the matrix,while the user’s strategies

UPayoff (α, β, γ, S, U ) = α(S, U ) + β(S, U ) − γ(S, U ) (9)

1084

We obtain the following matrix. ⎡ 1 2 B = ⎣−1 0 −1 0

⎤ 3+b b ⎦ b

behavior prevails, even if cooperating would lead to a higher gain for both players. This is a classic Prisoner’s Dilemma problem [9]. As we discuss more in-detail in the following section, the probability of these strategies being adopted can be corrected by introducing the notion of bargaining.

(10)

The values of the “do nothing” action are left parametric, and provide a way for us to study the impact on payoff for providing no information on the Nash equilibrium solutions and Nash bargaining solution for the game G. In particular we will distinguish the following cases: b > 0 means that server is tolerant against the non-disclosure of information by the user, if b ≤ 0 means that it prefers to force the user to release personal information, even false, rather than not getting information.

To further investigate the interplay of the strategies of our game, we study the effect of the value of a, the parameter representing the “do nothing” option. Proposition 4 (Effect of parameter a). With reference to the two player general sum game G with payoff matrices given in Expressions (8) and (10). The following two results hold: 1) If a < 0, then Player 1 will always play N in every Nash equilibrium. 2) If a > 1, then Player 1 will always play BD in every Nash equilibrium.

V. R ESULTS In this section, we discuss the results of our game. We consider first the case of non-cooperative general sum game, followed by an analysis in case bargaining is introduced.

Reasoning from the game matrices, if a < 0, then strategy N strictly dominates strategy BD for Player 1 and consequently Player 1 will only play strategy N in any Nash equilibrium. Similarly, if a > 1, then strategy BN strictly dominates strategy N for Player 1 and consequently Player 1 will only play strategy BN in any Nash equilibrium. Intuitively, Proposition 10, shows us that if the server is indifferent to missing profile data (notice the negative sign for a in Expression (8)), it will prefer to do nothing rather than take the burden of validating the user (or taking the risk of accepting her without validating her data) for the partial data submitted. This often happens for servers provisioning services (e.g., identity management services) which incur a high cost for non-truthful users but lose little with the loss of a prospective user. Therefore, in these contexts, it is best for the server to not validate any information given by the user because the validation does not influence its response. On the other hand, if the lack of input data is an obstacle to service provisioning, validating the partial data entered by the user will be a waste of the server’s resources. Hence, in both scenarios, the only reasonable action in case of missing data is to deny access. Next, we study the effect of parameter b (representing the “do nothing” option), to the Nash equilibrium of our model.

A. Results on the Competitive Game The analysis of the non-cooperative general sum game presented in the previous section provides us with some important insights on the possible outcome of the interaction between user and service provider. First, let Player 1 be the server and Player 2 be the user in the two player general sum game G with payoff matrices given in Expressions (8) and (10). We note that B.1 < B.2 , and thus Strategy T is strictly dominated by Strategy BF for Player 2 by Lemma 2. Therefore, according to Lemma 1, Player 2 will never play Strategy T . Applying this dominance argument produces two new game matrices: ⎡ ⎤ ⎡ ⎤ −2 −1 2 3+b b ⎦ A = ⎣−1 0 ⎦ B = ⎣0 −a −a 0 b A similar argument applies to A and B . We note that A1· < A2· and thus Player 1 will never play strategy BA. The result is a reduced matrix game with only two strategies for each player:

−1 0 0 b A = B = −a −a 0 b

Proposition 5 (Effect of parameter b). With reference to the two player general sum game G with payoff matrices given in Expressions (8) and (10). The following results hold: 1) If b > 0, then Strategy N strictly dominates all other strategies for Player 2. 2) If b = 0, then there are an infinite number of alternative Nash equilibria. 3) If b < 0, then Strategy F strictly dominates all other strategies for Player 2. 4) Assume b > 0. If 0 < a ≤ 1, then there is a unique Nash equilibrium in which Player 1 plays BD and Player 2 plays N . If a = 0, then there are infinitely many Nash equilibria and in all of them Player 2 plays N . 5) Assume b < 0. If 0 ≤ a < 1, then there is a unique

This reduction leads to the following proposition. Proposition 3 (Game’s strategies). Consider two player general sum game G with payoff matrices given in Expressions (8) and (10). Let Player 2 be the user and Player 1 be the server. Then the following hold: 1) In any Nash equilibrium, Player 2 will never play Strategy T . 2) In any Nash equilibrium, Player 1 will never play Strategy BA. The above proposition confirms the intuition that, given the contrasting priorities and goals of both players, the truthful strategies, will never be played. In other words, if the server and user do not trust each other’s practices, the most selfish

1085

Nash equilibrium in which Player 1 plays N and Player 2 plays F . If a = 1, then there are infinitely many Nash equilibria and in all of them Player 2 plays F . 6) Assume b = 0. If 0 ≤ a ≤ 1, then every Nash equilibrium (x∗ , y∗ ) has the form: ⎡ ⎤ 0 y∗ = ⎣ y ⎦ (11) 1−y ⎡ ⎤ 0 (12) x∗ = ⎣ x ⎦ 1−x

Thus, any Nash equilibrium solution will necessarily have the form given by Equations (11) and (12) and Player 1 will solve the Linear Programming Problem 13. This ends the proof. The above result demonstrates that when the user considers not disclosing any information a valuable option, the server’s best action is not to do anything; that is neither accept nor confirm the user in the network (e.g. send a “more information is needed” message). On the other end, if b is negative, the user will then resort to sending false data, in that she has no incentive in revealing truthful information. The next remark highlights an interesting phenomena observed in Proposition 11.

where y ∈ [0, 1] is arbitrarily chosen and x is chosen to solve the linear programming problem: max x(a − y) − a x (13) 0≤x≤1

Remark 6 (Interesting case). In the case when b = 0 and 0 ≤ a ≤ 1 in the previous proposition, an interesting fact occurs. If (a − y) < 0, then setting x = 0 will maximize the objective function of Problem 13 and Player 1 will necessarily play strategy N . If (a − y) > 0, then the opposite will hold and x = 1 and Player 1 will necessarily play BN . If a = y (which is possible as y is chosen arbitrarily), then Player 1 may also choose any x ∈ [0, 1] resulting in an infinite set of alternative Nash equilibrium points.

Proof: To prove (1-3), note that if b > 0, then N strictly dominates strategy F for Player 2, thus Player 2 will only play N in any Nash equilibrium. If b < 0, then F strictly dominates strategy N and Player 2 will only play N in any Nash equilibrium. Finally, if b = 0, then for any mixed strategy (0, y, 1 − y) ∈ Δ3 with y ∈ [0, 1], the payoff to Player 2 is 0. Thus, Player 2 can play any mixed strategy satisfying these criteria and the result will provide Player 2 her maximum value. To prove (4), we note that from (1) we know that Player 2 must play N . This causes our payoff matrices to reduce to:

0 b B = A = −a b

In other terms, if the service is not very costly, the service provider does not loose much by taking no action. In the contrary case, the server tends to be better off denying the user the service (BN), because a denial in such a case may make the user divulge more information. B. Cooperative Game Results The game described in the previous section has many of the characteristics of a Prisoner’s dilemma in that each player would maximize his/her gain if he (or she) played BA and T , i.e., there was mutual trust. Unfortunately, in the noncooperative construct, both players are tempted from this Pareto optimal solution by the attractiveness of antagonistic strategies. This is illustrated in Figure 1 for b = 1 and a = 1. A different result occurs when we consider the role of privacy policies and service level agreements in social sites. Users often are more comfortable in sharing data with sites they trust and that maintain properly enforced privacy policies. The role of this pre-agreement can be well represented by a Nash bargaining problem in which the players choose an arbitration scheme x11 , . . . , x33 , where xij is the binding probability that Player 1 will choose strategy i, while Player 2 will choose strategy j. Here the objective is not for each player to make unilateral decisions that will maximize his payoff, but for the players to negotiate (or ask an arbiter to negotiate) so that the multi-criterion optimization problem maximizing both player’s utility functions is solved. In other words, the best strategy to maximize gain of both the user and server is to negotiate information and service conditions, respectively. Without the server, the user may stand to loose an important service, and the server needs users to keep its business successful. These observations are confirmed by the following results. We prove that, while there is a certain portion of the population

Since 0 < a ≤ 1 it follows that strategy BN strictly dominates strategy N for Player 1 and thus, the unique Nash equilibrium must be that Player 1 plays strategy BN , while Player 2 plays strategy N . In the case when a = 0, then it doesn’t matter whether Player 1 plays Strategy BN or N , each will return the same payoff and thus any mixed strategy of the form (0, x, 1− x) with x ∈ [0, 1] will maximize the payoff to Player 1. In each case, Player 2 plays N . The proof of (5) that if b < 0 and 0 ≤ a < 1, then the unique Nash equilibrium has Player 1 playing N and Player 2 playing F follows by a similar argument, as does the fact that when a = 1, all strategies of the form (0, x, 1 − x) will maximize the payoff to Player 1, while Player 2 plays F in every Nash equilibrium. Lastly, if b = 0, then Player 2 can play any strategy of the form (0, y, 1 − y) ∈ Δ3 with y ∈ [0, 1] and her payoff will always be 0. In this case, if Player 1 plays strategy (0, x, 1 − x) ∈ Δ3 (with x ∈ [0, 1]) the payoff to Player 1 can be computed as: ⎡ ⎤⎡ ⎤ 0 2 −2 −1 0 x 1 − x ⎣ 1 −1 0 ⎦ ⎣ y ⎦ = −a −a −a 1 − y x(a − y) − a (14)

1086

who does not wish to provide data to social networks [5], these individuals can be manipulated into agreeing to provide critical information through judicious alteration of their relative payoff values. This argument is consistent with reality, and also shows the importance of enforceable social contracts since Nash’s bargaining theorem is predicated on the condition that the resulting agreement is binding. Thus sites like Facebook, that continuously update their privacy policies, violating an implied social contract, may be encouraging their users to engage in information distortion or hiding or even maintain multiple identities online providing a scenario ripe for sybil attacks.

PARETO FRONTIER

NASH BARGAINING POINT a = 1, b = 1

Lemma 7. Let u01 and u02 be Nash equilibrium payoffs to the players in the general sum game G defined above with a and b fixed. Then any arbitration scheme is the solution to the non-linear programming problem: ⎧ max (u1 − u01 )(u2 − u02 ) ⎪ ⎪ ⎪ n m ⎪ ⎪ ⎪ ⎪ ⎪ Aij xij − u1 = 0 s.t. ⎪ ⎪ ⎪ ⎪ i=1 j=1 ⎪ ⎪ ⎪ n m ⎪ ⎪ ⎪ ⎪ Bij xij − u2 = 0 ⎪ ⎨ i=1 j=1 (15) n m ⎪ ⎪ ⎪ ⎪ xij = 1 ⎪ ⎪ ⎪ ⎪ i=1 j=1 ⎪ ⎪ ⎪ ⎪ ⎪ xij ≥ 0 ∀i, j ⎪ ⎪ ⎪ 0 ⎪ ⎪ u 1 ≥ u1 ⎪ ⎪ ⎩ u2 ≥ u02

ADDED PAYOFF IN BARGAINING PROBLEM

Fig. 1. Payoff Region: The blue region illustrates the possible payoff values for Player 1 and 2 under competitive strategies. The red region adds the additional payoff combinations possible as a result of cooperation. The Nash bargaining solution in the case when a = b = 1 is shown on the Pareto frontier of the payoff region.

that the players agree to play strategy pair (BA, N ). This assumption is reasonable since Player 1 will receive a value between −1 and 2 while Player 2 will receive a value between 1 and 3 + b. Thus if x ≥ 13 (1 − a), Player 1 (the server) is ensured that he will out perform his Nash equilibrium payoff value; (clearly if a ≥ 1, then any x ∈ [0, 1] will ensure that Player 1 does at least as well as he does in competitive play.) If a ∈ [0, 1), then the resulting cooperative payoff function is: (2x − (1 − x) + a) (x + (3 + b)(1 − x)) = 7 3 1 − 6x2 + 10x + bx − 3bx2 − − b 2 2 2

Furthermore, at optimality, (u1 , u2 ) is the pair of Pareto optimal payoffs to Players 1 and 2 respectively that results from cooperation; i.e., agreeing to a common arbitration scheme x = (x11 , . . . , xmn ). Proposition 8. Suppose 1) a ∈ [0, 1) and

1 + 2a b≤− 2+a

(17)

Differentiating and solving for x in terms of b yields: x∗ = −

1 −11 + ab − 4b + 2a 6 2+b

(18)

We can find the value of b for which x∗ = 1 which yields the fact that when: 1 + 2a (19) b≤− 2+a

(16)

or 2) a ≥ 1 and b ≤ −1, then x11 = 1. That is, both players agree to share all information.

then x ≥ 1, which is not possible and so x = 1 is our solution. When a ≥ 1, then the cooperative payoff function is: − 6x2 + 9x + 3bx − 3bx2

A sketch of the proof is reported below. Proof: If a ≥ 1, then the Nash equilibrium payoff values will be (−1, 0) by Proposition 5 while if a ∈ [0, 1), then the Nash equilibrium payoff values will be (−a, 0). The full proof makes use of the Karush-Kuhn-Tucker conditions for non-linear programming [6] under the two possible values for (u01 , u02 ) and requires substantially more space than is available. It suffices to argue that for b ≤ 0 and a ≤ 0, the players will agree to play some combination of the strategy pair (BA, T ) - the server believes and adds a truthful user - and (BA, N ) - the server believes and adds with missing information. Let x be the probability that the players agree to play the strategy pair (BA, T ) while 1 − x is the probability

(20)

Proceeding in the same way, we find that x∗ =

13+b 22+b

(21)

and if b ≤ −1, then x ≥ 1 or x = 1. This ends the proof. Remark 9. When a = 0 and b = 0 (i.e., there is no added gain by withholding information on the part of either the server or the user) we note that x11 = 56 and x13 = 16 suggesting there are relative payoffs for which the bargaining strategy contains a non-zero probability that the player will wish to remain anonymous (i.e., play strategy N ).

1087

VI. C ONCLUSION

[14] C. M. Hoadley, H. Xu, J. J. Lee, and M. B. Rosson. Privacy as information access and illusory control: The case of the facebook news feed privacy outcry. Electronic Commerce Research and Applications, 9(1):50–60, 2010. [15] T. Hogg and L. Adamic. Enhancing reputation mechanisms via online social networks. In EC ’04: Proc. of the 5th ACM conference on Electronic commerce, pages 236–237, New York, NY, USA, 2004. [16] IEEE. First IEEE Conference on Decision theory and Game for Security, 2010. www.gamesec-conf.org. [17] B. Jennings and A. Finkelstein. Digital identity and reputation in the context of a bounded social ecosystem. In In Proc. of Business Process Management Workshops, pages 687–697, 2008. [18] M. Just. Designing and evaluating challenge-question systems. IEEE Security and Privacy, 2:32–39, 2004. [19] G. Kesidis, A. Tangpong, and C. Griffin. A sybil-proof referral system based on multiplicative reputation chains. Communications Letters, IEEE, 13(11):862 –864, November 2009. [20] J. Kleinberg, C. H. Papadimitriou, and P. Raghavan. On the value of private information. In Proc. of the 8th conference on Theoretical aspects of rationality and knowledge, TARK ’01, pages 249–257, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc. [21] S. Kruk, S. Grzonkowski, A. Gzella, T. Woroniecki, and H.-C. Choi. D-foaf: Distributed identity management with access rights delegation. In The Semantic Web – ASWC 2006, volume 4185 of Lecture Notes in Computer Science, chapter 15, pages 140–154. Springer Berlin Heidelberg, 2006. [22] K. Liu and E. Terzi. Towards identity anonymization on graphs. In Proc. of the 2008 ACM SIGMOD international conference on Management of data, pages 93–106, New York, NY, USA, 2008. ACM. [23] E. Martinez. Alexis Pilkington brutally cyber bullied, even after her suicide, 2010. http://www.cbsnews.com/8301-504083 162-20001181504083.html. [24] R. B. Myerson. Game Theory: Analysis of Conflict. Harvard University Press, 2001. [25] K. C. Nguyen, T. Alpcan, and T. Basar. Stochastic games for security in networks with interdependent nodes. CoRR, abs/1003.2440, 2010. [26] P. Nurmi. A bayesian framework for online reputation systems. In Telecommunications, 2006. AICT-ICIW ’06. International Conference on Internet and Web Applications and Services/Advanced International Conference on, page 121, February 2006. [27] A. Rabkin. Personal knowledge questions for fallback authentication: security questions in the era of facebook. In SOUPS ’08: Proc. of the 4th symposium on Usable privacy and security, pages 13–23, New York, NY, USA, 2008. [28] C. Shapiro. Consumer information, product quality, and seller reputation, 1982. [29] F. Stutzman. An evaluation of identity-sharing behavior in social network communities. iDMAa Journal, 3(1), 2006. [30] B. Thompson and D. Yao. The union-split algorithm and clusterbased anonymization of social networks. In ASIACCS ’09: Proc. of the 4th International Symposium on Information, Computer, and Communications Security, pages 218–227, New York, NY, USA, 2009. [31] C. W. Thompson and D. R. Thompson. Identity management. IEEE Internet Computing, 11(3):82–85, 2007. [32] T. E. Times. Facebook removes 20,000 underage or spam profiles daily, 2011. http://articles.economictimes.indiatimes.com/2011-0324/news/29181675 1 facebook-users-social-networking-site-cybersafety. [33] T. W. van der Horst and K. E. Seamons. Simple authentication for the web. In WWW ’07: Proc. of the 16th international conference on World Wide Web, pages 1217–1218, New York, NY, USA, 2007. ACM. [34] J. W. Weibull. Evolutionary Game Theory. MIT Press, 1997. [35] P. J. Windley, D. Daley, B. Cutler, and K. Tew. Using reputation to augment explicit authorization. In DIM ’07: Proc. of the 2007 ACM workshop on Digital identity management, pages 72–81, New York, NY, USA, 2007. ACM.

In this paper, we provide a quantitative model of the server/user interaction in social network sites at the time of registration. We demonstrate that while a server and user may have contrasting goals, we can improve the chances of mutual cooperation by introducing the notion of bargaining and anonymity. Depending on whether or not the server itself values lack of information, anonymity of the user will result in a single dominant strategy for the server. Also, we show that depending on the cost of the service being provided, the server can always choose a dominant strategy. Thus this model shows that the server can still have a benefit in terms of cost, even while allowing the user to bargain. We anticipate two major extensions of the current model in our future work. First, we plan to extend our study to include multiple user archetypes, for example by means of bayesian games. Second, we will analyze how to capture additional factors, such as agent inertia and social pressure as well as the implications of partial information disclosure. Finally, upon completion of the model, we will conduct experimental studies using simulations and actual users. Acknowledgement The work from Squicciarini is partially funded by NSF CNS 0831247. We would like to thank Giuseppe Petracca for his useful suggestions on the paper’s final version. R EFERENCES [1] K. Aberer and Z. Despotovic. On reputation in game theory application on online settings, 2004. [2] S. Ahern, D. Eckles, N. S. Good, S. King, M. Naaman, and R. Nair. Over-exposed?: privacy patterns and considerations in online and mobile photo sharing. In CHI ’07: Proc. of the SIGCHI conference on Human factors in computing systems, pages 357–366, New York, NY, USA, 2007. ACM. [3] G.-J. Ahn, M. Ko, and M. Shehab. Privacy-enhanced user-centric identity management. In IEEE International Conference on Communications, pages 1–5, 2009. ¨ [4] T. Alpcan, C. Orencik, A. Levi, and E. Savas¸. A game theoretic model for digital identity and trust in online communities. In ASIACCS ’10: Proc. of the 5th ACM Symposium on Information, Computer and Communications Security, pages 341–344, New York, NY, USA, 2010. [5] S. Barnes. A privacy paradox: Social networking in the united states. First Monday [Online], 11(9), 2006. [6] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear Programming: Theory and Algorithms. John Wiley and Sons, 2006. [7] S. Bhagat, G. Cormode, B. Krishnamurthy, and D. Srivastava. Privacy in dynamic social networks. In WWW ’10: Proc. of the 19th international conference on World wide web, pages 1059–1060, New York, NY, USA, 2010. ACM. [8] G. Bolton, E. Katok, and A. Ockenfels. How effective are online reputation mechanisms? Papers on Strategic Interaction 2002-25, Max Planck Institute of Economics, Strategic Interaction Group, May 2002. [9] S. J. Brams. Game Theory and Politics. Dover Press, 2004. [10] H. Bray. Privacy still a nagging concern on Facebook, 2010. http://www.boston.com/business/technology/articles/ / 2010/02/04/privacy still a nagging concern on facebook/. [11] J. DiMicco and D. Millen. Identity management: multiple presentations of self in Facebook. In GROUP ’07: Proc. of the 2007 international ACM conference on Supporting group work, pages 383–386, New York, NY, USA, 2007. ACM. [12] J. Ely, D. Fundenberg, and D. Levine. When is reputation bad? Games and Economic Behavior, pages 498–526, 2008. [13] U. D. Farzad Salim, Jason Reid and E. Dawson. Towards a game theoretic authorisation model. In Proc. of the 1st IEEE Conference on Decision and Game Theory for Security, 2010.

1088