2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications
Reputation Diffusion Simulation for Avoiding Privacy Violation
David Pergament / Armen Aghasaryan
Jean-Gabriel Ganascia
Alcatel Lucent Bell Labs 91620 Nozay, France
[email protected] [email protected]
Université Pierre et Marie Curie 75252 Paris, France
[email protected]
supposed friends have divulged these data without asking their consent. For all these reasons, it appears necessary to help the individuals to define their privacy policy on social networks by warning them about the potential dangers of individuals requesting a “friendship” relation or an access to a particular content. This is what motivated the design of the FORPS (“Friends oriented Reputation Privacy Score”). Namely, we have introduced this system to anticipate the propagation of information through social networks by scoring the propensity of individuals to propagate private information [10][13]. The contribution of this paper is the evaluation of the effect of such a scoring mechanism on the dynamics of actual data propagation. We address this problem from two perspectives: • On one hand, we evaluate the legitimacy of the use of FORPS, and its efficiency in terms of convergence to a state where the users have a correctly estimated opinion on a given requestor. • On other hand, we add dynamicity to our system: what happens if the requestor changes? What happens if malicious individuals try to propagate rumors? By creating a high number of interactions with a simulator, our goal is to validate, anticipate and calibrate the properties of FORPS in such a way that they ensure its privacy goals at best, without acting against the requestor. The paper is organized as follows. Section II refers to the related art of reputation scores and diffusion models. Section III presents the model used in the simulation. The first results are then presented in section IV. We then discuss the limitations of the approach and the research perspectives in the section V, before finally concluding with the section VI.
When people expose their private life in online social networks, this doesn’t mean that they do not care about their privacy, but they do lack tools to evaluate the risks and to protect their data. To address this issue, we have previously designed the FORPS system (Friends Oriented Reputation Privacy Score) that evaluates the dangerousness of people who suggest to become our friends, by computing their propensity to propagate sensitive information. In this paper, we introduce a multi-agent simulation model that allows evaluating the longterm and large scale effects of our system based on high number of interactions between simulated users. We show that in comparison with a simple decision process, different variants of the FORPS system produce better results in terms of estimation of the requestor’s dangerousness, the convergence speed and the resistance to rumor phenomena.
I.
INTRODUCTION
Many societal and ethical issues are related to the development of online social networks (OSN). Among them, the risks for the privacy protection have often been mentioned. On one hand, the users are afraid of the risk that the individual data, such as photos or comments, become public or that the owners of the social network infrastructure exploit them for their own purposes without taking care of individuals’ rights on those data. On the other hand, the social networks reshape continuously their privacy policies, taking into account the addressed criticisms and making people able to define by themselves the degree of visibility of their data. Privacy protection is based on a general principle according to which everyone has the right to totally control his personal data, i.e. to decide what information he/she accepts to reveal, when and to whom he/she does it [15]. However, this general principle is difficult to apply on social networks, because of the difficulty for a user to know who the persons asking him to be his ‘friends’ are and how they usually behave with their already existing friends. In addition, individuals change with time and age. It may then appear necessary for them to hide photographies, movies or textual content that corresponded to part of their previous life. It corresponds to the notion of “right to forget”, which means that individuals should be able to delete all the personal data they want. However, if we don't pay enough attention, social networks may contain huge quantities of individual data that can't be erased, especially if their
II.
The technology presented in this paper is related to people scoring for which several research studies have been carried out or services offered. In the domain of e-reputation, one can mention websites such as www.123people.com which find and aggregate data from different sources on the web and which provide information about an individual. Some systems, like Klout1, measure the popularity of people, how much for example their actions influence the others. Another example is eBay’s 1
978-0-7695-4745-9/12 $26.00 © 2012 IEEE DOI 10.1109/TrustCom.2012.237
STATE OF THE ART
401
http://www.klout.com
•
mechanism, where users can give notes about the degree of trust they have on somebody they dealt with before. There exist also scores used to estimate the likelihood that a person will default on a loan, e.g. fico score 2 . However, these systems are not targeting privacy issues. More related to privacy, one can mention systems that proposed the concept of privacy score which can be used to alert users about the visibility and protection of their sensitive data. These approaches are implemented as websites (e.g., Profile Watch3) or as Facebook applications (e.g., ’Privacy Check’ 4 ). Liu and Terzi [7] proposed a privacy score on social networking sites. The score is computed by considering two factors, the visibility and the sensitivity of the user’s data. Our privacy reputation score differs from the aforementioned approaches in that it takes different input data and uses a different algorithmic approach for the score computation [10]. Instead of analyzing only the data owner’s private or public data, our approach also considers the particular usage context defined by another user (the data requestor) who is requesting an access to the data owner’s information. This request can be formulated either as a friendship request in a social network or any other request to access a specific content item of the data owner [13]. The score represents the estimated privacy risk to the data owner if the request is granted. We notice that [8] and [6] also point out that sensitive information exposure can be caused by your friends. However, they are dealing with global profile information (like age) and unlike FORPS, they do not take into account the textual contents. Finally, multi-agent simulators have been broadly used to simulate the diffusion process over real or online social networks. The way we use the multi-agent systems is quite close to how the classical diffusion phenomena are dealt with, e.g. we can consider the diffusion of the requestor's score as an equivalent of innovation diffusion [12]. The authors of [1] have worked on privacy diffusion. But their goal was totally different; they wanted to simulate the migration of people from Myspace to Facebook for privacy reasons. III.
the previous comment contains a “tag” to the Delphine’s profile. All Bob’s friends have now access to the information added by Bob. Bob moreover posts a photo of the party on his profile and tags Bob. All Bob and Calvin’s friends have now access to this photo. In the two previously described examples, Calvin has publicly revealed private information on Bob. Was it a violation of privacy for Bob? Calvin wants to add Alice (a friend of Bob) as a friend within the online social network. Based on the propensity of Calvin to propagate sensitive data, can we help Alice decide whether to approve such a request or not? The solutions provided by FORPS [10] aims to address this question. B. FORPS and FORPS+ The basic idea of the FORPS mechanism consists in taking advantage of the overall knowledge present in a social network accessible to a given user (e.g., Alice). Then, the system tries to estimate the danger that another user (e.g., Calvin) may represent with respect to a non-desirable propagation of Alice’s sensitive data. This can be done by aggregating different sources of information characterizing Calvin’s profile and behavior: • public profiles of other users available in the social network or any public data on the web, • the private profile of Calvin, insofar as it is visible to the Alice point of view, and more importantly, • the information that the friends of Alice (e.g. Bob) possess or have access to, concerning Calvin. Examples of such information are likes or comments that Calvin leaves on photos or posts belonging to one of the friends of Alice; The FORPS system allows Alice to define her privacy sensitivity profile which is characterized by themes/categories and object-types which are relevant for Alice. For instance, Alice may want only some of her content items concerning a specific topic (e.g. family) to stay in a restricted area of users, other topics can be propagated. The same applies to different object types such as posts, photos, videos, etc. These preferences (called sensitivity profile) are taken into account by the system to calculate different privacy reputation scores of Calvin per theme and object type and then to obtain an aggregated score. The semantic analysis techniques are used to identify the categories associated to each content item [11]. We notice that these techniques provide personalized results. The score computation is also based on different behavioral factors characterizing information propagation in social networks, e.g. propagator propensity, information sensitivity, and user popularity. Some factors are quantitative; others are qualitative and pivot on sentiment mining analysis techniques [3]. By extension of the FORPS approach, in FORPS+ the scores can also be computed collaboratively: two users who have a high confidence relation (e.g., good friends), can let the system exchange their privacy score in order to combine their information about Calvin so that their computations became more accurate. In other words, FORPS+ has back-
FRIENDS ORIENTED REPUTATION PRIVACY SCORE (FORPS)
A. Motivating scenario As a typical scenario, we consider real or virtual friends who can communicate and exchange information using an online social network such as Facebook. After having been to a party, Bob posts a status update on his profile: “I feel very good now”. Calvin, a friend of Bob on this social network, adds information to this post: • a click on the like button, • a comment who reveals implicitly where the party has took place “The Qweens club with you, your sister and Delphine was amazing !” 2
http://www.myfico.com http://www.profilewatch.org 4 http://www.rabidgremlin.com/fbprivacy 3
402
power-law distribution. But in our case, as we are interested in the projection of the initial graph onto the requestor’s circle, we don’t consider the edges of all the nodes, but only those present in the requestor’s circle. So, we will just ensure the presence of a simpler property which consists in having a few influencing users (with larger number of edges) and a majority of other users with fewer number of links. Iteratively, for each member ‘c’ of the requestor’s circle, we choose randomly n others members, and connect them with ’c’. Note that ‘c’ can choose itself, and also that the number of links a node obtains can be accumulated across different iterations. Here, ‘n’ can follow a probability distribution, but for practical reasons, we set ‘n’ to 3 and still observe the expected properties of the projected network. We observe (Figure 1. ) that with this simple algorithm, only few members have a high level of connections, whereas the majority remains with a homogeneous number of connections.
propagated mechanisms, which allow people to combine their requestor’s reputation analysis (only the scores, not the content items that allowed to derive the scores). This extension assumes that the scores have the same semantics for the two users. Namely, as the scores are themedependant, FORPS+ takes into account the similarity of the sensitivity profiles, see [10] for more details. IV.
SIMULATION MODEL
The goal of the work described in the following sections is to create a model that is able to simulate a high number of interactions, in order to validate, anticipate and calibrate the properties of FORPS, in such a way that they ensure its privacy goals at best, without acting against the requestor. An online social network is modelled as an undirected graph G = (V,E) in which vertices (V) or nodes represent the individuals, and edges (E) represent a finite set of links between the individuals, usually a friendship relation, such that E ⊆ V × V [9]. It can be represented by its symmetrical q{qcharacteristic matrix FS := fs i , j , where n
Individuals
50
= |V |, and
fs i , j
1 ( v i , v j ) ∈ E ½ =® ¾ ¯0 otherwise ¿
(1)
40 30 20 10 0 2
3
4
5
6
7
8
Number of Friends
The number of friends an individual has corresponds to the degree of the corresponding node.
Figure 1. Degree Distribution in the projected network.
A. The Agents Our simulation model comprises three types of agents: 1) The requestors. This type will be composed by only one agent, let’s called him the agent ‘r’. 2) The members ‘c’ of the circle of ‘r’ in a social network. They are composed by the friends of ‘r’ as well as people ‘r’ wants as friends (potential future friends), or wants to be aware of their activities (“subscribers” in Facebook, “circles” in Google+5). 3) The rumor launchers ‘m’ are users which trigger rumors regarding the requestor ‘r’. Those agents have the faculty of not being influenced by other agents. They will propagate a message that is opposite to the true nature of ‘r’ (see the following chapters). We notice that this specific faculty can for example be possessed by ex-friends, which have arrived to a point of no return regarding their negative confidence in ‘r’.
C. The Privacy Score representation
S ct (r ) represents the score of the requestor ‘r’ at a time instance t according to a member of its circle ‘c’,. This score indicates the assumed degree of safeness of the requestor. t The higher is S c (r ) , the higher is the confidence of ‘c’ towards ‘r’. The lower it is, more ‘c’ considers ‘r’ as dangerous. S rt (r ) represents the real privacy score of the requestor. As the requestor is the only entity that possesses all the information about it, we use the index ‘r’ (requestor) for this score. Note that by considering that this value exists, we make here a strong assumption: we consider that the requestor has a coherent behavior at a given instance of time ‘t’ which is moreover regularly reflected in its interactions with others users. All the scores lie between 0 and 100. D. Initialization phase and Friendship Decision processs At the initialization phase, we define the “real” privacy score for the requestor, S rt (r ) and randomly initialize default
B. The Instantiation of the Network Let us see now how we can represent the behavior of nodes of an online social network in our simulation model. A good summary of existing approaches for social networks modeling is given by [14]. Usually, the degree distribution (the number of friendship links in a network) follows a
score estimation S ct (r ) . Then, at each iteration following the interactions with the requestor, the idea each member of its circle has on ’r’ will change as according to the update mechanism described in the following. Each member has its personal acceptability threshold below which its opinion
5
As we are dealing with OSNs based on privacy, that’s why we do not mention public OSN like Twitters, with its followers.
403
where 0 ≤ α ≤ 1 In FORPS+, all the users that are in a friend relationship with the member ‘c’ who has interacted with ‘r’ will also benefit from the added information (provided that the difference is substantial: S ct +1 ( r ) − S ct ( r ) ≥ Δ ):
over the requestor becomes negative 6 which results in breaking its ‘friendship’ relation with the requestor. Similarly, when its opinion become enough positive (relatively to its personal threshold) the agent re-establishes again its friendship relation. E. The Meetings At each step of the simulation (each iteration), agents move within the simulated 2D plane starting from original position and moving in a randomly selected direction with a small step. When an agent ‘c’ is localized at the same position as the agent ‘r’, there is a possibility that a direct or indirect information transfer occurs between ‘r’ and ‘c’ (see section D). This communication event, Com(r,c), is triggered in the simulation model according to the following rule:
s ( r, c ) − θ com > s threshold
∀c ' / FS c ,c ' = 1 S ct +' 1 ( r ) :=
t
(4)
(r )
where (1- α ) > (1- β ), indeed the scores people have directly computed will have a higher impact because in this case, the requestor’s data are analyzed with more personalized criteria according to the user’s sensitivity profile [10]. The rumor launcher agents have the same power of a requestor: they can influence others (except the requestor himself). Mathematically, they behave as a requestor. When a member ‘c’ will meet a rumor launcher, (i.e this communication event Com(m,c) is triggered), it will increase the amount of information it has related to the requestor:
(2)
where s(r,c) represents the strength of the friendship. This value depends on the presence of a friendship relation, fsr , c , as well as on the number of friends in common between ‘r’ and ‘c’. To trigger Com(r,c), s(r,c) is combined with a random perturbation θ com , and checked against a system-
Sct +1 ( r ) := α m .Sct ( r ) + (1 − α m ) Smt ( r )
(5)
Note: that in the sake of simplicity we have considered here that the rumor is propagated within the FORPS score although it is not a source of information created by the requestor itself as all the other entries.
wise defined threshold, s threshold . We introduce a negative random perturbation to account for situations where the information transfer is not meaningful with respect to the safeness degree of the requestor. For example, this can happen when the communication is related to themes that are not sensitive for the agent ‘c’ (see section III). As we want to give chances to a discussion to be continued, we need to give to our system a short-term memory. The slight and random move policy fits well with this goal, as it reinforces the probability of new meetings between users that have met recently.
G. Monitors and Indicators We define the following monitors to depict the evolution of scores and the number of current friendship relations of the requestor during the simulation: • Global Opinion monitor shows the average opinion on the requestor computed by all the members of its circle, see Figure 2. (top). • Friendship relation monitor shows the number of requestor’s friends (in green) the number of agents in its circle who are not its friends (in red), see Figure 2. (bottom). The Figure 2. illustrates the convergence (i.e. stable global opinion and stable number of friends) that is obtained after 6390 iterations. Note also that the global opinion is quite similar to the real privacy score of the requestor S rt (r ) = 67. To evaluate the results of simulations we defined several more specific indicators as follows. a) Requestor’ dangerousness evaluation error: this is the most important indicator, it measures how far from the real score, the evaluation score is. It is the absolute value of the difference between the two scores, the lower the better.
F. The Information Exchange When a communication is happening, agents exchange information about the requestor. By saying “interacting”, we have in mind the comment of a status, the ‘like’ a photo, the tag of an article, etc… Note that in online social networks, exchanging information could be done directly (information accessible thru the own data of ‘c’), or via a friend in common. Let’s suppose now that all the interactions that exist in our simulation are interactions between the requestor and the members of its circle, and that they can either represent a direct exchange or an exchange via common friends (indirect exchange). So, in FORPS, when an interaction occurs between ‘c’ and ‘r’ (i.e. the communication event Com(r,c) is triggered), the new score of ‘c’ regarding ‘r’ will get closer to the real score of ‘r’ by being updated as follows:
S ct +1 ( r ) := α . S ct ( r ) + (1 − α ) S rt ( r )
t +1
β . S c ' ( r ) + (1 − β ) S c
b) Convergence speed: During a simulation, agents are moving, and sometimes a communication occurs with a requestor or with a malicious agent. Each of these communication steps is considered as an iteration. When the number of friends stops evolving, the simulation is over. The convergence speed indicator represents the total number of iterations that took place during the simulation.
(3)
6
In fact, the peer of alter ego within the two networks possess the same acceptability threshold
404
c) Half-life: This is also an indicator of convergence. It informs when 50% of the agents in the circle of the requestor are its friends. If the requestor has a low score, half-life indicator may not exist. Note that this is also the intersection point of the red and the green curves where proportion of friends (in green) and non-friends (in red) is equal, see Figure 2. The three indicators represent the average value over the total number of 300 simulations used in our experiment.
Figure 3. Diffusion mechanism selection
The Figure 3. shows how one can easily select the diffusion mechanism among the three options. In the example of the Figure 3. , the Social Network 1 uses FORPS as diffusion mechanism, whereas the Social Network 2 uses FORPS+. B. FORPS’ Legitimacy For each of 300 tests, the requestor has a fixed real dangerousness value and its circle is composed by 144 individuals; this number is refered in literature as the average number of friends of Facebook users [5]. A test has a duration of around 22 seconds (110 minuts for the total). We have executed the tests for 2 different cases of the requestor’s score, a good one (82), and a bad one (55). The mean values of obtained results are represented in the following tables.
Figure 2. Monitors of our simulation
V.
SIMULATION RESULTS
In order to implement our simulation model, we used the multi-agent programmable modeling environment NetLogo [1]. A. NetLogo implementation We have designed a friendly interface which helps to compare the three models: FORPS, FORPS+, and “NO”. “NO” is a simple model where friend’s acceptance is only depending of the number of friends people have in common [2]. The authors observe that this is the major behaviour of users in the reality. In order to easily simulate the behaviors of the system under two sets of models parameters, we have instantiated two equivalent networks (one for each set of model parameters). Then, simulation tool allows illustrating in parallel the evolution of the system under these two sets of conditions, e.g. FORPS configuration vs FORPS+. Technically speaking the requestor has two different circles, there are two different families of agents “members of circles” (c1 and c2), however these two social networks are twins where each member of a social network has its alter-ego in the other, and the only difference between the two networks is the configuration of score propagation mechanisms.
Requestor’s real score = 82
Comparison Indicators
NO
Convergence Half-life
FORPS
FORPS +
107,13
21422,38
16160,45
82,72
12416,18
8508,88
Dangerousness 20,20 1,11 0,96 evaluation error Table 1. Results of 300 tests for a safe Requestor Requestor’s real score = 55
Comparison Indicators
NO
FORPS
FORPS +
Convergence
102,55
16161,58
12814,87
Half-Life
18,26
-
-
Dangerousness 45,12 1,20 1,03 evaluation error Table 2. Results of 300 tests for a dangerous Requestor
1) FORPS versus NO (without FORPS). We observe that the convergence speed of a simple system NO is the best. For example, with the case 1, it takes only 107 iterations for NO when FORPS takes 21422 iterations. But
405
However, we can notice that if we exclude from the evaluation the simulations which have taken too much time relatively to the others, FORPS+ become better than FORPS in 93% of cases.
it often gives absurd results (see Table 2). In fact, the Dangerousness Evaluation Error indicator informs that the scores computed by the NO system have 45 points of difference with the real score of the requestor, whereas the scores computed by FORPS have only 1 point of difference. This is because the NO system doesn’t take into account the propensity of the requestor to propagate information. Indeed, the requestor may be dangerous, but because its circle members have more and more friends in common with the requestor, they gradually accept him as a friend. We notice that contrarily to NO system, the half-life indicator doesn’t give results in the case of the Table 2 for FORPS (and also FORPS+). This is explained by the fact that in all the iterations, less than 50% of the agents of the requestor’s circle become its friends. 2) FORPS versus FORPS+. Let’s analize now what happens when we compare FORPS and FORPS+. In the example of Figure 4. the purple curve represents FORPS and the green one represents FORPS+.
C. Reactivity to requestor’s change. In order to confer to the system a “right to forget” feature, we need to amplify the impact of recent activities with respect to the old activities. The idea is to catch the latest evolution of the character of the requestor. Let’s see the reactions of the system in the case of such evolutions. We can observe in the Figure 6. that unlike the simple strategy, FORPS reacts quite well to this dynamics. Indeed, we see that the Global Opinion gradually becomes coherent with the requestor’s real score. By conferring to our system such a property, one gives to the requestor the opportunity to give another opinion of him. Note that this “right to forget” property of our system is different from a simple data aging over the time. In fact, if the requestor’s real score remains unchanged, nothing would change in the score estimations neither.
t0
t1
t2
t3
Figure 4. FORPS versus FORPS+
These plots show the series of values of three indicators taken from 100 tests. Relatively to the convergence speed indicator, FORPS+ provides better results than FORPS in 86% of the cases (see Figure 5. ). Note that it is not obvious to determine when a simulation has reached its stationary point (termination of the simulation). In some cases, most of the agent will quickly reach their final states whereas a minority of them will converge much later because they have a selective acceptability threshold, and will delay the global termination state.
Figure 6. FORPS’ reactions to requestor’s change t0:
Srt 0 (r ) =81, t1: Srt1 ( r ) =66, t2: Srt 2 ( r ) =83, t3: Srt 3 (r ) =93. (Social Network 1: FORPS, Social Network 2: NO)
D. Reactivity to malicious rumors An important property we want to obtain is the capability of the system to discern between authentic and false information available about the requestor. This is especially the case when a rumor appears. Let’s have a simple scenario: a leader and six of its active militants AM want to explicitly propagate negative ideas on the requestor. S rt (r ) =70
Smt (r ) =30 ∀m ∈ AM Let’s see what will be the reaction of our system on the Figure 7. Figure 5. Convergence Speed comparision
406
Figure 7. Reactions to malicious rumors
Before instant t1, the system has reached a stable state with FORPS option. At t1, the malicious rumor is triggered, its impact is radical. After only a few iterations, the requestor has inverted its proportion of friend (in green) and non-friend (in red) in its circle. The Global Opinion was also diminished but remained higher than 50% thanks to the effect of FORPS. At the instant t2, we modify the propagation of the privacy score by applying FORPS+ instead of FORPS, and we observe that FORPS+ manage to compensate the rumor. Indeed, the majority of the member of its circle becomes its friend again. This experience is repeated several times (t3, t4, t5 …) and equivalent results are obtained. Contrarily to other phenomena considered in this paper, we do not observe a full convergence, but an oscillatory state, which is nevertheless limited within narrow bounds.
Figure 8. FORPS+ and FORPS’ reactions to rumors (Social Network 1: FORPS, Social Network 2: FORPS+)
VI.
DISCUSSIONS
1) Bootstrap Problem: One of the assumed weaknesses of FORPS was linked to the classical bootstrap problem [4][10]. When we do not have any information about the requestor, how to initiate the process? What score should the system give for the requestor? In this paper, we have tested many initial states varying from very good to very bad opinions on the requestor,We observe that the initial state has only a little influence on the convergence. Indeed, they all lead to the same final state which allows to conclude that bootstrap problem is not an essential issue for the convergence of this system.
E. Combination of rumors with requestor’s changes Here we trigger the experiments with the two networks in parallel to have a synchronous comparison. At t1, the malicious rumor is triggered. We see in Figure 8. that when FORPS loose 20 points, between t1 and t1’ (real score 80, Global Opinion 60) FORPS + looses only 11 points (the same real score, but Global Opinion 69). Then, at t2 we modify the real requestor’s score 80-> 92 which can be considered as a reaction of counter-attack to the rumor from the point of view of the requestor. We observe that FORPS+ manages now to pass the half-life point, whereas FORPS does not.
So what is the impact on the privacy of an arbitrary initial state attribution? The answer is that, for a while, the opinion people have about the requestor can be erroneous. The danger will be to accept as a friend a potential propagator individual. One approach could be to adopt the brinkmanship strategy: without a substantial amount of data available on the requestor, always ignore his requests; according to the simulation results this should not have an impact on the final state of “friendship” with the requestor but it will just take a longer (and safer) path. Another important aspect of the bootstrap problem is that it is not really a marginal problem which would only be relevant at the beginning of the process. In fact, we are always in a context of incomplete knowledge on the requestor. For example, although Bob is a virtual friend of Calvin since a long while, if the dangerous Calvin acts nevertheless safely in the particular context of a group Bob
407
belongs to, Bob will not be able to discern the real nature of Calvin. We are therefore still in a bootstrap context, because we always have a partial view of the requestor activity. The convergence and stability we have demonstrated is therefore of particular interest.
and the results derived from the simulations presented in the current paper. VII. CONCLUSION The FORPS (Friends Oriented Reputation Privacy Score) system evaluates the dangerousness of people who want to become our friends, by computing their propensity to propagate sensitive information. In order to anticipate the long-term and large scale effects of this system, we have built a multi-agent simulation that models a high number of interactions between users. We have shown that privacy protection based on different variants of the FORPS system produces better results than a simple decision process, in term of evaluation of the requestor’s dangerousness, of convergence speed and of resistance to rumor phenomena.
For this reason, in our future research, we envisage to make our simulation model dependent on the percentage of friends in common people have with the requestor, and on the total number of friends the requestor have. 2) The “NO” model: Based on our intuitions and the previous work [2] we have supposed that in a simple process, people accept friend’s requests when they have enough common friends with the requestors. We think however that this model should be completed by introducing the possibility to disconnect a “friendship” relation. We envisage to retrieve data related to the loss of friends over the time, e.g. using tools such as “unfriends”7. 3) Simple Simulation Model for FORSP: One of the advantages of our simulation model is its simplicity. It misses however some aspects of the FORPS process. First, all the interactions we generate are considered faithful to the real privacy score of the requestor. But in real life, even if this score is quite bad, the requestor doesn’t act negatively every time. He has also neutral or positive behaviors. For the moment, we have solved this problem by adding the random perturbation in the event triggering logic (see formula 1). When it gives low number, it means that the discussion was not meaningful from the perspective of the score computation, and so it may be not considered as an interaction. The drawback is that this will never be considered as a positive interaction. In future works we intend to validate the assumption that such a simplified model does not perturb the final state of the estimated score.
REFERENCES [1]
[2]
[3]
[4]
[5]
[6]
[7]
Second, in this simulation we have supposed that the focus is on a single topic. It has simplified our way to take into account the exchanges of scores between users (FORPS+). Everything was considered as meaningful because related to a sensitive topic for everybody. For further works we should introduce themes and give different sensitive profiles to the agents.
[8]
[9] [10]
Third, we should also consider other specificities of the FORPS+ model, e.g. we should not favour the scores from friends who have exactly the same friends in common. Indeed, as their scores were computed by analyzing the same data, they won’t add any new information. 4) Testing with real users: Finally, we envisage testing different variants of FORPS system within a corpus of real users in order to benefit from their feedbacks with respect to both usability aspects as well as the efficiency of different algorithmic parameters we have exploited in the simulated model. This will allow notably to validate the assumptions
7
[11]
[12] [13]
[14] [15] [16]
http:// www.unfriendfinder.com
408
N. Baracaldo, C. Lopez, M. Anwar, M. Lewis. Simulating the effect of privacy concerns in online social networks. Information Reuse and Integration (IRI). IEEE International Conference on Digital Object Identifier (2011). Y. Boshmaf, I. Muslukhov, K.Beznosov, M. Ripeanu. The socialbot network: when bots socialize for fame and money. In Proceedings of the 27th Annual Computer Security Applications Conference (2011) A. Esuli, F. Sebastiani. SentiWordNet: A publicly available lexical resource for opinion mining. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (2006) T. Gediminas, T. Alexander. Toward the next generation of recommender systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transaction on Knowledge and Data Engineering 17(6), pp.734—749 (2005) S.A. Golder, D.M. Wilkinson, and B.A. Huberman. Rhythms of social interaction: Messaging within a massive online network. 3rd International Conference on Communities and Technologies (2007). P. Gundecha, G. Barbier and H. Liu. Exploiting Vulnerability to Secure User Privacy on a Social Networking Site. In the 17th ACM SIGKDD (2011). K. Liu, and E. Terzi. A Framework for Computing the Privacy Scores of Users in Online Social Networks. In ACM Transactions on Knowledge Discovery from Data (2010) D. Massad. Herd Privacy: Modeling the Spillover Effects of Privacy Settings on Social Networking Sites. The Computational Social Science Society of the Americas (2011). P. Mika. Social Networks and the Semantic Web. volume 5 of Semantic Web and Beyond Computing for Human Experience (2007) D. Pergament, A. Aghasaryan, J. Ganascia, and S. Betge-Brezetz. FORPS: Friends-Oriented reputation privacy score, in Proceedings of ACM/IEEE International Workshop on Security and Privacy Preserving in e-Societies (2011) D. Ramage, D. Hall, R. Nallapati, C.D Manning. Labeled LDA: A supervised topic model for credit attribution in multi-label corpora. EMNLP (2009). E. Rogers. Diffusion of innovations. Glencloe (1962) Y. Wang, A. Aghasaryan, A. Shrihari, D. Pergament, G. B. Kamga, S. Betg e-Brezetz. Intelligent Reactive Access Control for Moving User Data. The Third IEEE International Conference on Information Privacy, Security, Risk and Trust (2011) S. Wasserman, K. Faust. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press (1994). A. Westin. Privacy and Freedom. Atheneum, New York. (1967). U. Wilensky. NetLogo. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL (1999)