Research Article
ClusterRep: A cluster-based reputation framework for balancing privacy and trust in vehicular participatory sensing
International Journal of Distributed Sensor Networks 2018, Vol. 14(9) Ó The Author(s) 2018 DOI: 10.1177/1550147718803299 journals.sagepub.com/home/dsn
Jin Wang, Youyuan Wang, Xiang Gu, Liang Chen and Jie Wan
Abstract In vehicular participatory sensing, vehicles may provide false data or low-quality data. Building trust in vehicular ad hoc networks is an efficient way to deal with this issue. On one hand, vehicles need to disclose necessary information to demonstrate their trustworthiness. On the other hand, vehicles tend to hide their sensitive information to preserve user privacy. Therefore, privacy and trust are conflict in vehicular ad hoc networks. A cluster-based reputation framework named ClusterRep is proposed to balance privacy and trust in vehicular ad hoc networks. In this framework, the cluster head collaborates with cluster members to change pseudonyms and reputation values. The experiments show the scalability and the effectiveness of the ClusterRep compared with Beta strategy and IncogniSense-floor strategy. Keywords Vehicular ad hoc network, participatory sensing, reputation system, clustering, privacy preservation
Date received: 7 December 2017; accepted: 4 September 2018 Handling Editor: Carlos Calafate
Introduction Motivations Participatory sensing is a new paradigm for large-scale collection of user-supplied data from mobile devices. It offers possibility of low-cost sensing of the environment and has gained much attention in recent years. Participatory sensing has a wide range of applications in environmental monitoring,1 intelligent transportation,2–4 and localization.5 In a vehicular ad hoc network (VANET), a vehicular participatory sensing system collects sensing data from distributed participant vehicles. For example, the application Waze4 uses vehicles’ GPS information to obtain real-time road traffic conditions and optimally navigate real-time road congestion for drivers. In addition, Waze recruits community members to add details of the navigation map, for example, the price of a particular gas station and new roads. Vehicles equipped with wireless communication devices and sensors, for example, GPS receivers
and three-dimensional (3D) accelerators, are rich information resources. Hence, the vehicular participatory sensing is promising for various applications, such as traffic control, safety assist, and environmental monitoring. There are two important issues that need to be addressed in vehicular participatory sensing. The first is the user privacy preservation. As vehicles collect data from across wide geographical areas, vehicle ID and spatio-temporal information are always associated with the data uploaded. This leads potential threats to user privacy because the collected sensing data may disclose their locations and trajectories. Some encouraging School of Computer Science and Technology, Nantong University, Nantong, P.R. China Corresponding author: Jin Wang, School of Computer Science and Technology, Nantong University, Nantong 226019, Jiangsu, P.R. China. Email:
[email protected]
Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/ open-access-at-sage).
2 studies have been done for privacy preservation. Lu et al.6 proposed a pseudonym strategy instead of real vehicle ID for location privacy in VANETs. Gruteser and Grunwald7 presented an approach to reduce the spatio-temporal granularity of the sensing data uploaded to the service provider. The second issue is the reliability of the sensing data collected from various participants. This raises the issue of data trustworthiness or participant trustworthiness. Trust and reputation systems8 in distributed networks have been proposed in the last few years to address such issue. Inspired by this, a few novel reputation systems9–11 are proposed for vehicular participatory sensing. Unfortunately, the data trustworthiness issue inherently conflicts with the user privacy issue. The first reason is that the reputation system needs the participants’ identities be disclosed, while the privacy-preserving scheme requires the participants to be anonymous. The other reason is that the reputation value added to the message may lead to the reputation link attack12 that compromises user privacy. Hence, a good reputation system for vehicular participatory sensing must take into account both user privacy and trust management. However, currently there are only a few works considering user privacy in the reputation system for the vehicular participatory sensing. In this article, we present a novel proposal ClusterRep (a cluster-based reputation framework) to prevent the inadvertent leakage of privacy due to the reputation link attack from either the server or the neighboring members.
Related work Trust management is a potential solution to improve the quality of vehicular participatory sensing.13 Li et al.10 argued that messages generated by vehicles may not be reliable, and they presented a reputation system that allows evaluation of message reliability. Chen and Wei14 proposed a beacon-based trust management system called a road-side unit and beacon-based trust management system that aims to deal with internal malicious attackers in VANETs. Alswailim et al.15 presented a participant contribution trust (PCT) scheme to provide the crisis response system with the trusted contributions. Bhattacharjee et al.16 proposed a reputation model, named QnQ, to identify different users such as honest, selfish, or malicious using their reputation. Guo et al.17 proposed a TaaS (Trust as a Service) cloud utility leveraging a cloud hierarchy for assessing service trustworthiness of Internet of things (IoT) devices so as to filter out untrustworthy sensing data. These early studies focused on what should be considered as evidence to build trust and reputation in VANETs. However, both trust management and privacy preservation play critical roles in VANET and they need a trade-off. Li and Chigan18 stated that privacy
International Journal of Distributed Sensor Networks preservation and trust management have conflicting requirements in VANETs because privacy preservation makes it challenging to maintain the reputation history of each vehicle. Katsomallos et al.19 proposed a framework that decouples the privacy mechanism from the application logic so that it can be developed by another trusted party. Liu et al.20 proposed a recommendation mechanism called PriRe, which makes recommendations to users about what data can be provided based on their privacy preferences. Wang et al.21 presented a location anonymity server which transforms each message received from participant into a perturbed message. But they19–21 did not consider how to incentive participants to provide high-quality data. Unfortunately, most existing works only focus on these two issues separately.3,9–11,22–24 Some closely related studies are reviewed here. Christin et al.25 proposed the IncogniSense, which is an anonymity-preserving reputation framework for participatory sensing. They assumed that the server was not always trustworthy. A server adversary may associate reputation scores to contributions and thus deanonymize users. Nevertheless, the VANET’s characteristics are not considered in IncogniSense. For example, most vehicular participatory sensing tasks require that participants submit a report with a geographic tag. The server adversary may establish links between multiple sensing reports by these geographic tags. In VANET, vehicles often gather together to form a cluster, for instance, at the intersection of traffic lights, on the highway, or in the parking lot. In these places, the positions of the cars are very similar. So, using the cluster feature of VANET, participants can effectively avoid linking attacks from server adversary. Huang et al.26 proposed a privacy-preserving trustbased verifiable vehicular cloud computing scheme named PTVC. The PTVC can select the trustworthy vehicles to form the temporary vehicular cloud with the help by neighboring vehicles. Hu et al.27 proposed a trust-based platoon service recommendation scheme, called REPLACE, to help the user vehicles avoid choosing malicious platoon head vehicles. The PTVC and REPLACE assume the server is trustworthy. The privacy threat from the server is not considered.
Contributions We propose a cluster-based reputation framework to balance privacy and trust. In the ClusterRep, the cluster heads are in charge of pseudonym registration, pseudonym revocation, and lifetime management of cluster members. Before the end of each lifetime, the cluster member delivers a new pseudonym and the reputation value to the cluster head. The cluster head attaches the reputation to the new pseudonym. In this way, vehicles in the VANET maintain reputation
Wang et al. values and preserve privacy of users. Previous works, such as the IncogniSense framework, were mostly for the scenario of mobile phone users. Our proposed ClusterRep is for the VANET. Unlike the scenario of mobile phones, the VANET is a highly dynamic network, and vehicles forms clusters from time to time. The participants in VANET submit sensing reports, including location, reputation, voice, images, and so on. A reputation framework needs to consider quasiidentifiers in VANET messages, such as location and reputation. However, the IncogniSense does not consider how to make both location and reputation satisfy k-anonymity. Our ClusterRep is featured by employing the clustering technique, which is a unique characteristic of the VANET. Location cloak can be easily achieved with cluster technology. And, our ClusterRep solves how to use cluster to cloak reputation in the meanwhile. The ClusterRep is specific to balance privacy and trust in VANETs for participatory sensing.
Problem statement System model A VANET (Figure 1) includes three kinds of entities: vehicles, roadside units (RSUs), and servers. Vehicles are main members of the VANET. Each vehicle is equipped with an onboard unit (OBU), which is able to broadcast and receive messages via wireless communication. The communication mainly includes the vehicleto-vehicle (V2V) communication and the vehicle-toinfrastructure (V2I) communication. The RSU is a wireless communication infrastructure deployed along the roadside. It is kind of a communication interface between vehicles and servers. The third category of entities is servers, which manage the vehicles and the applications in VANET. The vehicular participatory sensing applications run on the VANET. Vehicles are participants who sense
Figure 1. The system model of a VANET.
3 environmental information and submit to the application server. Other vehicles and RSUs help to transfer sensing reports. An example is sharing data from GPSenabled vehicles to map traffic patterns. In this application, the application owner is called service provider. The service provider issues a task to the participants, that is, vehicles. The vehicles collect GPS longitude, GPS latitude, speed, and time using their onboard sensors. Then, the vehicles submit the sensing reports to the application server. The service provider aggregates the data in application server and then publishes a service, for example, driving directions. In our framework, the pseudonym server and the reputation server are trusted authorities in most cases. The pseudonym server issues and verifies pseudonym certificates. The reputation server calculates the reputation of vehicles and distributes reputation certificates.
Threat model Privacy threat from members. It is assumed that the attacker is a legal internal member of the network, who has attended a cluster and is capable of monitoring the broadcast messages in the cluster but cannot invade the encryption communication mechanism. The attacker can acquire the tuple \pseudonym, reputation. from messages. Suppose the current period is Tt , the next period is Tt + 1 . The \pseudonym, reputation. tuple in current period is denoted as \pset , rept .. Wherein, the pseudonym set pset = fV1, t , V2, t , . . . , Vi, t g, i 2 (1, m), m refers to the total number of vehicles and Vi, t refers to the pseudonyms possessed by Vi in period Tt . Correspondingly, in the reputation set rept = fr1, t , r2, t , . . . , ri, t g, i 2 m, where ri, t refers to the reputation value corresponding to the pseudonym Vi, t . Therefore, the \pseudonym, reputation. tuple in the period Tt + 1 is \pset + 1 , rept + 1 .. The attacker intends to establish the relationship Rt, t + 1 = Mapfpset , pset + 1 g through the analysis on reputation values, which causes disclosure of user privacy. It is supposed that the spatio-temporal messages have been cloaked when the users share their sensing reports. The reputation link attack conducted by the attacker is based on the assumption that the change of reputation is continued, which means that the change of the reputation of a certain user is tiny within adjacent periods. Therefore, in the adjacent periods, the pseudonyms corresponding to two members whose reputations are close might be the same user. In this way, the suspect pseudonyms of the target vehicle Vi can be collected, namely, pseit + 1 (wherein, pseit + 1 pset + 1 , i refers to the ID of the target vehicle). The attacker can acquire the candidate mappings of vehicle Vi from period Tt to period Tt + 1 , namely, MapfVi, t , pseit + 1 g, wherein the mapping relation MapfVi, t , Vi, t + 1 g is the only correct one. Obviously, if the potential target
4 pseudonym set pseit + 1 is smaller, the probability that the correct mapping is found will be higher, the success rate of attack will be higher, and the possibility of privacy disclosure will be larger. Privacy threat from servers. Generally, the servers are operated by a third-party company which people consider having less possibility of malicious attacks against the users. However, as the problem of user privacy disclosure by big companies such as Apple and Google in recent years has been exposed, people start to worry about the credibility of the third-party servers. It is assumed in this article that the server might be attacked by hackers and further leads to data leakage. The malicious adversaries may conduct the attack by comparing a cluster head’s two requests of reputation queries. Since the interval between the periods is short and the change of the quantity of cluster members is slow, the attacker can easily determine the quantity of the members in the cluster. The link relations among the pseudonyms of the cluster members can be deduced through reputation link attacks, which will cause privacy disclosure. This form of attack is an advanced form of the former. Successful attack from the former can only result in privacy disclosure of a certain cluster, while the latter may cause privacy disclosure of all the users. Therefore, it has a greater influence and more serious hazards.
International Journal of Distributed Sensor Networks which will complete the data collection task by recruiting participants. Initiatives and qualities of these participants are distinctive. Trust management is an effective method to stimulate the participants to contribute highquality data, and therefore, the effectiveness of the trust management mechanism must be ensured, so that the quality of the participants can be distinguished according to the reputation values. To cope with privacy attack from internal members. In VANETs, internal members all use the same communication protocol which can acquire the messages from others. These internal members could be selfish or malicious. They may collect the messages of their neighboring members, analyze their private information, and further do some benefit-seeking behaviors. To cope with privacy attack from servers. Even though third-party servers have provided safety measures, they still cannot completely ensure the safety. Besides, the public also worry that these companies will privately use the data contributed by participants out of their own benefit and thus disclose user privacy. Furthermore, it is necessary to design a mechanism to cope with the privacy attack from the servers hosted by service providers.
The architecture of ClusterRep Trust threat. In the participatory sensing system, the service provider relies on a large amount of data provided by the participants to create a service and then publishes the service to the service consumers. The participatory sensing system hopes that all the participants are honest and trustworthy. However, in reality, participants may be malicious. For example, a malicious participant may use counterfeit data to reduce the cost of sensing and defraud the service provider of rewards. Such malicious behaviors will pollute the data and reduce the quality of service. If such behaviors are not promptly stopped, other participants will follow suit. If the number of malicious participants is huge, it will even destroy the service. Therefore, trust threat is an important challenge for participatory systems. In this system, participants are divided into two types: the honest and the malicious. The honest participants provide the correct sensing report. The malicious participants provide false sensing reports. We assume that the service provider can identify false sensing reports through data analysis.
Design goal To ensure the effectiveness of trust management. The participatory sensing system is normally an open system,
Figure 2 describes the overall framework of the system, which comprises three parts: (1) clusters, (2) certificate authority (CA), and (3) reputation and pseudonym manager (RPM) and Application Server (APP). This framework assumes that the existing cluster agreement has implemented generation and maintenance of clusters. The clusters play an important role in the reputation cloaking and privacy preservation.
Cluster member Cluster members are the workers of the VANET, who undertake the sensing tasks, constantly collect sensing data, and upload the data to the third-party server with pseudonyms. To ensure the privacy, the pseudonyms will be updated periodically. The cluster head will coordinate with the cluster members in updating the pseudonyms. The CA will use blind signature technology to conduct overall management on the pseudonyms generated by cluster members. It can ensure that not only the messages are transparent to these members in the process of signature but also each member will possess a unique pseudonym within the period. The blind signature mechanism can resist the sybil attacks and meanwhile make it impossible to deduce the user ID through the pseudonyms.
Wang et al.
5
Figure 2. The architecture of the system.
Cluster head
Server
The cluster head is the bridge between cluster members and the RPM, and it is the key component of the entire framework. The work of the cluster head mainly comprises three parts. First, it sends RPM the requests for pseudonym registration and new member’s reputation initialization in batches. Second, it sends the cluster member reputation query requests in batches. Finally, before the end of the period, the cluster head coordinates with the cluster members in generating the pseudonym Vi, next for the next period and transfer current reputation from current pseudonym Vi, current to the pseudonym Vi, next . The cluster head undertakes the reputation anonymization task of all cluster members. This requires the cluster head to follow two principles in the process of cluster head generation and election. First, the number of cluster members shall be no less than the threshold K; second, the elected cluster head needs to provide true and reliable identity information to the CA. On one hand, this can awe the cluster head and reduce malicious tamper behaviors. On the other hand, after malicious behaviors occur, the cluster of the attackers can be rapidly located. It is worth noting that although the cluster head sacrifices its own privacy, this can make the whole cluster reach k-anonymity privacy preservation, and therefore, the overall scheme is acceptable.
The RPM is responsible for global reputation calculation and pseudonym management. The APP is responsible for distributing sensing tasks. For vehicles newly joined the cluster or newly generated pseudonyms, the cluster head will send the pseudonym registration request of the cluster members to the RPM. The RPM will complete the pseudonym registration after CA verifies the pseudonym ID. The reputation model is operated on RPM, which will associate a reputation value to the vehicle with current pseudonym considering sensing data uploaded. This will be saved in the RPM server as a record and delivered to the APP for trust decision. The APP can eliminate the low-reputation sensing data or reduce its weight according to the reputation values so as to achieve quality control. However, these follow-up applications are beyond the discussion range of this article.
Operations of the ClusterRep framework The ClusterRep divides each period into three parts: (1) pseudonym generation and reputation initialization, (2) sensing data aggregation and reputation update, and (3) reputation query and transfer. We divide a period T equally into n(nø3) parts. The first part is the start of the period, occupying one time slice T =n. The
6
International Journal of Distributed Sensor Networks
second part is the pseudonym work time slice, occupying (n 2)T =n. The third part is the last slice which prepares for the reputation transfer among pseudonyms in the next period, occupying one time slice T/n. In all the processes involving in reputation query and broadcasting, the cluster head cloaks the reputation value for privacy preservation.
Pseudonym generation and reputation initialization Taking cluster A (Figure 2) as an example, it is assumed that Va is a newcomer which has been provided with the capacity of pseudonym generation, and each vehicle is provided with a permanently unique ID. Under the enlightenment of the literature,24 this article adopts the blind signature technology to sign the pseudonyms generated by the cluster members. For each period T, the CA will distribute new private/public keys to each cluster member, which are dsig and esig generated by nsig . The cluster members will use signatures to sign for the pseudonym np and then use np to generate private/public keys \dp , np . and \ep , np .. The cluster member use the pseudonym to generate communication data (reputation request) mp , wherein mp = np resig mod nsig and r refers to the blinding factor. Then, the cluster members sign for the messages by their ID and get smp = f (mp jjIDjjT jjrini )dp mod np
ð1Þ
where rini refers to the reputation initialization request sent to RPM. The cluster head will collect \mp , smp . pairs of the cluster members at the start slice of the period and send them to CA for verification and registration in batches. The CA receives the messages and uses the public key to verify the legality of mp and validity of the pseudonyms within current period T. After successful verification, the CA will generate the blind signature message d SCA = mpsig mod nsig for mp and send the message to the RPM. The RPM will return the request result to the cluster head after acquiring the request. The whole algorithm process takes cluster A in Figure 2 as an example. The vehicle Va first locally generates the pseudonym V5, t , then the cluster head waits for a new period and completes pseudonym registration to RPM in batches. The RPM then responds to the reputation initialization request of Va whose pseudonym is V5, t . The cluster head cloaks the reputations after acquiring the reputation values of the cluster members and then broadcasts them in the cluster. So far, Va has completed the pseudonym registration and reputation initialization. The algorithm is given follows.
Sensing reports aggregation and reputation update Taking the cluster B (Figure 2) as an example, within a period, the cluster member Vb can use the pseudonym
Algorithm 1: the pseudonym generation and reputation initialization algorithm Input: cluster A, time t Output: reputation and pseudonym for a new member (e.g. Va ) 1. The newcomer vehicle Va creates a pseudonym V5, t , and signs with its ID; 2. The CHA registers pseudonyms to CA in batches (V1, t , 3), (V2, t , 8) (V5, t , *) at the start of Tt ; 3. The CA verifies pseudonyms; 4. The RPM issues reputation values, for example, (V5, t , 5); 5. The CHA cloaks CM’s reputation, for example, (V5, t , 5) ! (V5, t , 3); 6. The Va gets the new reputation and pseudonym.
Algorithm 2: the sensor data aggregation and reputation update algorithm Input: cluster B, time t Output: updated reputation of members (e.g. Vb ) 1. The vehicles accept APP’s sensing task and collect data; 2. The vehicle send sensing data using pseudonym V6, t ; 3. The CA verifies pseudonym V6, t ; 4. The RPM calculates score R6, t according to sensing data; 5. The RPM updates the reputation of Vb .
V6, t to upload the sensing data to the APP in multiple times. The CA will first verify the legality of current pseudonym V6, t . Then, the RPM will compute the rating R6, t of the sensing data according to the quality. The reputation algorithm will integrate this rating into the reputation of V6, t . The RPM thus gets the latest reputation value of vehicle V6, t and will update and save such reputation value. In the example of Figure 2, the APP issues a sensing task. Corresponding members in cluster B participate in such task and upload their sensing data. After the sensing data of vehicle V6, t is processed by the reputation algorithm, its rating is reduced by 0.2. That is, the reputation value is updated to 2.8 from 3. This algorithm is described taking Vb as an example.
Reputation query and transfer At the last time slice of a period, the cluster head will coordinate with the cluster members to update and transfer their reputation values. Taking vehicle Vc (Figure 2) as an example, the cluster head will notify the cluster members to initiatively update their pseudonyms when a time slice comes and complete the blind signature. After the CA verifies the validity of the pseudonym, the RPM will query corresponding pseudonym. If the vehicle with the pseudonym V11, t gets its latest reputation value of 3.8, the cluster head will cloak it to 3.5 to achieve 2-anonymity with vehicle V12
Wang et al.
7
Algorithm 3: the reputation query and transfer algorithm
Algorithm 4: the reputation cloaking algorithm
Input: cluster C, time t Output: new pseudonym and reputation for members (e.g. Vc ) 1. The CHC sends reputation query in batches as (V11, t , ?), (V12, t , ?). at the end of Tt ; 2. The CA verifies pseudonym; 3. The RPM responses queries, for example, (V11, t , ?) ! (V11, t , 3.8); 4. The CHC cloaks reputations, for example, (V11, t , 3.8) ! (V11, t , 3.5); 5. The Vc gets the current reputation r11, t = 3:5; 6. The CHC waits for next period Tnext = Tt + 1 ; 7. The all vehicles change their pseudonyms at the start of Tt + 1 , for example, V11, t ! V11, t + 1 ; 8. In period Tt + 1 , Vc gets new pseudonym and reputation (V11, t + 1 , 3.5).
Input: the dataset D includes all \pseudonym, reputation. tuples of cluster members, privacy factor K; Output: micro-aggregated dataset D0 ; 1. Compute distances of reputations in matrix (D); 2. Compute the centroid C = ComputeCentroidOfDataSet(D); 3. while (there are more than K 1 records to assign) do; 4. e = select the most distant record to centroid C from dataset D; 5. gi = BuildGroupFromRecord(e, D, K); 6. gi = ExtendGroup(gi , D, K); 7. end while; 8. mini = FindSmallest(gi ); 9. M = fFloor(mini )g.
after receiving the feedback of the RPM. When the next time slice comes, all vehicles will update their pseudonyms. At the moment, the pseudonym of Vc will be updated from V11, t to V11, t + 1 . In the next time slice, Vc will register with the new \pseudonym, reputation. tuple. The RPM will save the new pseudonym V11, t + 1 and its reputation value after receiving the registration request from V11, t + 1 . And so, the transfer of reputation among the pseudonyms is completed. This algorithm is described as follows.
Reputation cloaking The cluster head is in charge for reputation cloaking for cluster members, which is based on the variable-size maximum distance to average vector (V-MADV) algorithm.28 The V-MADV algorithm is the improvement of the maximum distance to average vector (MADV) algorithm. The MADV is a classical algorithm in the field of micro-clustering, which is used in the clustering process to make all the groups satisfy the requirements. The number of group members is no less than the given threshold K. The differences among the group members are small, while the differences between groups are large. Since the MADV algorithm needs to determine the number of groups in advance, which is not flexible enough, it is not applicable to the scenario of VANETs. The V-MADV algorithm overcomes the defect of MADV of predefining the parameters, which can adjust the size of the groups more flexibly without increasing the computing expenses. The cluster head will adopt the V-MADV algorithm for clustering after receiving the reputation set responded by the RPM. After acquiring the clustering set, the cluster head will find out the minimum reputation value in each set and replace all reputations in this set by the round-down value. Suppose there is a vehicle V15 , whose reputation is 8.1. (1) If V15 sends sensing reports directly with its real
reputation, the APP may easily invade the privacy of vehicle V15 by analyzing its historical data. (2) If V15 submits sensing reports with pseudonyms and the reputation, which is \pseudonym, reputation, data., it is not so easy for the APP to know which data are submitted by V15 . For example, V15 uploads \pseudonymt, 8.1, datat. at time t and then \pseudonymt + 1, 8.1, datat + 1. at time t + 1. This will reduce the possibility of privacy violations. However, if the reputation of V15 is unique, it is vulnerable to the reputation link attacks from the APP. (3) In the ClusterRep, the vehicle V15 can join in a cluster, such as cluster A in Figure 2. As V15 , V2 , and V3 have similar reputations, they will be clustered as a group. Then, the Cluster Head cloaks the reputation of this group, and set reputation 8.1 of V15 to 8. In this way, the V15 reaches 3-anonymity. The APP is difficult to link the submitted sensing reports, and the privacy of V15 is well protected.
Evaluations Evaluations are divided into three parts: (1) we study on the relation between privacy factor K and the size of clusters. (2) We study on the scalability of the framework and discuss the influence of cluster size on the overall performance. (3) We discuss the resistance ability of the framework against reputation link attack.
Simulator The simulations are implemented with the vehicular simulation framework veins, the network simulation tool OMNeT++, and the road traffic simulation tool Sumo. In these experiments, there are vehicles freely organized into clusters on the straight roads as shown in Figure 3. The speeds of the vehicles are consistent. Assuming that each period only contains three time slices, the pseudonyms will be mixed in the first time slice and the last time slice.
8
International Journal of Distributed Sensor Networks
Figure 3. Simulation scenario.
Metrics The Privacy level (PL) is defined as the possibility of attacker’s correct recognition on the identities of cluster members. If the distribution of cluster members in the set pseit + 1 is more even, the information entropy possessed by the cluster members will be larger, and their identities will be more difficultly deduced. It is assumed that the pseudonym set of period Tt is pset = fvi, t ji = 1, 2, . . . , ng, and the pseudonym set of period Tt + 1 is pset + 1 = fvi, t + 1 ji = 1, 2, . . . , ng. The v transition probability Pvi,j, tt + 1 is defined to represent the probability of a pseudonym from Vi, t to Vj, t + 1 . The PL is measured by the entropy of the cluster members, whose calculation formula is PL = Pvvi,j, tt + 1 * log (Pvvi,j, tt + 1 )=m
ð2Þ
where m is the number of the group members. The Trust level (TL) is defined as the effectiveness of trust, namely, the effectiveness of the reputation value after reputation cloaking. The reputation value may lose a part of precision after the generalization, and its effectiveness will be reduced n 0 ri 1X TL = n i = 1 jri j
ð3Þ
where ri0 . is the reputation cloaking result of ri . Privacy and trust are a trade-off. Therefore, we propose the
preference factor a(0\a\1) for setting the preference for privacy or trust of the cluster. An overall metric Z is defined as Z = a*TL + (1 a)*PL
ð4Þ
Parameter evaluation In a cluster, the cluster head takes charge of the reputation cloaking of members. For a cluster of size M, it will be divided into several groups according to Algorithm 4 with the privacy factor K, and K decides the PL. For the cluster members, the larger the K(K ł M) is, the better the anonymization will be. However, if K is too much larger, it will result in more losses in generalization of the reputation, namely, the effectiveness of trust will be reduced. Therefore, how to choose an optimal K value for the M-sized cluster and consider the privacy and trust at the same time is a problem to be solved in this experiment. This experiment takes a certain cluster as an example, the cluster size M is 20. To simplify the simulation, this article assumes that the APP only assigns one task in each period, and each period only comprises three time slices. For the cluster that the size M is 20, when Kø11, there could be only one group in the cluster, and the PL is maximum, namely, PL = log (M). After reputation cloaking, the reputations of all cluster members will be the same value, and the TL will lose its
Wang et al.
9
Figure 5. The effect of cluster size. Figure 4. The effect of parameter K.
effectiveness. Similarly, when K = 1, reputations of each cluster member will be reserved to the greatest extent, and TL is maximized. Thus, our experiment will mainly discuss the change of K within the interval ½2, 10 and its influence on the overall performance. As shown in Figure 4, we find that when K = 7, the overall performance of the cluster is the best, the privacy is well preserved, and meanwhile the loss of reputation can be controlled at a low level. When K exceeds 7, privacy is not increased correspondingly, but the effectiveness of reputation rapidly declines. Therefore, in later simulations, we set K to 7, which is K = dm=3e.
Scalability evaluation Since vehicles move fast, the cluster structure is dynamically changing. In this part, the influence of cluster size change on the overall performance is studied. The core algorithm of ClusterRep is reputation cloaking based on clustering, therefore the scalability of the cloaking algorithm is an important index to measure the scalability of the framework. In this part, the calculation expense of the cloaking algorithm is first discussed and described by algorithm complexity. The scalability of the algorithm is discussed theoretically. Assuming that the size of the cluster is M, the major steps of the algorithm is analyzed as follows: 1. 2. 3.
4.
Calculate the distance matrix of any two records, and the complexity is O(n2 ). Calculating the centroid needs to traverse all the records, and the complexity is O(n). Calculating the distance between all the records and the centroid needs to be conducted in n times, and the complexity is O(n). In the main loop, the number of records for each group gi satisfies K ł jgi j ł 2K 1, therefore (3K 1)=2 needs to be adjusted in average, and
5. 6.
7.
8.
9.
2n=(3K 1) times of iterations need to be done, and the complexity is O(n). Choose the record which is the farthest away from the centroid, and the complexity is O(n). Group gi is generated with a size of K, which needs to absorb K 1 records. The required comparison operation is (K 1)n=2 times, and therefore, the complexity is O(n). Expand the rest records into the group, the rest n=2 records need a distance comparison with the distributed records, thus comparison operations (k 1)n=4 are required, and the complexity is in direct proportion to O(n). Find out the minimum value of group gi after the reputation set is acquired, and the complexity is O(n). Round down the reputation values, and the complexity is O(1).
To sum up, the calculation expense of the main loop is O((2n=3k 1)(n + n + n)) = O(6n2 =3k 1) = O(n2 ); and the overall expense of the algorithm is O(n2 ) + O(n) + O(n) + O(n2 ) + O(n) + O(1) = O(n2 ). It indicates that the expense is acceptable. Except algorithm analysis, we also designed an experiment for further analysis. To simplify the experiment, this article only considers the scenario that the vehicles join gradually. Under the experimental environment, the initial size of cluster A is 10, and 10 vehicles are added to the cluster at each period. The cluster head CH_A dynamically adjusts the parameter K according to the size of the cluster. The experiment adopts the metric Z to measure the system performance. Here, we set a = 0:5 for the calculation of Z. The experimental result is as shown in Figure 5. It can be seen that with continuous participation of vehicles, the overall performance of cluster A tends to be stable. Therefore, it indicates that the ClusterRep framework is expandable both theoretically and experimentally. In addition, we designed experiments to see how long it takes for the algorithm to run in a mix-zone. As can be
10
International Journal of Distributed Sensor Networks
Figure 7. ClusterRep resists reputation link attack from cluster members. Figure 6. The running time when the size of cluster changes.
seen from Figure 6, as M (the size of cluster) becomes larger, the time increases with M. And, the changes between the two are in line with the complexity of the algorithm O(n2 ).
Security comparison This experiment aims to study the resistance of ClusterRep against reputation link attack. And, it is compared with other methods. Attack from member. It is assumed that in cluster A, the attacker Va is a cluster member who will recognize the pseudonym identity of cluster members in each period through the supervision on the broadcasting messages. We assume the pseudonym set of current period is pset , and the pseudonym set of next period is pset + 1 . For each pseudonym vi, t , the attacker attempts to confirm the follow-up vi, t + 1 according to the reputation. For the reputation ri, t of the pseudonym vi, t , rk, t + 1 which is closest to ri, t in the next-period reputation set will be found out, and then, the suspect pseudonym set pseit + 1 is collected. If vi, t + 1 is contained in pseit + 1 , the probability that the user is linked will be Plink = 1=jpseit + 1 j, which is defined as success probability of attack. Otherwise, the attack fails. Therefore, the lower the success probability of attack is, the stronger the resistance ability of the system will be. Correspondingly, the higher the success probability of attack is, the weaker the resistance ability of the system will be. In Figure 7, the ClusterRep is, respectively, compared with the Beta29 and the IncogniSense-floor25 methods. Here, the Beta represents that the framework uses the Beta reputation model, but the cluster head will not cloak the reputation. And the IncogniSensefloor method refers to that the framework adopts the Beta reputation model and the cluster head round down the reputation values for reputation anonymity
as IncogniSense does. As shown in Figure 7, the Beta does not cloaks members’ reputation and its resistance ability from attack is the weakest. The ClusterRep cloaks reputation values after micro-clustering, so that the cluster members can at least reach k-anonymity, and thus, the cloaking effect is superior to that of IncogniSense-floor, who directly round down the reputation values. Thus, ClusterRep can resist the reputation link attack from internal members effectively. Attack from server. An adversary server can acquire messages sent by the cluster head. It can also speculate the true identity of the user through reputation link attack. The ClusterRep adopts a method based on cluster to preserve privacy. Moreover, to further improve the privacy preservation effect, cluster head can mix the messages. In our example, this requires the cluster head CHa to randomly unicast the query requests to a cluster head neighbor among fCHb , CHc , CHd g instead of directly submitting to the server. And, one of the neighbors will send the query request to the server on behalf of CHa after receiving its request. This process is referred to as cluster head query mixture. It is assumed that there are clusters A, B, C, D, and E, and their sizes are all 20. The cluster heads will mix the query data. This experiment adopts success probability of attack as the evaluation metric. The above two methods are compared with IncogniSense-floor. In this simulation, the IncogniSensefloor line and the ClusterRep are non-mixed strategies, so the results do not change with the number N. The ClusterRep_Mixed is a mixed strategy, which means it will exchange reports among N cluster heads and then submit to the server. Therefore, the larger N, the better its privacy preservation. In Figure 8, it can be seen that the ClusterRep and the ClusterRep_Mixed can effectively reduce the success probability of attack and resist reputation link attack. Wherein, the ClusterRep_Mixed has a better resistance effect with the increase in the number of cluster heads joining the mixture. Correspondingly, the IncogniSense-floor anonymous
Wang et al.
11 Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by University Science Research Project of Jiangsu Province of China (grant nos 15KJB520029 and 16KJB520038).
References
Figure 8. ClusterRep resists reputation link attack from servers.
mechanism does not utilize the advantage of increasing cluster numbers, thus it cannot well resist the attack from third-party servers.
Conclusion This article proposes a privacy preservation and trustaware framework based on clusters in VANET. This framework realizes user’s disguised uploading of sensing data with the assistance of the cluster head, which collaborates with cluster members to generate pseudonyms periodically through blind signature technology. Meanwhile, the cluster head also uses reputation generalization to resist the attackers and prevent them from deducing user identities through reputation. Since the generalization will lead to internal conflict between anonymity preservation and reputation loss, this article demonstrates that the framework can well balance such conflict through experiments. This framework also has good scalability and can effectively resist the reputation link attack. In the future, we will try to integrate the framework with multiple classical reputation models and research on the ability of resisting against reputation manipulation. In this article, the personal privacy demands of cluster members are not distinguished, but in the future, the preferences of each cluster member for privacy will be taken into account. Besides, what cooperation strategy shall be adopted among the cluster members to achieve maximum benefit and further construct a highquality cluster is also a problem to be explored in the future. Finally, since the VANET is highly dynamic, to improve adaptability and robustness of algorithms is also a problem to be solved in the future. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
1. Rana RK, Chou CT, Kanhere SS, et al. Ear-phone: an end-to-end participatory urban noise mapping system. In: Proceedings of the 9th ACM/IEEE international conference on information processing in sensor networks, Stockholm, 12–16 April 2010, pp.105–116. New York: ACM. 2. Ali K, Al-Yaseen D, Ejaz A, et al. CrowdITS: crowdsourcing in intelligent transportation systems. In: Proceedings of the IEEE wireless communications and networking conference (WCNC), Shanghai, China, 1–4 April 2012, pp.3307–3311. New York: IEEE. 3. Pham N, Ganti RK, Uddin YS, et al. Privacy-preserving reconstruction of multidimensional data maps in vehicular participatory sensing. In: Silva JS, Krishnamachari B and Boavida F (eds) Wireless sensor networks. Berlin: Springer, 2010, pp.114–130. 4. Muhtarog˘lu FCP, Demir S, Obalı M, et al. Business model canvas perspective on big data applications. In: Proceedings of the IEEE international conference on big data, Silicon Valley, CA, 6–9 October 2013, pp.32–37. New York: IEEE. 5. Rai A, Chintalapudi KK, Padmanabhan VN, et al. Zee: zero-effort crowdsourcing for indoor localization. In: Proceedings of the 18th annual international conference on mobile computing and networking, Istanbul, Turkey, 22– 26 August 2012, pp.293–304. New York: ACM. 6. Lu R, Li X, Luan TH, et al. Pseudonym changing at social spots: an effective strategy for location privacy in VANETs. IEEE T Veh Technol 2012; 61(1): 86–96. 7. Gruteser M and Grunwald D. Anonymous usage of location-based services through spatial and temporal cloaking. In: Proceedings of the 1st international conference on mobile systems, applications and services, San Francisco, CA, 5–8 May 2003, pp.31–42. New York: ACM. 8. Jøsang A, Ismail R and Boyd C. A survey of trust and reputation systems for online service provision. Decis Support Syst 2007; 43(2): 618–644. 9. Minhas UF, Zhang J, Tran T, et al. A multifaceted approach to modeling agent trust for effective communication in the application of mobile ad hoc vehicular networks. IEEE T Syst Man Cy C 2011; 41(3): 407–420. 10. Li Q, Malip A, Martin KM, et al. A reputation-based announcement scheme for VANETs. IEEE T Veh Technol 2012; 61(9): 4095–4108. 11. Yang N. A similarity based trust and reputation management framework for VANETs. Int J Future Gener Commun Netw 2013; 6(2): 25–34. 12. Wang J, Zhang Y, Wang Y, et al. RPRep: a robust and privacy-preserving reputation management scheme for
12
13.
14.
15.
16.
17.
18.
19.
International Journal of Distributed Sensor Networks pseudonym-enabled VANETs. Int J Distrib Sens N 2016; 2016: e6138251. Tefera MK and Xiaolong Y. Trust and privacy in mobile participatory sensing: current trends and future challenges. In: Proceedings of the 3rd IEEE international conference on computer and communications (ICCC), Chengdu, China, 13–16 December 2017, pp.712–716. New York: IEEE. Chen Y-M and Wei Y-C. A beacon-based trust management system for enhancing user centric location privacy in VANETs. J Commun Netw 2013; 15(2): 153–163. Alswailim MA, Hassanein HS and Zulkernine M. A participant contribution trust scheme for crisis response systems. In: Proceedings of the 2017 IEEE global communications conference (GLOBECOM), Singapore, 4–8 December 2018, pp.1–6. New York: IEEE. Bhattacharjee S, Ghosh N, Shah VK, et al. QnQ: a reputation model to secure mobile crowdsourcing applications from incentive losses. In: Proceedings of the IEEE conference on communications and network security (CNS), Las Vegas, NV, 9–11 2017, pp.1–9. New York: IEEE. Guo J, Chen IR, Tsai JJP, et al. Trust-based IoT participatory sensing for hazard detection and response. In: Proceedings of the international conference on serviceoriented computing, Banff, AB, Canada, 10–13 October 2016, pp.79–84. Berlin: Springer. Li Z and Chigan C. Joint privacy and reputation assurance for VANETs. In: Proceedings of the IEEE international conference on communications (ICC), Ottawa, ON, Canada, 10–15 June 2012, pp.560–565. New York: IEEE. Katsomallos M, Lalis S, Papaioannou T, et al. An open framework for flexible plug-in privacy mechanisms in crowdsensing applications. In: Proceedings of the IEEE international conference on pervasive computing and communications workshops, Kona, HI, 13–17 2017. New York: IEEE.
20. Liu R, Liang J, Gao W, et al. Privacy-based recommendation mechanism in mobile participatory sensing systems using crowdsourced users’ preferences. Future Gener Comp Sy 2018; 80: 76–88. 21. Wang Y, Cai Z, Chi Z, et al. A differentially k-anonymity-based location privacy-preserving for mobile crowdsourcing systems. Procedia Comput Sci 2018; 129: 28–34. 22. Gao S, Ma J, Sun C, et al. Balancing trajectory privacy and data utility using a personalized anonymization model. J Netw Comput Appl 2014; 38: 125–134. 23. Gao S, Ma J, Shi W, et al. LTPPM: a location and trajectory privacy protection mechanism in participatory sensing. Wirel Commun Mob Com 2015; 15(1): 155–169. 24. Petit J, Schaub F, Feiri M, et al. Pseudonym schemes in vehicular networks: a survey. IEEE Commun Surv Tut 2015; 17: 228–255. 25. Christin D, Roßkopf C, Hollick M, et al. IncogniSense: an anonymity-preserving reputation framework for participatory sensing applications. Pervasive Mob Comput 2013; 9(3): 353–371. 26. Huang C, Lu R, Zhu H, et al. PTVC: achieving privacypreserving trust-based verifiable vehicular cloud computing. In: Proceedings of the IEEE global communications conference (GLOBECOM), Washington, DC, 4–8 December 2016, pp.1–6. New York: IEEE. 27. Hu H, Lu R, Zhang Z, et al. REPLACE: a reliable trustbased platoon service recommendation scheme in VANET. IEEE T Veh Technol 2017; 66(2): 1786–1797. 28. Solanas A, Martinez-Balleste A and Domingo-Ferrer J. V-MDAV: a multivariate microaggregation with variable group size. In: Proceedings of the 17th COMPSTAT symposium of the IASC, Rome, August 2006, pp.917–925. Physica-Verlag Heidelberg. 29. Josang A and Ismail R. The beta reputation system. In: Proceedings of the 15th bled electronic commerce conference, Bled, Slovenia, 17–19 June 2002, vol. 5, pp.2502–2511. AIS Electronic Library.