University of Stuttgart Institute of Parallel and Distributed Systems (IPVS) Universitätsstraße 38 D-70569 Stuttgart
Understanding Vulnerabilities of Location Privacy Mechanisms against Mobility Prediction Attacks Zohaib Riaz, Frank Dürr, Kurt Rothermel
International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (MobiQuitous 2017)
9th Nov 2017
Background and Motivation • Mobile apps promote location information sharing ◦ Let your friends know where your are! ◦ Tag tweets/photos with your location! ◦ Get location-based services, e.g., nearby POIs
• A single location update may convey: ◦ Geo-location ◦ Location semantics, e.g., restaurants, shops etc. {(lat,lon), venue name, type/semantics}
• Sensitive semantics (e.g., hospitals) User privacy concerns! Research Group “Distributed Systems”
University of Stuttgart 2
IPVS
Location Privacy mechanisms: State-of-the-art • Suppression: avoids release of sensitive semantic info (Götz et al. 2012)
?? Home
Leaks no context Secure against location-history based attacks cuts utility harshly: no data no service/sharing
Meeting
No context!!
• Obfuscation: “cloaks” semantic info (Yigitoglu et al. 2012)
allows approximate POI-searches/location-sharing leaks contextual information Research Group “Distributed Systems”
Home
Meeting
(time, geo-location)
University of Stuttgart 3
IPVS
Are existing obfuscation algorithms secure? • Threat Model: ◦ Attackers can aggregate location history information
▪ At least in the obfuscated form ▪ May possess accurate historic data
• Hypothesis: User privacy is at risk!
Research Group “Distributed Systems”
University of Stuttgart 4
IPVS
Contributions • Design of a semantic mobility model to represent attacker knowledge ◦ Both accurate and obfuscated historic location information
• Demonstration of its effectiveness as an attack against state-of-the-art semantic obfuscation mechanisms
◦ Dataset: year-long location check-in histories of 278 Foursquare users
• Identification of fundamental design improvements for future semantic obfuscation mechanisms.
Research Group “Distributed Systems”
University of Stuttgart 5
IPVS
State-of-the-art Semantic Obfuscation Mechanisms (Damiani et al. 2010, Yigitoglu et al. 2012)
• Each user can specify his sensitive semantic locations ◦ Hospitals, bars, churches etc.
Cloaking Region
𝑙=4
• Can also specify the degree of protection (see paper for details) ◦ l-diversity: Number of distinct locations inside a cloaking region
non-sensitive loc. sensitive loc.
Step 2: Real-time location updates
Step 1: Preprocessing
Actual:
Published:
Generate CRs for City Map (independent of user mobility) Research Group “Distributed Systems”
University of Stuttgart 6
IPVS
Attack overview
Prior: 𝛀𝐀𝐭𝐭 User’s Semantic Mobility Model (location history-based)
Posterior
+
Observations recent updates
=
?? 8 am
11 am
4 pm
5 pm
7 pm
location updates in a single day … Research Group “Distributed Systems”
University of Stuttgart 7
IPVS
The semantic mobility modeling problem (1) • Goal: Learn 𝛀𝐀𝐭𝐭 from location history • A popular fundamental assumption: Human Mobility can be modelled as a Discrete-time Markov chain ▪ Semantic locations modeled as states (𝒙𝒕 )
▫ 𝑥𝑖 ∈ 𝑆 = {𝑠1 , … , 𝑠𝑀 }, e.g., home, work, shopping ▪ Inter-state transitions governed by probabilities:
▫ 𝑎𝑖𝑗 = 𝑃 Start
𝑥𝑡 = 𝑠ℎ𝑜𝑝𝑝𝑖𝑛𝑔 𝑥𝑡−1 = ℎ𝑜𝑚𝑒)
𝑥1
𝑥𝑡−1
𝑎𝑖𝑗
𝑥𝑡
𝑥𝑇
𝑥𝑡+1
End
𝑡=𝑇
𝑡=1
Markov chain over a day of user’s movement Research Group “Distributed Systems”
University of Stuttgart 8
IPVS
The semantic mobility modeling problem (2) • State information: ◦ not always clear ◦ is additionally accompanied by other observations
• How to model this information?? Start
𝑥1
𝑥𝑡−1
𝑥𝑡
states
??
Office
Shopping
𝑥𝑇
𝑥𝑡+1
End
Home
observed features {geo-location, hour-of-day, weekday, …} Research Group “Distributed Systems”
University of Stuttgart 9
IPVS
Hidden Markov Models (HMMs) • Can accommodate: ◦ Cloaking Regions and missing state information
𝑥𝑡 = Home
Hidden States 𝑎𝑖𝑗 = 𝑃 𝑥𝑡 = 𝑗 𝑥𝑡−1 = 𝑖)
◦ Observed features as state-dependent emissions 𝑏𝑗 𝑜𝑘 = 𝑃 𝑜𝑡 = 𝑜𝑘 𝑥𝑡 = 𝑗
𝑜𝑡 = 6 𝑎𝑚
• Given Ω = {𝐴, 𝐵} and 𝑂 = {𝑜1 , … , 𝑜𝑇 } ◦ Can efficiently compute 𝑃(𝑥𝑡 = 𝑗) Start
𝑥1
𝑥𝑡−1
𝑎𝑖𝑗
𝑥𝑡
𝑥𝑇
End
𝑏𝑗 𝑜𝑘 𝑜1
𝑜𝑡−1
𝑜𝑡
Research Group 10 “Distributed Systems”Observation
𝑜𝑇 University of Stuttgart
sequence
IPVS
Training HMMs: Learning A and B • Option 1: if dataset is fully labeled Maximum-likelihood Start
𝑥1
𝑥𝑡−1
𝑥𝑡
𝑥𝑡+1
𝑥𝑇
𝑜1
𝑜𝑡−1
𝑜𝑡
𝑜𝑡+1
𝑜𝑇
End
Transition probabilities
𝑥𝑡 = Home
𝑃(𝑥𝑡 = 𝑂𝑓𝑓𝑖𝑐𝑒 | 𝑥𝑡−1 = 𝐻𝑜𝑚𝑒)
20 Home
8 5
Emission probabilities
Office
𝑃(ℎ𝑜𝑢𝑟|𝐻𝑜𝑚𝑒)
Cafe
7 Shopping
Hour of Day
Research Group “Distributed Systems”
ℎ𝑜𝑢𝑟 𝑜𝑓 𝑑𝑎𝑦 10pm 3pm 12am 6am 7pm 8am 1pm 11pm University of Stuttgart
11
IPVS
Training HMMs: Learning A and B • Option 1: if dataset is fully labeled Maximum-likelihood Start
𝑥1
𝑥𝑡−1
𝑥𝑡
𝑥𝑡+1
𝑥𝑇
𝑜1
𝑜𝑡−1
𝑜𝑡
𝑜𝑡+1
𝑜𝑇
End
Ω𝑖𝑛𝑖𝑡 (smoothed parameters, see paper for details)
Research Group “Distributed Systems”
University of Stuttgart 12
IPVS
Training HMMs: Learning A and B • Option 2: If labels are not present Baum-Welch algorithm Ω𝑖𝑛𝑖𝑡 Start
𝑥1
𝑥𝑡−1
𝑥𝑡
𝑥𝑡+1
𝑥𝑇
𝑜1
𝑜𝑡−1
𝑜𝑡
𝑜𝑡+1
𝑜𝑇
End
• Begin with a prior model, e.g., Ω = Ω𝑖𝑛𝑖𝑡 ◦ Step 1: generate probabilistic state-sequences using a model Ω
◦ Step 2: Estimate Ωnew using Maximum-likelihood estimation ◦ Set Ω = Ωnew and repeat steps 1&2 until convergence Research Group “Distributed Systems”
University of Stuttgart 13
IPVS
Training HMMs: Learning A and B
?
𝑃 𝑥𝑡 = 𝑗 ∝ 𝑃(𝑜𝑡 |𝑥𝑡 = 𝑗) {home, shopping, bar}
Office Start
𝑥1
𝑥𝑡−1
𝑥𝑡
𝑥𝑡+1
𝑥𝑇
𝑜1
𝑜𝑡−1
𝑜𝑡
𝑜𝑡+1
𝑜𝑇
𝑥𝑡 = Office End
𝑜𝑡 = 1 𝑎𝑚 Example
𝑏𝑗 𝑜𝑡 = 0 ∀ 𝑠𝑗 ∉ 𝐶𝑅 Research Group “Distributed Systems”
University of Stuttgart 14
IPVS
Experimental Workflow • Dataset: ◦ ◦ ◦ ◦
Crawled check-ins from Twitter’s public feed from Nov 2015- Nov 2016
Got venue information (including surrounding venues) from Foursquare Necessary filtering leaves 278 users with 284,472 check-ins Mean length of check-in history: 246 days
Clean Map to time-slots
Home
Obfuscate
Compute 𝛀𝐀𝐭𝐭 train
𝛀𝐀𝐭𝐭
Evaluate Success test
Location History Research Group “Distributed Systems”
University of Stuttgart 15
IPVS
Evaluation Metrics Precision of Attack
user 𝑖’s test-set
low precision Attack
high precision sensitive trips non-sensitive trips
ground truth
Research Group “Distributed Systems”
University of Stuttgart 16
IPVS
Results • Sensitive location selected s.t. it is visited at a certain frequency
Minimum Precision
Minimum Precision
Research Group “Distributed Systems”
University of Stuttgart 17
IPVS
Results • Sensitive location selected s.t. it is visited at a certain frequency
Minimum Precision
Minimum Precision
Research Group “Distributed Systems”
University of Stuttgart 18
IPVS
Results • Sensitive location selected s.t. it is visited at a certain frequency
Minimum Precision
Minimum Precision
Research Group “Distributed Systems”
University of Stuttgart 19
IPVS
Conclusion • We show that the privacy guarantees offered by state-of-the-art location obfuscation mechanisms are weak!
• Obfuscated location-history ◦ can be exploited for mobility modeling ◦ can be used to de-obfuscate user trips
• State-of-the-art location obfuscation mechanisms are more vulnerable to de-obfuscation when used frequently
• The need of mobility-aware obfuscation algorithms is evident! Research Group “Distributed Systems”
University of Stuttgart 20
IPVS
Contact and Discussion
www.priloc.de
Zohaib Riaz Institute for Parallel and Distributed Systems, University of Stuttgart, Germany
[email protected] Research Group “Distributed Systems”
University of Stuttgart IPVS