Securing Tag-based recommender systems against profile injection attacks: A comparative study Georgios Pitsilis (
[email protected]) Heri Ramampiaro (
[email protected]) Helge Langseth (
[email protected]) Department of Computer Science, Norwegian University of Science and Technology
RecSys ’18, Vancouver, Canada, October 2-7, 2018. Introduction - Problem Statement
Conclusions
Contributions - Results
I Collaborative Tagging systems are vulnerable to attacks, aiming to promote or disapprove a product for the benefit of the attacker [2].
Synthetic data used
Deep Learning architecture
Distribution of tags population in folksonomies of del.isio.us dataset 2500000
9000 legitimate bogus 8000
vector dimension (25)
1000000 3000
3
4
5
6
7
8
9
tag 49
...
tag 50
...
for the legitimate class
10
folksonomy
# tags
word embedding
LSTM
Embedding layer
sigmoid
More information about the data used at:
More information about this attack at: I Piggyback attack Goal of the attack: To associate the target resource with a highly reputable one so that they would appear similar.
predicted probabilities
...
2
... ...
1
tag 4 ...
0
...
...
0
tag 3
...
1000
...
...
2000
500000
tag 2
single value
4000
...
50 values
5000
tag 1
50 values
1500000
padded folksonomy to size 50
6000
...
I Overload attack Goal of the attack: To overload the tag context with bogus resource to achieve high correlation between a tag and that resource.
output dimensionality (250)
7000 # bogus folksonomies
More information about our contribution at:
# original folksonomies
2000000
Dense Layer
Output layer softmax
ReLU
More details about the model at:
I Comparing the accuracy of the classification schemes
More information about this attack at: I Effective countermeasures are necessary to safeguard recommendations.
BAYES 0.8339 0.9082 0.5863 0.9009 0.93878 0.7748
DL 0.9570 0.9709 0.9104
[4] Valerio Basile, Silvio Peroni, Fabio Tamburini, and Fabio Vitali. 2015. ”Topical tags vs non-topical tags: Towards a bipartite classification”, Information Science 41, 4 ,486-505, (2015)
0.9728 0.9818 0.9426
[5] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. ”Distributed Representations of Words and Phrases and Their Compositionality”,In Proc. of NIPS 2013 - Volume 2. , Curran Associates Inc., (2013)
I Comparing the effects of the attacks on the Recommendation process. Metrics: 1) Average rank, 2) Affected population.
Methods and Material
More details about the evaluation at:
34 33 32 31 30 No Classifier SVM BAYES D-L
29 28 27 0.1%
0.2%
0.4% 1% Size of attack
2%
10%
4.0 No Classifier SVM BAYES D-L
3.8 3.6 3.4 3.2 3.0 0.1%
0.2%
0.4% 1% Size of attack
2%
10%
Results for Overload attack showing the affected population at the top (Small values indicate strong resistance), and the rank of bogus resource at the bottom (Large values indicate strong resistance).
Population affected by the attack (%)
35
Average Rank of Bogus resource
Population affected by the attack (%)
.
Average Rank of Bogus resource
I Tested Countermeasures: . Naive Bayes Filtering: Classification of folksonomies based on the existence of tags in them [3]. . Support Vector Machine (SVM): Suitable for binary classification tasks. Linear kernel function used. Squared Hinge loss function used. . Deep Learning Model (DL) Employs a Long-Short-Term-Memory (LSTM)-based Recurrent Neural Network (RNN) algorithm. Consists of four Layers. I Datasets: . social bookmarking data used [4]. . Generated synthetic malicious data to simulate attacks. I Evaluation . Used the Vector Space model [1] for computing the personalized top-k lists of recommendations. . 10-fold cross validation was applied for all three algorithms we tested.
[1] G. Salton, A. Wong, and C. S. Yang. 1975. ”A vector space model for automatic indexing”, Communincations ACM 18, 11 613-620, (1975), https://doi.org/10.1145/361219.361220
[3] Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz, ”A Bayesian Approach to Filtering Junk E-Mail”: Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, AAAI Technical Report WS-98-05, (1998)
Table: Classification accuracy for both attacks
SVM 0.9501 0.9665 0.8958 0.9680 0.97888 0.9319
References
[2] M. Ramezani, J. Sandvig, T. Schimoler, J. Gemmell, B. Mobasher, and R. Burke.2009. ”Evaluating the Impact of Attacks in Collaborative Tagging Environments”, In Proc. of International Conference on Computer Science and Engineering (CSE 2009). IEEE, 136-143.
Results
Type of attack Classifier → Overload F-score (overall) F-score (legit) F-score (bogus) Piggyback F-scores (overall) F-score (legit) F-score (bogus)
I Even attacks of small scale are sufficient to render a significant population of users vulnerable. I The DL approach provides good resistance in preventing the intrusion of bogus resources into the users top-k lists. I In terms of the Avg. Rank of the Bogus resource, DL scales better for large sizes of attacks as compared to the Bayes classifier, which seems to perform best for small attacks only. I For the Piggyback attack, DL is second to Bayes, using the same metric. However, DL shows improvements as the attack size increases, while the Bayes classifier shows diminishing returns.
34
Acknowledgements
32 30 28 26 No Classifier SVM BAYES D-L
24 22 20 0.1%
0.2%
0.4% 1% Size of attack
2%
10%
5.5 No Classifier SVM BAYES D-L
5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 0.1%
0.2%
0.4% 1% Size of attack
2%
10%
Results for Piggyback attack showing the users population in which the bogus resource has ranked higher than the popular one (top), and its actual rank (bottom).
This work has been supported by Telenor Research, Norway, through the collaboration project between NTNU and Telenor. It has been carried out at the Telenor – NTNU Norwegian open AI-Lab.