Securing Tag-based recommender systems against

Securing Tag-based recommender systems against profile injection attacks: A comparative study Georgios Pitsilis ([email protected]) Heri Ramampiaro ([email protected]) Helge Langseth ([email protected]) Department of Computer Science, Norwegian University of Science and Technology

RecSys ’18, Vancouver, Canada, October 2-7, 2018. Introduction - Problem Statement

Conclusions

Contributions - Results

I Collaborative Tagging systems are vulnerable to attacks, aiming to promote or disapprove a product for the benefit of the attacker [2].

Synthetic data used

Deep Learning architecture

Distribution of tags population in folksonomies of del.isio.us dataset 2500000

9000 legitimate bogus 8000

vector dimension (25)

1000000 3000

3

4

5

6

7

8

9

tag 49

...

tag 50

...

for the legitimate class

10

folksonomy

# tags

word embedding

LSTM

Embedding layer

sigmoid

More information about the data used at:

More information about this attack at: I Piggyback attack Goal of the attack: To associate the target resource with a highly reputable one so that they would appear similar.

predicted probabilities

...

2

... ...

1

tag 4 ...

0

...

...

0

tag 3

...

1000

...

...

2000

500000

tag 2

single value

4000

...

50 values

5000

tag 1

50 values

1500000

padded folksonomy to size 50

6000

...

I Overload attack Goal of the attack: To overload the tag context with bogus resource to achieve high correlation between a tag and that resource.

output dimensionality (250)

7000 # bogus folksonomies

More information about our contribution at:

# original folksonomies

2000000

Dense Layer

Output layer softmax

ReLU

More details about the model at:

I Comparing the accuracy of the classification schemes

More information about this attack at: I Effective countermeasures are necessary to safeguard recommendations.

BAYES 0.8339 0.9082 0.5863 0.9009 0.93878 0.7748

DL 0.9570 0.9709 0.9104

[4] Valerio Basile, Silvio Peroni, Fabio Tamburini, and Fabio Vitali. 2015. ”Topical tags vs non-topical tags: Towards a bipartite classification”, Information Science 41, 4 ,486-505, (2015)

0.9728 0.9818 0.9426

[5] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. ”Distributed Representations of Words and Phrases and Their Compositionality”,In Proc. of NIPS 2013 - Volume 2. , Curran Associates Inc., (2013)

I Comparing the effects of the attacks on the Recommendation process. Metrics: 1) Average rank, 2) Affected population.

Methods and Material

More details about the evaluation at:

34 33 32 31 30 No Classifier SVM BAYES D-L

29 28 27 0.1%

0.2%

0.4% 1% Size of attack

2%

10%

4.0 No Classifier SVM BAYES D-L

3.8 3.6 3.4 3.2 3.0 0.1%

0.2%


2%

10%

Results for Overload attack showing the affected population at the top (Small values indicate strong resistance), and the rank of bogus resource at the bottom (Large values indicate strong resistance).

Population affected by the attack (%)

35

Average Rank of Bogus resource

Population affected by the attack (%)

.

Average Rank of Bogus resource

I Tested Countermeasures: . Naive Bayes Filtering: Classification of folksonomies based on the existence of tags in them [3]. . Support Vector Machine (SVM): Suitable for binary classification tasks. Linear kernel function used. Squared Hinge loss function used. . Deep Learning Model (DL) Employs a Long-Short-Term-Memory (LSTM)-based Recurrent Neural Network (RNN) algorithm. Consists of four Layers. I Datasets: . social bookmarking data used [4]. . Generated synthetic malicious data to simulate attacks. I Evaluation . Used the Vector Space model [1] for computing the personalized top-k lists of recommendations. . 10-fold cross validation was applied for all three algorithms we tested.

[1] G. Salton, A. Wong, and C. S. Yang. 1975. ”A vector space model for automatic indexing”, Communincations ACM 18, 11 613-620, (1975), https://doi.org/10.1145/361219.361220

[3] Mehran Sahami, Susan Dumais, David Heckerman, and Eric Horvitz, ”A Bayesian Approach to Filtering Junk E-Mail”: Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, AAAI Technical Report WS-98-05, (1998)

Table: Classification accuracy for both attacks

SVM 0.9501 0.9665 0.8958 0.9680 0.97888 0.9319

References

[2] M. Ramezani, J. Sandvig, T. Schimoler, J. Gemmell, B. Mobasher, and R. Burke.2009. ”Evaluating the Impact of Attacks in Collaborative Tagging Environments”, In Proc. of International Conference on Computer Science and Engineering (CSE 2009). IEEE, 136-143.

Results

Type of attack Classifier → Overload F-score (overall) F-score (legit) F-score (bogus) Piggyback F-scores (overall) F-score (legit) F-score (bogus)

I Even attacks of small scale are sufficient to render a significant population of users vulnerable. I The DL approach provides good resistance in preventing the intrusion of bogus resources into the users top-k lists. I In terms of the Avg. Rank of the Bogus resource, DL scales better for large sizes of attacks as compared to the Bayes classifier, which seems to perform best for small attacks only. I For the Piggyback attack, DL is second to Bayes, using the same metric. However, DL shows improvements as the attack size increases, while the Bayes classifier shows diminishing returns.

34

Acknowledgements

32 30 28 26 No Classifier SVM BAYES D-L

24 22 20 0.1%

0.2%


2%

10%

5.5 No Classifier SVM BAYES D-L

5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 0.1%

0.2%


2%

10%

Results for Piggyback attack showing the users population in which the bogus resource has ranked higher than the popular one (top), and its actual rank (bottom).

This work has been supported by Telenor Research, Norway, through the collaboration project between NTNU and Telenor. It has been carried out at the Telenor – NTNU Norwegian open AI-Lab.