The Peer Sampling Service (PSS) has been proposed as a method to initiate and maintain the set ... We show how techniques from SNA can be adopted in gossip- ...... sense that they operate to avoid being easily discovered [10]. The attack .... sip exchanges is used to build the KB required to detect with good accuracy the ...
Prestige-based Peer Sampling Service: Interdisciplinary Approach to Secure Gossip ∗ Gian Paolo Jesi , Edoardo Mollona
Srijith K. Nair, Maarten van Steen
Dept. of Computer Science University of Bologna - Italy E-mail: {jesi | mollona}@cs.unibo.it
Dept. of Computer Science Vrije Universiteit Amsterdam (The Netherlands) E-mail: {steen | srijith}@cs.vu.nl
ABSTRACT The Peer Sampling Service (PSS) has been proposed as a method to initiate and maintain the set of connections between nodes in unstructured peer to peer (P2P) networks. The PSS usually relies on gossip-style communication where participants exchange their links in a randomized way. However, the PSS network organization can be easily modified by malicious nodes running a “hub attack", in which they achieve a leading structural position. From this prestigious status, the malicious nodes can severely affect the overlay and achieve several application dependent advantages. We present a novel method to overcome this attack and provide results from simulation experiments that validate our claim. This method is inspired by a simple technique used to detect social leaders in firm’s organizations that is based on the social (structural) “prestige" of actors.
Keywords Gossip, peer sampling, security, SNA
1.
INTRODUCTION
In large-scale distributed systems, such as P2P, there is a need to provide some method for sampling the network. This feature is needed, for example, to discover network properties like its topology, or to build and maintain robust overlays [3, 6, 15]. Usually, in such systems, each node maintains a set of so-called local cache or view of logical links (e.g., IP address and port number) to other nodes (neighbors). In dynamic conditions, where nodes constantly join and leave the system, these caches must evolve to reflect the changes triggered by the dynamism. A key requirement in this kind of systems is that the sampling should be uniformly random to ensure connectivity and resilience to crashes. In the absence of malicious behavior, the local caches can be successfully maintained in a pseudo-random fashion using gossip∗The authors acknowledge financial support from the research project FIRB 2003 n◦ RBNE03HJZZ named “The evolution of clusters of firms: emerging technological and organizational architectures".
based approaches such as the Peer Sampling Service (PSS) [6], with distinct implementations (e.g., [6, 15]). Unfortunately, the PSS can be exploited by potentially malicious nodes which wishes, in the worst case scenario, to defeat the overlay. As the status information is spread very quickly, non-malicious nodes may be too late in detecting that the overlay have been subverted and any attempt to recover could be useless. Figure 1(b) and (c) shows the effects of the presence of just 20 attackers interested in becoming the overlay’s structural leaders in a 1000 nodes network. In the former subfigure the malicious nodes (i.e., the dark circles in the plot) became hubs as all the other nodes have logical links to only them. In the latter subfigure, these attackers leave the network after having achieved the structural positions, causing the overlay to become completely disconnected. What is worse, is that these effects can be achieved in a few gossip interactions and independent of the network size. To solve this problem we have designed and tested (by simulation) a prestige-based Secure Peer Sampling Service (SPSS) based on heuristics inspired by a Social Network Analysis (SNA) technique [4, 16] used to measure the structural prestige of nodes in a network. The conceptual idea of becoming a structural leader goes beyond the boundaries of distributed systems; in fact, it is also an interesting research topic in firm’s reorganization processes [12]. We show how techniques from SNA can be adopted in gossipbased distributed systems in order to solve the identified security issues. This approach is in line with others works that exploit social networks towards building robust systems [11]. We compare our approach with another from SPSS approach [8] and show its improved effectiveness in dealing with a larger set of attackers. The remainder of this paper is organized as follows: in Section 2 we first describe our scenarios and the “hub attack” model [8] in which several colluding malicious nodes are sufficient to completely partition a network using a gossip based protocol. In Section 3, we focus on our specific problem and present the basic ideas behind our prestige-based solution. In Section 4, we describe the solution algorithm and then provide the results of our experimental evaluation in Section 5. Finally, we briefly survey related work and conclude with a summary, open issues and possible future work.
2. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’09 March 8-12, 2009, Honolulu, Hawaii, U.S.A. Copyright 2009 ACM 978-1-60558-166-8/09/03 ...$5.00.
GOSSIP AND ATTACK MODEL
We consider a network consisting of a large collection of nodes that can join or leave at any time. Leaving the network can be voluntary or due to a crash. We assume the presence of an underlying routed network (e.g., the Internet) in which any node can, in principle, contact any other node. It is required that any node must be addressable by a unique node identifier (ID), such as an hIPaddress, porti pair. We do not address the presence of firewalls and NAT routing among peers since it has been demonstrated that they
555
491 478
999
196
421
445
(b) k = 20 attackers
337
955
784
161
916
225
995
922
148
327
926
352
34
817
23
478
422
765
747
975
211
640
275
883
985
183
477
379
854
950
131
620
399
858
140
657
170
856
880
176
239
598
144
149
494
473
572
996
867
634
706
903
868
977
589
519
133
289
78
11
693
702
499
387
205
29
708
778
314
153
776
142
945
30
91
171
151
734
365
14
268
614
515
257
678
855
13
42
452
213
988
959
809
577
870
949
223
612
346
801
214
304
19
516
687
98
423
649
720
143
75
291
360
847
878
181
103
338
87
782
390
918
449
93
382
781
567
582
486
54
596
342
590
466
371
551
457
767
695
923
180
549
493
605
891
876
788
119
617
462
609
238
593
367
636
461
731
156
84
568
12
895
484
531
837
663
699
158
772
672
541
353
266
504
25
36
733
728
942
3
99
472
444
394
896
428
283
806
94
372
887
501
126
66
60
185
209
984
222
305
159
585
309
453
194
881
323
840
108
563
913
69
524
146
269
550
80
395
166
537
85
560
792
24
527
583
68
689
963
450
358
927
2
497
993
740
576
548
443
162
512
935
396
899
137
191
384
834
670
402
714
811
586
251
552
771
206
631
700
877
362
199
513
272
626
237
405
873
630
264
629
901
303
540
603
726
460
835
575
96
953
262
204
875
879
613
317
253
707
989
656
518
946
324
312
820
465
813
571
521
594
682
645
914
339
152
804
190
885
201
295
528
65
618
351
915
970
511
37
256
43
157
969
943
846
285
766
83
941
79
509
1
481
234
348
284
249
254
906
802
545
412
343
843
49
730
255
919
331
174
595
439
480
106
433
278
717
376
458
684
638
315
260
841
780
960
73
345
147
104
674
8
451
539
128
169
113
489
404
591
722
495
718
105
956
701
45
538
134
52
936
438
350
660
127
354
748
607
912
349
938
698
192
86
863
333
763
797
370
639
814
407
745
761
63
664
679
588
420
697
865
796
311
653
973
554
344
32
359
844
419
581
543
934
82
777
859
132
55
578
759
738
89
220
558
992
525
662
16
186
67
383
746
890
957
470
971
46
751
680
627
341
827
667
112
252
294
406
762
939
503
785
233
145
110
427
189
779
752
529
155
622
608
471 818
9
124
364
130
757
242
92
326
332
905
21
247
866 456
483
475
911
553
564
967
482
773
435
838
815
948
165
381
250
125
574
743
688
816
88
368
729
783
822
447
409
280
448
502
426
681
118
650
319
710
909
221
59
109
335
979
587
962
712
425
964
195
95
116
991
463
737
363
954
547
389
28
644
864
974
951
713
671
716
556
101
218
669
81
316
929
40
986
799
90
273
637
270
675
514
981
536
168
459
385
442
369
580
655
805
429
287
832
329
562
632
274
490 70
795
492
508
643
479
263
774
120
907
414
22
227
602
833
431
361
694
658
139
357
619
871
26
736
958
517
621
437
599
33
243
441
651
641
860
58
6
831
232
646
546
230
507
378
750
208
600
940
794
930
721
569
328
138
355
246
198
485
810
216
240
5
787
606
325
533
4
292
654
388
307
994
245
416
892
235
215
917
313
848
661
983
565
400
31
715
173
544
297
980
931
861
241
228
932
724
826
244
10
15
559
77
224
775
824
628
623
455
474
845
276
322
413
366
798
808
869
236
392
828
966
982
790
71
968
39
893
306
842
193
616
882
839
200
282
719
38
897
277
535
197
756
659
686
522
910
107
219
952
900
229
330
61
579
723
769
933
397
696
123
259
498
393
129
758
310
851
231
7
248
673
468
825
207
377
898
768
488
570
464
753
601
74
178
398
677
976
709
921
532
286
520
217
937
754
789
320
210
188
739
56
373
704
391
800
27
347
534
496
267
764
261
97
857
642
692
102
375
436
454
469
652
302
430
380
288
770
760
635
336
64
72
691
301
997
690
849
665
403
299
904
76
823
884
987
830
819
182
226
990
446
888
321
154
920
334
175
401
908
972
506
902
803
18
434
648
318
187
573
203
308
874
212
947
786
491
889
258
476
410
647
386
418
944
115
44 135
566
167
293
100
62
604 374
705
633
862
886
727
703 57
793
17
741
164
136
611 202
184
685
530
0 122
417
41
290
160
300 965
467
850
526
807
114
296
791
557
523
668
735
408
852
812
998
666
111
683
542
411
744
500
487
755
48 415
624
432
271
742
676
265
141
821
51 150
440
924
163 711
121
386 496 977 958 851 627 714 446 99 227 331 494 196 798 526 308 893 833 109 385 864 412 972 396 876 520711 280 35 157 665 953 284 50 3948630572 204 752 102 652 948654 105 825 846 730987 567 263 470 724 269 221 947 149256 643 637 892 994 296 415 387 292 750 926 81 487 458 690 962 60 950 480 843 704 890 79 819 215246 676214 806 445 247 481648 572 908 759 943309 251 785 248 889826 755 598 95 468 181 114 669 66413 660 871 368 42112705 426 696 110 430 68761 332 92 240 647 312405 62794 573 403 523 599 885 914 634850 401 942 10189139 171 789 7739 397 812 283 937 118548 322 188 748 522 54 786 324 235 173 955 186 185 447 75866531 439 623 527 197 837 83 797211 510 333 104 96906 521596311 894 822 267 668 982 998935976 404 804 61 814 24 375 364125828 414 287901144 687 853 238 852 817 422 992 764 348 725 934 765 464 174 48 564 187949 29 636 101 525 552707 777805 228 44169586 154 198 469 506 727 434 111 299 631 135 644 453 622 672 848912 15 779 860 0 80 444 556 192 169 920 13817234 18 975 130 122 165 538 620 737 432 983 502 783 160 796 367 832 957 407 515 932 232 956524 818 285 855 836 702 260 253 803 936 438 961 680 835 141 399 279 604 999164 303 341 351583 684 136 498 844 697 252 516 320 865 736 213 651 184 595402 65 49112 845 554 124417436 728 504 12076 988 827 340 698 370 304 810 222 816 590 731 9030483 550 361 718 316 346 77533408 770 274635 472 53 679 485 97 570 501 208 425 918802 466 910 700281 117 683858 357 395 919 4 398 190 374490 939 377 103 772 236 273 41 435476 542 191 347 535 419 677 323 271 792 9 952710 482 47 909 649 625 166 383 695881 821 863 563 87 456 624 93 776 409 751 540 278 621 870 55 933 874255 86 632 335 36 639 551610 20217156 536 37 318 508 930 896 882 517 907 363 424 872140 46 842 300 143 428 749 382 146 529 21131 336 963 209 867 328 980 179 459 293 129 582 575 841 420 338 989 923 317 137 614712 88456372 513 172 477 964 301 298 233 597 265 902 512897 666 900503 915 203 243 82025735 282 175 782 673 85 460 967 270 799 576 981 840 275 64 84 561 791 475 410 763 261 373381 913 418 601 416 83972328 344 514 291 390 199 145 780 603 694113 931 979 455 940 369 859 257 929 991847592 356 108 218743 650 589 245 128 147 824 244 461 6 507 497 717 645 442 5 762 663159 294 423 609895 378 681 566 451 974 457 394 89 127 52 153 254 107 970 226 454272 16 288 969 588 571 868 757 978 606 753 249 40 193 831 500 393 829 539 813 875 321565 262 784242 973 626511 306 489 123 350 641 319 778 463 206 67699 177 807 155 231671 688781 479 183 327 406 264 692 830 223 31 616 732 433 440 509 790 3 212 888 954 615 928 905 617 70 721 115 734 579 38 600 788 657 170733 163 467 686 593745142168 478 437230 768 878 925 584 659 384 11 898 857 44 585873 194547 63077445996 716 121365 611 22 640 450 295313 591 473 63 360 656938 353 71911 465 729 376927 667891 2 49 391 879 343 899 693 581 359 443 738 720 629 276 43 337 180541 488 960 250 161116 993 675 150 746 917 330 877 400 14823 984 342941 534880 471 98 670 54457990 594 560 760 744 224 66549 653 495568 904 628 519 431 58 646 73838 74 922662 82 362 747307966 148 771861 78201 289 562 152 921 100 607 608 553314856126 800 722 528 34701 484 758 266 613 558 162 133 965 862740 134 903 633 612 33 795 448 427 715 371 499 302 773474 158658 559 713297834 587741545787 59 809 413 754 310 349 452 239 532 421 3807671 388132 492 685 580 88 389 709 220 379106 924 358 290 19 241 742 886 259462 167 985689 766 151 678 619 602 854 775 793 200 691674449 708 869682 537703 769 726 339 944 959 91 182 225 887 569 801 661 971 546 354 268 51 210 968 557 946 8555 366 945 638 411 315 392 258 277 329 997 95194916 176 27605 119 811 26 202 345 618 178 642 505 355 574 352 237 195 32706 577 849 883 429 986 756 216 493 229 23 219 719 518 995 578 530 326 205 207 815 325 334 286 808 543 655
179
20 836
50
(a) healthy condition: no attackers
725
872
863 79 832 90 304 970473 836 582 592 882 527 808 467 392 773 675 549 88 147 725 263 143 151 577982 997120 234 894 709 149 921 598 196 347 866 260 611 415 159 402 366 348952944 102 524 245 322 95 573 145 740 6 440 285 941 321 374 728 837 927 885 644 213 589 996 282 827 778 64 772 826 62 636 367 68 727972 732 523 879 122256 702 683 632244 631 625 301 653 15 485 946 408 846 308 550 173 928 104 311 32 60 486 831 842688667 870 373 320 242634 763268 36 721878 733 505 655 922 623 628 38 897 543 547 890 421 327364 716 579 900 426 635 729744 267 609 444 4 801 810 72 286 80 713 909 686 591 433 535 595 748 430 462 607843 287 767191 292 872 370 180 78710 793 23316 401 325 614 69701106 310 910 13 100 466 786 141 575528 679881 916 658 14 376 238 174 350 760 929396 105 422 947 532 804 492 480 531 23 845 840441 84581759 51 46127 474 204 388 177 380 67 169406 757 316 608 684 673 142 248 101 659 29 198 570 273 71 519 237 764 499 417 246 338 654 82 346 257 700 361 498 443 501 630 960 903 96 561409 81 639 340 540 726 781 838 800 606 739 906 126 89 220 253 447 258 382187 449 737 992 493 512 621 522 901 372 19 343 533 298 978 395 55482128 464 930265932 988 31 841813 823 517 65226 56550 937 936 92 241 974 420 984 259 36022 445 162 358 855 769365500 385 534 98828 52 56 103 548 1481257 785 20 819 115 497 771 472 399 281 375 381 578 54 979 513049 766 123 12 746 475 264 783 250 121 689 200898 154 847719 574 274334 63 911 626 803 542171 604 934 99 734 45 86 456 73 544 24 168 289 458 368 212 262 404 508 961 400 949 848 811 987520 266788 656 712 457 179 546195 594 224 230 403 646 341 797 146 718 859 21 681 666 280 912 178 844 981 369 586 682184 975779209 600 572 153 965 295 676 717 164521993 197 437 849 293 995307 968 371 825 85 964 907 243 593 888489 931 851 398 891 991 357 564 973 942 915 354 556 48 170333 714217 329 83 596 377 850 514 983 751 924 511 755215 664 822161 190 672 127 914 669 118 613 507 715 649349 506 251 869 494 798 745 752 302 663 994 117 53 816 780460 852 865642 192 57 487 640954 674 33829 156 328 108 648 277 998 232 627 948383 787 904 862 756504 378 945525 401 219 65144820 670 222 172 569 939 124 768 874 761 315 270633 484 136 782216 59 479 966 129 571 46 680 699 610 706 58 794 698 155 720 530 214 410 563864 490 109 229 279 580235 951 199 917 47 218488 10 537 790 275638887 94 336 8 693 539 87 690 66 468432 817 424 868 585 856 496 25 791 482 880 597 551 312 318326 133 802 552 303 999 590560 176 839 239 434 455 742 562 661 276 883 624 207 833183 254139526 284 557152 167 926 529 871 616 18 43 236 905 918 201 685 223140 131 799 203 687 324 657 518 134 116 749 39 919 135 416 692 181 189 394 448 754 576 175 425 858 323 313 255 645 503 629 668 516 211 650 317723 379 969 708 789 93221 419 899 747 160 297 584 743 463 342 227 471 877 736 792 567 272 873 777389 240 438 465641 0210 76 765568 776 107 695 228860 896 247 738 359 194 958 61 483 940 867 617 26 660 678 451 288 429 353 809806 976 452 959 671854 165 722 306 476 545 225 923 362 55 158 384 750132 637 291 515 469 34 137 202 861 986 587 495 454 703 953319 391 989 335 337 835 815 411 11 208 513 42 70 812 75 249 305299 902 950 735 857 393 157 193 908294741 166 913 459 834 332 397 314 814 893 541 731 553 344 290 435 296 925 386 601 536 538 943 356 971 387 509774 652 605 352 510436 889 138 758 112 74 762 481 186182 588 599 261 477 558 439 41 694 97 418 875 35119 963 665 677 583 269 620 980 114696 724 977 853 647 602 30 784 427 615 612 271 77 252 753423 150 807 775 830 206 603 231363 355 938 566 555 278 622 956876 707 413 933 446 188 12814470537 697 405 17 330 957 691 818 407 390 962 824 412 704 955 618 309 9 805 163 113 559450 884 339 3513 502 345 283 662 414 643 331 730 920 185 91 796 967 205 110 795442 990 453 619 300 892 431 935 2 711470 886 111 895 985428770
505
610
47
356
510
853
894
732
928
35
(c) k = 20 attackers after leaving the overlay
Figure 1: Overlay topology in, respectively: (a) normal condition, when nodes are connected in a random fashion; (b) when the attackers (i.e., dark circles) are present in the system and become the "hubs" of the network; (c) the network becomes fully disconnected when the attackers, after polluting the other node’s caches, have left the network. Network size is 1000 nodes. need not pose serious problems [5]. Due to scalability constraints, a node knows about only a small subset of other participants. This subset, which may change, is stored in a local cache of size c, while the node IDs it holds are called neighbors. This set provides the connectivity for a node in the overlay; the relation “who knows whom" induced by the neighborhood set defines the overlay topology. Real-world P2P applications provide a set of well-known, highly available nodes in order to provide a bootstrap facility which constitute the initial neighborhood set. In general, a timestamp is associated with each distinct ID stored in the cache to eventually purge “old” ID references according to an aging policy. This model does not rely on time synchronization, the time being measured in generic time units or cycles during which each node has the possibility to initiate a gossip exchange with another (randomly selected) neighbor. When the content of the cache is corrupted by the presence of bogus IDs (i.e., polluted cache), the ability to communicate with the rest of the network can be severely compromised. In this case, a new initialization or bootstrap is required. In our attack model, we consider a practical scenario in which a small set of colluding malicious nodes or attackers play in the system. The size k of this set is equal to the cache size (k = c). The goal of an attacker is to subvert the network in order to achieve a leading structural position i.e., become a hub. The attack method involves the spreading of fabricated data through the messages gossiped among the participants in such as way as to affect the logical links of the overlay. We suppose the attackers are “clever", in the sense that they operate to avoid being easily discovered [10]. The attack algorithm, called the hub attack [8], can run any PSS implementation like any other well-behaving node, but the content of its malicious messages is filled with the IDs of other malicious nodes in the system. In addition, the ts for each ID is manipulated to ensure that it will not be dropped by a neighbor. This behavior is perfectly valid from a PSS point of view, as the only mandatory constraint is that cache entries be distinct. Surprisingly, this intrinsic weak integrity constraint complies to reality; in real-world P2P file-sharing systems (see [9,10,13]), when a peer receives an advertisement for an item, it does not usually verify its validity. Another kind of malicious cache attack have been proposed [7] but, in this work, we restrict to this one; however, the presented SPSS approach is also suited for this other cache attack.
In this paper, we consider a specific implementation of the PSS called N EWSCAST. However, our approach is generic and implementation independent. N EWSCAST is a gossip-based topology manager that builds and maintains a dynamic random graph. In this protocol, a node randomly selects one of its neighbors in order to subsequently exchange links from their respective caches. Note that such an exchange implies that (usually) the neighbor set of each node changes. Details can be found in [6].
3.
PROBLEM AND SOLUTION GUIDELINES
The effect of gossiping malicious information is to mutate the randomly organized PSS overlay into a network having a completely different kind of organization that suits the attackers’ aim. Our goal is to preserve the original organization of the PSS to avoid severe performance issues or the disruption of the overlay. The main challenge is represented by the exponential growth of the infection which leads to a situation where each well-behaving node ends up with no neighbors other than the attackers. As a consequence, when a node is infected, it has no way to recover, especially in a fully distributed environment where there are no trusted parties to ask for help [8]. What is worse, is that just a single gossip exchange (initiated by a malicious node) may be sufficient to fully pollute a node’s cache. The basic idea and the main contribution of this work is to fit the PSS into a higher level framework which allows to: (a) play a PSS implementation as usual and (b) monitor the overlay and react to structure changes when required. The key point is that the presence of the new framework is transparent to the PSS and there is no way for a node to know how the information provided will be used by a receiving neighbor. In fact, a stochastic proportion of the gossip interactions could just be “explorative", while the others lead to standard PSS interactions. The task of the exploration is two-fold: on one hand it builds a particular sample of the current neighbor’s surroundings, while on the other hand it collects node IDs that may become useful if and when the PSS cache becomes polluted by the spreading infection. As there is no way to detect if a neighbor has actually played the PSS or not, this behavior generates a dilemma that could tremendously limit the attacker’s power. As the attackers aim to mutate the overlay organization, they have to artificially promote them-
forever do PSS_done ← FALSE size ← [0 ... degree()] rndNSet ← RNDNEIGHBORSUBSET(size) wait(∆t) ∀ neighbor ∈ rndNSet do SENDSTATE(neighbor) n_state ← RECEIVESTATE() 1 et !PSS_done do with prob. size if CHECKIDS(n_state) do PSS_done ← TRUE apply PSS update. otherwise GETSTATISTICS(n_state) RECOVERSTATE()
forever do n_state ← RECEIVESTATE() SENDSTATE(n_state.sender) if CHECKIDS(n_state) do apply PSS update. RECOVERSTATE()
(a) Active Thread
(b) Passive Thread
Figure 2: SPSS pseudo-code algorithm.
selves by aggressively diffusing their IDs and they cannot hide this behavior. Monitoring the overlay requires collecting statistics about the overlay structure, therefore a suitable property must be identified; this property has to be selected according to the particular nature of the overlay we address - random nature, in our case. The nature of the property we choose and the way we check it are the key points of our solution (see [10]). For our solution we draw inspiration from social network analysis (SNA) and in particular we consider the notion of prestige of nodes in a directed network, where the nodes that receive more positive choices are considered prestigious [4]. In SNA, prestige is expressed as a particular pattern of social links. We adopt a (simple) technique to compute the structural prestige of a node in terms of popularity (or in-degree). The information about a node’s indegree is collected during a gossip exchange among neighbors, by looking at each entry in the received caches. Since the network should be random, we expect that the average value of popularity is almost the same for each node. Detecting a node showing a popularity value far grater than the average means that it could represent a network hub. Each node builds its own knowledge about its surroundings and it does not share this information with its neighbors [10]. While in a firm’s social context prestige is a desirable status (e.g., usually associated with higher respect, responsibilities and income), in our particular environment the prestige feature is an undesirable property. In a random overlay, the “egalitarian" nature of a P2P system is even more emphasized as it includes the homogeneity in terms of structural network position.
4.
PRESTIGE-BASED SPSS ALGORITHM
As the main problem in preventing the hub attack is the exponential speed of infection, the main challenge is to let each node build a suitable knowledge base (KB) to possibly recover its local cache. Another SPSS approach [8] was based on building a KB according to a quality rate performed at each exchange. In such an approach, during each gossip, a node collects information regarding a single neighbor; it is a slow process mitigated by the fact that the probability of the actual state update is proportional to the quality value. The approach we propose in this work can build a suitable KB in a much faster manner. In the following, we distinguish two distinct aspects: Dilemma mechanism: Figure 2 shows the prestige-based SPSS algorithm using a pseudo-code language. The first important difference compared to the prototypical gossip scheme (see [3]) is the absence of the SELECTPEER() function which selects a random peer from the local cache. Here instead, a node can gossip with multiple neighbors in the same time unit (cycle); a sub-set of random neighbors, rndNSet, is selected from the cache by the RNDNEIGHBORSUBSET() function. This feature is similar to
the behavior of cellular automata, which in turn can be considered as gossip systems as well [3]. In normal conditions - i.e., when the number of attackers is equal to the cache size (k = c) - just two gossips per cycle are sufficient to prevent the hub attack as, on average, one gossip cycle is dedicated to the PSS interaction and the other collects statistics about the overlay organization. The process of detecting the network organization is faster than the diffusion of malicious IDs. Regardless of the number of neighbors selected by RNDNEIGHBORSUBSET(), the PSS state update policy can be executed only 1 . once (by the active thread) and with a probability of rndNSet.size() The interactions with the other neighbors on the other hand, are used to collect prestige related information about the node’s neigh1 borhood. The presence of the rndNSet.size() probability is the reason why a single gossip is not sufficient to prevent the attack, as it would disable both the dilemma and prestige mechanisms. The proposed mechanism produces the dilemma in which a potentially malicious node A never knows how another neighbor B is going to use the information the malicious node provided. In fact, the more they pollute, serving malicious IDs during the exchanges, the more is the possibility of appearing as “suspect" in terms of popularity (prestige). Prestige mechanism: The information collected during the gossip exchanges is used to build the KB required to detect with good accuracy the malicious nodes and to eventually repair the node’s cache when it becomes polluted by the presence of malicious IDs. A table structure - the Prestige Table (P TABLE) - holds the following kind of tuple: [ node ID, #hits, TTL ], where each node ID detected is associated to a frequency value (#hits) and to a timeto-live (TTL) value expressing the time validity of the table entry. Essentially, each node explores its neighborhood and collects data about the popularity (or frequency expressed in #hits) with which the same node ID has been reported by the received caches. In just a single gossip, each node can collect a number of items equal to the current cache size (c items). Though we do not pose any size restriction to the P TABLE, which can hence eventually grow up to the size of the network; this is very unlikely, as the entries are purged according to an aging policy, which decrements by 1 each entry’s TTL at each cycle. An entry’s TTL value is incremented (by 1) each time the same entry is detected. As the attackers tend to acquire a network centric position, their prestige is likely to dominate over the other node’s entries and its value will be far more than the average node’s prestige. According to this simple idea, the P TABLE object maintains the average hits value: #hitsavg ; this value is used as a threshold to distinguish between potential attackers (i.e., #hitsA ≥ #hitsavg for a node A) and well-behaving nodes. When an entry expires (i.e., TTL=0), its ID is collected in a W HITE L IST, as it can be (with high probability) considered a well-behaving node. If the same ID is detected in a neighbor’s cache in a subsequent gossip, it is not removed from the W HITE L IST until its #hits value has eventually reached the P TABLE #hitsavg value. These calculations and the management of the KB are handled by the GETSTATISTICS() function. Any well-behaving node could play the PSS with an attacker that has not been detected yet, as its KB is not established enough or is even empty. The risk of having the cache polluted is always present and therefore a method to recover the node’s cache is required. In Figure 2, the function RECOVERSTATE() represents this requirement. If the KB is not empty, it does not matter if not all the attackers are known. In the average case, it is sufficient to have a set of node IDs in the W HITE L IST that are present in the P TABLE, but have a lower #hits value than the #hitsavg value, or that are not present at all in the P TABLE. When the KB is empty instead, the node’s cache
EXPERIMENTAL EVALUATION 1
0.1
0.01
0.001 0
0.1
20
40
60
80
100
Cycles 1% churn 0.01
5% churn
10% churn
Figure 4: Dynamic scenario: the avg. pollution level is shown over time according to distinct churn ratios. 2 gossips per cycle and k = 20 malicious nodes are adopted. Network size is 10000.
0.001 0
50
100
1 gossip 2 gossips
150 Cycles
200
250
300
4 gossips 8 gossips
Figure 3: The avg. pollution level in the caches is shown over time. Each plot represents a distinct number of gossips (i.e., 1, 2, 4, 8 respectively); k = 20 = c malicious nodes are present. Network size is 10000. The prestige-based SPSS’s evaluation reported in this section has the following goals: (a) how much time is required to achieve a stable and tolerable proportion of pollution in the node’s cache and how the number of gossips per cycles influence this performance, (b) performance in a dynamic scenario and finally (c) comparison of the prestige-based SPSS with another fully distributed SPSS, based on a different approach [8], in terms of performance in extreme conditions. In the following evaluation, we consider that the pollution level is “tolerable" if the overlay graph does not split into clusters when the malicious nodes leave the network (see [8]). If not stated otherwise, the hub attack is performed by a set of k colluding malicious nodes k = c = 20, where c is the cache size the actual PSS implementation adopted (N EWSCAST). Due to the space constraints, only the results for the 10000 nodes overlay scenario will be reported, but the results are quite similar for smaller overlay sizes (e.g., 1000 and 5000 nodes). Figure 3 shows the average pollution proportion in the node’s caches. Each plot represents a protocol setup adopting a distinct number of gossips: 1, 2, 4 and 8. In this scenario, the overlay is static; in other words, we guarantee that during the life-span of the experiments no nodes will leave or join the network for any reason. When the nodes perform just a single gossip per cycle, the prestige SPSS is indistinguishable from the ordinary PSS, where no defense is available. Starting from the two gossips setup, the situation changes dramatically. The average pollution drops to a very low (and safe) 1-2%. Increasing the number of gossips further to 4 or 8 has a very marginal benefit, if any. It was seen that the communication cost, in terms of messages exchanged, grows much faster than the achieved pollution suppression benefit and 2 gossips
It is interesting to see that the dynamism of the network actually helps the prestige mechanism to keep the cache pollution low. While it may seem counter-intuitive, the reason of this behavior is easily explained. As in the static scenario the attackers cannot subvert the system, in addition to that, well-behaving nodes are substituted at a constant rate and their caches are randomly initialized, minimizing the chance of joining the network with a polluted cache and neutralizing any previous success in polluting the previous ones. In fact, as the churn rate increases, the average pollution in the system decreases proportionally. 1 Avg. proportion of cache pollution
Avg. proportion of cache pollution
5.
are good enough to manage the attack. In Figure 4, we consider the performance of the prestige-based SPSS in a dynamic setup, where at each cycle a set of nodes leaves the network and it is substituted by an equal amount of new nodes. However, this process involves only the well-behaving participants, while the malicious nodes are not affected and they pursue their malicious intent for the whole duration of the experiment. Three distinct churn rates are shown: 1%, 5% and 10% respectively. Two gossips per cycle are adopted.
Avg. proportion of cache pollution
could be completely polluted by an attacker. This would happen in the early stages of the protocol when a node has just joined the overlay. The only chance to recover, from such scenario, is to be contacted by another well-behaving and non polluted node. As the overlay is pseudo-random, on average, this can happen quite often. The function CHECKIDS() is designed to perform basic cryptographic checks over the received IDs. The SPSS system requires a central Certification Authority (CA) which is not part of the protocol, but is required to provide the nodes the credentials to join the network. We cryptographically secure each ID structure using [IDA ,tscreation ,tsexpiration , PKA , σ], where IDA is A’s ID, the ts are time-stamps, PKA is A’s public key and σ is the digital signature on the message.
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 SPSS(2) PSPSS(2) SPSS(8) PSPSS(8) 1%
2,5%
5%
10%
Figure 5: Comparison among the prestige based SPSS and distinct a SPSS approach. The size of the network is 10000 nodes. The set of attackers is expressed as a percentage of the network size: 1%, 2,5%, 5% and 10%. The thick horizontal line represents the maximum tolerable pollution level. Figure 5 depicts an extreme attack scenario where the number of malicious nodes k is expressed as a percentage of the overlay size and it is much higher than the cache size c (k >> c); in particular, the percentages we selected are: 1%, 2,5%, 5% and 10% of the overlay size. We compared the prestige-based (SPSS) with another SPSS algorithm based on a different mechanism [8]. Each bar for each cluster represents a distinct percentage of attackers in
the network. The thick horizontal line represents the maximum tolerable pollution level over which the overlay runs the risk of being partitioned. To allow a fair comparison among the contenders, we adopted respectively 2 and 8 gossips setup for the former, versus 2 and 8 overlays setup for the latter SPSS. Essentially, the prestigebased SPSS can deal with the presence of a larger set of malicious nodes and its effectiveness is not limited to the standard k = c case. However, the extreme scenario shows the benefit given by multiple gossips: the first setup (2 gossips) can just deal with 1% of attackers, while the second one (8 gossips) can tolerate 5% of attackers. The other SPSS instead always fails when k >> c. The infection can spread to the whole network only if the cache size c (and the number of explorative gossips) are too small compared to the attacker’s population. Essentially, the attackers can successfully behave in a malicious manner if their population is sufficiently large, otherwise the more they pollute the more is the possibility to be discovered. When the set of attackers is too large, from a well-behaving node point of view, its overlay neighborhood looks random: the malicious caches it receives are fabricated from a large set of malicious IDs and they are effectively indistinguishable from a non malicious one without having a deeper knowledge of the overlay structure. The prestige-based approach is also effective when more sophisticated malicious cache content (e.g., malicious node IDs are mixed with crashed or non existent node IDs) is adopted. However, we have not shown them in this work due to the space restrictions.
6.
RELATED WORK
The previous SPSS approach [8], with which we made the comparison depicted in Figure 5, builds and manages distinct PSS overlays. By participating in multiple overlays, each node monitors the received cache patterns coming from distinct sources. The emergence of high frequency patterns may indicate the presence of the attackers and may trigger a topology reconfiguration and may blacklist the potential attackers. Our novel approach instead, does not need the extra overhead of maintaining multiple overlays, but each node just monitors its neighborhood. However, monitoring and playing the PSS is indistinguishable; this poses a dilemma for the attackers: if they pollute with brute force they will be discovered fast and with high probability; conversely, if they pollute more cleverly (e.g., at a lower rate) they will not be discovered, but they will hardly accomplish their malicious task. In [1], the authors presents a sampling membership algorithm with which every node’s sample converges to a uniform sample and can resists to the failure of a linear portion of the nodes. This approach defines and uses its own sampler algorithm, while we focus on securing an already existent sampling service (i.e., the PSS). In [14], the authors introduces a fully decentralized approach for securing synthetic coordinate systems. They adopt a sort of sociallike, vote-based approach in which each coordinate tuple must be checked by a (small) set of other nodes. For each node producing a coordinate tuple, the set of nodes that have to check and eventually approve that tuple is given by a hash function based on each node’s unique identifier. The system is very resilient to attacks targeting instabilities and inaccuracies to the underlying coordinate system. However, this approach requires the presence of a DHT facility that adds complexity and may become an extra source of issues (e.g., DHT attacks). Social network principles (e.g., reciprocity and structural holes [2]) are also adopted in JetStream [11] to optimize and build robust gossip systems. The basic idea is to make a predictable neighbor selection when gossiping to avoid unpredictable, excessive message overhead. In addition, the traditional scalability and reliability
of gossip are maintained.
7.
CONCLUSION
We have presented a novel SPSS mechanism which maintains the underlying random overlay in the presence of malicious nodes playing the hub attack. Our mechanism is based on a prestigebased SNA technique. The comparison with another SPSS method reveals the improved effectiveness of this approach when a larger set of malicious nodes (k >> c) are involved in the attack.
8.
REFERENCES
[1] E. Bortnikov, M. Gurevich, I. Keidar, G. Kliot, and A. Shraer. Brahms: Byzantine resilient random membership sampling. In PODC, June 2008. [2] R. Burt. Structural Holes: The Social Structure of Competition. Harvard University Press, 1992. [3] P. Costa, V. Gramoli, M. Jelasity, G. P. Jesi, E. Le Merrer, A. Montresor, and L. Querzoni. Exploring the Interdisciplinary Connections of Gossip-based Systems. Operating System Review, 41(5):51–60, October 2007. [4] W. de Nooy, A. Mrvar, and V. Batagelj. Exploratory Social Network Analysis with Pajek. Cambridge, 2005. [5] N. Drost, E. Ogston, R. V. van Nieuwpoort, and H. E. Bal. Arrg: real-world gossiping. In HPDC, pages 147–158, New York, NY, USA, 2007. ACM. [6] M. Jelasity, S. Voulgaris, R. Guerraoui, A.-M. Kermarrec, and M. van Steen. Gossip-based peer sampling. ACM Trans. Comput. Syst., 25(3):8, 2007. [7] G. P. Jesi, D. Gavidia, C. Gamage, and M. van Steen. A Secure Peer Sampling Service. UBLCS 2006-17, University of Bologna, Dept. of Computer Science, May 2006. [8] G. P. Jesi, D. Hales, and M. van Steen. Identifying Malicious Peers Before it’s Too Late: A Decentralized Secure Peer Sampling Service. In IEEE SASO, Boston, MA (USA), 2007. [9] J. Liang, N. Naoumov, and K. Ross. The Index Poisoning Attack in P2P File Sharing Systems. In INFOCOM 2006. [10] S. J. Nielson, S. Crosby, and D. S. Wallach. A Taxonomy of Rational Attacks. In IPTPS, LNCS. Springer, 2005. [11] J. A. Patel, I. Gupta, and N. Contractor. JetStream: Achieving predictable gossip dissemination by leveraging social network principles. In NCA, Cambridge, MA, USA, 2006. [12] T. Rowley, D. Bherens, and D. Krackhardt. Redundant Governance Structures: an Analysis of Structural and Relational Embeddedness in the Steel and Semiconductor Industries. Strategic Management Journal, 21, 2002. [13] A. Rowstron and P. Druschel. Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems. In Middleware, Nov. 2001. [14] M. Sherr, B. T. Loo, and M. Blaze. Veracity: A fully decentralized service for securing network coordinate systems. In 7th International Workshop on Peer-to-Peer Systems (IPTPS 2008), February 2008. [15] S. Voulgaris, D. Gavidia, and M. van Steen. CYCLON: Inexpensive Membership Management for Unstructured P2P Overlays. J. Network Syst. Manage., 13(2), 2005. [16] S. Wassermann and K. Faust. Social Network Analysis. Cambridge University Press, 1994.