Social Simulation of Knowledge Sharing and Reputation Matt Jackson Abdulmajed Dakkak M. Afzal Upal Electrical Engineering & Computer Science Department University of Toledo Abstract Recently, there has been considerable work on mechanisms for maintaining and updating agent reputations to discourage deception among agents engaged in some form of commerce. Information is a special commodity not like any other economic commodities. This paper presents a market of information exchange and discusses how agents reputations are impacted by their willingness to share information and by the accuracy of the information they share with other agents. Contact: Prof. M. Afzal Upal EECS Department, MS 308 University of Toledo, Toledo, OH, 43606 Tel: 1-419-530-8145 Fax: 1-419-530-8146 Email:
[email protected] Key Words: trust, reputation, agent-based social simulation
Social Simulation of Knowledge Sharing and Reputation Matt Jackson, Abdulmajed Dakkak, M. Afzal Upal Reputation has received considerable attention in recent years; with different disciplines using it to describe different phenomenon that occur in their field. Economists, for example used reputation to describe “irrational” moves in economic games [10]. Computer Scientists, on the other hand, have used it to detect odd user ratings for some online e-commerce sites [1] [2] [3] and to find good quality files and reliable sources on P2P file sharing services [4] [11]. Evolutionary biologists, sociologists, and psychologists [7] also use reputation to explain many of the events that happen in their field. The purpose of this project is to examine how reputation is built amongst a population due to the sharing of information. Our experiment attempts to discover what effect the sharing of information has on reputation. By simulating a small world where the inhabitants share knowledge which they have learned, we hope to reveal some new information about the factors which influence reputation. The Wumpus World environment [15] is a well known simplified environment regularly used in the field of Artificial Intelligence. The world is represented on a two dimensional grid with each cell possibly containing a number of objects: wumpus, pit, or gold. The standard wumpus world has one agent on the gird. The goal of the agent is to navigate the board and find the gold, while avoiding the wumpus and the pits. An agent has three senses to aid it in finding the gold and avoiding danger • Stench – indicates that the wumpus is in one of the four neighboring cells • Breeze - indicates that a pit is in one of the four neighboring cells. • Glitter – indicates that gold is in one of the four neighboring cells. Although we do use the wumpus world as our environment, we have modified the goals of the game to suit our purposes. Our world is populated by multiple agents acting simultaneously, and we have removed the goal of finding the gold. Thus, there is no end state in our simulation; it simply runs for the allotted number of turns. In our wumpus world, each turn results in every active agent moving one step, and then if there are multiple agents on the same cell, trading information in order to gain knowledge of the world. Based on the quantity and quality of this information, an agent builds up a reputation. The purpose of this simulation is to investigate the dynamics of the spread of information though the agents’ reputation.
Proposed Solution Our solution is to put many agents in the wumpus world and observe how the reputation [14] of agents changed with respect to a number of different variables. Three different types of agents were used in this experiment: • Normal agent – the regular agent describes by the wumpus world problem. When it encounters a breeze or stench it marks the cells adjacent to it as possibly containing a wumpus or a pit. • Paranoid agent – an agent that assumes the worst. If a breeze is encountered, for example, then the agent marks every cell around that cell as a definite pit unless it already knows that the cell is safe. The exact same thing happens when stench is encountered. • Dumb agent – an agent that does not have the ability perceive breezes or stenches. In our simulation the dumb agents were the only agents who are able to give information that is not safe. Our reason for creating three different types of agents was to attempt to better simulate the real world, which contains paranoid and risky people. The original wumpus world game is designed to create an accurate representation of the world, with no chance for the incorrect labeling of a cell. Because we wanted to examine how reputation would be effected by the giving inaccurate information, we designed the paranoid and dumb agents to ignore or mislabel cells in order to introduce inaccurate knowledge into the simulation. Agents have the following attributes: • ID – a unique identification number used to distinguish between agents. • Age – determines how many steps an agent has taken before it died. • Share ratio – the ratio of units of information shared per one unit of information received. In our simulation the value was 0.5, 1, or 2. • Memory – contains agent’s self derived beliefs. Each agent believes that what it knows is the truth. This knowledge may change during the game. • Learned Memory – contains information provided by other agents. An agent records the information about the cell as well as the agent that provided the information.
• Solution Time – the age at which the reputation knows the whole state of the board from his memory and from learned memory. The total state of the board might not be the truth, however. • Reputations – each agent keeps a list of what it believes to be the reputation of every other agent in the world. This value is the agent’s belief and is not related to the beliefs of any other agents.
Sharing Information When two agents meet (are present on the same cell at the end of a turn), agent A and agent B, for example, a few of steps are taken in order to share information. An agent A asks another agent B for its share ratio information. On receiving this information agent A then gives agent B the reciprocal of Agent B’s share ratio times cells (a cell defined as all the information known by an agent for a specific coordinate on the board). Agent B then shares with agent A the reciprocal of agent A’s share ratio times cells. So for example, if agent A’s share ratio is 0.5 (meaning it needs to have information about 2 cells to be shared for every cell it shares) and agent B’s share ratio is 1, then when the two agents meet agent B shares 2 cells, while agent A only shares 1 cell. The algorithm is shown bellow: Algorithm 1: Sharing Information When two agents meet Agent A shares (1/Agent B’s share ratio) cells Agent B shares (1/Agent A’s share ratio) cells
Algorithm 2: Choose Cell to Share While (true) Cell = pick a random cell from the world If Cell is Visited or Cell is Wumpus or Cell is Pit then return Cell
The cells to share are chosen randomly, with a constraint that the agent must have visited the cell it is going to share, or it had determined that the cell is either a pit or a wumpus.
The Movement of Agents An agent moves in a predictable manner, choosing to explore unvisited yet known to be safe cells first, unless it either exhausts all of the moves it could take safely, or it finds a cell in the learned memory that is safe. If such a situation occurs, then the agent consults what was told to him by other agents, and uses the shared information to determine where to safely move next. There are no requirement for normal agents to go to cells that were designated as safe in leaned memory, while paranoid agents must know that the other agent has visited the cell and marked it as safe.
The Reputation Algorithm As the agents move across the grid they would encounter two states that are of interest to us. The first is meeting another agent. If that occurs, then the agents share some information (as described above) and adjust the other agent's reputation based on the following algorithm. Algorithm 3: Reputation algorithm part 1 When 2 agents meet then Call Share information function If InfoShared is for a cell which has not been visited Then store information but do not adjust reputation Else If InfoShared confirms what is known Then increase reputation of other agent If InfoShared contradicts what is known The decrease reputation of other agent
Algorithm 4: Reputation algorithm part 2 for each step taken by an agent do If the current cell's state was told by another agent and that information is now verified true Then increase the other agent's reputation by one Else what is told is not true Then decrease the other agent's reputation by one
Another state occurs when the agent accepted information from another agent about an unvisited cell, and later visits that cell. In that case another algorithm is used to adjust the reputation of the other agent.
Evaluation We ran a number of simulations in order to gather accurate results. In each simulation, we varied a number of parameters in the system to observe the effects of them. These parameters included: the number of each type of agent present in the world, the size of the world, the length of time the simulation runs, the number of pits, and the number of wumpuses. We recorded the results of each of these simulations, calculated the average reputation value
for each type of agent, and compared it against one of the variables to generate the graphs below. Another comparison was made with respect to the share ratio of the agents.
Reputation
Reputation VS Number of Agents (by Agent Type) 90 80 70 60 50 40 30 20 10 0 100
150
180
Number of Agents Dumb
Normal
Paranoid
Figure 1 First, in figure 1, we related agents' reputation to the number of agents present in the simulation. Dumb agents showed that their reputation is almost negligible when compared to Normal and Paranoid agents, while Normal and Paranoid agents showed fluctuation, but none significant enough to declare any correlation.
Reputation
Reputation VS Number of Agents (by Share Ratio) 100 90 80 70 60 50 40 30 20 10 0 100
150
180
Number of Agents 0.5
1
2
Figure 2 Figure 2 also related reputation with the number of agents present but this time against the agent's share ratio and not their type. The graph did show some increasing trend, but again not enough to reach to any conclusion. Reputation VS Number of Hazards (by Agent Type) 80 70 Reputation
60 50 40 30 20 10 0 6
7
8
10
12
18
Number of Hazards Dumb
Normal
Paranoid
Figure 3 Next, in figure 3, we related reputation to the number of hazards (the combined number of pits and wumpuses) present in a world. Again, dumb agents showed little change, accumulating almost no reputation when compared
with the other agent's. Normal and Paranoid agents both showed a decreasing trend as the number of hazards increased. Reputation VS Number of Hazards (by Share Ratio) 80 70 Reputation
60 50 40 30 20 10 0 6
7
8
10
12
18
Number of Hazards 0.5
1
2
Figure 4 Figure 4 again showed the same downward trend seen in figure 3 for each of the share ratios, showing that agents who share more (with a share ratio of 2) have a relatively higher reputation overall as the hazards increase. The downward trend seen in figures 3 and 4 could be caused by a number of factors including that an increased number of hazards prevented agents from meeting each other as often, resulting in lower reputation values overall.
59058
55955
52852
49749
46646
43543
40440
37337
34234
31131
28027
24924
21821
18718
15615
12512
94093
63062
1000
50000 45000 40000 35000 30000 25000 20000 15000 10000 5000 0 32031
Reputation
Reputation VS Time (by Agent Type)
Time Dumb
Normal
Paranoid
Est Dumb
Est Norm
Est Par
Figure 5 Figure 5 related reputation to the duration of a simulation. We ran a number of simulations for an increasing number of turns and plotted the relationship. Normal and Paranoid agents both show a fairly linear increase in reputation as time increases, while dumb agent’s reputation values are so small that they do not register on the graph. The estimated value was also plotted on the same graph and was calculated using the following formula: Algorithm 5: Estimating reputation for α > 1000 ρ(α) ≈ ρ(1000) * α/1000 α = age of the agent
Reputation VS Time (by Share Ratio) 25000
Reputation
20000 15000 10000 5000
24824
23523
22222
20920
19619
18318
17016
15715
14414
13113
11811
10510
92091
79078
66065
53052
40039
27026
1000
14013
0
Time 0.5
1
2
Est 0.5
Est 1
Est 2
Figure 6 Figure 6 also showed a linear relationship between the time and reputation for each share ratio. As expected, and as was seen in figure 4, the results show that agents who share more generally have a higher reputation. The estimation was also plotted and was very accurate that the actual plot was hidden by the estimated plot. Solution Time VS Number of Agents (by Agent Type) 60
Solution Time
50 40 30 20 10 0 220
300
320
360
380
400
Number of Agents Normal
Paranoid
Figure 7 Figure 7 compared the solution time and the number of agents on the board. The results show a somewhat random behavior, and thus no correlation, however it can be seen that normal agents on average had a lower solution time than paranoid agents. In this case, a lower number indicates that the agent was able to solve the world faster. The dumb agents were excluded for this test as we believed that they would not impact the results. Solution Time VS Number of Agents (by Share Ratio) 35
Solution Time
30 25 20 15 10 5 0 220
300
320
360
380
400
Number of Agents 0.5
1
2
Figure 8 The final plot, figure 8, compared the solution time with respect to the share ratio also showed a somewhat random behavior. It does show that the share ratio does not affect the solution time since the plots are packed
together closely. This was unexpected since we believed that an agent who take information (have a lower share ratio in our simulation) more would learn faster, while agents who take less information will learn slower. Overall, these simulations resulted in a number of interesting findings that could be applicable in the real world. We see from the results that the reputation of agents increases linearly with time (figures 5,6). We can also see that the reputation of agents decreased at a negative rate as the number of hazards increased (figures 3,4). We also found with the results that Normal agents and Paranoid agents were both equally reputable in our simulations, with no clear superiority for either type, while dumb agents performed even worse than anticipated (figure 3). Examining the age of the dumb agents revealed that most would die within the first dozen turns, while the rest of the agents functioned without dying for the remainder of the test.
Conclusion The goal of this paper was to examine how the sharing of information affects a person’s reputation. From our results, we clearly see a few pieces of information to satisfy this goal. Firstly, the more information shared results in a higher reputation, as indicated by our plotting of reputation over time. Secondly, being generous results in a higher reputation, as seen in a number of our plots showing the share ratio of 2 to have the highest reputation. A few other conclusions which can be drawn from the simulations are that as the wumpus world becomes more dangerous, reputations will tend not to be as high, and also that paranoia will tend to slow down the efficiency of a person in learning about their world. One result which surprised us was the performance of the paranoid agents, who we had expected to perform slightly worse than the normal agents. Instead, they functioned equivalently, and even sometimes surpassed the normal agents in performance. This seems to contradict their design, in which they are prone to giving false information due to their paranoia. Future work could investigate this phenomenon. Another interesting find was that we could accurately estimate the reputation over time based on an agent's early reputation. Future work could find how this is applicable in other fields of reputation.
References [1] B. Yu and M.P. Singh. Distributed Reputation Management for Electronic Commerce. In Computational Intelligence, 18(4): 535-549, 2002. 16 [2] Dellarocas, Analyzing the Economic Efficiency of eBaylike Online Reputation Mechanism, in Proc of 3 rd ACM Conf on E-Commerce, Oct 2001. [3] D. Houser and J. Wooders. Reputation in auctions: Theory and evidence from ebay. Working paper, University of Arizona, 2001. [4] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, P. Samarati, and F. Violante, "A reputation-based approach for choosing reliable resources in peer-to-peer networks," in 9th ACM Conference on Computer and Communications Security, Nov. 2002. [5] J. M. Pujol, R. Sanguesa, and J. Delgado. Extracting reputation in multiagent systems by means of social network topology. In Proceedings of First International Joint pages 467--474, 2002. [6] J. Sabater and C. Sierra. Regret: a reputation model for gregarious societies. In C. Castelfranchi and L. Johnson, editors, Proc. of the 1st Int. Joint Conference on Autonomous Agents and Multi-Agent Systems, pages 475--482, 2002. [7] J. Yen, J. Yin, M. Miller D. Xu, R.A. Volz, and T.R. Ioerger. CAST: Collaborative agents for simulating teamwork. In Seventeenth International Joint Conference on Artificial Intelligence, 2001. [8] L. Mui, M. Mohtashemi, and A. Halberstadt. A computational model of trust and reputation. In 35th Hawaii International Conference on System Science (HICSS), 2002. [9] L. Mui, M. Mohtashemi, and A. Halberstadt. Notions of reputation in multi-agents systems: a review. In Proceedings of First International Joint Conference on Autonomous Agents and Multiagent Systems, pages 280-287, 2002. [10] M. Celentani, D. Fudenberg, D.K. Levine, and W. Psendorfer. Maintaining a reputation against a longlived opponent. Econometrica, 64(3):691---704, 1966. [11] M. Gupta, P. Judge, and M. H. Ammar, "A reputation system for peer-to-peer networks," in NOSSDAV, June 2003. http://citeseer.ist.psu.edu/gupta03reputation.html [12] North Carolina State. An Evidential Model of Distributed Reputation Management Bin Yu. http://citeseer.ist.psu.edu/625493.html [13] S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 2nd edition, 2003.