Abstract. Enabling trust to ensure more effective and efficient agent interaction is at the heart of the Semantic Web vision. We propose a computational trust.
A Computational Trust Model for Semantic Web Based on Bayesian Decision Theory Xiaoqing Zheng, Huajun Chen, Zhaohui Wu, and Yu Zhang College of Computer Science, Zhejiang University, Hangzhou 310027, China {zxqingcn, wzh, huajunsir, zhangyu1982}@zju.edu.cn
Abstract. Enabling trust to ensure more effective and efficient agent interaction is at the heart of the Semantic Web vision. We propose a computational trust model based on Bayesian decision theory in this paper. Our trust model combines a variety of sources of information to assist users with making correct decision in choosing the appropriate providers according to their preferences that expressed by prior information and utility function, and takes three types of costs (operational, opportunity and service charges) into account during trust evaluating. Our approach gives trust a strict probabilistic interpretation and lays solid foundation for trust evaluating on the Semantic Web.
1 Introduction In this paper, we present a Bayesian decision theory-based trust model that combines prior and reputation information to produce a composite assessment of an agent's likely quality and quantify three types of cost: operational, opportunity and service charges incurred during trust evaluating. Consider a scenario in which a user (initiator agent) wants to find an appropriate service provider (provider agent) on the Semantic Web, and his problem is which provider may be the most suitable for him. Assuming that he maintains a list of acquaintances (consultant agents) and each acquaintance has a reliability factor that denotes what degree this acquaintance's statements can be believed. During the process of his estimating the quality of different providers and selecting the best one among them, he can 'gossip' with his acquaintances by exchanging information about their opinions. We call it reputation of a given provider agent that integrates a number of opinions from acquaintances and acquaintances of acquaintances. We also consider initiator agent's prior information that is direct experience from history interactions with the provider agents. And then, the trust can be generated by incorporating prior and reputation information.
2 Trust Model for Semantic Web There are many different kinds of costs and benefits an agent might incur when communicating or dealing with other agents and the trust model should balance these costs and benefits. Following [K. O'Hara et at., 2004], The costs and utility associated with trust evaluating include the following. X. Zhou et al. (Eds.): APWeb 2006, LNCS 3841, pp. 745 – 750, 2006. © Springer-Verlag Berlin Heidelberg 2006
746
X. Zheng et al.
Operational Cost. Operational costs are the expenses of computing trust value. In other words, this is the cost of setting up and operating the whole trust plan. Therefore, the more complex the algorithm is the higher the cost is expected to be. Opportunity Cost. Opportunity cost is the lost of missing some possibility of making better decision via further investigation. Service Charge. Service charge can be divided into two types. One is Consultant fee that incurred when an agent asks for the opinions from the consultant agents, the other is Final service charge that will be paid to the selected provider agent. Utility Function. To work mathematically with ideas of "preferences", it will be necessary to assign numbers indicating how much something is valued. Such numbers are called utilities. Utility function can be constructed to state preferences and will be used to estimate possible consequences of the decisions. 2.1 Closed Trust Model and Bayesian Decision Theory We assume that the quality of a provider agent can be considered to be unknown numerical quantity, θ, and it is possible to treat θ as a random quantity. Consider the situation of an agent A tries to make an estimate of trust value for provider agent B. A holds a subjective prior information of B, denoted by distribution π(θ), and requests A's acquaintance to give opinions on B's quality. After A receives the assessments of B's quality from its acquaintances, A takes these opinions as sample about θ. Outcome of these sample is a random variable, X. A particular realization of X will be denoted x and X has a density of f(x|θ). Then, we can compute "posterior distribution" of θ given x, denoted π(θ|x). Just as the prior distribution reflects beliefs about θ prior to investigation in B's reputation, so π(θ|x) reflects the update beliefs about θ after observing the sample x. If we need take another investigation on B's quality for more accuracy, π(θ|x) will be used as prior for the next stage instead of original π(θ). We also should construct utility function for agent A's owner, represented by UA(r), to express his preferences, where r represents rewards of the consequences of a decision. Supposing that π(θ|x) is the posterior distribution of provider agent B, the expected utility of function UA(r) over π(θ|x), denoted Eπ(θ|x)[UA(r)], is possible gain of consequence of selecting B. If there are several provider agents can be considered, we simply select one that will result in the most Fig. 1. A "closed society of agents" for agent A expected utility. By treating an agent as a node, the "knows" relationship as an edge, a directed graph emerges. To facilitate the model description, agents and their environment are to be defined. Consider the scenario that agent A is evaluating trust of B and C for being business. The set of all consultant agents that A asks for this evaluation as well as A, B, C can be considered to be a unique society of agents N. In our example (see Figure 1), N is {A, B, C, D, E, F, G, H, I, J, K} and is called a "closed society of agents" with respect to A.
A Computational Trust Model for Semantic Web Based on Bayesian Decision Theory
747
Good Average Bad
Decisions are more commonly called actions Table 1. Contitional density of X, and the set of all possible actions under given θ consideration will be denoted A . In our example, θ Good Average Bad agent A is trying to decide whether to select agent B (action b) or C (action c) as business 0.60 0.10 0.05 partner (A = {b, c}). Service charges of B and C are 400 and 450 units respectively (SCB = 400, 0.30 0.70 0.15 SCC = 450). We will treat quality of service, θ, as x a discrete variable, and θ affects the decision process is commonly called the state of nature. 0.10 0.20 0.80 In making decision it is clearly important to consider what the possible states of nature are. The symbol Θ will be used to denote the set of all possible states of nature and θ Θ = {good, average, bad} represents three levels of service quality. See Figure 1, a notion on an edge between initiator and consultant agents or between the two consultant agents represents reliability factor, that between consultant and provider agents is the assessment of service quality and that between initiator and provider agents is prior information. In our example, the prior of A to B is {πB(θg)=0.3, πB(θa)=0.5, πB(θb)=0.2)}, where PB(good) = 0.3, PB(average) = 0.5, and PB(bad) = 0.2. We also suppose that the prior of A to C is {πA(θg)=0.33, πA(θa)=0.34, πA(θb)=0.33)}. The probability distribution of X that represents the assessments of service quality from consultant agents is another discrete random variable, with density f(x|θ), shown in Table 1. We also construct a utility function of {UA(r = good) = 900, UA(r = average) = 600, UA(r = bad) = 100} for agent A's owner which means how much money he is willing to pay. We take the assessments of B's quality from consultant agents as sample about θB, and in order to answer the question of which sample information should be used with higher priority we propose following order rules. (1) The sample from the agent with shorter referral distance should be used first; (2) If the samples come from the agents that have the same referral distance, which with larger reliability factor is prior to that with smaller one. Note that θ and X have joint (subjective) density h ( x , θ ) = π (θ ) f ( x | θ )
(1)
and that X has marginal (unconditional) density m (x) =
∫
Θ
f ( x | θ )π (θ ) d θ
(2)
h ( x ,θ ) m (x)
(3)
it is clear that (providing m(x)≠0) π (θ | x ) =
Now, we begin to evaluate trust value of B for A by using above data and information. Firstly, we use the sample information from D through the path A→D→B, and the posterior distribution πB(θ|x) of θB, given xa, is {πB(θg|xa)=0.1915, πB(θa|xa)=0.7447, πB(θb|xa)=0.0638}. However, the reliability factor has not been considered when calculating this posterior distribution. We used following formula to
748
X. Zheng et al.
rectify above πB(θ|xa), where π(θ), πold(θ|x) and πnew(θ|x) represent the prior distribution, the posterior distribution before rectification and the posterior distribution after rectification respectively, and R is reliability factor. π
new
(θ i | x ) = π ( θ i ) + [ π
old
(θ i | x ) − π (θ i )] × R
(4)
i ∈ { g ( good ), a ( average ), b ( bad )}
So, after rectification, πB(θg|xa) = 0.3+(0.1915−0.3)×0.85 = 0.2078, πB(θa|xa) = 0.5+(0.7447−0.5)×0.85 = 0.7080, and πB(θb|xa) = 0.2+(0.0638−0.2)×0.85 = 0.0842. The residual calculating process of B and the whole process of C are shown in Table 2 and 3. We employ multiplying to merge two or more than two reliability factors. For example, the reliability factor of the edge A→E is 0.9 and that of E→G is 0.8, then the value of reliability factor on the path A→G is 0.72 (0.9×0.8). The reason behind using multiplying is that if the statement is true only if the agents that propagate this statement all tell the truth and it is considered to be independent for any two agents to lie or not to lie. After obtaining the posterior distribution of B and C, we can compare the results of the expectation of UA(r) over πB(θ|x) minus SCB with that of the expectation of UA(r) over πC(θ|x) minus SCC, and simply select the agent that possibly will produce more utility. Expected utility of B and C are Utility of B = E π ( θ | x ) [U A ( r )] − SC B = 0 . 2837 × 900 + 0 . 6308 × 600 + 0 . 0855 × 100 − 400 = 242 . 36 Utility of C = E π ( θ | x ) [U A ( r )] − SC C B
C
= 0 . 5049 × 900 + 0 . 4519 × 600 + 0 . 0432 × 100 − 450 = 279 . 87 hence (279.87 > 242.36), action c should be performed which means that C is more appropriate than B in the eyes of A . Table 2. The process of evaluating B's trust Step
Path
Statement
Reliability Factor
Prior Distribution
Posterior Distribution
ʌB(șg)
ʌB(șa)
ʌB(șb)
ʌB(șg|x)
ʌB(șa|x)
ʌB(șb|x)
1
AĺDĺB
Average
0.8500
0.3000
0.5000
0.2000
0.2078
0.7080
0.0842
2
AĺEĺGĺB
Bad
0.7200
0.2078
0.7080
0.0842
0.1233
0.6420
0.2347
3
AĺEĺFĺB
Good
0.6300
0.1233
0.6420
0.2347
0.3565
0.5073
0.1362
4
AĺEĺGĺHĺB
Average
0.5400
0.3565
0.5073
0.1362
0.2837
0.6308
0.0855
Table 3. The process of evaluating C's trust Step
Path
Statement
Reliability Factor
Prior Distribution
Posterior Distribution
ʌC(șg)
ʌC(șa)
ʌC(șb)
ʌC(șg|x)
ʌC(șa|x)
ʌC(șb|x)
1
AĺEĺC
Average
0.9000
0.3300
0.3400
0.3300
0.2635
0.5882
0.1483
2
AĺIĺJĺC
Good
0.7410
0.2635
0.5882
0.1483
0.5905
0.3466
0.0629
3
AĺIĺJĺKĺC
Average
0.4817
0.5905
0.3466
0.0629
0.5049
0.4519
0.0432
2.2 Open Trust Model and Bayesian Sequential Analysis Above discussion is under the condition that the "closed society of agents" must be defined at first, but it is nearly impossible for inherent open and dynamic Web. Our
A Computational Trust Model for Semantic Web Based on Bayesian Decision Theory
749
idea is that at every stage of the procedure (after every given observation) one should compare the (posterior) utility of making an immediate decision with the "expected" (preprosterior) utility that will be obtained if more observations are taken. If it is cheaper to stop and make a decision, that is what should be done. To clarify this idea, we transform Figure 1 to the structure of tree, shown in Figure 2. The goal of preposterior analysis is to choose the way of investigation which minimizes overall cost. This overall cost consists of Fig. 2. The process of trust evaluating the decision loss (including opportunity cost) and the cost of conducting observation (consultant fee). Note that these two quantities are in opposition to each other and we will be concerned with the balancing of these two costs in this section. We continue above example used in the illustration of the closed trust model. As shown in Figure 2, we begin at the stage 1 when A only holds the prior information of B and has no any information about C (even the existence of C, but it is more possible that an agent like C is near in the network). Agent A either can make an immediate decision (to select B) or can send request to its acquaintances for their opinions by extending the tree of Figure 2 down to the next layer. Suppose that the cost of consultant services is determined by the function of udn, where n represents the number of stage, d the mean degree of outgoing in given network, and u the mean consultant fee for each time (here, u = 1 and d = 4). Note that initiator agent can estimate the cost of consultant services at the next stage more exactly in real situation. The utility of an immediate decision is the larger of E π B ( θ ) [U A ( r )] − SC
B
= 0 . 30 × 900 + 0 . 50 × 600 + 0 . 20 × 100 − 400 = 190
π C (θ )
and E [U A ( r )] − SC C = 0 . 33 × 900 + 0 . 34 × 600 + 0 . 33 × 100 − 450 = 84 Hence the utility of an immediate decision is 190. If the request messages is sent and x observed, X has a marginal (unconditional) density of {m(xg) = 0.24, m(xa) = 0.47, m(xb) = 0.29,} and E π (θ |x ) [U A (r )] − SC B = 404.15 , E π (θ |x ) [U A (r )] − SCB = 225.55 , E π (θ |x ) [U A (r )] − SCB = −44.88 . Note that if xb is observed, we prefer to select C instead of B (because E π (θ ) [U A (r )] − SCC = 84 and 84 > −44.88). The expected cost of B
B
g
B
a
b
C
consultant services for further investigation at this stage is 4 (1×41). So the utility of not making immediate decision (opportunity cost) is: max{404.15,84}×0.24+max{225.55,84}×0.47+max{−44.88,84}×0.29 − 4 = 223.36 Because 223.36 > 190, further investigation would be well worth the money. The residual process of Bayesian sequential analysis is shown in Table 4. As shown in Table 4, at the stage 3, the expected utility of C begins to larger than that of B, and because 295.70 > 255.02, making an immediate decision is more profitable. So A should stop investigating and select C as a decision.
750
X. Zheng et al. Table 4. The process of Bayesian sequentail analysis Agent B Prior Distribution
Stage șg
șa
Agent C Marginal Distribution
șb
xg
xa
Prior Distribution xb
șg
șa
Utility Marginal Distribution
șb
xg
xa
xb
ConsulFurther tant Immediate InvestigaFee Decision tion
Decision
1
0.3000 0.5000 0.2000 0.2400 0.4700 0.2900 0.3300 0.3400 0.3300
í
í
í
4
190.00
223.36
Continue
2
0.2078 0.7080 0.0842 0.1997 0.5706 0.2297 0.2635 0.5882 0.1483
í
í
í
16
220.25
221.33
Continue
3
0.3565 0.5073 0.1362
64
295.70
255.02
Stop
í
í
í
0.5905 0.3466 0.0629 0.3921 0.4292 0.1787
3 Conclusions The semantic web is full of heterogeneous, dynamic and uncertain. If it is to succeed, trust will inevitably be an issue. Our work's contributions are: The closed and open trust models have been proposed based on Bayesian decision theory and give trust a strict probabilistic interpretation; The utility and three types of costs, operational, opportunity and service charges incurred during trust evaluating have been considered sufficiently and an approach is proposed to balances these cost and utility; Our approach enables an user to combine a variety of sources of information to cope with the inherent uncertainties on Semantic Web and each user receives a personalized set of trusts, which may vary widely from person to person.
Acknowledgment This work was partially supported by a grant from the Major State Basic Research Development Program of China (973 Program) under grant number 2003CB316906, a grant from the National High Technology Research and Development Program of China (863 Program).
References 1. B. Yu and M. P. Singh. Trust and reputation management in a small-world network. 4th International Conference on MultiAgent Systems, 2000, 449-450. 2. K. O'Hara, Harith Alani, Yannis Kalfoglou, Nigel Shadbolt. Trust Strategies for the Semantic Web. Proceedings of Workshop on Trust, Security, and Reputation on the Semantic Web, 3rd International (ISWC'04), Hiroshima, Japan, 2004. 3. M. Richardson, Rakesh Agrawal, Pedro Domingos. Trust Management for the Semantic Web. Proceedings of the Second International Semantic Web Conference, Sanibel Island, Florida, 2003. 4. S. Milgram. The small world problem. Psychology Today, 61, 1967. 5. S. P. Marsh. Formalising Trust as a Computational Concept. Ph.D. dissertation, University of Stirling, 1994. 6. Y. Gil, V. Ratnakar. Trusting Information Sources One Citizen at a Time. Proceedings of the First International Semantic Web Conference, Sardinia, Italy, June 9-12, 2002, 162-176.