used to measure, through Torrance's (1986) chained technique, the utility of ... life when being in health state xj and wj corresponds to the duration of period j,.
Measuring Attribute Utilities when Attributes Interact Peter P. Wakker, Anne M. Stiggelbout, & Sylvia J.T. Jansen Medical Decision Making Dept., Leiden University Medical Center, Leiden, The Netherlands
January, 2000
Abstract This paper introduces a new method for measuring attribute utilities in multiattribute utility theory. The novelty of our method lies in the use of anchor levels, i.e. levels of attributes the value of which is not affected by other attributes. It is shown that, no matter how complex the interactions between attributes are, we can meaningfully define and measure attribute utilities if we can construct anchor levels for those attributes. When applied to time preferences, the method measures temporal utility in the presence of intertemporal dependencies. An application to medical decision making is described, where the (dis)utility of side-effects of radiotherapy, an important factor in the treatment decision, can now be measured.
KEY WORDS: Utility Measurement, Multiattribute Utility, Time Preference, QALY
2
1. INTRODUCTION This paper is the final part of a trilogy on utility measurement. The three papers introduce a new method for defining and measuring utilities of decision criteria (“attributes”) when there are interactions with other attributes. The novelty of the method lies in the use of “anchor levels.” These are specially constructed levels of attributes that remain unaffected by the interactions. The usefulness of such levels was already suggested by Borcherding, Schmeer, & Weber (1995, p. 24). By imposing independence conditions only on the anchor levels and not on complete attributes, we generalize existing approaches. We show that attribute utilities can be meaningfully defined and measured even in the presence of complex interactions. The first two papers (Jansen et al. 1998, 2000) present empirical results and this final paper develops the underlying theory. The empirical papers consider time preferences for health states. Time preference can be considered a special case of multiattribute utility with each time point or period an attribute and the consumption or health state at that time point/period the level of the attribute. In the empirical papers, anchor levels are special health states the value of which is not systematically affected by the health states that follow or precede. These anchor health states are used to measure, through Torrance’s (1986) chained technique, the utility of temporary health states that are affected by interactions with health states that precede or follow. This, third, paper introduces anchor levels for general multiattribute utility theory. Preference axioms and a quantitative representation are provided. Apart from the assumed existence of anchor levels, no restrictions are imposed on the other attribute levels and every kind of general interaction is permitted. By applying the
3 general multiattribute theory developed in this paper to time preferences we obtain a theoretical foundation for the methods used and tested in the empirical papers. These three papers, with empirical and theoretical results combined, hopefully demonstrate the usefulness of anchor levels for multiattribute utility theory. In Jansen et al. (1998), the empirical feasibility of our method was verified. It was demonstrated that some biases, known from other methods, are avoided by our method. Jansen et al. (2000) subsequently used this method to investigate an often debated issue in the health domain: The discrepancy between the evaluation of health states when anticipated and when actually experienced (Russell et al. 1996, Kahneman & Tversky 1999). A method was introduced for testing whether this discrepancy is caused by adaptation or by some other factor, such as a tendency in the medical area to overemphasize the downsides of impaired health states in scenario descriptions. The method was tested in an experiment where the latter factor rather than adaptation seemed to cause the discrepancy. Multiplicative interactions between attributes are incorporated in the multilinear models presented by Keeney & Raiffa (1976) and used in a number of empirical studies (Fryback & Keeney 1983, Fischer et al. 1986 pp. 1068-1069, Torrance et al. 1996). More general interactions are permitted in the multivalent and hypercube models of Farquhar & Fishburn (1981). These general interactions, however, complicate the measurement of utilities. The purpose of our method is to combine theoretical generality with empirical tractability of measurement. Section 2 describes separability and attribute interaction in multiattribute utility theory. Section 3 considers additive decomposability of utility, i.e. absence of interactions, and demonstrates that these requirements hold if and only if all attribute levels are anchor levels. Section 4 assumes that only some levels are anchor levels. It
4 presents an axiomatic study to establish the meaningfulness and measurability of attribute utilities in this generalized setup. As an application, utilities are measured in Section 5 for nonseparable time preferences in the health domain. Section 6 concludes and proofs are given in an appendix.
2. INTERACTIONS IN MULTI-ATTRIBUTE UTILITY Multi-attribute utility theory provides tools for aggregating different objectives, that may be mutually competitive, into an overall decision (Keeney & Raiffa 1976, von Winterfeldt & Edwards 1986). Usually tradeoffs have to be made between the several objectives, or attributes as they will be called formally. The larger the house the farther remote it is from work, the higher the price the higher the quality, the better career opportunity the lower the job security, etc. Formally, we assume n attributes and consider a set X = X1 × … × Xn of alternatives. Alternatives are n-tuples (x1,…,xn) where xj designates the level of the alternative on attribute j. For instance, assume that alternatives are cars and that there are three relevant attributes, maximum speed in miles per hour, price in K (thousand dollars), and color. Then the triple (150, 20K, green) could designate a car of interest. In alternative setups xj could be the amount of commodity j received, the score of a student on test j, etc. In time preference for money, there are n time-points and x j is the money received at time point j. Section 5 describes an application to the health domain regarding the measurement of Quality Adjusted Life Years (QALYs, introduced by Fanshel & Bush 1970). This is a special case of time preference. There are n time periods and xj is the health state during period j.
5 In the great majority of applications, alternatives are evaluated additively (Keeney & Raiffa 1976, Eq. 6.29; von Winterfeldt & Edwards 1986, Eq. 8.1; Salo & Hämäläinen 1992; von Nitzsch & Weber 1993; Hutton Barron & Barrett 1996; Dyer et al. 1998; Keller & Kirkwood 1999; Keeney & McDaniels 1999). That is,
U(x1,…,xn) = ∑j=1wjuj(xj) n
(2.1)
evaluates the alternative, where U(x1,…,xn) is the overall utility of the alternative (x1,…,xn), uj(xj) is the utility of xj, and wj is a weight factor to settle the exchange rates between the various attributes. In QALY measurement, uj(xj) is the quality of life when being in health state xj and wj corresponds to the duration of period j, possibly corrected for discounting. Sometimes additivity is obtained by redefining attributes (McDaniels 1995 p. 421). If the additive model holds then the assessment of utility is relatively simple. We can assess the utilities uj of the various attributes independently of what the levels of the other attributes are, next assess the exchange rates wj between the attributes, after which the whole evaluation system has been assessed (Weber & Borcherding 1993, p. 2). Assessment is more complex if there are interactions between the attributes. For example, assume that the value of being blind in period 2 depends on whether or not the client was blind in period 1. Then it is no longer possible to assess the utility of blindness in isolation without consideration of the health state in a preceding period. A general formula to express such dependency is
U(x1,…,xn) = ∑j=1wjuj(xj,x). n
(2.2)
6 Here uj(xj,x) is the utility of xj which depends on the levels of the other attributes x1,…,xj−1, xj+1,…,xn. Without further specifications, Eq. 2.2 is too general to yield empirical predictions. There is no way to elicit the utility uj(xj,x) from decisions. Only the total sum ∑j=1wjuj(xj,x) is meaningful under common decision theoretic assumptions. The n
various parameters wj and uj(xj,x) in themselves cannot be identified. If deafness is replaced by blindness during some period and the overall value increases then we cannot tell to what degree the actual improvement was directly caused by blindness being preferable to deafness. The improvement may instead be due to indirect effects, i.e. the health states following blindness may have improved because they are better after deafness than after blindness. To assess the whole preference system we have to perform an independent assessment of all relevant combinations of attribute levels. Such a task is cumbersome and does not conveniently yield further insights. The additive model (2.1) holds if appropriate preference conditions are satisfied such as utility independence and separability. The latter condition entails for instance that a preference between two alternatives (c1,x2,…,xn) and (c1,y2,…,yn) is independent of the common attribute c1. It therefore remains unaffected if we replace the common level c1 by another common level c1’. That is,
(c1,x2,…,xn) u (c1,y2,…,yn) if and only if (c1’,x2,…,xn) u (c1’,y2,…,yn) .
If this condition holds then for a preference over future health states it should not matter if the common health state in period 1 is blindness of deafness. Utility independence is a similar condition but refers to risk which is defined in the next
(2.3)
7 section; hence the formal definition of utility independence is only given in the next section. In many applications the additive representation (2.1) is used even though it is not perfectly valid and interactions between attributes exist. As the additive model is highly convenient and it is hard to implement other models, deviations from additivity are ignored if they do not generate large biases. Unfortunately, the deviations from additivity are too large to be ignored in many applications (Farquhar 1977). Such deviations are for instance common in time preferences (Loewenstein & Elster 1992). Order effects, habit formation, addiction, central phenomena in consumer theory, are all based on nonseparability of disjoint time periods (Becker 1996). Order effects can be so strong as to even lead to violations of monotonicity. For example, an increasing income stream may be preferred to a decreasing income stream even though the latter at each time point yields more total income (Loewenstein & Sicherman 1991, Hsee 1998). In the health domain, the utility measurement of temporary health states is complex due to its dependence on what happens before and after. In some situations, separability and utility independence must be partly abandoned but weakened versions can be maintained. In such cases it is typically assumed that some attributes are separable from (so have utilities independent of) some other attributes, but interactions between remaining attributes are permitted. For example, the evaluations of educational and judicial policies may be mutually dependent and nonseparable but they may be independent from environmental policies. Then the evaluations of educational and judicial policies can be made independently from the environment (Strotz 1957). Weak versions of attribute
8 independence are extensively studied in Keeney & Raiffa (1976). Here interactions are multiplicative and are governed by a restricted number of extra parameters. The mentioned weakenings have in common that those parts of separability or attribute independence that are maintained are imposed on all levels of the attributes in question. For example, if attributes 2, …, n are separable from attribute 1, i.e. Eq. 2.3 holds, then it should hold for all levels c1, c1’, and for all x2,…,xn, y2,…,yn. It is in this respect that our paper introduces a new approach. We will assume a version of separability that needs to be imposed only on specially chosen levels of attributes (“anchor levels”). Then a meaningful way for measuring attribute utility uj(xj,x) remains possible even if this utility depends on the other levels x1,…,xj−1, xj+1,…,xn. The anchor levels should be chosen with care so as to be suited for their role.
3. ADDITIVE DECOMPOSABILITY THROUGH ANCHOR LEVELS We assume that risk is present in the decision process. Therefore we consider not only the set
n ×j=1 Xj
of combinations of conceivable attribute levels but, more
generally, the set L(×j=1Xj) of all lotteries, i.e. simple probability distributions over n
n ×j=1 Xj.
“Simple” means that the number of possible alternatives is finite. A typical j
j
lottery is (p1,x1; …; pm,xm), yielding alternative xj = (x1, …, xn) with probability pj, for j = 1, …, m. Probabilities p1,…,pn are nonnegative and sum to one. By u we denote the preference relation of a decision maker over the lotteries. Expected utility means
9 that there exists a utility function U : ×j=1Xj → ¸ on the alternatives such that n
preference maximizes
(p1,x1; …; pm,xm) ∑j=1pjU(xj) . m
(3.1)
We assume (3.1) throughout this paper. The decomposition of the overall utility function U(x1,…,xn) into attribute utility functions uj(xj) is the general topic of multiattribute utility and is also the topic of this paper. Often-used preference conditions to justify decompositions of U are various forms of “utility independence,” where preference between two lotteries that have a fixed nonrandom attribute level in common is independent of that common level. Weak forms of such conditions permit multiplicative interactions between attributes. The most well-known preference condition to guarantee additive decomposability is Fishburn’s (1965) “marginal independence.” It requires that the utility of a lottery 1
m
(p1,x1; …; pm,xm) depends only on the marginal distributions (p1,x i ; …; pm,x i ) generated over the attributes i = 1,…,n, and not on the way in which the various attributes are correlated. This condition is necessary and sufficient for the additive decomposability in (2.1) (Keeney & Raiffa 1976 Theorem 6.4). For the purpose of this paper a weakened condition, based upon Fishburn’s (1965) Theorems 1 and 3, is most suited. We will discuss the condition in some more detail and introduce a preparatory notation. For x = (x1,…,xn) and yi from Xi,
yix is x with xi replaced by yi, i.e. it is (x1,…,xi−1,yi,xi+1,…,xn).
10 Consider the following fifty-fifty lottery (1/2,bix; 1/2,biy). The two alternatives have a common ith attribute bi. Imagine that the decision maker can choose whether the left or the right attribute level bi is replaced by another level gi. It seems intuitively plausible that, if there is no interaction with the other attributes, then the utility improvement is the same in the alternative with x’s as with y’s. That is, the following indifference (denoted by ~) seems to be indicative of absence of interaction.
(1/2, gix; 1/2, biy) ~ (1/2, bix; 1/2, giy)
(3.2)
We call bi, gi (a pair of) anchor levels if (3.2) is satisfied for all x,y. Substituting expected utility shows that then
U(gix) − U(bix) is independent of x.
(3.3)
Violations of (3.2) have been used to define “multivariate risk aversion.” For instance, Richard (1975, Conditions (i) and (ii) on p. 13) considers n=2. Multivariate risk aversion is defined as a preference for the side in (3.2) where the preferred level of x and y is coupled with the nonpreferred level of g and b. Such a condition had already been considered by de Finetti (1932) and it has been tested in several studies (Payne, Laughhunn, & Crum 1984). A set of attribute levels are anchor levels if each pair from the set is a pair of anchor levels.
THEOREM 3.1. Let 1 ≤ i ≤ n. All attribute levels xi are anchor levels if and only if U(x) can be written as ui(xi) + V(x1,…,xi−1,xi+1,…,xn) for some functions ui and V. Ä
11 We obtain the following variant of Fishburn’s (1965, Theorem 3) characterization of additive decomposability (for a more general representation, see Theorem 3 of Farquhar & Fishburn 1981).
THEOREM 3.2. The additive decomposition (2.2) holds if and only if all levels of all attributes are anchor levels. Ä
The condition in the theorem is obviously weaker than marginal independence (the latter immediately implies Eq. 3.2), but apparently still suffices. In a number of papers, Farquhar and Fishburn have derived representations from generalizations, i.e. less restrictive versions, of the above conditions (see Farquhar & Fishburn, 1981, and the references therein). In multivalent representations, attribute sets Xi are partitioned i
i
into subsets Xj such that (3.3) holds for all elements bi, gi from each subset Xj. Then, i
generalized (“multivalent”) additive representations are derived on subdomains X 11 × i
… × X nn. The fractional hypercube methods take generalized forms of Eq. 3.2, with multi-outcome gambles. Various utility functions, constructed through several additions and multiplications, are derived.
4. ANCHOR LEVELS AND INTERACTIONS In the preceding section we have seen that all attribute levels must be anchor levels under additive decomposability. The multivalent and hypercube methods extend this result to partitions of attribute levels and more complex representations. We will follow an alternative route from Eq. 3.3. We consider cases where only some
12 specially chosen attribute levels are anchor levels but others need not be. We do not impose any restrictions on the functional forms outside the anchor levels and establish general representations with general interactions. Our main purpose is to demonstrate the empirical measurability and the meaningfulness of the functional form and its attribute utilities. Let us assume for now that {bi, gi} are anchor levels and, to avoid triviality, that gix s bix for some (hence for all) x. Here g abbreviates good and b abbreviates bad. We pursue an interpretation of the utility of attribute xi within x even though there are interactions. Assume first that the scale and location of U are such that U(gix) = 1 and U(bix) = 0, and keep all attributes j≠i fixed at their level xj. Then we define the “attribute utility” of xi given x, denoted ui(xi,x), as U(x) which, by expected utility, is equal to (with always p between 0 and 1):
ui(xi,x) = 1/p ≥ 1 if gix ~ (p,x; 1−p,bix)
(4.1)
0 ≤ ui(xi,x) = p ≤ 1 if x ~ (p,gix; 1−p,bix)
(4.2)
ui(xi,x) = −
p ≤ 0 if bix ~ (p,gix; 1−p,x). 1−p
(4.3)
If we drop the scaling assumption U(gix) = 1 and U(bix) = 0, so only have U(gix) > U(bix), then we can nevertheless define ui(xi,x), the attribute utility of xi at x, as in (4.1), (4.2), and (4.3), to get
U(x) = wiui(xi,x) + V(x) .
(4.4)
13 Here V(x) = U(bix) is independent of xi and wi = U(gix) − U(bix) is independent of x (see Eq. 2.3). By separating out the term V(x) we guarantee that ui(.,x) is zero at bi. We interpret wi as the weight of attribute i. The degree to which wi is an empirically meaningful quantity or just a convenient scaling factor is of course determined by the degree to which U(gix) and U(bix) are one or the other. In Section 5, U(gix) and U(bix) have a special empirical meaning hence so have wi and ui(xi,x), with wi an index of duration and ui(xi,x) a rate of utility per time unit. As a notational convention in the following theorem, uj(yj,x) designates the utility of yj when the levels of the other attributes are x1,…,xj−1, xj+1,…,xn. The following theorem shows the empirical meaningfulness of the preceding constructions. In summary, the weight of attribute i is independent of x (Statement i), ui captures the marginal utility contribution of attribute i in the presence of interaction with the other levels xj (Statement ii), and ui can be measured empirically (Statement iii).
THEOREM 4.1. Assume that gir s bir for some r. Then {gi,bi} are anchor levels if and only if
U(x) = wiui(xi,x) + V(x)
where:
(i) wi > 0 is independent of x; (ii) U(yix) − U(zix) = wi(ui(yi,x) − ui(zi,x)) for all yi, zi. Equivalently, V(x) is independent of xi.
14 (iii) ui(xi,x) is given by (4.1), 4.2), (4.3).
Further, ui is uniquely determined. Ä
The following theorem shows that the above results can be obtained in an overall manner when anchor levels are available on all attributes. It thereby provides a special case of (2.2) that is empirically meaningful and preserves the generality of (2.2) outside the anchor levels.
THEOREM 4.2. Let g,b be two alternatives with gib s b for each i. Then all {bi,gi} are anchor levels if and only if
U(x1, ..., xn) = ∑j=1wjuj(xj,x) + V(x) n
where:
(i) The wjs are nonnegative and are independent of x; (ii) For each i, U(yix) − U(zix) = wi(ui(yi,x) − ui(zi,x)) for all yi, zi. Equivalently,
∑j≠iwjuj(xj,x) + V(x) is independent of xi. (iii) ui(xi,x) is given by (4.1), (4.2), (4.3).
Further, the uis are uniquely determined. Ä
15 The message of the theorem does not lie in the representation per se. The representation, a variation on Eq. (2.2), in itself is completely general. With its dependence of V(x) and each uj(xj,x) on the whole of x it can describe any utility function in a multitude of ways. The representation only has meaning in combination with the second part of the theorem. This part shows that the parameters of the model are empirically meaningful and can be observed. We can identify the separate contributions uj(xj,x) of each xj to the whole even while there are interactions between the attributes. The uj(xj,x) can be measured by essentially the standard gamble techniques that are used under one-dimensional expected utility, described in Eqs. (4.1), (4.2), (4.3). The terms in the summation can be interpreted as the contributions of each individual attribute, given a fixed level of the other attributes (compare (ii)), with all interactions permitted. The wis are the weights of the attributes that are nonnegative and sum to one if and only U(g) − U(b) = 1 (proved in Formula (A.1) in the Appendix). The result obviously cannot simplify the preference system beyond its intrinsic complexity. Many elicitations are required to elicit the whole preference system without any further restriction on the interactions permitted, and for every separate x the measurement of u(yj,x) has to be redone. But at least the elicitation of the uj(yj,x) is possible in a meaningful and an experimentally tractable manner. The utility difference U(xjz) − U(yjz) can also be measured by classical methods without resort to the anchor levels. For instance, if U(h) = 1, U(") = 0, and xjz and yjz are between h and " in preference, then we can find p and q such that xjz ~ (p,h; 1−p,") and yjz ~ (q,h; 1−q,") and we get
16 U(xjz) − U(yjz) = p − q.
(4.5)
Without anchor levels available, however, it is not easy to interpret such differences the more so as no belonging representating form is obtained. In addition, such general measurements are often not experimentally tractable. For this reason they have not been used in applications (Borcherding, Schmeer, & Weber 1995 p. 9/10). The main motivation for the study of anchor levels lies in their features for the elicitation of utility. By keeping the attributes xj for j ≠ i constant the stimuli can stay close to the actual situation of the clients and thus reduce the cognitive burden. The classical method just described, using alternatives h and " that are unrelated to the stimuli of relevance, will generate more biases and distortions.
5. APPLICATION TO THE MEASUREMENT OF TEMPORARY HEALTH STATES Patients’perception of quality of life, and hence subjective utility measurements, are important in medical decisions. In times of budget cuts and cost-effectiveness policy decisions, a well-developed technology for measuring subjective utilities of health states is essential (Krischer 1980, Russell et al. 1996f). Quality-adjusted life years (QALYs) have been introduced to integrate mortality, morbidity, and duration of health states into a single measure (Fanshel & Bush 1970, Pliskin, Shepard, & Weinstein 1980, Kaplan 1993, Russell et al. 1996, Fryback 1997). The traditional techniques for measuring utility were developed for chronic health states. When applied to temporary health states, complications can arise.
17 Jansen et al. (1998, 2000) study post-operative radiotherapy treatment for earlystage breast cancer patients. The patients had completed primary treatment (lumpectomy or mastectomy) and now faced the possibility of radiotherapy. Radiotherapy reduces recurrencies of breast cancer but may induce undesirable side effects like fatigue and skin reactions. To determine an optimal decision, the impact of the side effects on the patients’ well-being must be measured. Two reasons prevent the employment of the traditional measurement techniques that require (hypothetical) chronic health states. First, radiotherapy as a chronic health state is too unrealistic to consider even hypothetically; it must be presented as a temporary health state. In mathematical terms this means that not all combinations of attribute levels are available, e.g. (R,R) is not available. Hence we have to deal with a subset of a product set. Such a restriction of domain can complicate the analysis (Fishburn 1976). To meet this complication, Torrance (1986) introduced a chained technique for measuring the utility of temporary health states. His technique, like the traditional techniques, still assumes intertemporal independence of the values of health states. For a patient’s perception of radiotherapy, however, the quality of life during treatment depends crucially on the prospects after treatment and therefore cannot be measured independently thereof. This leads to the second reason why Jansen et al. deviate from traditional measurement techniques. Because of this second reason we have developed the generalization of Torrance (1986) now formalized in Sections 3 and 4. Formally, we consider two “attributes,” i.e. time periods (n=2). The first consists of six months, the second of “the remaining life expectancy” based on the average life expectancy of women of the same age. The considered health states are:
18
HEALTH STATE
radiotherapy treatment
ABBREVIATION
R
good health death hospitalization G
D
H
E.g., (R,G) means radiotherapy in period 1 and good health in period 2. More precisely, R in period one designates a six-week radiotherapy treatment followed by four and a half months of possible side effects. The scaling U(G,G) = 1 and U(D,D) = 0 is commonly accepted in the health domain. Such a scaling convention is essential when comparing effects across different people and treatments. A chronic health state (R,R) of radiotherapy is unrealistic hence will not be considered. The quality of life of treatment during the first period depends so crucially on the prospect of the subsequent health state improvement that R cannot serve as an anchor state. This observation also underscores that separability over disjoint time periods is unacceptable in this context, motivating our new technique. G and D may seem suited to serve as anchor states. Their quality of life can be set equal to 1 and 0, respectively, independent of the qualities of life during other periods. We will indeed use G as an anchor state. Unfortunately, D in period 1 cannot be realistically combined with the health state of interest during period 2, i.e. G. Jansen et al. (1998, 2000) therefore decided to use another anchor health state, “hospitalization, caused by a serious accident.” Most people will be able to relate to this hypothetical health state. It also is sufficiently distinct from the other health states considered in this experiment that it will not interact systematically with those. Hence we felt that it could serve as an anchor health state. We wanted to investigate u1(R) (and, in fact, its stability) when followed by good health, hence health state x2 for period 2 is G. For virtually all patients (H,G) e (R,G) e (G,G), so that formula (4.2) could be used and the probability p was measured such that
(R,G) ~ (p,(G,G); 1−p,(H,G)).
(5.1)
In the notation of Section 3, u1(R,(R,G)) = p. These utilities were rescaled so as to agree with the convention that quality of life is 0 at death and 1 at good health, i.e. u1(D,(D,G)) = 0 and u1(G,(G,G)) = 1. To this effect, we elicited the utility of H through indifferences
19
(H,H) ~ (q,(G,G); 1−q,(D,D)).
(5.2)
The indifference implies that U(H,H) = q. Note that all scenarios used are realistic, none involving death followed by another health state. A further assumption of the anchor state in this experiment is that it is equally valuable at all time points so that we can set
u1(H,(H,G)) = q.
(5.3)
We finally inserted (5.3) in (5.1) to obtain
u1(R,(R,G)) = p + (1−p)q.
(5.4)
This is the utility of radiotherapy during six months, recognizing the dependency on the good health following it. This procedure, where a measured utility such as in (5.3) is used as input to calculate another utility such as in (5.4), is called chained (Torrance 1986). The utility in (5.4) agrees with the scaling convention that the utility of death is zero and the utility of good health is one hence it can be used in comparisons across different people and studies. The measurement procedure followed in this experiment has several experimental advantages, e.g. it only involves hypothetical scenarios that are easy to imagine for the patients. The scenarios stay close to what patients are actually experiencing and what they can relate to by keeping all attributes other than the one considered fixed at the relevant level. It thus reduces, for instance, the effects of biases and insensitivities regarding attribute weights (von Nitzsch & Weber 1993). Indeed, anchor levels can serve as gauges for multiattribute weight elicitation. Using intuitively meaningful anchor levels rather than maximal and minimal outcomes was already suggested by Borcherding, Schmeer, & Weber (1995, p. 24): “It might be more desirable to elicit meaningful anchors from the decision maker … and then to elicit weights for these ranges.” The implementability of the method was found to be good and several biases that have been known to occur in other measurements, such as loss aversion, could be
20 avoided (Jansen et al. 1998, p. 397). The measurement therefore better agreed with other measurements such as “time tradeoffs” (Russell et al. 1996). It would have been possible, theoretically, to measure utility by classical methods without resort to the anchor health states, following Eq. 4.5. This procedure would, however, have been impossible to implement in the practical setting of this investigation. Patients would not have been able to relate to hypothetical stimuli unrelated to their actual situation. Hence we resorted to our method based on anchor health states. An investigation to what degree the elicitations can affect decisions (Kimbrough & Weber 1994) is left for future studies. For further discussion of the empirical and psychological advantages see Jansen et al. (1998, 2000).
6. CONCLUSION This paper has proposed a new theoretical generalization of attribute independence based on anchor levels. These are relatively stable levels of outcomes, i.e. their values should be relatively unaffected by the context and the interactions. Theoretical characterizations have been provided, demonstrating that meaningful representations can be axiomatized permitting general interactions between attributes. In a medical application a hypothetical hospitalization scenario was used that seemed to satisfy this requirement to a reasonable degree. The practical recommendation resulting from this research is as follows. No matter how complex the interaction between attributes is, we can meaningfully define and measure attribute utilities if we can construct anchor levels for those attributes.
APPENDIX: PROOFS PROOF OF THEOREM 3.1. If U is of the form in the theorem, then U(giy) − U(biy) = ui(gi) − ui(bi) is independent of y indeed. Hence all xi are anchor levels. Conversely, assume, for i=1, that U(g1x) − U(b1x) is independent of x for all g1 and b1. Fix any r = (r1,…,rn), write U(x) = U(x) − U(r1x) + U(r1x) = U(x1r) − U(r) + U(r1x), and define u1(x1) = U(x1r) − U(r) and V(x2,…,xn) = U(r1x). Ä
21
PROOF OF THEOREM 3.2. The proof of Theorem 3.1 is first applied to i=1, and then proceeds inductively. Fix x1 at any level say r1, then decompose V(x2,…,xn) as u2(x2) + W(x3,…,xn), etc. Ä PROOF OF THEOREM 4.1. First assume that a decomposition of U as described exists. Then U(gix) − U(bix) = wi(ui(gi) − ui(bi)) = wi so is independent of x indeed. From this (3.2) follows, so {gi,bi} are anchor levels. For the second implication, assume that {gi,bi} are anchor levels. Define wi = U(gix) − U(bix) which, by (3.3), is independent of x and is positive because U(gir) − U(bir) is positive; (i) is proved. Define V(x) = U(bix) which is independent of xi. Substitution of expected utility shows that U(x) − U(bix) is wiui(xi,x) for each of the three cases (4.1), (4.2), and (4.3), proving (iii). U(x) = wiui(xi,x) + V(x) follows. V(x) is independent from xi if and only if it cancels out in the left-hand difference in (ii). Uniqueness of ui follows immediately from (4.1), (4.2), and (4.3). Let us, for completeness, mention the uniqueness results pertaining to the other variables. U and V are unique up to a common location (U(bix) = V(bix) for all x) and U, wi, and V are unique up to a common scale. Ä PROOF OF THEOREM 4.2. First assume the decomposition described in the theorem. Then U(gix) − U(bix) = wi(ui(gi,x) − ui(bi,x)) = wi and (3.3) and (3.2) hold, i.e. {gi,bi} are anchor levels. Assume next that all {gi,bi} are anchor levels. Define wi = U(gix) − U(bix) which, by (3.3), is independent of x and positive because U(gib) − U(b) is positive. Hence (i) holds. Define ui(xi,x) as in (4.1), (4.2), (4.3) so that (iii) is satisfied. Let V(x) = ∑j=1U(bjx) − (n−1)U(x). n
Then ∑j≠iwjuj(xj,x) + V(x) = ∑j≠iuj(xj,x)(U(gjx) − U(bjx)) + V(x) =
∑j≠i(U(x) − U(bjx)) + V(x) = ∑j≠i(U(x) − U(bjx)) + ∑j=1U(bjx) − (n−1)U(x) = n
U(bix) is independent of xi. ∑j≠iwjuj(xj,x) + V(x) cancels from the left-hand
22 difference in (ii) (i.e., ∑j≠iwjuj(xj,yix) + V(yix) − ∑j≠iwjuj(xj,zix) − V(zix)) = 0) if and only if it is independent of xi, hence (ii) has been demonstrated. We further have: U(x) = ∑j=1(U(x) − U(bjx)) − (n−1)U(x) + ∑j=1U(bjx) = ∑j=1(U(x) − U(bjx)) + V(x) n
n
n
= ∑j=1uj(xj,x)(U(gjx) − U(bjx)) + V(x) = ∑j=1uj(xj,x)wj + V(x). n
n
This establishes the representation of U. Uniqueness of ui follows immediately from (4.1), (4.2), and (4.3). Let us, for completeness, mention the uniqueness results pertaining to the other variables. U and V are unique up to a common location, U(b) = V(b). U, the wjs, and V are unique up to a common scale. By the definition of the wjs it follows that
∑j=1wj = U(g) − U(b). n
(A.1)
Ä
REFERENCES Becker, Gary S. (1976), "The Economic Approach to Human Behavior." PrenticeHall, Englewood Cliffs, N.J. Borcherding, Katrin, Stefanie Schmeer, & Martin Weber (1995), "Biases in Multiattribute Weight Elicitation." In Jean-Paul Caverni, Maya Bar-Hillel, F. Hutton Barron, & Helmut Jungermann (Eds.), Contributions to Decision Making − I, 3−28, Elsevier, Amsterdam. de Finetti, Bruno (1932), "Sulla Preferibilità," Giornale degli Economisti e Annali di Economia 11, 685−709. Dyer, James S., Thomas Edmunds, John C. Butler, & Jianmin Jia (1998), "A Multiattribute Utility Analysis of Alternatives for the Disposition of Surplus Weapons-Grade Plutonium," Operations Research 46, 749−762. Fanshel, S. & J.W. Bush (1970), "A Health-Status Index and Its Application to Health Servics Outcomes," Operations Research 18, 1021−1066. Farquhar, Peter H. & Peter C. Fishburn (1981), "Equivalence and Continuity in Multivalent Preference Structures," Operations Research 29, 282−293.
23 Farquhar, Peter H. (1977), "A Survey of Multiattribute Utility Theory and Applications." In M.K.Starr &. M. Zeleny (Eds), Multiple Criteria Decision Making, North-Holland, Amsterdam, 59−89. Fischer, Gregory W., Mark S. Kamlet, Stephen E. Fienberg, & David A. Schkade (1986), "Risk Preferences for Gains and Losses in Multiple Objective Decision Making," Management Science 32, 1065−1086. Fishburn, Peter C. (1965), "Independence in Utility Theory with Whole Product Sets," Operations Research 13, 28−45. Fishburn, Peter C. (1976), "Utility Independence on Subsets of Product Sets," Operations Research 24, 245−255. Fryback, Dennis G. (1999), "Utility Assessment for Cost-Utility Analysis in Health Care: The $/QALY Model." In James C. Shanteau, Barbara A. Mellers, & David A. Schum (Eds.), Decision Science and Technology: Reflections on the Contributions of Ward Edwards, Kluwer, Dordrecht. Hsee, Christopher K. (1998), "Less is Better: When Low-Value Options are Valued more Highly than High-Value Options," Journal of Behavioral Decision Making 11, 107−122. Hutton Barron, F. & Bruce E. Barrett (1996), "Decision Quality Using Ranked Attribute Weights," Management Science 42, 1515−1523. Jansen, Sylvia J.T., Anne M. Stiggelbout, Peter P. Wakker, Marianne A. Nooij, Evert M. Noordijk, & Job Kievit (2000), "Stability of Patients' Preferences: Do Utilities for Breast Cancer Treatment Change as a Result of Experience with the Treatment?," Medical Decision Making, forthcoming. Jansen, Sylvia J.T., Anne M. Stiggelbout, Peter P. Wakker, Thea P.M. Vliet Vlieland, Jan-Willem H. Leer, Marianne A. Nooy, & Job Kievit (1998), "Patient Utilities for Cancer Treatments: A Study on the Feasibility of a Chained Procedure for the Standard Gamble and Time Trade-Off," Medical Decision Making 18, 391−399. Kahneman, Daniel & Amos Tversky (1979), "Prospect Theory: An Analysis of Decision under Risk," Econometrica 47, 263−291. Kahneman, Daniel & Amos Tversky (1999, Eds.), “Choices, Values, and Frames.” Cambridge University Press, forthcoming. Kaplan, R.M. (1993), "Quality of Life Assessment for Cost/Utility Studies in Cancer," Cancer Treat. Rev. 19 suppl A, 85−93.
24 Keeney, Ralph L. & Timothy L. McDaniels (1999), "Identifying and Structuring Values to Guide Integrated Resource Planning at BC Gas," Operations Research 47, 651−662. Keeney, Ralph L. & Howard Raiffa (1976), "Decisions with Multiple Objectives." Wiley, New York (Second edition 1993, Cambridge University Press, Cambridge, UK). Keller, L. Robin & Craig W. Kirkwood (1999), "The Founding of INFORMS: A Decision Analysis Perspective," Operations Research 47, 16−28. Kimbrough, Steven O. & Martin Weber (1994), "An Empirical Comparison of Utility Assessment Programs," European Journal of Operational Research 75, 617−633. Krischer, Jeffrey P. (1980), "An Annotated Bibliography of Decision Analytic Applications to Health Care," Operations Research 28, 97−113. Loewenstein, George F. & John Elster (1992), "Choice over Time." Russell Sage Foundation, New York. Loewenstein, George F. & N. Sicherman (1991), "Do Workers Prefer Increasing Wage Profiles?," Journal of Labor Economics 9, 67−84. McDaniels, Timothy L. (1995), "Using Judgment in Resource Management: A Multiple Objective Analysis of a Fisheries Management Decision," Operations Research 43, 415−426. Pliskin, Joseph S., Donald S. Shepard, & Milton C. Weinstein (1980), "Utility Functions for Life Years and Health Status," Operations Research 28, 206−224. Russell, L.B., M.R. Gold, J.E. Siegel, N. Daniels, & Milton C. Weinstein (1996, for the Panel on Cost-Effectiveness in Health and Medicine), "The Role of CostEffectiveness Analysis in Health and Medicine," JAMA 276, 1172−1177. Salo, Ahti A. & Raimo P. Hämäläinen (1992), "Preference Assessment by Imprecise Ratio Statements," Operations Research 40, 1053−1061. Strotz, Robert H. (1957), "The Empirical Implications of a Utility Tree," Econometrica 25, 269−280. Torrance, George W. (1986), "Measurement of Health State Utilities for Economic Appraisal: A Review,"Journal of Health Economics 5, 1−30. Torrance, George W., David H. Feeny, William J. Furlong, Ronald D. Barr, Yuemin Zhang, & Qinan Wang (1996), "Multiattribute Utility Function for a
25 Comprehensive Health Status Classification System: Health Utilities Index Mark 2," Medical Care 34, 702−722. Von Nitzsch, Rüdiger & Martin Weber (1993), "The Effect of Attribute Ranges on Weights in Multiattribute Utility Measurements," Management Science 39, 937−943. Weber, Martin & Katrin Borcherding (1993), "Behavioral Influences on Weight Judgments in Multiattribute Decision Making," European Journal of Operational Research 67, 1−12. Von Winterfeldt, Detlof & Ward Edwards (1986), "Decision Analysis and Behavioral Research." Cambridge University Press, Cambridge.