Using discrete choice experiments to value ... - Wiley Online Library

1 downloads 2314 Views 147KB Size Report
healthcare course' hosted by the. Health Economics Research ... experiments (DCEs), in the measurement of consumers' preferences for pharmacy services.
IJPP 2005, 13: 9–20 ß 2005 The Authors Received March 22, 2004 Accepted September 16, 2004 DOI 10.1211/0022357055245 ISSN 0961-7671

Using discrete choice experiments to value preferences for pharmacy services Katherine Payne and Rachel Elliott

Abstract

North West Genetics Knowledge Park, The Nowgen Centre, Manchester, UK Katherine Payne, research fellow School of Pharmacy and Pharmaceutical Sciences, The University of Manchester, Manchester, UK Rachel Elliott, clinical senior lecturer

Correspondence: Dr Katherine Payne, North West Genetics Knowledge Park, Grafton Street, Manchester M13 9WL, UK. E-mail: katherine.payne@ manchester.ac.uk Acknowledgements: This paper has benefited from attendance on the ‘Stated preference discrete choice modelling in healthcare course’ hosted by the Health Economics Research Centre, University of Oxford in collaboration with Jordan Louviere and Vic Adamovicz and from informal discussions with health economists with expertise in discrete choice experiments, in particular, Emma McIntosh, University of Oxford. Any errors or omissions remain the responsibility of the authors.

Objective This paper describes the application of discrete choice experiments (DCEs), in the measurement of consumers’ preferences for pharmacy services. Summary Patients’ preferences for healthcare influence strongly their use of services. Quantifying revealed preferences for services (what services people use) is not always possible because either the service does not yet exist or the consumer has no experience of it. There is a need for tools that measure stated preference (what people say they will do) for healthcare, to allow development of new services. DCEs have been used in the valuation of preferences for healthcare services and interventions and can be applied usefully to the valuation of preferences for pharmacy services. DCEs assume that preferences are based on preferences for different attributes of a service, and that consumers are prepared to trade off one attribute against another, such as effectiveness versus sideeffects. In a DCE study, respondents make hypothetical choices between scenarios of services with fixed attributes, but varying levels, revealing their strength of preference for attributes of that service. These data are analysed using regression, which generates coefficients that quantify the direction and magnitude of preferences. Marginal rates of substitution and willingness to pay for each attribute can be estimated, which provide powerful information for future service provision. For this approach to be applied in practice, key methodological issues must be handled explicitly, principally scenario design, attribute and level selection, orthogonality, level balance, minimal overlap and utility balance. A hypothetical example of a DCE designed for valuing consumers’ preferences for a medication review service for the elderly is described.

Introduction Consumers, choices and the NHS

Members of the public are becoming more knowledgeable about healthcare and consequently are expecting more from the services provided.1 To improve the use of effective pharmacy services it is important to know what aspects of the service affect peoples’ preferences. Choices about allocation of resources have to be made, and are being made both at a national level in the UK (for example by bodies such as the National Institute for Clinical Excellence (NICE)), and at a local level. Decision makers have to determine what services to provide, when and at what level of provision. We suggest that consumers’ and patients’ preferences and values for pharmacy services need to be used to inform the answers to these questions. For the remainder of this paper we will use the term ‘consumers’ to refer to current patients, future patients or other users of pharmacy services. How can we identify preferences?

Revealed preferences demonstrate what consumers actually choose to use in real-life situations. For example, the revealed preference for a cholesterol screening service could be measured by counting the number of consumers who came in for screening over a defined time period, as a proportion of the number of consumers eligible for the service. The revealed preference approach is not generally a practical option for quantifying preferences for healthcare. It is often not feasible to obtain information on the number of consumers who use a service and hence their ‘revealed’ preference for

9

10

The International Journal of Pharmacy Practice, March 2005

the service. Often, decision makers want information about the potential value of a new service before it is established. A new service may have certain characteristics that are not present in the existing service and we need to know if consumers value these characteristics enough to switch from the existing service. Opinion polls have been used widely to identify consumers’ preferences. They are simple to use but provide limited information about the direction or strength of preferences. Satisfaction surveys provide information about what is important to consumers and their level of satisfaction with the current service as provided, but not information about the potential value of future services.2,3 Neither method can be used to identify the factors, and the relative degree of importance of these factors, that drive observed levels of satisfaction. Valuing preferences

Research methods have been developed to value consumers’ stated preference for a healthcare intervention. In contrast to revealed preference, ‘stated preference’ provides a measure of what consumers say they will do based on a hypothetical (imaginary) scenario describing the healthcare service or intervention in question. Discrete choice experiments (DCEs) are a stated preference method, based on economic theory that assumes people have clear preferences for goods or services and are able to choose one good or service in preference to another. This paper describes the application of DCEs in the measurement of consumers’ preferences for healthcare, with particular reference to pharmacy services.

Discrete choice experiments DCEs identify the relative importance of individual attributes, factors or characteristics, of a service (or product) and their effect on consumer preferences for the service (or product). DCEs are part of a group of stated preference methods called ‘conjoint analysis’ that elicit preferences for scenarios using rating, ranking or discrete choices.4 Conjoint analysis was developed by market researchers to estimate the impact of selected product characteristics on consumer preferences. Conjoint analysis involves a survey that presents a hypothetical scenario. Consumers are asked to rank, rate or choose scenarios, which describe product characteristics, in order of preference. DCEs ask the consumer to make a ‘discrete choice’ between two or more services that differ with respect to certain predefined attributes. In theory, the consumer has a stronger preference for, and will choose the service, which reflects a higher value to them, given the defined attributes and levels. DCEs are therefore a ‘choice-based’ stated preference method. The good or service, such as a car, is described using attributes, such as price, colour, engine size, fuel type and average fuel consumption. The attributes are described using levels, such as black, silver, red or blue for colour and 1300 cc or 2 l for engine size. Different people have

different preferences for each of these attributes and this informs their preference for a particular type of car over another. DCEs assume that people will trade-off one attribute against another. An example scenario is one in which someone is making a hypothetical choice between two types of car (car A and car B). Car A and car B differ in terms of the levels attached to the attributes. Car A may have the preferred price and engine size for that person, but use a less preferable type of fuel and have poorer fuel consumption than car B. In this instance, the person may still decide to pick car A, even though it does not completely match their ideal preferences for all the attributes. The person has had to make trade-offs between the attributes and levels describing the car. This process closely resembles how we make decisions on a day-to-day basis. The theory underpinning the method of DCEs is rooted in the discipline of economics and the random utility framework. This assumes that the total value (utility) a consumer attaches to a good or service is described by the sum of individual attributes. With respect to healthcare, these attributes can include factors that value the clinical outcome of the service or the process of providing the service. The attributes are then described in more detail by defining a number of levels for each one. DCEs have been used quite widely in the evaluation of preferences for healthcare programmes. The reader is directed to Ryan et al. (2001)5 and Ryan and Gerard (2003)6 for two systematic reviews of approaches used to value the public’s preferences for healthcare interventions. These reviews provide useful overall assessments of the state of use of DCE in the evaluation of healthcare services from papers published between 1990 and 2000. To identify some more recent examples of studies that have applied DCEs to healthcare we re-ran the Ryan and Gerard (2003)6 search strategy for the following databases from January 2003 to May 2004: MEDLINE; EMBASE; ISI Web of Science; PsychINFO; ECONLIT and Health Management Information Consortium database. Table 1 provides a summary of the methods used in the identified papers that used the DCE approach and presents the primary empirical results of studies. Studies that presented methodological issues, or used conjoint analysis, with ranking or rating, were not included.

How do you do a DCE? While a full description of the technicalities involved is beyond the scope of this paper we describe how to do a DCE using current thinking in study design and data analysis. We have used a hypothetical example of a DCE designed to value consumers’ preferences for a medication review service for the elderly, to illustrate key principles. We selected this service as an example because in the UK, for example, it is relevant to emerging NHS policy being described in the National Service Framework for the elderly and has formed the basis of the recently established Local Pharmaceutical Service pilots, which are exploring new ways of contracting for pharmacy services. The same issues are relevant in other countries where new services

Reimbursement schemes

Breast cancer screening participation

Bech (2003)7

Gerard et al. (2003)8

Author (year) Service under evaluation

Data analysis

Random effects probit model and binary probit model were compared. Statistical tests included: log likelihood function; likelihood ratio index; chi-squared; rho (to explore evidence of random effects); P values and the percentage of correctly predicted responses

Random effects probit Approach: discrete choice (1) The county’s budget safety for the healthcare sector model Three models were experiment (does not provide budget safety; provides budget safety) specified: full joint model; Development of choices: SAS (2) Hospitals’ incentives to increase the number of patients full partitioned model was used to generate (no incentive; small incentives; high incentives) and restricted model optimal set of scenarios (3) Quality of treatment (constant quality; increasing quality) taking orthogonality and (4) Hospitals know with certainty their budget for the present year (no level balance into account; 16 uncertainty; hospitals’ revenue depend on its activity) scenarios were combined into (5) Cost per treatment (constant; decreasing) eight choices that were randomly allocated into two blocks of four pairwise sets Administration: postal questionnaire

Design

Approach: discrete choice 87 (48%) women in (1) Method of inviting women for screening (personal experiment reminder from the local service; personal reminder and the process of recommendations by GP; a media campaign; a recommendation from Development of choices: a attending for design expert created an family/friends). Information included with invitation (sheet about breast screening in orthogonal main effects procedure, benefits and risk of screening; no information sheet) Australia design with selected two(2) Time to wait for an appointment (1 week; 4 weeks) Number of factor interactions; 32 (3) Choice of appointment times (usual office hours; one evening per reminders not scenarios blocked in 2 sets week; Saturday morning) reported of 16 options. (4) Time spent travelling (not more than 20 min; 20 to 40 min; 40 to Administration: postal 60 min; 1 to 2 h) questionnaire (5) How staff at the screening relate to the patient (welcoming manner; reserved manner) (6) Attention paid to privacy (private changing area; open changing area) (7) Time spent attending for mammogram (20 min; 30 min; 40 min; 50 min) (8) Time to notification of results (8 working days; 10 working days; 12 working days; 14 working days) (9) Level of accuracy of test (70%; 80%; 90%; 100%)

65 (52%) Danish county council politicians and 97 (63%) hospital managers Number of reminders ¼ 2 (1 by letter and 1 by phone)

Sample size (response Attributes (levels) rate) and number of reminders

Table 1 Some examples of the application of discrete choice experiments to healthcare evaluation

Sample size (response Attributes (levels) rate) and number of reminders

Ryan and Ubach (2003)11

Mark et al. (2003)9

Repeat prescribing system

66 (response rate not (1) Convenience of ordering and collecting your repeat prescription (attend surgery and visit pharmacy; visit only pharmacy) reported) patients (2) Cost of ordering and collecting your repeat prescription (£0.25; recruited to a £1.00; £2.00 per month) randomised controlled trial (33 (3) Quality of information received when receiving repeat prescription (spoken directions; spoken directions and discuss any problems) intervention and 33 control) evaluating a new repeat prescribing system Number of reminders not reported

(1) Percentage of treated population who remained abstinent during a Rate of 1388 (65%) three-month treatment period (20%; 40%; 60%; 80%) prescription of physicians (2) Percentage of patients who would have no incidence of heavy alcoholism specialising in drinking during a three-month treatment period (20%; 40%; 60%; medications addiction 80%) medication (3) Percentage of patients who experience non-serious side-effects such Number of as nausea, headaches, dry mouth, dizziness or nervousness during a reminders ¼ 2 (1 three-month treatment period (10%; 15%; 25%; 35%) letter and 1 phone) (4) Percentage of patients who complied at a high rate (80 or more doses) during a three-month treatment period (20%; 40%; 60%; 80%) (5) Mode of action (directly reduces drinking; causes an adverse reaction with alcohol) (6) Route of administration (oral; long-acting injection) (7) Price per day ($0.25; $1.00; $3.00; $5.00) Ratcliffe et al. Treatments for (1) Joint aches (very slight; moderate; severe) 412 (response rate (2003)10 (2) Joint pains (occasionally, 2–3 times per week; most days) osteoarthritis not reported) general population (3) Mobility (normal; some difficulty; confined to chair) (4) Risk of mild/moderate side-effects (1 in 4; 2 in 4; 3 in 4) of older patients (5) Risk of serious side-effects (1 in 500; 1 in 200; 1 in 50) (over 55 years) with osteoarthritis Number of reminders ¼ 0

Author (year) Service under evaluation

Table 1 (Cont.) Data analysis

Random effects probit Approach: discrete choice model. The model was experiment segmented according to: Development of choices: severity of osteoarthritis; SPEED used to identify 16 age; annual income; most scenarios that were paired troublesome side-effect into three sets of eight experienced choices using a random number generator plus a logical consistency test Administration: telephone interviews Random effects probit Approach: discrete choice model using a general to experiment specific approach using Development of choices: full the 90% significance factorial design with one level. Each attribute scenario taken out at random was segmented into two and the remaining 23 groups: intervention and scenarios. These 23 choices control were randomly allocated between six questionnaires Administration: not reported

Multinomial logit model Approach: discrete choice using maximum including a ‘neither’ option likelihood techniques Development of choices: 128 scenarios were blocked into groups of four scenarios (total of 32 blocks) and each respondent randomly assigned to a block. Included four interaction terms Administration: postal questionnaire (with $50 incentive)

Design

Community 3891 (68%) parents preferences for of children in out-of-hours Aberdeen and care provided Glasgow by general Number of practitioners reminders ¼ 0

Scott et al. (2003)14

Schwappach (2003)13

Colorectal cancer 301 (response rate (CRC) not reported) screening by people aged 50 to faecal occult 70 years living in blood test central Sydney at ‘average risk’ of colorectal cancer Number of reminders ¼ 0 Allocation of 150 (response rate scarce not reported) resources for undergraduate life-saving students from treatments medical and economic faculties of University of Witten Number of reminders ¼ 0

Salkeld et al. (2003)12

Random effects probit (1) Benefit measured as number of CRC deaths prevented (3; 8; 14; 20 Approach: discrete choice model. Base case model experiment per 10 000 screened) including all responses (2) Potential harm measured as false positive induced colonoscopy (100; Development of choices: full plus an interaction factorial design with 32 300; 600; 800 per-CRC death prevented) model. A second reduced scenarios paired into 16 (3) Notification policy of test result (notified of either positive or regression model with choices, plus two added negative test result; notified only of a positive test result; not notified ‘non-traders’ excluded choices to test for rationality of a negative test result) was also estimated Administration: face-to-face interviews Two-limit random effects Approach: discrete choice (1) Life expectancy (1; 10; 30 years) Tobit model was experiment plus allocation of (2) Quality of life (very good; limited; bad as defined by EuroQoL states compared with a random a percentage of a budget 11111, 11221, 11331, where: 11111 ¼ perfect health; 11221 ¼ some effects generalised least problems with performing usual activities and in moderate pain and Development of choices: 18 squares (GLS) regression scenarios identified by SPSS discomfort and 11331 ¼ unable to perform usual activities and in model, which was Orthoplan. Scenarios paired extreme pain and discomfort considered superior. The using random number tables (3) Age (20; 40; 60 years) model was re-estimated and four scenarios prepared (4) Socio-economic status (high; low) after excluding non-traders manually: example and (5) Whether patient had received costly treatment previously (yes; no) maximal differences between (6) Former lifestyle of patient (healthy; unhealthy) levels. Paired into 11 choices Administration: interactive survey on worldwide web Random effects probit (1) Where your child is seen (emergency centre run by GPs; your home; Approach: discrete choice model: (a) with main experiment a hospital emergency department) effects; (b) with main (2) Who your child sees (a GP from your practice; a GP who doesn’t Development of choices: 48 effects and interaction scenarios were reduced to 16 work at your practice) terms, which was reduced scenarios. One scenario was (3) Time taken between the telephone call and treatment being received to a more parsimonious chosen to be constant and the (20 min; 40 min; 60 min; 80 min) model by excluding others compared with it. The (4) Whether the doctor seems to listen to what you have to say (doctor variables with a P value constant scenario was chosen seems to listen; doctor does not seem to listen) greater than 0.10 to represent the new model of emergency care. The remaining 15 scenarios were randomly divided across two questionnaires (one with eight pairs and one with seven pairs) Administration: postal questionnaire

Skjoldborg and GyrdHansen (2003)16

Danish hospital 1991 (69%) Danish and healthcare population system Number of reminders ¼ 0

Design

Data analysis

Random effects probit Approach: discrete choice (1) Diarrhoea (absent; mild; moderate) model specified by two experiment (2) Hot flushes (absent; mild; moderate) separate models (3) Ability to maintain an erection (no problems; occasional problems; Development of choices: the exercise was divided into two unable) parts: two sets of three unique (4) Physical energy (no problems; lacking) attributes (3, 4, 6) and (1, 2, 5) (5) Breast swelling or tenderness (absent; present) plus two common attributes (6) Sex drive diminished (no; diminished) (7, 8). Two parts of exercise (7) Life expectancy (both options equal; one option better by 2 months; contained eight choices. Eight one option better by 4 months) different versions of the (8) Out-of-pocket expenses (range £0 to £400, 16 levels used) questionnaire representing a different orthogonal main effects design were prepared Administration: face-to-face interview Random effects probit Approach: discrete choice Hospital model was compared to experiment (1) Travel time to hospital by car (hospital A: 15 min; hospital B: 15 min; binomial probit model. Development of choices: 35 min; 60 min) Dummy variables were SPEED and block design. 25 (2) Admission to emergency department (hospital A: open for everyone; included to represent the subgroups each with 191 to hospital B: open for everyone; referral from emergency doctor disutility associated with 265 respondents. Three required) the notion of payment questions per respondent for (3) Average waiting time for non-acute surgery (hospital A: 6 months; and the degree of hospital and healthcare hospital B: 3; 6; 9 months) payment system (4) Frequency of treatment without complications (hospital A: lower Administration: face-to-face than average; hospital B: lower than average; higher than average) interviews (5) Introduction of up-to-date treatment regimes has priority (hospital A: no; hospital B: no; yes) (6) The patient is primarily attended by the same physician (hospital A: yes; hospital B: yes; no) (7) Number of beds per ward (hospital A: 4; hospital B: 2; 4; 8) (8) Out-of-pocket payment (in DKK) per hospitalisation (hospital A: 0; hospital B: 0; 1200; 2500; 5000; 10000) or (9) Extra tax payment per year (in DKK) (hospital A: 0; hospital B: 0; 1200; 2500; 5000; 10000) or (10) Out-of-pocket payment versus no out-of-pocket payment (yes; no) or (11) Tax increase versus no tax increase (yes; no)

Sample size (response Attributes (levels) rate) and number of reminders

Sculpher et al. Management of 129 (72% response non-metastatic rate) men with (2004)15 prostate cancer non-metastatic prostate cancer and 69 of these had T stage 1 or 2 cancer at diagnosis Number of reminders ¼ 0

Author (year) Service under evaluation

Table 1 (Cont.)

Ubach et al. (2003)18

Taylor and Armour (2003)17

Healthcare system (1) Health care system tries to offer all new treatments irrespective of cost (system A: yes; system B: no; yes) (2) More screening programmes introduced (system A: no; system B: no; yes) (3)There is free choice of public hospital (system A: no; System B: no; yes) (4)Treatment in private hospital is subsidised (system A: no; system B: no; yes) (5) Focus on preventive measures to reduce lifestyle-related diseases (system A: yes; system B: no; yes) (6) Extra payment per year (hospital A: 0; hospital B: 0; 1200; 2500; 5000; 10000; 25000) or (7) Maximum out-of-pocket payment per year (hospital A: 0; hospital B: 0; 1200; 2500; 5000; 10000) or (8) Out-of-pocket payment versus no out-of-pocket payment (yes; no) or (9) Tax increase versus no tax increase (yes; no) Random effects probit Approach: discrete choice Two methods of 352 (88%) pregnant (1) Method of induction (ARM plus oxytocin; dinoprostone gel) model experiment women attending (2) Place of care (ARM levels: labour ward; dinoprostone levels: induction of There was also an analysis Development of choices: public antenatal antenatal ward þ labour ward; labour ward; labour: of income subgroups factorial design from clinics in Australia single room þ labour ward) dinoprostone mathematical tables was used (3) Length of time from induction to delivery (ARM levels: 6 h; vaginal gel and Number of to identify 18 scenarios for reminders ¼ 0 dinoprostone levels: 6; 14; 24 h) artificial gel that were compared with (4) Level of pain as chance of needing epidural (ARM levels: 50%; rupture of current situation (ARM: dinoprostone levels: 30%; 50%; 80%) membranes levels kept constant) (5) Mode of delivery as chance of needing Caesarean (ARM levels: 10%; (ARM) plus Administration: selfdinoprostone levels: 5%; 10%; 15%) oxytocin completion questionnaire in (6) Cost (ARM levels: $A0; dinoprostone levels: $A0; $A150; $A1500) clinic Random effects probit Approach: discrete choice Hospital 1793 (61%) hospital (1) Working relationships with staff (good; fair) model. Three separate experiment consultants’ job consultants in (2) Amount of staff at work (enough; shortage) models were estimated: Development of choices: characteristics Scotland (3) Change in actual hours of work a week (10; 5; 0; 5) six job characteristics SPEED used to identify 16 Number of (4) On-call (home and not busy; home and very busy; residential and not only; demographic and scenarios. Constant scenario reminders ¼ 1 busy; residential and busy) family characteristics plus paired with 15 scenarios. (5) Change in total NHS income (no change; 10%; 20%) job characteristics; Extra choice added for (6) Opportunities to do non-NHS work (none; some; unlimited) specialities and location internal consistency. Three plus job characteristics’ versions of questionnaire using a general to specific Administration: postal approach. The final questionnaire model included attributes significant at a P value of 0.01

16

The International Journal of Pharmacy Practice, March 2005

Table 2 Possible attributes and levels for the medication review for the elderly service Attribute Process attributes Who conducts the medication review Frequency of medication review

Cost to you in terms of travel (£)

Health outcome Chance of improving your health by reducing risk of an adverse event (%)

Non-health outcomes Follow-up support

Levels

Pharmacist General practitioner Monthly 6-monthly 12-monthly 24-monthly 0 5 10 20 5 10 25 35 No Yes

are aspired to or planned. Table 2 shows the attributes and levels that could be used to describe the service, in our hypothetical DCE. The medication review service for the elderly described in the example (Table 2) is thus described by five attributes. The first step in a DCE is to decide which attributes and levels adequately describe the service to be evaluated. The attributes and levels can be qualitative (such as who conducts a medication review) or quantitative (such as frequency or cost of a medication review). The DCE should be designed with a minimum number of qualitative attributes because they complicate the design and also limit the amount of information that can be obtained from the DCE. The number of attributes included should be the minimum to adequately describe the service without losing important information about the attributes that may drive peoples’ preferences. This is because the number of attributes informs the length of the final DCE. It is important to consider the interaction between attributes and try to maintain independence between the overall effect of each attribute on preferences. To illustrate, consider the attribute cost. In this example, cost is described in terms of travel cost. A second attribute describing the distance a person has to travel to access the service could be included that used distance travelled (in miles) as the levels. In this instance, there would be an interaction between the cost and distance travelled attributes because the further you travel, the more expensive it will become. There is an interaction. This can be dealt with either by removing one of the attributes with an interaction that appears to be duplicating the same information, or by allowing for this interaction in the design. Careful consideration must also be paid to choosing the levels to describe each attribute. It is not necessary to have equal intervals between the levels. Levels selected should

seem reasonable and reflect the service (or good) being evaluated (Table 2). For example, it would not be realistic to assign levels such as every day or once every 10 years to an attribute describing the frequency of the medication review. The quality of information elicited about peoples’ preferences will only be as good as the attributes and levels used. Systematically organised searches of the published literature are often supplemented by qualitative techniques, such as face-to-face interviews (semi-structured or depth) or focus groups, to define important individual attributes and levels of the service in question. For example, a DCE exploring preferences for electronic prescribing systems used interviews with GPs, computing experts and opinion leaders and focus groups with pharmacists and patients to identify the attributes.19 Identify hypothetical scenarios and pairwise choices

The next step is to develop the ‘choice’ question for each respondent to address. This is the choice each respondent is being asked to make regarding the service, i.e. ‘Given the following characteristics of each service, which service would you choose, service A or service B?’ It must clearly express the research question being addressed, which is to elicit information about peoples’ preferences for a service, using a hypothetical exercise. The number of attributes and levels are formed into scenarios. The number of possible scenarios that must be included in the experiment to incorporate the total number of combinations of attributes and levels is called a ‘full factorial design’. If an experiment has four attributes each with two levels, then this gives rise to 24 or 16 possible scenarios. The hypothetical example of the DCE of a medication review service has two attributes with two levels (22) and three attributes with four levels (43). This results in 22  43 ¼ 256 possible scenarios. This number of scenarios will produce unmanageable numbers of choices for each respondent to consider. A ‘rule of thumb’ is that respondents can generally manage between 9 and 16 pairwise choices before they get tired or bored, so too many choices may affect the estimated model parameters.20 In contrast, some DCE designers suggest that respondents may cope with designs that include more choices.21 Consensus is developing around criteria for efficient design. Huber and Zwerina suggest four criteria to maximise efficient design:22 orthogonality; level balance; minimal overlap and utility balance. These four criteria have been used in practice to inform the design of a conjoint analysis tool to evaluate HIV testing.23 Orthogonality in the design

The principle of orthogonality means that occurrences of any two levels of different attributes in the design are uncorrelated. In practice, an orthogonal design means that each attribute, such as who conducts the medication review or frequency of the review, can be assumed to have an independent effect on the overall utility (value) of the service. This is important because it means that the

March 2005, The International Journal of Pharmacy Practice

individual effect of each attribute on respondents’ preferences can be completely isolated. This is the stage at which any potential interaction between attributes should be considered. Software packages (SPEED or SPSS Orthoplan) have been used by DCE designers to reduce the total number of scenarios to a number that respondents can feasibly cope with answering, while maintaining an orthogonal main effects design. There is no consensus on the ideal approach to use. One alternative is to employ the skills of a design expert. In future, design catalogues, which suggest the number of scenarios, given the number of attributes and levels, may be used to improve the efficiency of DCE designs.24

Level balance

17

overlap for the attribute ‘frequency of medication review’, which in this instance has the same level (6-monthly).

Utility balance

This relates to choices when the probability of choosing either alternative within a choice set is as similar as possible, that is, the respondent is forced into answering questions where the choice between service A and service B is difficult to make. It is often extremely difficult to meet all four criteria. Observance of utility balance may sometimes reduce the efficiency of the DCE design, and in practice, this criterion is not often upheld.

Creating pairwise choices

This term describes balance in terms of the frequency with which the levels of an attribute occur in the design. Consider our DCE and imagine we have identified 16 scenarios. Each scenario is described by five attributes and the number of levels are multiples (2 and 4) of each other. Each level will appear exactly eight times in the 16 scenarios for the two attributes described using two levels. This is level balance. If one of these attributes had three levels, then there would be imbalance in the design because the level would appear 5.33 (16/3) times in the design and SPEED (or SPSS Orthoplan) would include more of one level compared to the other two.

Minimal overlap

This criterion is used when starting to form pairwise choices. Minimal overlap describes how often attribute levels do not vary within each choice set in the overall design. The ideal DCE will have a mininum number of instances when the levels in a choice are the same. This is important because no information will be obtained about the relative importance of each attribute for driving preferences when the levels in the pairwise choice are the same. The pairwise choice illustrated in Table 3 shows

Table 3 An example of a pairwise choice Attributes

Service A

Service B

Who conducts the medication review? Frequency of medication review Cost to you in terms of travel (£) Chance of improving your health by reducing risk of an adverse event (%) Follow-up support Which service would you prefer (tick one only)?

General practitioner

Pharmacist

6-monthly

6-monthly

5

10

5

25

Yes

No

Once SPEED, Orthoplan or a design catalogue has informed on the number of scenarios necessary to include in the experiment, the number of pairwise choices must be designed. A number of different approaches to ‘pair-up’ the scenarios may be used and this aspect is still under debate as it is not clear what impact each method has on the statistical properties of the design.25 One approach is the constant comparison technique in which one scenario, which may be selected from the scenarios to best reflect the current way of providing the service, is kept constant and then each scenario (from the orthogonal design) is paired against it. Alternatively, all the scenarios can be paired ‘randomly’. A number of checks are necessary if this pseudo-random approach is used to ensure that minimal overlap has been reached, and the result of the pairing has not generated correlation between the attributes levels for each pair. The original orthogonal design has ensured there is no correlation, but the process of forming pairs could result in correlation. This should be checked using standard statistical approaches (Pearson’s or Spearman’s correlation coefficients). A full description of this process is beyond the scope of this paper. Table 3 shows an illustration of a pairwise choice. Most DCEs include an example of a pairwise choice (practice choice question) for respondents at the start of the questionnaire. It is generally a good idea to include checks for internal validity as a measure of whether the respondent understands the nature of the experiment being conducted. A dominance check will indicate if the respondents clearly understand when one alternative should logically be preferred to another. So, if the levels described for service A are better (or equal to) the levels described for service B, then the respondent should choose service A. A consistency check may be used to explore if a respondents’ preferences remain the same throughout the DCE. This is tested by repeating a choice later on in the DCE. Respondents may formulate (or develop) their preferences as they actually complete the DCE. In these instances, they may not pass a consistency check. This may not necessarily mean they have answered the DCE incorrectly. Further work is required to determine how consistent people’s preferences are during DCE.

18

The International Journal of Pharmacy Practice, March 2005

Administer the DCE

The approaches used to administer DCEs are similar to that used by standard surveys. There is currently no guidance on the ideal sample size. Published DCEs have used a range of sample sizes from around 30 respondents to sample sizes in the hundreds of respondents. Louviere offers a useful discussion on the considerations that should be made when designing a study sample and selecting a sample size for a DCE.25 A key consideration is whether the DCE designer intends to conduct subgroup analysis of different groups (defined for example by age, gender or socio-economic status) within the population to explore differences in the patterns of preferences. Generally, to achieve the target sample size, DCEs in healthcare have been administered using a postal survey.26,27 However, the usual challenge of achieving a reasonable response rate may be amplified with DCEs because they are cognitively demanding for respondents. To overcome this some DCEs have been administered using face-to-face interviews, which also provides an opportunity to explore qualitatively some of the reasons behind the respondents’ stated preferences.28 Analyse the results

For the study sample that completed the DCE we assess whether each attribute does contribute to people’s preferences (significance of the attribute). We then check the direction of the effect of each attribute on preferences and the relative importance of each attribute. Finally, we quantify how much people were willing to trade between each attribute. The analysis of a DCE assumes the overall strength of preference for a service (utility) is defined by a linear additive model. The dependent variable is whether the respondents chose service A or service B in each ‘choice’ question. The independent variables are the differences between the levels of each attribute in each ‘choice’ question. A linear additive model assumes that the overall utility (satisfaction) for a good or service is described by the sum of all the individual utilities attached to each of the included attributes. Furthermore, each attribute has an independent effect on preferences. A regression model is defined as the change in utility in moving from service A to service B each described by five attributes: Change in utility ¼ 1 who þ 2 frequency þ 3 cost þ 4 chance þ 5 support þ e þ u

ð1Þ

where 1 . . . 5 are the coefficients that describe the direction and strength of preference for each attribute; ‘who’ represents the difference in the ‘who’ attribute between service A and service B; ‘frequency’ represents the difference in the ‘frequency’ attribute between service A and service B; ‘cost’ represents the difference in the ‘cost’ attribute between service A and service B; ‘chance’ represents the difference in the ‘chance’ attribute between service A and service B; ‘support’ represents the difference in the ‘support’ attribute between service A and service B; e is the error term due to

differences amongst observations and u is the error term due to differences amongst respondents. The analysis method of choice for this model is a type of regression model called the random effects probit procedure. A similar type of model called logit may also be used. Logit and probit models differ with respect to the assumptions made about the underlying distribution but this does not generally affect the results in terms of the size of the coefficients in the resulting model.29 Probit (or logit) is used because the dependent variable in this model is binary. The random effects estimation is used to correct for correlation between observations from the same respondent. This correlation may occur because each respondent provides more than one observation when they answer each of the pairwise choices in the experiment. The interested reader is directed towards Greene and Louviere et al. for further information.25,29 Interpretation of the results

The outputs of the analysis of a DCE describe the significance of each attribute, the direction and the relative of importance each attribute. The significance of each attribute is measured using the P value attached to the estimated coefficient and indicates whether each attribute contributes to the overall model. Table 4 provides some hypothetical results from a DCE. Four of the attributes have beta coefficients with P values less than 5% (0.05). The ‘support’ variable, which is a measure of whether follow-up support is provided, is not significant. This means it does not have a significant effect on preferences for this service. The sign on the beta indicates the direction of the influence of each attribute. This is easy to interpret for quantitative variables (cost). A negative sign means that as cost increases, people are less likely to choose the service. Cost has a negative impact on preferences. A positive sign on chance of improving health outcome indicates that people value an improvement health outcome

Table 4 Hypothetical set of data from a DCE Variable

Levels (dummy or quantitative code)

Who

General practitioner (0) Pharmacist (1) Monthly (24) 6-monthly (4) 12-monthly (2) 24-monthly (1) 0 (0) 5 (5) 10 (10) 20 (20) 5 (5) 10 (10) 25 (25) 35 (35) No (0) Yes (1)

Frequency

Cost (£)

Chance (%)

Support

Beta

P value

0.039

0.001

0.010

0.000

0.003

0.002

0.049

0.000

0.580

0.300

March 2005, The International Journal of Pharmacy Practice

from a medication review. These results are intuitively logical. More care is needed to interpret the sign of qualitative variables, which are coded using dummy coding (or ideally effects coding) variables. It is important to note the level coded as zero. The positive sign on the ‘who’ variable indicates that this population would prefer a pharmacist to conduct the review. The interested reader is directed to Phillips et al. and Louviere et al. for a detailed discussion on the use and relative merits of effects coding compared to dummy coding for variables.23,25 The relative importance of each significant attribute is then shown by the relative size of the coefficient in the estimated model. This is indicated by the absolute size of the coefficient for each variable. Only significant variables should be included. It is important to take into account the unit of measurement for each attribute when interpreting the size of the coefficient. The order of importance of these attributes is: ‘chance’; ‘frequency’; ‘who’ and ‘cost’. Importantly, DCEs allow estimation of the marginal rate of substitution (MRS) for each attribute. For example, analysis of the marginal rate of substitution (amount of one attribute an individual is prepared to trade-off against another) between travel cost and chance of improving health rate gives an estimate of the willingness to trade between a lower travel cost against a higher chance of improving health. It is also necessary to take into account the underlying measurement scales used for the attributes when making comparisons between the utility estimates (beta coefficients) for attributes. One approach to account for measurement scales is to transform the coefficients by calculating the MRS of attributes with respect to a common unit.23,28 The common unit used must have a quantitative scale such as cost or time. The need to take account of scale effects is a recent methodological development and is still under debate by choice experiment designers. Including a variable that represents the potential cost to the consumer (travel cost in this instance) provides a method of estimating the amount of money they are willing to trade-off against changes in the other attributes. If we use the estimate of the beta coefficient from the cost attribute we can estimate (indirectly) the willingness-topay (WTP) for a one-unit increase in each of the other attributes. WTP for a pharmacist providing medication review ¼ 0.039/0.003 ¼ £13. An estimation of ‘willingnessto-pay’ provides an indirect method of estimating the monetary value that consumers attach to a pharmacy service. Ryan presents a useful comparison of two methods,30 a contingent valuation experiment (direct method) and choice experiment (indirect method) to estimate willingness to pay. Lanscar and Savage published a critique of the use of DCE to estimate willingness to pay which stimulated a debate in the health economics literature.31–34 The selection of attributes and levels, design of the DCE and approach to estimate the model should be such that they can predict how preferences may be revealed in actual practice. A follow-up study would be required to check whether stated preferences actually do reflect what happens in practice and would be a test of external validity. To date, no studies have checked the

19

external validity of a DCE. To do this it would be necessary to design a revealed preference study, after conducting a DCE, and compare the results.

Implications for pharmacy The majority of pharmacy services comprise a complex mix of characteristics that may be described by the effect on clinical outcome but also process attributes, such as who provides the service. Ideally, pharmacists want to provide services that are perceived to be of value to consumers. However, pharmacists, along with other healthcare providers, are not always aware of which service characteristics are important to consumers or influence their decisions to use them. Healthcare policy makers and planners require valid information on consumers’ preferences. DCEs provide a powerful method with which to identify the direction and strength of preference for characteristics of pharmacy-led services so that a pharmacist could tailor the attributes of a pharmacy service to what potential service users say they want.

References 1 Litva A, Coast J, Donovan J et al. ‘The public is too subjective’: public involvement at different levels of health-care decision making. Soc Sci Med 2002;54:1825–37. 2 Willett VJ, Cooper CL. Stress and job satisfaction in community pharmacy: a pilot study. Pharm J 1996;256:94–8. 3 Wolfgang AP. Career satisfaction of physicians, nurses and pharmacists. Psychol Rep 1988;62:938. 4 Ryan M, Farrar S. Using conjoint analysis to elicit preferences for health care. BMJ 2000;320:1530–3. 5 Ryan M, Scott DA, Reeves C et al. Eliciting public preferences for healthcare: a systematic review of techniques. Health Technol Assess 2001;5:1–186. 6 Ryan M, Gerard K. Using discrete choice experiments to value health care programmes: current practice and future research reflections. Appl Health Econ Health Policy 2003;2:55–64. 7 Bech M. Politicians’ and hospital managers’ trade-offs in the choice of reimbursement scheme: a discrete choice experiment. Health Policy 2003;66:261–75. 8 Gerard K, Shanahan M, Louviere J. Using stated preference discrete choice modelling to inform health care decisionmaking: a pilot study of breast screening population. Appl Econ 2003;35:1073–86. 9 Mark TL, Swait J. Using stated preference modeling to forecast the effect of medication attributes on prescriptions of alcoholism medications. Value Health 2003;6:474–82. 10 Ratcliffe J, Buxton M, McGarry T, Sheldon R, Chancellor J. Patients’ preferences for characteristics associated with treatment for osteoarthritis. Rheumatology 2004;43(3):337–45. 11 Ryan M, Ubach C. Testing for an experience endowment effect in health care. Appl Econ Lett 2003;10:407–10. 12 Salkeld G, Solomon M, Short L, Ryan M, Ward JE. Evidence-based consumer choice: a case study in colorectal cancer screening. Aust NZ J Public Health 2003;27(4):449. 13 Schwappach DLB. Does it matter who you are or what you gain? An experimental study of preferences for resource allocation. Health Econ 2003;12:255–67.

20

The International Journal of Pharmacy Practice, March 2005

14 Scott A, Watson MS, Ross S. Eliciting preferences of the community for out of hours care provided by general practitioners: a stated preference discrete choice experiment. Soc Sci Med 2003;56:803–14. 15 Sculpher M, Bryan S, Fry P, De Winter P, Payne H, Emberton M. Patients’ preferences for the management of non-metastatic prostate cancer: discrete choice experiment. BMJ 2004;328:382–4. 16 Skjoldborg US, Gyrd-Hansen D. Conjoint analysis. The cost variable: an Achilles’ heel? Health Econ 2003;12:479–91. 17 Taylor SJ, Armour CL. Consumer preference for dinoprostone vaginal gel using stated preference discrete choice modelling. Pharmacoeconomics 2003;21:721–35. 18 Ubach C, Scott A, French F, Awramenko M, Needham G. What do hospital consultants value about their jobs? A discrete choice experiment. BMJ 2003;326:1432–5. 19 Ubach C, Bate A, Ryan M, Porteous T, Bond C, Robertson R. Using discrete choice experiments to evaluate alternative electronic prescribing systems. Int J Pharm Pract 2002;10: 191–200. 20 Hanley N, Wright R, Koop G. Modelling recreation demand using choice experiments: rock climbing in Scotland. Environ Res Econ 2002;22:449–66. 21 Hensher D, Stopher P, Louviere J. An exploratory analysis of the effect of the number of choices sets in designed choice experiments: an airline choice application. J Air Transport Manag 2001;2:373–9. 22 Huber J, Zwerina K. The importance of utility balance in efficient choice set designs. J Mark Res 1996;33:307–17. 23 Phillips KA, Maddala T, Johnson FR. Measuring preferences for health care interventions using conjoint analysis:

24 25 26

27 28

29 30 31

32

33

34

an application to HIV testing. Health Serv Res 2002;37:1681–705. Sloane NJA. A library of orthogonal arrays. www.research. att.com/njas/oadir/ (accessed October 8, 2004). Louviere JJ, Hensher DA, Swait JD. Stated choice methods. Cambridge: Cambridge University Press; 2000. Ryan M. Using conjoint analysis to take account of patient preferences and go beyond health outcomes: an application to in vitro fertilization. Soc Sci Med 1999;48:535–46. Ratcliffe J. Public preferences for the allocation of donor liver grafts for transplantation. Health Econ 2000;9:137–48. Bryan S, Roberts T, Heginbottam C, McCallum A. QALYmaximisation and public preferences: results from a general population survey. Health Econ 2002;11:679–93. Greene WH. Econometric analysis, 5th ed. New Jersey: Prentice Hall; 2003. Ryan M. A comparison of stated preference methods for estimating monetary values. Health Econ 2004;13:291–96. Lanscar E, Savage E. Deriving welfare measures from discrete choice experiments: inconsistency between current methods and random utility and welfare theory. Health Econ 2004;13:901–37. Ryan M. Deriving welfare measures from discrete choice experiments: a comment to Lanscar and Savage (1). Health Econ 2004;13:909–12. Santos Silva JMC. Deriving welfare measures from discrete choice experiments: a comment to Lanscar and Savage (2). Health Econ 2004;13:913–18. Lanscar E, Savage E. Deriving welfare measures from discrete choice experiments: a response to Ryan and Santos Silva. Health Econ 2004;13:919–24.

Suggest Documents