David A. Aaker and James M. Carman are Professors, Graduate. School of Business Administration, University of California, Berke- ley, Richard P. Bagozzi is ...
DAVID A. AAKER, RICHARD P. BAGOZZI, JAMES M. CARMAN, and JAMES M. MacLACHLAN *
The rote of response latency in the measurement af preferences is investigated by means of a causal modeling method. The convergent validity and predictive validity of response latency measures are examined and compared with those of constant sum and paired comparison procedures.
On Using Response Latency to Measure Preference
Marketing researchers now use several measures of "preference," including constant sum scales, lottery choices, purchase intention questions, and coupon redemption (Axelrod 1968; Haley 1970). These measures differ in terms of their reliability, validity, and practicability in a given research situation. Rather recently it has been suggested that response latency may be useful in measuring preference. Response latency is the length of time taken by a respondent to make a paired comparison choice. Several researchers have shown that response latency measures strength of preference. The faster a choice is made, the stronger the preference for the selected alternative. The link between response latency and preference has been demonstrated in a variety of contexts. For example, Dashiell (1937) asked respondents to select between colors and Barker (1946) used children to choose between beverage pairs. In the marketing context, Curry (1975) asked subjects to select between wines, MacLachlan and LaBarbera (1978) asked telephone interview respondents to select preferred television programs, MacLachlan (1977) had subjects choose between pairs of branded grocery products, and Tyebjee (1979) investigated the preferences and choices of respondents for beer. The efficacy of response latency suggests that it be used in marketing research for several reasons. First, paired comparison preference measurement is
frequently employed in part because constant sum scaling tasks are unwieldy in telephone interviewing or when the respondent is interfacing directly with a computer terminal. If paired comparison is used, response latency can be measured at little marginal cost, particularly by the growing number of research firms that are integrating the computer into their surveys (MacLachlan, Czepiel, and LaBarbara 1979). Second, response latency has the desirable property of being unobtrusive, as respondents are usually not aware that their response is being monitored. Third, the combination of response latency and paired comparison is very likely to provide a measurement of preference that is superior to paired comparison by itself. Fourth, response latency can be helpful in construct validation. Because a true test of construct validity requires "maximally different methods" to determine convergent and discriminant validity (Campbell and Fiske 1959), response latency measures provide a needed method. Heretofore, construct validation studies in marketing have relied on similar self-report methods. Finally, by serving as a multiple measure of preferences, response latency can enhance the reliability of measurements. Two practical questions still need to be answered, however, before response latency will or should be accepted in survey research. First, what is the relative marginal contribution of response latency to the measurement of preference? In particular, is enough substance added to the paired comparison measure to make its inclusion worthwhile? Second, how should the response latency measure be combined with paired comparison and other measures of preference to obtain the best measure of brand preference? The purpose of this article is to address these two
* David A. Aaker and James M. Carman are Professors, Graduate School of Business Administration, University of California, Berkeley, Richard P. Bagozzi is Associate Professor, Massachusetts Institute of Technology. James M. MacLachlan is Assistant Professor, Graduate School of Business Administration, New York University. 237
Journal of Marketing Research Vol. XVII (May 1980), 237-44
238
questions. A multiple indicator/multiple cause model, hereafter referred to as a MIMIC model, is employed. The MIMIC model allows the response latency contribution to be evaluated in the context of a cause-effect structural model. Thus, the predictive (actually retrodictive) validity of response latency in a structural model context is addressed. Previous research has really focused entirely on its convergent validity (i.e., its tendency to be correlated with other brand preference measures). Further, the MIMIC model provides a mechanism to combine the response latency with paired comparison measures and with any other preference measure that has been collected and has theoretical support. DATA AND METHOD A laboratory experiment by MacLachlan (1977) provides the data used to address the two questions. Sixty housewives, recruited from a single community, participated in the study. Data gathering began with a pantry audit of brands currently on hand, continued with a weekly panel in which consumers recorded their actual brand purchases over a two-month period, and culminated in an experimental session in the computer-controlled laboratory facility on the Berkeley campus. On entering the laboratory, participants were assigned randomly to one of two viewing groups. Group 1 was exposed to commercials for Heinz catsup and Ivory soap; Group 2 was exposed to Coca Cola and Zest commercials. In the present investigation, the primary analysis involves the Heinz and Ivory commercials and their respective product classes, catsup and bath soap. Group 2 serves as a control group. The groups viewed a 20-minute television program on how to borrow money. The test commercials were embedded, with four others, in this program sequence in a manner simulating standard TV programming. Participants were paid $10 in cash and were given an additional $3 worth of groceries. After viewing the program, participants were asked various questions about its content and their opinions on borrowing money—whether such programs were useful, and so on. After this session, each individual was taken to a private cubicle containing a computer terminal, slide projector, and viewing screen. She was given brief instructions on how to operate the terminal, told that pairs of brands would be projected on the screen in front of her, and instructed to push one or the other of two buttons according to which brand she preferred. In each product class, five brands were presented. The paired comparison measure was derived from the choices made by participants during this phase of the experiment by summing the number of times each brand was preferred. Unknown to the participants, the time taken to make a choice in each case was being recorded by the computer clock. These times were used to generate
JOURNAL OF MARKETING RESEARCH, MAY 1980
two measures of response latency, RLC and RLS. Both measures are based on an equation which converts the raw latencies into a preference measure, while controlling for individual variations in response style caused by such extraneous factors as age. RLC,, = [RL/RL,, - RL/5]D where: ,c = the strength of preference, or affective valuedistance (DashieU 1937), in a paired comparison of brand t (the test brand) and brand c (the control brand). RL,^ = deliberation time in seconds between the test and control hrand. Experimental evidence suggests that deliberations of five seconds or longer are indicative of true indifference (MacLachlan 1977, p. 44). Thus, RL,^ is constrained to he less than or equal to five. RL = the participant's mean response latency for a numher of paired comparison decisions. D = plus I if r is preferred to c; minus 1 if c is preferred to t.
The RLC measure is RLC,^ for the case in which the comparison brand is a particular major national brand used as an anchor. The test brands were Heinz catsup and Ivory soap; the anchors were Del Monte catsup and Zest soap, respectively. The RLS measure is the sum of the RLC,^ values over all four comparison brands. RLS thus measures the distance between the test brand and all the other brands in the product class. In contrast, RLC measures the distance of the test brand from the particular anchor brand. In general, the anchor brand could be any one brand in the product space from which the advertiser is attempting to move his or her brand. It was not obvious in these situations whether the advertisement would change brand preference over all brands to the same extent. Thus, both measures were used. After this exercise, participants completed a conventional constant sum procedure. They were given five envelopes, each identified as a particular brand, and 11 cards. They were told to "please divide the 11 cards among the five envelopes, based on how likely it is you would buy the brand. You may give a brand all of the cards, some of the cards, or none of the cards." The procedure provided the CS measure discussed here. MacLachlan (1977) gives additional details of the experiment. RESULTS The model used to analyze the relative usefulness of response latency to measure brand preference is shown in Figure I. Brand preference in this case is not directly observed or measured. It is a latent unobservable variable that has operational implications for relationships among observed variables. There are two sets of observed variables. One set contains antecedent variables of brand preference; past pur-
USING RESPONSE UTENCY TO MEASURE PREFERENCE
239
Figure 1 THE HYPOTHESIZED MODEL
PP
chases (PP) and advertising (S). Their relationship to brand preference is hypothesized to be linear and additive. Past purchases are measured as a percentage of purchases of the product category going to the test brand over the past six weeks, as recorded in the purchase diary by the respondent. The advertising stimulus is a dummy variable, depending on whether the respondent was in the test or control group. The coefficients for the predictors are P, and pj- The link between S and PP shows the collinearity between the two predictors. Because they are assumed to be measured without error, to hypothesize them to be orthogonal would unduly constrain the model and would bias 3 , and p.,. The second set of observed variables contains the indicators of brand preference; the paired comparison score (PC), the constant sum score (CS), and the response latencies (RLC or RLS). Separate runs were made using each of the latency measures. Measurement error is hypothesized in all indicator variables.' The coefficients for the indicator variables are denoted by X ,, X J, and X 3. They indicate the relative validity of the three measures of preference and can provide the basis for combining them into a single preference measure. Algebraically, the model involves the following equations.^ 'Actually, a more parsimonious model that did not hypothesize a correlation between u, and MJ was tested first. In causal modeling, it is desirable to test simple structures and reject them before testing more complex models. Because the response latency was measured during the selection of the paired comparison preferences, both are subject (o the same distractions and extraneous variation present at the moment of measurement. Thus, it is reasonable to expect il(,j ^ 0. 'The treatment of error leims in this class of models is somewhat restrictive. In this case, there will be an indeterminacy In Ihe model unless the variance of E is arbitrarily fixed. The logical choices are 0 and I. Because the cause variables are assumed to be error-free, it is preferable lo assume that the unobservable construct (Y*) is subject to stochastic disturbance. The assumptions are that
PC = x , y * -\- u, RLC = K^Y* -\- Uj
CS = \ , y* + u,. It is now apparent that this is a MIMIC model with the error terms of two indicators allowed to be correlated. For a general background on this and other causal models involving unobservable variables, see Aaker and Bagozzi (1979), Bagozzi (1980), Blalock (1969), Costner (1971), Duncan (1975), and Goldberger and Duncan (1973). Maximum likelihood estimates of the parameters of MIMIC models can be obtained by using the LISREL algorithm (Joreskog and van Thillo 1972). The results of the test of this model are shown in Table 1. The test hypothesis employs the likelihood ratio statistic that is distributed as chi square. The hypothesis is that the structure and restrictions on the model are valid. High chi square values (low /7-values) are evidence that the model does not fit the data at hand. Thus, one seeks a large probability in the chi square distribution to the right of the calculated value. For only one model, RLS for soap, is the probability not greater than 0.10. The other three models receive empirical support from this test. The explanation for the poor fit of RLS for soap probably lies in the inappropriateness of that measure for the soap test. The catsup control group saw no other catsup commercial. Therefore, the stimulus is only the Heinz commercial. The soap control group, however, saw a Zest commercial. This commercial had little impact on the relative position of Zest, Ivory, or Dial, but did improve the strength of preference £(E) = 0 and E{E ^) = 1. This value of the variance acts as a scaling factor on the magnitude of the causal ajod indicator coefficients.
JOURNAL OF MARKETING RESEARCH, MAY 1980
240
Table 1 RESULTS FOR THE HYPOTHESIZED MODEL
Bath soap
Catsup Statistic
k,(PC)
-value
RLS
RLC
-.355(.139)' .746(. 162) .432(. 147) .580(.141)
-.380(.146) .759(.176) .431{.149)
.689(.167} .416 1.11 .78 .80 .36 .29
.459(. 148) .673(. 190) .170 3.21 .37 .77 .37 .28
RLS
RLC
.876