Social Network Thresholds, Utility Maximization ... - Semantic Scholar

Word count: 12,470

Integrating Models of Innovation Adoption: Social Network Thresholds, Utility Maximization, and Hazard Models

Christophe Van den Bulte University of Pennsylvania

Gary L. Lilien The Pennsylvania State University

October 1999

Acknowledgements We benefited from comments by Wayne Baker, Hans Baumgartner, Albert Bemmaor, the late Clifford Clogg, Jehoshua Eliashberg, David Krackhardt, Keith Ord, Arvind Rangaswamy, David Schmittlein, David Strang, Thomas Valente, and audience members at the 1999 INSNA Sunbelt Conference, the 1994 and 1998 INFORMS Marketing Science Conferences, the Australian Graduate School of Management, Carnegie Mellon, Columbia, Cornell, Duke, Harvard, KU Leuven, Michigan, Northwestern, Penn State, Stanford, UNC Chapel Hill, UT Austin, and Wharton. We thank Thomas Valente for providing us with the Medical Innovation data set. Financial support from Penn State's Institute for the Study of Business Markets and the Richard D. Irwin Foundation is gratefully acknowledged. Correspondence address: Christophe Van den Bulte, The Wharton School, University of Pennsylvania, 3620 Locust Walk, Philadelphia, PA 19104-6371. Tel: 215-898-653; fax: 215-898-2534; e-mail: [email protected].

1

Integrating Models of Innovation Adoption: Social Contagion, Utility Maximization, and Hazard Models

Abstract

This paper shows how two different theoretical approaches to model innovation adoption, network threshold modeling and random utility modeling, both result in discrete-time hazard models, a popular statistical modeling approach to study innovation adoption and diffusion. The paper emphasizes the need for careful situational analysis prior to model building and illustrates the point with the diffusion of tetracycline, the drug analyzed by Coleman, Katz and Menzel in Medical Innovation. The situational analysis identifies marketing effort rather social contagion as a likely driver of adoption. A new statistical analysis of the Medical Innovation data set, expanded with newly collected advertising data, support this alternative explanation for the wellknown Medical Innovation results.

2

INTRODUCTION Empirical tests of social theories often proceed along the following lines. First, present the theory, either verbally or mathematically. Then, deduce testable hypotheses about associations between certain variables that operationalize crucial concepts in the theory. Finally, test the hypotheses empirically within a statistical framework. A statistical model (e.g., linear regression, log-linear modeling, …) allows the processes generating the data to show some random error, or lack of fit, to the proposed theory, but still enables one to compare the observed data with the predictions deduced from the theory to determine whether the theory should be rejected (Snijders 1996). Models used in quantitative empirical research generally consist of three components: •

The general structure or dynamic of the process. For innovation adoption, for instance, the process is the transition from being a non-adopter to being an adopter.

•

The mean structure, i.e. the set of differentiating elements used to explain differences in outcomes. For innovation adoption, this structure is the set of explanatory variables posited by the substantive theory to cause whether and when someone adopts.

•

The error structure reflects that social action cannot be perfectly predicted either because of the stochastic nature of decision processes or because of measurement and specification errors. The error structure also defines the statistical model.

In most cases, the error component (defining the statistical model) is simply added to the mean structure (derived from the theoretical model). Applications of linear regression and path analysis are an example. In some circumstances, however, it is possible to integrate the statistical model with the posited theory. Such integration requires that one develop theoretical models that contain a stochastic, or random, element (Skvoretz 1998; Snijders 1996). In such probabilistic models of social phenomena, not only the general functional form and set of covariates but also

1

the statistical model itself is a direct expression of the social theory. More tightly integrating substantive theory and statistical models forces one to develop one’s theory more explicitly and in a way that permits more direct testing (Collins 1988; Snijders 1996). This paper integrates three streams of thought in studies of why and when actors adopt innovations: (1) network threshold modeling (e.g., Granovetter 1978; Valente 1996), (2) random utility modeling, a variant of rational choice modeling (e.g., Ben-Akiva and Lerman 1985), and (3) hazard modeling, a set of techniques used in many disciplines to empirically study adoption behavior (e.g., Strang and Tuma 1993). The paper proceeds as follows. We first describe the general structure of the process of adopting an innovation as a discrete-time Markov process, and note that the hazard of adoption can be modeled statistically by means of binary dependent variable (BDV) models (Allison 1982). Next, we show how such statistical BDV models conform to social network threshold theory, when thresholds are allowed to be stochastic rather than deterministic. We then show that the same discrete-time hazard models can be derived from random utility theory. This equivalence shows that innovation adoption models based on stochastic social network thresholds have the same structure as random utility models of adoption, and raises the issue of rational action within social network threshold theory (Granovetter 1978; Schelling 1978), which we briefly discuss. Next, echoing Coleman (1990) and Popper (1963 [1994]), we emphasize that what constitutes a “rational choice” can only be framed in the context of careful situational analyses. We present such an analysis for the adoption of tetracycline, the drug studied in Medical Innovation (Coleman et al. 1966), and suggest that it was marketing effort rather than social contagion that increased physicians’ hazard of adoption over time. We empirically support this conjecture with a statistical analysis of the Medical Innovation data, expanded with newly collected advertising data. We conclude with a

2

discussion of our modeling approach and empirical findings.

DISCRETE-TIME HAZARD MODELS We represent an actor’s adoption as a transition in a two-state, discrete-time semi-Markov process. We use state 0 to denote "not having adopted" and state 1 for "having adopted." We assume that people do not disadopt, i.e., state 1 is an absorbing state. We represent the transition probability of actor i at time t as: [1]

P01it

≡

P( yi,t = 1 yi,t-1 = 0 )

=

G ( α´xit + β´zit ), where

P01it

= probability of i transiting to the absorbing “having adopted” state at time t;

yi,t

= 1 if i has adopted by time t, and yi,t = 0 otherwise;

xit

= vector of variables affecting i’s decision to adopt, irrespective of any influence from the social environment. Apart from an intercept, the vector can include (1) personal characteristics, (2) characteristics of the innovation, such as performance relative to the currently used alternative, compatibility with current values and beliefs, and trialability, (3) contextual covariates such as the economic business cycle or disease incidence, and (4) marketing variables such as price and advertising;

zit

= vector of variables representing the influence from the social environment on i;

α, β = vectors of parameters to be estimated; G

= a cumulative distribution function. Analyzing adoption behavior as this Markov process has the appealing property that the

hazard rate of adoption, i.e. the limiting probability that one adopts at time t given that one has not adopted yet, equals the transition probability P01it. As a result, discrete-time hazard models can be estimated using binary dependent variable (BDV) models such as binary logit and probit (Allison 1982).

3

In applied research there is often a somewhat dubious link between the statistical model and the posited substantive theory that it is intended to represent. An attractive feature of the BDV model is that it conforms to social network threshold theory, as we show next.

RANDOM NETWORK THRESHOLD MODELS Threshold models of collective behavior (Granovetter 1978; Granovetter and Soong 1983; Schelling 1978; Valente 1996) hinge on three postulates. First, they assume that actors engage in a type of behavior if and only if the benefits of doing so outweigh the costs. Second, they assume that the costs and benefits of engaging in a type of behavior depend on how many other actors are already doing so. Third, they assume that the point where net benefits begin to exceed net costs varies across actors. That is, actors vary in the number or proportion of relevant others who must have engaged in the behavior before they do so themselves. So, actors have different resistance levels or thresholds. Previous analyses of network threshold theory have typically assumed that thresholds are constant over time. Granovetter (1978), however, noted that thresholds may well vary over time. Actors’ thresholds result from the configuration of costs and benefits to them in one particular situation, and these costs and benefits may well change. Lower prices or the publication of a Consumer Reports documenting high product quality, for instance, should reduce people’s reluctance to buy a new product, lowering their threshold. Granovetter also mentioned random measurement error as a source of variation: someone with a threshold of 17% may be unable to distinguish among values between 15% and 20%. Similarly, Erikson (1998, p. 90) suggested that “individual thresholds could perhaps best be regarded as random variables,” implying that “the model should be stochastic with situational and individual factors as predictors and with a random error term.” The remainder of this section shows how such stochastic network threshold models can be cast in a binary dependent variable framework.

4

Let actor i’s time-varying threshold value ξit be the sum of two components: a deterministic part θit capturing possibly time-varying situational and individual factors, and a stochastic term εit. Further, let the extent of social influence that i is subject to at time t be a function of whether other social actors j have adopted previously (indicated by yj,t-1) and how important each of these alters j is to i (indicated by the social weight wij). The extent of social network exposure that i is experiencing can then be expressed parsimoniously as a lagged network autocorrelation term Σj wij yj,t-1 (e.g., Marsden and Podolny 1990; Strang 1991). Social network threshold theory posits that actor i will adopt if and only if βΣj wij yj,t-1 ≥ θit + εit, where we use the coefficient β to quantify the size of social network exposure effect.1 Even though the researcher does not know the exposure and threshold values with certainty, the probability of adoption is equal to the probability that the social network exposure is greater than the threshold (Agresti 1990, p. 103). Hence we have: [2a]

[2b]

P( yi,t = 1 yi,t-1 = 0 )

=

P( βΣj wij yj,t-1 ≥ θit + εit )

=

P( βΣj wij yj,t-1 - θit ≥ εit )

=

G( µβΣj wij yj,t-1 - µθit )

where G is the distribution function of εit and µ is a positive scale parameter related to the variance of εit. For instance, if εit is normally distributed, then G is the normal distribution function and we have a binary probit model. If εit is logistically distributed, we have a binary logit model. Note that the parameter µ cannot be distinguished from the overall scale of β or θ. For convenience, one can make the arbitrary assumption that µ = 1, which corresponds to assuming the variance of the random term to be 1 in the probit model and π2/3 in the logistic

1

The size of β indicates how much a unit increase in network exposure brings an actor closer to his or her threshold.

5

model (Agresti 1990; Ben-Akiva and Lerman 1985). Researchers often have specific hypotheses about what factors generate differences in actors’ threshold levels. These factors may include demographics, education, income, risk aversion, and exposure to mass media and marketing influences (Rogers 1995). If one expresses the nonstochastic component of actors’ threshold as a linear function of these variables collected in a vector xit, so that θit = -α´xit, one obtains: [3]

P( yi,t = 1 yi,t-1 = 0 )

=

G( βΣj wij yj,t-1 + α´xit ),

which is equivalent to the basic expression for many hazard rate models for analyzing social contagion dynamics in innovation adoption (e.g., Strang 1991).2 Figure 1 depicts the relationships between random social network threshold theory, the ensuing theoretical BDV models of adoption decisions, adoption data, and discrete-time hazard models often used to describe such data. Note that our modeling strategy does not require individual thresholds to be observed directly (which obviously they cannot, as Erikson (1998) notes). Instead, this approach exploits threshold theory’s postulate that adoption behavior results from the configuration of costs and benefits in a particular situation and provides a mechanism for one to recover the extent to which social network exposure and other covariates are associated with the time of adoption from the data. Note that the modeling strategy does not impose a particular distribution on the deterministic threshold components θit, only on the random shocks εit. [ Figure 1 about here ]

2

Note that the scaling of µ will affect the values of the parameter estimates, but not the relative size of the model

parameters, αm/αn (m≠n) and β/αm. Note also that since the scaling is arbitrary, imposing µβ = 1 does not constitute a more stringent test of network threshold theory than having β vary freely.

6

RANDOM UTILITY MODELS Binary dependent variable models conform to the theory of random utility, in which observed discrete choices are the result of optimizing behavior (Ben-Akiva and Lerman 1985). The derivations are very similar to those for threshold models. Let the decision maker have two alternatives to choose from, both of which are available, affordable, and known during the decision process. Let Uist denote the utility expected by actor i from the s-th alternative at time t (s = 0, 1). Let Uist = Vist + εist, where Vist is a deterministic component, and εist is a random component independently distributed over i and t. We assume that the deterministic component associated with each alternative (i.e., adopting the innovation vs. sticking with the current alternative) is a function of how actor i perceives the innovation characteristics (e.g., price, previous adoption by alters) collected in the vector pist, and of actor i’s characteristics (e.g., income, risk aversion, social status) collected in the vector cit. Then, assuming a linear function, we have: [4a]

Ui0t

=

α0 + γ´pi0t + δ0ćit + εi0t

(Status quo utility)

[4b]

Ui1t

=

α1 + γ´pi1t + δ1ćit + εi1t

(Adoption utility),

where γ is a vector of the utility weight of each innovation characteristic, and δs is a vector indicating how the utilities vary across actors as a function of their characteristics. Highly educated actors may value a new social practice or product more than less educated actors do, for example. The random utility model is then defined by the following choice criterion: actor i adopts the innovation s = 1 when Ui1t ≥ Ui0t. Even though these utilities are not known to the researcher with certainty, the probability of choosing one alternative s is equal to the probability that the utility of that alternative is greater than or equal to the utility of the other alternative (Ben-Akiva

7

and Lerman 1985). Hence we have: P( yit = 1 yi,t-1 = 0 )

[5]

=

P( Ui1t ≥ Ui0t )

=

P( Vi1t - Vi0t ≥ εi0t - εi1t )

=

G( µVi1t - µVi0t ) ,

=

G( α1 - α0 + γ´(pi1t - pi0t) + (δ1 -δ0)ćit ) ,

where G is the distribution function of εi0t - εi1t and µ = 1 is assumed without loss of generality.3 This establishes that the discrete-time hazard of adoption conforms to random utility theory. One can organize the explanatory variables in the same way we did for social network threshold models, i.e., represent the difference in utilities between the innovation and the alternative as Vi1t - Vi0t = α´xit + β´zit , and write: [6]

P( yit = 1 yi,t-1 = 0 )

=

G( α´xit + β´zit ).

Theories of rational choice, including random utility theory, rarely consider social network effects. Still, endogenous feedback and network effects can easily be incorporated as arguments in the utility functions Vist. For instance, we can operationalize pi1t as (1/2) Σj wij yj,t-1 and pi0t as (-1/2) Σj wij yj,t-1, resulting in zit = Σj wij yj,t-1, to reflect a situation where the perceived utility of a choice alternative increases linearly with the number of socially relevant alters who have chosen it previously (e.g., Burt 1982). Becker (1996) defends such extensions based on social capital arguments. For many research purposes, including innovation adoption, social capital can be conceived of and operationalized as resources—including information—accessible through one’s networks (e.g., Borgatti et al. 1998; Lin 1999). If actors use the adoption by others, i.e. yj,t-1, as 3

If εi0t and εi1t are normally distributed, independently across i and t (but not necessarily s), so is their difference,

resulting in a probit model. If εi0t and εi1t are distributed independently across i, t and s according to the Type I extreme value distribution, their difference is logistically distributed, resulting in the logit model. Finally, if εi0t and εi1t are distributed independently across i, t and s over the unit interval [0, 1], the linear probability model results.

8

an informative signal about the benefits of the innovation, then it is theoretically justifiable to include lagged network autocorrelations terms like Σj wij yj,t-1 in the utility functions. This rationale is not limited to information transfer among socially cohesive actors, but extends to signals of competitive threat among equivalent actors as well (cf. Burt 1987).

PURPOSIVE ACTION AND RATIONAL CHOICE IN SOCIAL CONTAGION THEORIES OF INNOVATION ADOPTION We have shown how network threshold theory, when applied to innovation adoption and extended to a stochastic form, can be operationalized in the form of binary dependent variable models such as logit or probit of the form P( yi,t = 1 yi,t-1 = 0 ) = G( βΣj wij yj,t-1 + α´xit ). We have also shown how random utility models can result in the same specification when lagged network autocorrelation terms are included as arguments in the utility function. Hence, both theories have the same structure, with the common principles being (1) utility thresholds to action and (2) purposive action. Stochastic social network threshold models can be interpreted as a form of random utility models. This result is hardly surprising, since both underlying theories are couched in terms of cost, benefits, and (net) utility. As Granovetter (1978, p. 1422) stated, actors in threshold models “are assumed rational—that is, given their goals and preferences, and their perception of their situations, they act so as to maximize their utility.” As the concepts of utility and rationality may have “connotations that many sociologists find unpalatable as a basic characterization of behavior” (Logan 1996, p. 156), some discussion of these issues is warranted. We highlight three issues: (1) the utility concept used here is minimal and only implies consistent preferences, (2) social influence is accounted for by changes in the arguments of the utility function rather than by changes of its parameters, and

9

(3) random utility and random thresholds are methodological principles aiding model construction, not substantive theory concepts.

Preference consistency Utility, as we use it here, is a methodological concept, an index function representing actors’ preferences and rules of consistency used in making choices (Coleman 1990; Luce and Raiffa 1957). Using BDV models requires only modest behavioral assumptions about the decision makers. The one key assumption is that preferences are consistent (Ben-Akiva and Lerman 1985). That is, when confronted with the same set of alternatives with identical characteristics on separate but otherwise identical occasions, actors will make the same choice. This notion of consistency does not, however, imply that actors cannot make apparently inconsistent choices, since the model assumes that some characteristics of the innovation, the actor, or the situation affecting the actor’s evaluation are unknown to the analyst and are represented in the error terms (Logan 1996; Popper 1963 [1994]). The utility-based models we presented connect choice to preferences by assuming consistency of preferences, but say nothing more about the nature of these preferences. The models do not assume that complete information is available on all innovation characteristics, that utility is the same as material wealth, or that actors are egoistically self-interested and forward-looking. While the models are specified at the level of the individual and represent individual purposive action, the individualism is methodological and structural rather than atomistic (Burt 1982; Hechter 1983). Even though the modeling approach assumes that understanding innovation adoption starts with the intentions and consequences of individuals’ action, the relative importance of individual attributes and social context is left as an empirical issue (cf. Granovetter 1978). Whether and how the importance of social contagion is

10

contingent on the broader social structure and the status specific actors occupy in that structure (Burt 1982, p. 8; Burt and Janicik 1996), can also be assessed empirically.

Changes in preferences We represent social influence on the preference among choice alternatives as changes in the arguments of a stable utility function (Becker 1996; Burt 1982; Stigler and Becker 1977). We do not represent social influence as shifts in tastes (DiMaggio 1994; Pollak and Watkins 1993), i.e., as changes of parameter values in the utility function. Note, though, that changes in social network exposure Σj wij yj,t-1 can be interpreted as shifts in the intercept parameter α0 capturing actors’ taste elements that are not explicitly represented through covariates. Hence, the changein-utility-arguments approach does not entirely conflict with the change-in-tastes approach. While some researchers may prefer a fuller treatment of taste shifts rather than argument shifts, assuming invariant utility parameters enables one to statistically identify and estimate one’s models (Logan 1996) and provides “a way to use the important sociological insight on the social origin of preferences without thereby throwing out choice out of the analysis” (Lindenberg 1992, p. 10).

The need for substantive theory As Luce and Raiffa (1957) and Popper (1963 [1994]) emphasize, utility maximization is a substantively vacuous concept in the sense that answering the question “Why did this actor do that?” by “Because it maximized her utility” is a tautology. The concept of utility we use here is so minimalist that the resulting logit and probit models are theoretically underdetermined. To offer a real explanation, one must further specify the preferences, beliefs and constraints

11

affecting the actors’ valuation. That is, a real explanation requires specifying what the arguments of the utility function are. Only then can rational action be “its own explanation,” i.e., a conception of action “that we need ask no more questions about” (Boudon 1998, p. 177, quoting Martin Hollis and James S. Coleman, respectively). Utility theory can be used to explain choices on the basis of actors’ preferences, beliefs and constraints, but is it those reasons, and not utility theory, that do the explanatory work. Useful theory building requires (re-)constructing these reasons (Boudon 1998; Hausman 1992; Marini 1992)4. Different theories of innovation adoption suggest different elements driving actors’ assessments. Neo-classical economics adds consumerism to the assumption of rationality, i.e., it further imposes “that individuals are self-interested, that the sole objects of their preferences are commodity bundles, and that individual utility functions are independent” (Hausman 1992, p. 92)5. Theories of social learning under risk aversion suggest that others’ behavior affects actors’ beliefs and perceived constraints. Some of these theories—such as Bayesian decision theory— posit very specific ways to construct the covariates of interest. Theories of competition for scarce resources emphasize the threat by competitors and posit that social contagion stems from equivalently positioned actors, not from peers one is exchanging information with. Attitude theories, including the theory of reasoned action, posit that peer evaluation of the act of adopting is important. Neo-institutional accounts of normative legitimation processes are similar in that respect. We emphasize that random utility and random thresholds are powerful concepts linking the 4

“If people are assumed to choose what they value, and if what they value is revealed only by what they choose, a

theory of purposive action is inherently tautological. It is a partial theory that cannot predict. The theory becomes useful only when motivational assumptions are made about what people value” (Marini 1992, p. 29). 5

As Hausman (1992) notes, it is this additional assumption of consumerism that distinguishes the undersocialized

“economic man” from “rational man.”

12

methodological principle of rationality and purposive action to statistical models, but that they do not in themselves suggest any substantive social mechanism. What social (network) theory adds is a set of substantive ideas about what drives the utility of actors facing the decision to adopt. Social theory suggests what variables enter as arguments in the utility function, i.e. as covariates in the resulting probability model. Social networks and rationality are not antithetical alternatives in the realm of innovation adoption, but rather, are mutually enriching concepts one can use to gain better understanding. Probabilistic models of purposive action represent how actors, given their values and beliefs, make choices. Such models can be useful in understanding and predicting innovation adoption only when applied in conjunction with knowledge or wellreasoned conjectures about (1) what actors value and (2) what choice alternatives they are aware of and perceive to be available. It is an important task of sociology, cultural studies and (social) psychology to provide that knowledge (Douglas and Isherwood 1979; Marini 1992).

THE NEED FOR SITUATIONAL ANALYSIS Applying models of purposive action requires a careful analysis of the situation from the actor’s point of view, because assuming utility maximization is a methodological principle, and hence not subjected to empirical testing. Empirical refutation of a model of purposive action means that the researcher did not appropriately represent the actors’ choice situation (i.e., preferences and constraints), not that the rationality principle—held as a point of method, not of fact—was refuted (Popper 1963 [1994]). Rosenberg (1980, p. 82) puts it quite clearly: If we assume that the behavior of a system tends to maximize the value of some variable, but our observations are at odds with the predictions of the theory, then “we never infer that the system is failing to maximize the variable in question, but assume that our specification of the constraints under which it is operating is incomplete.” Coleman (1990, p. 18) similarly posits

13

that “much of what is ordinarily described as nonrational or irrational is merely so because the observers have not discovered the point of view of the actor, from which the action is rational.” As this last quote indicates and Popper emphasizes (1963 [1994]), researchers assuming utility maximization must take great care that their model appropriately represents the choice situation, i.e. the actors’ preferences and the constraints under which they are operating. Diffusion researchers increasingly recognize the need for such careful situational analyses. Despite the widespread acceptance of the role of social contagion in innovation diffusion, some research has challenged its apparent empirical support, showing that S-shaped diffusion curves—often interpreted as evidence of social contagion—can be the result of population heterogeneity rather than contagion. For instance, when a product’s price decreases linearly over time and reservation prices are normally distributed over the population, the diffusion curve will be the normal cumulative density function (Thirtle and Ruttan 1987). The logistic model has also been shown to be formally indistinguishable from patterns that arise in the absence of any social contagion (Bonus 1973). Heterogeneity in thresholds combined with some systematic change in environmental variables can be enough to generate S-shaped diffusion curves. Improving economic conditions or changes in supply-side elements such as declining prices, increasing distribution coverage, and increasing advertising and sales efforts, are obvious possibilities (cf. Brown 1981). These analytical results support concerns voiced by England (1998) and Haunschild and Miner (1997) that the well-documented positive relationship between the prevalence of prior adoption among one’s network alters and the likelihood of one’s own adoption—often interpreted as evidence of social contagion—is often produced by factors

14

growing over time but excluded from the model.6 Along similar lines, Stinchcombe (1997, p. 6) has criticized neo-institutional research—an area in which contagion is a central concept and topic of investigation—for “ignoring the work of people who put the detail into institutions and who constrain people and organizations to conform to institution’s exteriority. … [I]f the guts of the causal process of institutional influence are left out of the model, then we successfully mathematize abstract empiricism, an empiricism without the complexity of real life.” Careful situational analyses reduce the risk of confounding social contagion with contextual effects. We illustrate this point by re-analyzing data from Medical Innovation, the classic study on the role of social contagion in the diffusion of the tetracycline broad-spectrum antibiotic by Coleman, Katz and Menzel (1966). Medical Innovation is often credited for establishing that diffusion is a social process in which people’s adoption of an innovation is driven by social contagion, or more specifically, by influence from others who have already adopted (Rogers 1995). The study has more than just historical interest, though. Its data on the diffusion of tetracycline, a broad-spectrum antibiotic, have become “a strategic research site for testing new propositions of how social structure drives contagion” (Burt 1987, p. 1301) and for assessing the performance of new modeling techniques (Marsden and Podolny 1990; Strang and Tuma 1993; Valente 1996). However, recent re-analyses of the data have found social contagion effects to be rather small (Burt 1987; Burt and Janicik 1996), sensitive to model specification (Strang and Tuma 1993), or even insignificant (Marsden and Podolny 1990). Still, none of these re-analyses

6

In his first paper on network threshold models, Granovetter (1978) warned of cases where individuals appear to

react to one another, but are actually all responding to a common, external influence, providing this memorable quote from Weber ([1921] 1968 , p. 23): “Thus, if at the beginning of a shower a number of people on the street put up their umbrellas at the same time, this would not ordinarily be a case of action mutually oriented to that of each other, but rather of all reacting in the same way to the like need of protection from the rain.”

15

started with an analysis of the physicians’ situation or put the data to rigorous tests called for by skeptics. The re-analyses were primarily designed to investigate which social contagion mechanism was operating (information sharing among cohesive actors versus competitive mimesis among positionally equivalent actors) or to assess the performance of new models in capturing patterns in the data. With such objectives, previous re-analyses assumed that contagion was truly at work and therefore neither started with a detailed account of how the market for the new drug operated (as called for by Stinchcombe) or included variables capturing non-network mechanisms likely to have been at work (as called for by England and by Haunschild and Miner). In the next two sections, we illustrate the research strategy we described in the preceding sections to the diffusion of tetracycline. We first present a situational analysis, concluding that social contagion is unlikely to may have been at work, but that the pharmaceutical companies’ marketing effort may have played a considerable role. Next, we test this conjecture empirically by applying discrete-time hazard models to the Medical Innovation data set expanded with new data on advertising.

THE PHYSICIANS’ CHOICE SITUATION IN MEDICAL INNOVATION Medical Innovation is a study of the adoption by 125 physicians in four cities in Illinois of a new antibiotic drug called tetracycline. At the time that Lederle launched the first tetracyclinebased product in November 1953, three other broad-spectrum antibiotics were already on the market. Lederle had introduced chlortetracycline in 1948, Parke-Davis had introduced chloramphenicol in 1949, and Pfizer had introduced oxytetracycline in 1950. To better understand the situation physicians found themselves in when deciding to adopt the new drug or

16

not, we analyze the product’s characteristics, the way it was commercialized, and the kind of sources of information and influence physicians in the 1950s typically used.7

Product characteristics The diffusion literature suggests that an innovation’s rate of adoption is affected by potential adopters’ perceptions of five critical characteristics: complexity, compatibility with existing values, trialability, observability of results, and relative advantage over alternatives (Rogers 1995). Tetracycline had product characteristics typically associated with rapid diffusion: Low complexity. Tetracycline was chemically similar to two existing and successful antibiotics, as evidenced by their generic names. Though a new compound, tetracycline “was merely the newest in an already established family of drugs,” and an “undramatic pharmaceutical innovation” (Coleman et al. 1966, pp. 17 and 36). Compatibility. Physicians were favorably disposed toward the pharmaceutical industry, its new products, and its efforts to market them (Ben Gaffin 1959; Caplow and Raymond 1954). Enthusiasm was particularly strong for broad-spectrum antibiotics (Peterson et al. 1956)8. Trialability and observability of results. Broad-spectrum antibiotics were generally used in the treatment of acute rather than chronic conditions. Because of the short time between

7

A less detailed analysis of the product and its marketing is presented in a report on a concurrent project by Van den

Bulte and Lilien (1999). 8

Peterson et al. (1956) intensively studied 88 general practitioners in North Carolina, each over a period of three to

three and a half days. They often observed the immediate preparation of an injection of penicillin upon learning that the patient had a fever. This decision was frequently reached before the patient had been examined. Two thirds of the physicians gave antibiotics to all patients suffering from respiratory infections, without attempting to determine whether the infection was viral or bacterial (antibiotics are ineffective against viral infections). Also, “it was apparent from observation and statements from physicians that their practices in regard to medications and therapy are influenced significantly by the information and products supplied by the drug salesman.” (p. 103).

17

treatment and outcome, a physician could easily and quickly determine their efficacy in any particular case, and adjust the therapy if necessary (Coleman et al. 1966, p. 17). Relative advantage. Tetracycline produced fewer side effects than the other three broadspectrum antibiotics (Pearson 1969). Tolerance and side effects had become a very important issue by the time tetracycline was launched. In the summer of 1952, side effects of Chloromycetin had received wide press coverage. The June 28 editorial of the Journal of the American Medical Association called doctors’ attention to reports on Chloromycetin’s side effects, suspecting it of causing aplastic anemia in a few rare cases. On July 3, the AMA issued a press release “AMA Warns Doctors on Chloromycetin Therapy” that got wide coverage in the popular press. Finally, the FDA even withdrew its certificate, organized its own field survey, and turned its reports over to the National Research Council for review. On August 14, the drug was allowed back on the market, though Parke-Davis was ordered to print prominently on its labels the dangers inherent in its use (Fortune 1953; Pearson 1969). As a result of these problems, Chloromycetin’s share of the broad spectrum antibiotics market declined to 5% in October 1952, down from 38% four months earlier. In September 1953, two months before Lederle’s launch of tetracycline, Chloromycetin’s share was still at a low of 10% (Fortune 1953). Thus, tetracycline had a competitive advantage on a product dimension that was especially salient at the time of its launch. In sum, our analysis of tetracycline’s characteristics presents little reason to expect social contagion to have been a dominant driver. Since there was little ambiguity and perceived risk in prescribing tetracycline, information from previous adopters should not have affected physicians’ evaluation of the drug. Since tetracycline was merely the newest member of an

18

already established family of drugs and an undramatic innovation, it is also questionable that adopting it would have markedly enhanced physicians’ status among their peers.

Suppliers’ Commercial Efforts An analysis of the potential adopter’s situation should also include a view of the supply side. In this section, we document characteristics that have been found to be associated with rapid diffusion (e.g., Bauer 1961; Hahn et al. 1994): the intensity of competition among suppliers, their reputation among potential adopters (cf. supra), and their marketing efforts. In contrast to other broad-spectrum antibiotics, tetracycline did not enjoy exclusive patent protection. After a tumultuous episode of litigation, the parties involved worked out a complex patent sharing and licensing agreement, giving Lederle, Pfizer, and Bristol the right to manufacture and sell the drug and allowing Squibb and Upjohn to sell the drug under a supply contract with Bristol (Federal Trade Commission 1958, pp. 245-257). These five firms accounted for more than half of all antibiotics sold in the U.S. in 1950 and all had a good reputation with the medical community (FTC 1958). Lederle, the first company to launch a tetracycline formulation, deployed a very aggressive marketing program. Broad-spectrum antibiotics enjoyed the largest promotional budgets in the pharmaceutical industry (FTC 1958). Even by those standards, Lederle’s marketing efforts for its tetracycline brand Achromycin were exceptional. As a pharmaceutical representative visiting physicians or “detail man” remarked, “Lederle was interested in bombarding physicians with the Achromycin name and we did just that and got the name across. We swamped them with Achromycin” (FTC 1958, p. 130). Lederle used an array of marketing efforts. Coleman, Katz and Menzel (1966, pp. 44 and 181) mention the “blanket exposure of all doctors to the detail man.”

19

Lederle’s direct mail budget for tetracycline permitted 105 mailings, an average of two per week, to every physician in the United States during its first year of introduction. Medical journal advertising for the first twelve months consisted of 26 insertions in the Journal of the American Medical Association (JAMA), and monthly insertions in the highly circulated Modern Medicine and Medical Economics, as well as in all state journals, 116 county journals, and most specialty journals (FTC 1958). Tetracycline also received wide positive coverage in the professional media (Ben Gaffin 1956). Pfizer was much less aggressive in pushing tetracycline. Fearing that strongly promoting its own brand of tetracycline, Tetracyn, would undercut its sales of oxytetracycline marketed under the name Terramycin, Pfizer relegated the marketing of Tetracyn to a small subsidiary it had just acquired in 1953, J.B. Roerig and Company. As Pfizer’ general sales manager, tried to explain the situation to his own sales force (Mines 1978, p. 202): “Under no circumstances is the present and future of Terramycin, our ‘bread and butter,’ to be jeopardized. … We do not suggest that anything should be done or said that will deter or discourage the Roerig boys in their attempt to get their product established. Rather, it may be suggested you help them where you can—except where it would jeopardize your TM [Terramycin] volume“ (emphasis in original). Only in January 1955, possibly alarmed by the tremendous success of Lederle’s Achromycin, did Pfizer start to market Tetracyn through its main sales organization. We have no detailed information on how aggressive the other three players marketed their own tetracycline brand. They did not face the same fear of product cannibalization as Pfizer and appear to have been more aggressive, though not so extremely as Lederle (Pearson 1969).

20

Tetracycline was not only extensively promoted, but also aggressively priced. Although tetracycline was superior to other broad-spectrum antibiotics, its price to consumers was the same as for the three other types of broad-spectrum antibiotics (FTC 1958). To the extent that physicians took price into consideration in their prescription behavior, it would have favored rapid adoption. In sum, tetracycline was marketed by a small group of companies enjoying a solid reputation. The first company to enter the market, Lederle, deployed a very intensive marketing campaign. The product also enjoyed a large amount of free publicity. Although superior in therapeutic effects, the product did not carry a price premium. Such a market environment is conducive to rapid initial diffusion (Bauer 1961; Hahn et al. 1994). Again, there is no reason to expect physicians’ adoption to be influenced by colleagues. In their monograph, Coleman, Katz and Menzel (1966, pp. 13-14) argued that the physicians’ problem was not too little but too much information, and suggested that physicians turned to their colleagues as a way to cope with this information overload. However, such a simplifying strategy is necessary only if actors experience ambiguity or uncertainty, a condition which we have documented to be unlikely: while there was indeed a deluge of information, it all pointed in the same direction, toward adopting tetracycline.

Physicians’ sources of information A number of studies provide insight in the relative importance of the physicians’ various sources of information about new drugs around the time of the tetracycline study. Many of them were conducted in the Midwest or in relatively small cities, and can thus be expected to be representative for the four Illinois towns in Medical Innovation.

21

A 1952 survey of Midwestern physicians reported that the physicians found detail men, direct mail, journal articles, and journal advertising to be much more important sources of information than colleagues (Caplow and Raymond 1954). Menzel and Katz (1955) conducted the pilot study to Medical Innovation in a New England city of size comparable to the four cities in the main study. They also found that physicians rated colleagues as less important than detail men, articles in journals, and direct mail. Using a sample of Chicago physicians, Ferber and Wales (1958) found similar results, as did Winick (1961) in a study of an ethical drug introduced in 1957. National scale studies by the National Opinion Research Center (Hawkins 1959), Ben Gaffin and Associates (1959), and Harris (1966) reported similar findings. One study deserves special consideration. Between May and July 1955, Ben Gaffin and Associates (1956) investigated the adoption of five new drugs by physicians in the Fond du Lac area in Wisconsin. That study is of particular interest because Lederle’s Achromycin was one of the drugs studied, the time period analyzed covers that in Medical Innovation, and the community is similar to the four cities studied by Coleman and his associates. Hence, the Fond du Lac study provides an empirical check of the arguments developed above. The product diffused very rapidly: 50 of the 55 physicians claimed to have prescribed tetracycline within 19 months after its release and 32 of them said they had used it within the first two months it was on the market. Doctors gave little credit to colleagues. To the question “where did you happen to get the information about Achromycin which led you to prescribe it?” 30 of the 50 prescribers mentioned detail men, 12 mentioned journal articles, 5 direct mail, 4 journal advertising, and only 3 mentioned other physicians. All these studies indicate that physicians in the 1950s typically did not report peer influence as an important information source for new pharmaceuticals, but noted commercial

22

communication efforts and medical journals to be more important and valuable.

Conclusion from the situational analysis Overall, tetracycline’s product characteristics, the way it was marketed, and the sources of information physicians typically used for adoption decisions do not paint a case for strong contagion effects. Table 1, reconstructed from original reports on the Medical Innovation study, provides additional evidence that physicians did not consider colleagues an important source of either information or influence. [ Table 1 about here ] Skeptics may raise the following question: If advertising and detailing were indeed as important as our situational analysis suggests, wouldn’t Coleman, Katz and Menzel have taken these factors into account? That is, doesn’t the very fact that they did not include these factors in their analysis reduce the face validity of our own conclusions? We do not believe so. Appendix A argues that the way Medical Innovation came about may have led its authors away from looking into the effects of detailing and journal advertising. Appendix A also presents a reminiscence by Coleman indicating that he and his fellow authors may not have been very familiar with the institutional details of their research site.

A NEW STATISTICAL ANALYSIS INCORPORATING MARKETING EFFORT Our situational analysis indicates that the commercial efforts by the drug manufacturers, and especially Lederle, may have been a key driver of the diffusion process. Earlier analyses, however, have ignored this factor. Their results may therefore have been based on a confound. We investigate this possibility empirically below.

23

Data Coleman, Katz and Menzel (1966) provide a detailed description of the population, the sample, and data collection procedures. Burt (1986) placed the portion of the original data set that we use in the public domain. Since the data are accessible for public scrutiny, we limit our discussion to the variables we used or constructed for our own analysis. Physician characteristics. We included five covariates to account for heterogeneity in physicians’ tendency to adopt early. Professional age measures (on a 1-6 scale) how long ago the physician graduated. We included both a linear and quadratic term to account for a possible inverse U-shaped relationship between professional age and adoption proneness: compared to mid-career physicians, older physicians may be more conservative and very inexperienced physicians more risk-averse. We mean-centered age to avoid extreme collinearity. We used the number of journals a physician receives or subscribes to as a measure of media exposure influence. Journals included both newsletters sent by pharmaceutical companies and scientificprofessional publications. We used the logarithm to reflect decreasing returns to scale. We expected physicians having a chief or honorary position in their hospital, captured as a dummy variable, to be less involved in actual medical practice than active or regular staff, and hence to adopt later. We also included an attitudinal measure: scientific orientation, coded as 1 if the physician agreed with the statement that it is more important for a physician to "keep himself informed of new scientific developments [rather than to] devote more time to his patients," and as 0 otherwise. We also estimated models including the number of nominations a physician received as advisor or as discussant as measures of status. Although sociometric status figures prominently

24

in the analysis by Coleman and his associates, it did not contribute significantly to model fit or change the coefficients of the contagion variables once we controlled for the number of journals received. Burt (1987), Marsden and Podolny (1990) and Strang and Tuma (1993) reported similar findings. It appears that opinion leaders adopted early as a result of their cosmopolitan perspective and media habits rather than out of pressure to maintain their status among their colleagues. We report results only for models excluding sociometric status variables. Seasonal effects. We included a seasonal dummy variable for the summer months July and August. We expected fewer adoptions of a new antibiotic in these two months because the weather is milder and schools are closed, limiting the spread of contagious diseases, and thus the demand for antibiotics (Cliff et al. 1981). Advertising volume. The Medical Innovation data do not contain information on the amount of marketing effort targeted towards the physicians whose prescriptions were tracked. We use the number of advertising pages in three leading advertising outlets, Modern Medicine, Medical Economics and GP, as our measure of marketing effort. These three publications were preferred by pharmaceutical advertisers and widely read by physicians in the 1950s (Ben Gaffin 1953, 1956). Our attempts to collect data on the number of ads appearing in JAMA were unsuccessful, as librarians removed the advertising supplement before binding the issues for storage. We distinguish between the marketing efforts by the first entrant, Lederle, and those of the later entrants. We do so for two reasons. First, the first entrant’s marketing efforts are often more effective than those of later entrants when the latter do not offer an important therapeutic advantage (Bond and Lean 1977; Hurwitz and Caves 1988; Shankar et al. 1998). Second, Lederle had a very large sales force and was strongly committed to aggressively building a dominant position while other companies were less well endowed and less aggressive.

25

We matched the number of advertisements in each issue to the 4-week sampling periods in the data set prepared by Burt (1986). Because the data are monthly observations and previous research in the pharmaceutical industry documents the presence of sizable spillover effects over time (Berndt et al. 1997; Montgomery and Silk 1972; Rangaswamy and Krishnamurthi 1991), we expected marketing communication efforts to span multiple periods. We therefore constructed measures of depreciation-adjusted stock of marketing effort (Berndt et al. 1997; Kalish and Lilien 1986; Rizzo 1999). Let mt be the amount of advertising in month t (in hundreds of pages), and let δ be the monthly decay rate (0 = δ = 1). The stock of marketing effort Mt is then defined as: t

Mt

=

mt + (1- δ) Mt-1

S (1- δ)t-τ mτ.

=

t =0

[10]

We constructed one such variable for Lederle and one for all other competitors combined. We assumed the decay parameter δ equal across companies. We have not been able to locate data on the amount of detailing effort by various companies marketing tetracycline, but do not believe this to be a problem. Detailing effort and journal advertising are so highly correlated in pharmaceutical markets that either variable can be used to represent overall marketing effort (Berndt et al. 1997; Gatignon et al. 1990; Lilien et al. 1981; Rangaswamy and Krishnamurthi 1991; Rizzo 1999). We do not include an interaction effect between marketing effort and the number of journals received. Including such an interaction would generally provide a sharper test of advertising effects, but not in this particular case. There are two reasons. First, the number of journals received includes many in-house publications by other pharmaceutical companies that would

26

carry only for the sponsoring company and not for others, such as Lederle.9 Second, though we operationalize marketing effort using advertising data, true marketing effort also captures detailing effort and direct mail, the effects of which do not depend on journals received. Contagion variables. Coleman, Katz and Menzel had 228 physicians interviewed. Each physicians was asked to name up to three other physicians he discussed medical practice with, and up to three physicians he sought advice from about medical practice. However, the researchers had prescription data collected only for general practitioners (N=125) and not for specialists. There are two approaches to this missing data problem. Burt (1987) and Marsden and Podolny (1990) assumed that the adoption of tetracycline by specialists affected generalists’ decision to adopt, and consequently imputed adoption dates for specialists (though they only analyzed the generalists’ adoptions). In contrast, Coleman et al. (1966) and Strang and Tuma (1993) assumed that generalists did not take specialists' adoption behavior into account, and did not consider the latter’s missing adoption data when constructing the social influence variables. We use both approaches, as Strang and Tuma (1993) suggest that different assumptions about the effects of specialists on generalists may have caused re-analyses to find evidence of social contagion or not. We constructed two types of exposure variables, each assuming a different influence mechanism represented by the wij weights. The direct ties weights reflect whether i nominated j as an interaction partner for advice or discussions. We constructed weights of structural equivalence indicating whether i might mimic j out of fear of losing out in the competition for status. We operationalized structural equivalence as the proportion of exact matches between two

9

Distinguishing between 0 and 1 or more journals received when constructing the interaction term allows one to

circumvent this problem. Unfortunately, no physician reported receiving zero journals. We thank David Strang for suggesting the 0/1 interaction

27

physicians’ set of relationships with third parties: thus, the more their portfolio of relationships overlap, the higher the weight they give to one another. Appendix B details how we constructed the network exposure variables. We also used network exposure variables constructed by Burt: one capturing word-of-mouth operating over direct ties and one capturing competition for status between structurally equivalent physicians. Burt incorporated specialists’ imputed adoption data in his exposure variables. He also used different operationalizations of the influence weights than we did. Because of imputation problems, Burt could not compute influence from structurally equivalent colleagues for seven physicians. For one of these seven, Burt was also unable to compute a measure of social cohesion influence. For these few physicians, we substituted our measures of influence through structural equivalence and direct ties for Burt’s missing values. After constructing the variables, we deleted four physicians, due to missing covariates. The data set for estimation contains 17 monthly observations for 121 individuals, 105 of whom had adopted by the last observation period. Table 2 presents descriptive statistics for the data, after excluding post-adoption observations irrelevant to explain adoption. [ Table 2 about here ]

Estimation and Specification Tests We estimated our models using maximum likelihood, with one exception. We estimated the marketing effort decay parameter δ using a grid search (cf. Berndt et al. 1997) in a model featuring no social network exposure variables. A value of .25 led to the highest model likelihood. Model fit was not very sensitive to changes in the range between .15 and .30. In

28

subsequent analyses of models featuring both marketing effort and social network exposure, we kept δ fixed at .25. We checked for unobserved heterogeneity in both probit and logit models. In the probit models, we estimated a normal mixture while allowing the base hazard to vary freely every three months (cf. Han and Hausman 1990). In the logit models, we used the score tests developed by Hamerle (1990) and Commenges et al. (1994). None of the tests suggested the presence of significant unobserved heterogeneity (p > 0.10). To avoid redundancy and cluttering, we present the results for the logit specifications only and omit the test statistics for unobserved heterogeneity.

Results Table 3 reports the results for each of the four social network exposure variables: two for exposure through direct ties (both our and Burt’s measures) and two for exposure through structural equivalence (again our and Burt’s measure). The four first columns (1a-4a) report the logit coefficients in models without marketing effort. Social contagion is significant in all four cases. Exponentiating the social contagion coefficients indicates that the odds of adoption by someone with maximum exposure was 2 to 3 times the odds of adoption by someone without any social network exposure. Also, the coefficients of all physician characteristics except for age have the expected sign and are significant at 90% confidence or higher. Age does not have the expected positive sign, suggesting that younger physicians did not delay adoption. The summer dummy has the expected negative sign, but is not or is only marginally significant. [ Table 3 about here ] The next four columns (1b-4b) report the results for models incorporating marketing efforts.

29

The coefficients of the physician characteristics barely change, but the social contagion effects are now all insignificant. As expected from our situational analysis, marketing efforts by Lederle appear to have affected physicians more than peer influence or marketing efforts by later entrants, neither of which showing a significant effect. Note also that incorporating Lederle’s marketing effort improves the model likelihoods. Also, a model with marketing effort but without social network exposure fits about equally well as models with both types of variables (column 5). Overall, the results indicate that Lederle’s marketing effort, not social contagion, was the dominant driver increasing physicians’ hazard of adoption over time.

Discussion Our results about the absence of network effects once one controls for advertising not only contradict the received view of strong network effects in Medical Innovation (Rogers 1995), but at the same time also explain the “weak” results obtained more recently by Marsden and Podolny (1990) and Strang and Tuma (1993). Marsden and Podolny estimated a Cox proportional hazard model, which is very similar to a discrete-time logit hazard model with a dummy variable for each time period. These dummies capture all cross-temporal variation in the mean adoption hazard, and leave only variance within particular time periods to be explained by network exposure. Strang and Tuma incorporate lagged penetration as a covariate, besides lagged network terms. Lagged penetration assumes that any physician interacts with any other physician (i.e., a constant wij for all i and j). Thus it ignores network structure but captures the crosstemporal variation in average network exposure. Similarly, our marketing variables vary over time but not across physicians. Hence, all three studies show that differences in adoption across physicians within any particular time period are not statistically significantly associated with

30

differences in lagged social network exposure.10 Our study, however, is the only to provide an explanation for this finding grounded in a detailed situational analysis.

CONCLUSION This paper offers three insights: •

Discrete-time hazard models of innovation adoption can be developed from two theoretical traditions innovation adoption: network threshold modeling and random utility modeling;

•

Deriving a theoretical and statistical model from assumptions of rational action requires that the researcher understand the actors’ point of view, from which perspective the action is rational. This implies an account of the actors’ preferences, beliefs and constraints;

•

The Medical Innovation data do not document that diffusion is driven by contagion operating over social networks. Social contagion effects have been confounded with marketing effects.

We elaborate below.

Theoretical and statistical models We have shown how hazard models for the analysis of innovation adoption can be developed from two theories of rational choice: social network threshold theory and random utility theory. This result should appeal to rational choice sociologists searching “to combine the advantages of theory-guided research, as found in economics, with the strong empirical tradition of sociology” (Lindenberg 1992, p. 3). Ours is far from the first effort to incorporate network effects into a utility model. Burt’s (1982) monograph Toward a Structural Theory of Action is a notable contribution to build a theory of purposive, utility-maximizing action, in which perceived utility is a function of the opinions and behaviors of socially relevant others. Our approach builds on 10

We thank Keith Ord and David Krackhardt for raising the issue of cross-temporal versus cross-sectional variation.

31

Burt’s work relating network dynamics and purposive action. We extend that work by showing how such theoretical models—when expanded into stochastic variants and applied to the realm of binary decisions—result analytically in binary dependent variable models and discrete-time hazard models. These hazard models are not just empirical operationalizations from the underlying theoretical model, but follow directly from it. Such hazard models appropriately handle typical diffusion data in which right censoring is common issue, thus avoiding a key problem in earlier attempts to empirically apply Burt’s theory of structural action and Granovetter’s threshold theory to new product diffusion (Burt 1987; Valente 1996). We see many venues to adapt and further develop the simple models we presented. For instance, the model structure itself can be enriched. BDV models for transitions in a two-state discrete-time Markov chain can easily be extended to allow for disadoption (see Amemiya 1985). The case of multiple competing innovations can be handled by using a multinomial rather than binomial logit or probit specification (e.g., Yamaguchi 1991). This transforms the model from a traditional hazard rate model into a competing risk model, but multinomial models can also be based on a random utility framework (Ben-Akiva and Lerman 1985). Models can also be extended to incorporate multiple stages, such as awareness followed by evaluation conditional upon awareness (e.g., Andrews and Srinivasan 1995; Van den Bulte and Lilien 1999). Such separation may help gain a better understanding of the differential effects of advertising and social contagion, as the former is believed to operate mainly early in the decision process and the latter mainly in later stages (Rogers 1995; Van den Bulte and Lilien 1999). All these extensions in which the theoretical and statistical models are tightly linked may advance the empirical analysis of rational choice models using large-scale longitudinal data (Goldthorpe 1998).

32

Situational analysis and model specification Analyzing the social situation from the actors’ point of view has a long tradition in sociology and is a key theme in Weber’s work. In this paper, we emphasized Popper’s view such analyses are not just desirable, but necessary for researchers applying models of purposive action. Correctly setting up the utility maximization problem requires that the researcher specify the actors’ preferences, beliefs and constraints appropriately. Of course, one never knows whether one has done so, but that should not be a reason not to try. We illustrated this using the physicians’ choice in Medical Innovation. Though we believe our analysis of the situation was careful and its predictions borne out in a quantitative analysis, any research effort involves both modeling compromises and data restrictions and, and we made several restrictive behavioral assumptions when constructing the set of covariates entering the utility functions in our application. We limited our analysis to interpersonal influence associated with others past behavior only. We thereby ignored contemporaneous and anticipated future actions. Ignoring contemporaneous actions, we did not capture possible joint decision making among physicians. Anticipated future actions are likely to be important in competitive contexts, where preemptive adoption increases one’s competitive position. For individuals, these positional concerns may take the form of worrying about one’s social status from being regarded as a trendsetter. In organizational fields, preemptive adoption may help actors develop or sustain a competitive advantage through higher status, lower costs, better product quality. The situational analysis suggests contemporaneous social contagion and prospective behavior were unlikely to be important, but we did not assess this in the statistical analysis. There are both statistical (Besag 1975) and theoretical problems (Coleman 1990) that need to be resolved before considering empirically analyzing the effects of

33

contemporaneous and anticipated actions in complex, non-random network structures.11 Our analysis was also limited by the richness of the data about what gets communicated through the network. Social interaction, we assumed, informs potential adopters only about others’ choices, not their expected or achieved utility, post-adoption attitudes, or other evaluations. This is defensible when actors do not discern internal states or outcomes of others. Sometimes, however, outcomes are actually communicated by adopters (e.g., satisfaction) or can be observed (e.g., market share gains or increased fundraising by organizations that implement a new technology). Researchers having such data available can modify the model quite easily by substituting the relevant variable (say, qjt) for the yj,t-1 indicator and compute network exposure as Σj wij qjt.

Medical Innovation and Social Contagion Our last result is that (1) the Medical Innovation data do not provide statistical evidence of network effects in new product diffusion, and that (2) there are good reasons not to have expected such effects for this particular innovation. Prior evidence of social contagion was an artifact, capturing the effect of Lederle’s aggressive marketing efforts. Our findings cast doubt on a small but important part of the empirical base underlying the belief that innovations diffuse through social contagion. Our findings must not be interpreted as suggesting that social network effects do not matter in general. Tetracycline, our situational analysis suggests, simply is not a very adequate case to assess the importance of social contagion. Its product characteristics and

11

Strang and Tuma (1993) do model contemporaneous contagion. Their estimation procedure, however, does not

seem to not take into account the interdependence among observations explicitly present once one allows contemporaneous contagion. Traditional likelihood functions for BDV models do not account for this endogeneity and maximizing them leads to invalid results (Besag 1975).

34

the way it was marketed made it unlikely for sizable social network effects to be at work. The absence of social network effect in such a foundational study, still, gives credence and salience to earlier calls for sound skepticism and for wariness of confounds when studying social contagion. We hope that the relationships between social network threshold theory, random utility theory, situational analysis, and hazard modeling will help diffusion researchers to better tighten what Merton (1968, p. 73) called the “triple alliance between theory, method and data.”

35

Appendix A. The Original Study’s Silence on Marketing Factors Does Not Mean That They Were Irrelevant Doesn’t the very fact that the authors of Medical Innovation did not take marketing factors into account reduce the face validity of our situational analysis’s conclusions? We do not believe so for two reasons. First, the way Medical Innovation came about may have led its authors away from looking into the effects of detailing and journal advertising. Second, Coleman, Katz and Menzel may not have been very familiar with all the institutional details of their research site. The genesis of Medical Innovation. Ironically enough, Medical Innovation started off as an advertising effectiveness study for Pfizer, one of the sponsors of Paul Lazarsfeld’s Bureau of Applied Social Research at Columbia University. As reported by a former affiliate to the Bureau, Pfizer “wanted to find out whether or not it should continue to advertise a new drug in the Journal of the American Medical Association” (Glock 1979, p. 27). Typical of the scholarly entrepreneurship with which Lazarsfeld funded his institute, this rather humdrum marketing question was converted into a sociological study of scholarly interest showing very little surface traces of its mercantile origins (Rogers 1994). It is important to note that, in those days, Pfizer did not place ads in the regular advertising section in JAMA, but had its own newsletter Spectrum (that contained both ads and articles) inserted in each issue of JAMA. This, we believe, explains why the study includes multiple questions about specific newsletters and about JAMA but none about regular journal advertising or other medical journals mentioned by name (the questionnaire is reprinted in Coleman et al. 1966, pp. 195-205). The genesis of Medical Innovation as a study on the effectiveness of drug house newsletters also explains the rather small amount of attention given to detailing as a source of influence.

36

Familiarity with the research setting. Consider the following 1993 reminiscence by James Coleman of the Medical Innovation project: “I never saw the communities. It [Medical Innovation] was one of those research projects that happens while you are busy with more important projects. I designed the research with Herb [Menzel] and Elihu [Katz]. Herb and Elihu reviewed the literature on medical innovation. A team of interviewers came out to Illinois from the Bureau to talk to the doctors and Sidney Spivik searched the prescription records. The questionnaires went back to Columbia to be keypunched, a set of cards were sent to me for analysis, and the research report was published in Sociometry [1957]” (Coleman cited in Burt 1997; square brackets added by Burt). This recollection indicates that Coleman did not have fist-hand knowledge of medical practice in the four Illinois communities. It suggests that his two fellow authors did not either. Hence, the fact that Medical Innovation does not emphasize the role of marketing efforts in the diffusion of tetracycline cannot be used to infer that they were unimportant.

37

Appendix B. Procedures Used to Create Social Influence Weights Our analysis uses both discussion and advice relationships. Using the network data of all 228 physicians, we constructed the social weight matrices for each city separately in a series of steps. Step 1. First, we created adjacency matrices with element aij equal to 1 if i mentions j, and zero otherwise. We created two such adjacency matrices for each city: one for discussion ties and one for advice ties. Step 2. Since being discussion partners is a naturally reciprocal relationship, we symmetrized the discussion adjacency matrix (Alba and Kadushin 1976). Step 3. We constructed a pooled adjacency matrix by adding the symmetrized discussion matrix and the advice matrix, treating discussion and advice as indicators of a common underlying variable “interacting with.” We also performed analyses (not reported here) keeping discussion and advice separate. This did not affect the results. Step 4. We constructed four different weight matrices to account for various network contagion mechanisms. Direct tie matrices are identical to the adjacency matrices. We computed structural equivalence weights as the proportion of exact matches between two physicians’ set of relationships with third parties. A valid match required that the physicians had at least one common third party, which implies that actors without any common third party did not put any weight on each other's actions. Step 5 involved deleting all rows and columns referring to physicians who were not among the 125 included in the prescription sample. In step 6, we put all diagonals to zero and normalized all rows such that (1) wii = 0, and (2) Σjwij = 1 iff wij ? 0 for some j , and Σ jwij = 0 otherwise. This row-normalization implies that physicians are sensitive to the proportion rather than the number of relevant others who have adopted, and ensures that each network exposure variable is bounded between 0 and 1. Actor i’s social network exposure at time t can then be computed as Σj wij yj,t-1.

38

Figure 1. Relations between theory, models and data in a social network/adoption framework

a

Theory Random social network threshold theory

Specify

Theoretical model Probability model of adoption

Interpret

Formal identity BDV model

Estimate and test

Statistical model Discrete-time hazard model

Summarize

Data Panel data on adoption

a

Based on a diagram by Skvoretz (1998, p. 241) of the general relationships among theory, theoretical models, methodological or statistical models, and data.

39

Table 1. Doctors did not consider colleagues an important source of influence or information when adopting tetracycline _____________________________________________________________________________________________________________________ Percentage of physicians crediting a source with a

Percentage of physicians mentioning a source of information b

Original influence Most influence First source Intermediate source Final source _____________________________________________________________________________________________________________________ Detail men Journal articles Direct mail Drug house periodicals

57 7 18 4

38 23 8 5

52 22 6 3

27 16 21 11

5 14 21 21

Colleagues Meetings

7 3

20 __

10 3

15 4

28 8

All other media

4

6

3

7

3

_____________________________________________________________________________________________________________________ a

Based on Katz (1961, p. 77). A crosscheck against the Medical Innovation network data set prepared by Burt (1986) indicates that the base for these percentages are the 141 physicians (out of a total of 216 interviewed after the 12 exploratory interviews) whose most recent adoption was tetracycline. b Based on Coleman, Katz, and Menzel (1966, p. 59). Data were available for 87 adopters, who generated 131 mentions of sources intermediate to first and last source. Thus, the base for the percentages in the first and third column is 87, that for those in the middle one is 131.

40

Table 2. Descriptive statistics of model variables _____________________________________________________________________________________________________________________ Mean

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

Y

.111 .084 .000 2.907 Journals (log) 1.484 Science .268 Chief .098 Direct ties .384 Structural equivalence .379 Direct ties (Burt) .334 Structural equivalence (Burt) .451 Advertising by Lederle .155 Advertising by others .191

Summer Age Age2

SD

.314 .278 1.706 2.561 .392 .443 .298 .413 .321 .321 .419 .087 .224

Min

0 0 -2.54 0.21 0.69 0 0 0 0 0 0 0 0

Max

1 1 2.46 6.48 2.20 1 1 1 1 1 1 0.24 0.98

Correlations 1

2

3

4

5

6

7

8

9

10

11

12

-.05 -.09 -.08 .10 .12 -.05 .01 .01 .07 .02 .05 .03

.05 .05 -.07 -.08 .00 .25 .25 .18 .27 .30 .02

.17 -.16 -.01 .31 .15 .13 .15 .18 .13 .16

-.03 .01 .26 .18 .14 .07 .09 .12 .12

-.02 .06 -.05 -.22 -.04 -.15 -.17 -.14

.29 -.14 -.16 -.14 -.16 -.19 -.17

.17 .05 .09 .15 .05 .06

.75 .68 .86 .68 .57

.70 .82 .80 .71

.72 .69 .63

.75 .60

.68

_____________________________________________________________________________________________________________________

41

Table 3. Logit coefficients for discrete-time hazard models Showing that marketing effort and not social contagion is associated with adoption behavior _________________________________________________________________________________________________________________________________ Without marketing effort __________________________________________ 1a direct ties

2a direct ties (Burt)

3a struct. equi.

4a struct. equi. (Burt)

With marketing effort ____________________________________________________ 1b direct ties

2b direct ties (Burt)

3b struct. equi.

4b struct. equi. (Burt)

5 no contagion

_________________________________________________________________________________________________________________________________ Common tendencies Intercept Summer

-3.48 d -0.61

-3.69 d -0.63

-3.82 d -0.65

-3-78 d -0.69

-4.46 d -0.77

-4.31 d -0.82

-4.14 d -0.76

-4.14 d -0.80

-4.42 d -0.79

Intrinsic tendencies Professional age Professional age squared Chief or honorary position Number of journals (log) Scientific orientation

-0.12 -0.09 b -0.92 a 0.76 c 0.97 d

-0.14 a --0.09 a -0.94 a 0.77 c 0.99 d

-0.12 -0.10 b -0.86 a 0.91 c 0.97 d

-0.13 a -0.09 a -1.02 b 0.87 c 1.01 d

-0.13 a -0.10 b -0.95 a 0.96 d 1.11 d

-0.14 a -0.10 b -1.00 b 0.90 c 1.10 d

-0.13 a -0.10 b -0.98 b 0.92 c 1.13 d

-0.13 a -0.10 b -0.98 b 0.95 c 1.12 d

-0.13 a -0.10 b -0.97 b 0.95 c 1.12 d

0.64 b

1.18 d

0.98 c

0.81 c

-0.16

0.53

-0.44

0.03

Social contagion Marketing effort Lederle Others

Fit -2 Log Likelihood

619.2

611.5

617.1

615.8

5.61 c 0.29

4.10 a 0.02

6.11 c 0.44

5.13 b 0.22

5.22 c 0.23

608.0

606.8

607.6

608.2

608.2

_________________________________________________________________________________________________________________________________ a: p