To Doin, and Sieun iii ... in-law, Cheehang Park and Yesook Cho, for their love and encouragement. Finally, I am also grateful to my friends ..... We propose a direct utility model in which a consumer makes a sequence of pur- chase decisions ...
Direct Utility Models for Asymmetric Complements Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Sanghak Lee, B.S., M.S. Graduate Program in Business Administration
The Ohio State University 2012
Dissertation Committee: Greg M. Allenby, Advisor H. Rao Unnava Patricia M. West
c Copyright by
Sanghak Lee 2012
Abstract
Asymmetric complements refer to goods where one is more dependent on the other, yet consumers receive enhanced utility from consuming both. This dissertation proposes a direct utility model in which the gains a consumer receives from a purchase (i.e., utility) are separated from the costs (i.e., constraints). The asymmetries of a cross-category relationship are identified by introducing a latent sequence of a consumer’s decisions across multiple categories together with a super-additive utility structure. An integral restriction in a consumer’s decision space is included as a constraint in order to accommodate the indivisible nature of data. The proposed models are applied to scanner panel data and estimated using Bayesian methods. The empirical analyses provide evidence of asymmetric complementarity between milk and cereal and indicate the importance of accommodating demand indivisibility.
ii
To Doin, and Sieun
iii
Acknowledgments
First of all, I thank God for walking with me throughout my doctoral program. I can whole-heartedly confess that “The Lord is my shepherd, I shall not be in want” (Psalm chapter 23 verse 1). I cannot give enough thanks to my advisor, Greg Allenby, for his devoted effort and warm-hearted care. Greg is a great advisor who not only trains me to be equipped with academic knowledge and skill-sets but also has a significant influence on my personal life, acting sometimes like a friend and other times like a father. He has truly been a great role model that I want to follow in my future career in academia. I am very grateful for the people I’ve met in the marketing department. I would like to thank the marketing faculty members for their contribution and support to my doctoral studies. Special thanks need to be given to Rao Unnava and Patricia West for their guidance and support during my dissertation research. I want to thank current and past doctoral students including Sandeep Chandukala, Jeffrey Dotson, Tatiana Dyachenko, Karthik Easwar, John Howell, Hyojin Lee, Jenny Stewart, Christopher Summers, and Lifeng Yang. The friendship with them has provided me with a great joy and comfort and encouraged me even in hard times. I also want to express my sincere appreciation to Jaehwan Kim for his guidance and care. I could not have completed my doctoral studies without the dedicated support of my wife, Doin. I would like to express my deepest gratitude to her for all the sacrifices iv
she has made and her wholehearted support over the past five years. Doin and Sieun, my daughter, have always been a great source of encouragement, joy, and inspiration. I also want to thank my parents, Chunbae Lee and Myunghee Cho, and my parentsin-law, Cheehang Park and Yesook Cho, for their love and encouragement. Finally, I am also grateful to my friends in the Grace Korean UMC, especially to Rev. Miran and Kenneth Lee for their love and prayer.
v
Vita
May 24, 1977 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Born – Seoul, Korea 2003 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.S. Chemical Engineering, Seoul National University 2007 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M.S. Korea Advanced Institute of Science and Technology 2007-Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Teaching and Research Associate, The Ohio State University
Fields of Study Major Field: Business Administration
vi
Table of Contents
Page Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x
1.
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2.
A Direct Utility Model for Asymmetric Complements . . . . . . . . . . .
4
2.1 2.2
2.3
2.4
Introduction . . . . . . . . . . . . . . . . . . . . . . Model Development . . . . . . . . . . . . . . . . . . 2.2.1 Direct Utility Model for Sequential Decisions 2.2.2 Model Likelihood . . . . . . . . . . . . . . . . 2.2.3 Extensions to N Categories . . . . . . . . . . 2.2.4 Heterogeneity . . . . . . . . . . . . . . . . . . 2.2.5 Simulation Study . . . . . . . . . . . . . . . . Empirical Analysis . . . . . . . . . . . . . . . . . . . 2.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Estimation Results . . . . . . . . . . . . . . . 2.3.3 Model Comparison . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Cross-price Elasticity . . . . . . . . . . . . . . 2.4.2 Spillover Effect of Marketing Activity . . . . vii
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
4 7 8 11 15 16 16 17 17 19 21 22 23 25
2.5
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Modeling Indivisible Demand . . . . . . . . . . . . . . . . . . . . . . . .
38
3.1 3.2 3.3
. . . . . . . . . . . . . . . . . .
38 41 43 43 45 47 49 49 51 52 53 54 54 55 57 57 58 60
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
Appendices
73
A.
Probability Mass for Zero (Discrete vs. Continuous) . . . . . . . . . . . .
73
B.
Estimation Procedure (with Nonlinear Sub-utility for an Outside Good) .
75
C.
Simulation Study (with Nonlinear Sub-utility for an Outside Good) . . .
78
3.
3.4 3.5
3.6
3.7
3.8
Introduction . . . . . . . . . . . . . . . . . . . . . . Literature Review . . . . . . . . . . . . . . . . . . . Model Development . . . . . . . . . . . . . . . . . . 3.3.1 Direct Utility and Constraints . . . . . . . . . 3.3.2 A Discrete Likelihood for Indivisible Data . . 3.3.3 Comparison to a Continuous Likelihood . . . 3.3.4 General Case . . . . . . . . . . . . . . . . . . Simulation Study . . . . . . . . . . . . . . . . . . . . Irregular Regions of Integration . . . . . . . . . . . . 3.5.1 Sufficient Conditions for Optimality . . . . . 3.5.2 Estimation by Bayesian Error Augmentation Empirical Analysis . . . . . . . . . . . . . . . . . . . 3.6.1 Data . . . . . . . . . . . . . . . . . . . . . . . 3.6.2 Estimates . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Decomposition of Corner Probability . . . . . 3.7.2 Price Elasticity and Compensating Value . . . Concluding Remarks . . . . . . . . . . . . . . . . . .
viii
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
List of Figures
Figure
Page
2.1
Asymmetric Interdependency in Utility . . . . . . . . . . . . . . . . .
28
3.1
Sub-utility for Blueberry Yogurt: Discrete vs. Continuous
. . . . . .
62
3.2
Impact of Package Size on Corner Probability for Blueberry Yogurt .
63
ix
List of Tables
Table
Page
2.1
Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
2.2
Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
2.3
Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.4
Impact of Household Size on Parameters . . . . . . . . . . . . . . . .
32
2.5
Model-fit Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
2.6
Estimation Results for Benchmark Models . . . . . . . . . . . . . . .
34
2.7
Price Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
2.8
Slutsky Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2.9
Spillover Effect of Merchandising Activity . . . . . . . . . . . . . . .
37
3.1
Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
3.2
Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
3.3
Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
3.4
Price Elasticity and Compensating Value . . . . . . . . . . . . . . . .
67
3.5
Simulation Study with Non-linear Sub-utility for an Outside Good . .
68
x
Chapter 1: INTRODUCTION
Economic models for consumer demand provide a structured ground for analyzing data collected from a marketplace and understanding consumer behavior. The general premise is that a consumer makes a purchase decision in order to maximize his/her utility under a certain set of constraints. The utility implies what a consumer gains from a purchase, while the constrains represent what needs to be given up. The two essays in this dissertation emphasize the importance of understanding both aspects of a consumer’s decision making: utility and constraints. Direct utility models are proposed (i) to account for the cross-category relationship between asymmetric complements, which is capture by a super-additive utility structure, and (ii) to incorporate the presence of demand indivisibility, which gives rise to the restriction in a consumer’s decision space. In Essay 1 (Chapter 2), a direct utility model for asymmetric complements is developed for investigating the cross-category relationships. Asymmetric complements refer to goods where one is more dependent on the other, yet consumers receive enhanced utility from consuming both. Examples include garden hoses and sprinklers, chips and dip, and routine versus personalized services where the former has a broader base for utility generation and the latter is more dependent on the other’s presence. Measuring the presence of asymmetries is difficult because the direction of influence 1
is not observed. A latent sequence of a consumer’s decision across multiple categories is introduced together with the utility structure capable of identifying the origin of demand variation. Scanner panel data of milk and cereal purchases are used to investigate the presence of asymmetric complementarity and implications are explored through counterfactual analyses involving cross-price elasticities and spillover effects of merchandising variables. Essay 2 (Chapter 3) sheds light on the importance of accommodating constraints in consumers’ decision-making. Disaggregate demand in the marketplace exists on a grid determined by the package sizes offered by manufacturers and retailers. While consumers may want to purchase a continuous-valued amount of a product, realized purchases are constrained by available packages. This constraint might not be problematic for high-volume demand, but it is potentially troubling when demand is small. Despite the prevalence of packaging constraints on choice, economic models of choice have been slow to deal with their effects on parameter estimates and policy implications. A general framework for dealing with indivisible demand is proposed together with the Bayesian methods for estimation. Analyses of simulated data and a scanner-panel dataset of yogurt purchases indicate that ignoring packaging constraints can severely bias parameter estimates and measures of model fit, which results in the inaccurate measures of metrics such as price elasticity and compensating value. It is also shown that 9.52% of non-purchase in the data reflects the restriction of indivisibility, not the lack of preference. Collectively, the two essays enrich our understanding of consumers’ purchase behavior by developing direct utility models in which utility is separated from constraints. This dissertation provides an empirical evidence of asymmetric complements 2
that are likely to be purchased together with asymmetric degrees of dependence. It is also shown that consumers are restricted to choose a non-negative integer number of an indivisible good, which gives rise to a non-purchase decision even when purchase intention is present. Methodologically, merits of direct utility models are highlighted by developing a quantitative model for discrete demand of asymmetric complements and corresponding estimation routines are developed using Bayesian methods.
3
Chapter 2: A Direct Utility Model for Asymmetric Complements
2.1
Introduction
The simultaneous demand for goods in multiple categories can be due to many reasons. Demand can be simultaneously high because goods are complements, from cross-category timing of promotions and price discounts, from relaxed budgets corresponding to large shopping trips, and from the presence of joint demand shocks reflecting unobserved spikes in demand. Until recently, multi-category demand models have relied extensively on modified versions of the same models used to study substitute goods. Modifications include the introduction of correlated error terms (Manchanda et al. (1999), Li et al. (2005)), correlated coefficients (Duvvuri et al. (2007), Hansen et al. (2006), Erdem (1998), Ainslie and Rossi (1998)), the presence of common covariates (Nair et al. (2004), Basu et al. (2003), Manchanda et al. (1999)), and the distinction between consumption and purchase (Dub´e (2004)) that helps account for the likelihood of simultaneous purchases. While these modifications have the capability of representing patterns of correlated demand, they are limited in their ability to explain its origin. Correlated errors and correlated coefficients do not
4
distinguish between inherently complementary products, defined in terms of superadditive utility, versus the presence of other reasons for simultaneous purchase and negative cross-price effects. An interesting aspect of modeling cross-category demand lies in detecting its asymmetric utility structure. One category might be more dependent on the other while the other less dependent on the first, which is not reflected by introducing simple interaction terms into a consumer’s utility function. In models of quadratic utility U (x1 , x2 ) = α0 x + 1/2x0 Bx, for example, off-diagonal elements of the coefficient matrix, B, are constrained to be symmetric, e.g., βij = βji , because utility is scalar and depends on the sum of coefficients in the term (βij + βji ) xi xj , not each coefficient separately. The symmetry constraint is present in all models employing quadratic forms - e.g., translog utility functions (Christensen et al. (1975)). The marketing literature has recently begun to move to more formal models of cross-category demand. Song and Chintagunta (2007) propose a model structure for cross-category analysis based on economic theory that relies on a Taylor series approximation to an underlying indirect utility function. Their model captures households’ multi-category purchase incidence, brand choice, and purchase quantity decisions by considering a joint utility function of multi-category purchases. Mehta (2007)’s model is similar to Song and Chintagunta (2007) in that the basic translog indirect utility function is employed and the logit choice probability forms the basis of the model. However, the model is restricted to account for the purchase incidence only, ignoring the possibly confounding effect of satiation on market basket demand. We note that both models rely on an indirect translog utility function (Pollak and Wales (1992), Christensen et al. (1975)), which in turn relies on a Taylor series approximation of an 5
arbitrary utility or indirect utility function. One of the advantages of this approach is that it does not rely on any specific form of a utility function. However, while offering some flexibility, it is limited in describing cross-category relationships due to the symmetry restriction. We propose a direct utility model in which a consumer makes a sequence of purchase decisions across multiple categories. Utility asymmetries are identified from the latent sequence of a consumer’s purchases, where purchases in one category are affected by earlier purchases in other categories. We posit a latent model for the purchase sequence, allowing the probability of a particular sequence to be influenced by product inventory and observed merchandising variables. Our data likelihood is obtained by integrating over the latent sequences. We contribute to the literature on complementary demand by developing a direct utility model that identifies asymmetric interdependencies among goods. Our model has the following properties: i) an order of sequential purchase decisions is estimated; ii) the model allows for asymmetric interdependencies among categories including complementarity as well as substitutability; iii) the model provides a unified framework to handle brand-level purchase quantities, accommodating both corner and interior solutions; and iv) the model explicitly deals with the indivisibility of data, not assuming a continuous decision space for a consumer. The model is applied to scanner panel data of milk and cereal purchases. The parameter estimates provide evidence of asymmetric complementarity between milk and cereal such that the impact of milk on the utility of cereal is greater than the impact of cereal on the utility of milk. The presence of asymmetric spillover effects has implications for marketing management, particularly for the sequencing of category investment decisions and 6
the cross-category coordination of promotional campaigns. We show that asymmetric interdependency in utility leads to asymmetric spillover effects of promotions, and that assuming symmetric effects will bias the parameter estimates and distorts policy implications. In the next section, we develop our direct utility model that is assumed to be maximized subject to budgetary and packaging restrictions. In section 3, our model is applied to scanner panel data and the estimation results are summarized. Section 4 discusses the implications of our model and concluding remarks are offered in section 5.
2.2
Model Development
Complementary products are defined in our model in terms of a super-additive utility structure that favors joint as opposed to separate purchase. Our definition is different from the standard economic definition of complementarity (e.g., Pashigian (1998)) where negative cross-price effects are used to identify complementary products. That is, if a price decrease in one good leads to increased demand for another, then the goods are considered complements. Such an effect, however, can be due to many factors such as a person’s attention to prices being more acute during a weekly, large budgeted shopping trip versus a filler trip for specific basket items. Our goal is to derive insight into how utility is formed, which we believe to be more informative than studying how consumer demand reacts to prices. Asymmetric interdependency between goods is difficult to identify because the cross-category effect is often bi-directional. Utility of beer, for example, is enhanced by salty snacks, and the reverse is also true. When we observe the joint purchase
7
of both goods without knowing the order of purchase decisions, it is hard to tell which one influences the other. We introduce a latent sequence of purchase decisions by which the direction of cross-category effect is determined. A model of sequential decisions might reflect a physical path of a consumer’s trip at a store, or it might represent the order of decisions made in consumer’s mind that may be different from the temporal sequence of actual purchases. We begin our model development with the utility specification, and then discuss how sequence of shopping decisions is determined. The statistical specification of our model is then discussed. The model is developed in terms of two product categories for exposition, and then generalized to an arbitrary number of categories.
2.2.1
Direct Utility Model for Sequential Decisions
Suppose there are two product categories (i.e., X and Y ) and two brands are available for each category (i.e., x1 , x2 and y1 , y2 ) with their unit prices being (px1 , px2 , py1 , py2 ) and unit volumes being (w1x , w2x , w1y , w2y ). In a shopping trip at time t, a consumer has a fixed budget E that implies the maximum willingness-to-pay for the two categories and is aware of the levels of the inventories of the two categories (i.e., ηtx , ηty ). We assume that consumer purchases reflect a sequential process of constrained utility maximization across product categories. If category X is considered first and the decision on category Y is made conditional on the decision on X, then it can be described as two step procedure. First, a consumer makes a purchase decision on X to maximize the joint utility of X and an outside good z under a set of constraints: i) budget constraint defined by
8
E and the unit prices pxit ; ii) the discreteness imposed by package sizes; and iii) nonnegativity of purchase quantities. Here, the outside good is defined as the remaining dollars that are saved rather than spent on the inside goods (X and Y ), and the linear sub-utility for the outside good is assumed, implying the value of the outside good does not satiate. max
x1t ,x2t ,zt
s.t.
U
(x1t , x2t , zt |ηtx , ηty )
=
2 X
ψitx log (wix xit + ηtx ) + ψ z zt
(2.1)
i=1 2 X
pxit xit + zt = E
i=1
xit ∈ {0, 1, 2, · · ·} ,
∀i ∈ {1, 2}
zt ≥ 0 where ψitx = exp (αix + βyx ηty + δix mxit + εxit ) with mxit denoting a marketing activity on xit . The parameter αix represents the time-invariant baseline utility of xit and δix measures the impact of a marketing activity mit that might vary over time. βyx captures the impact of category Y on the utility of category X, with positive βyx implying complementarity and negative βyx for substitutability. εxit implies a demand shock that is known to a decision maker, but not revealed to a researcher, and is assumed to follow an i.i.d. normal distribution with mean zero and unit variance. Due to the concavity of logarithm, the marginal utility of xit is a decreasing function of its own purchase quantity and the inventory of X. The utility specification in (1) allows for the variety seeking behavior of a consumer, which is manifested by the shopping basket where multiple brands of one category are purchased simultaneously. Second, conditional on the purchase decision on X (i.e, x∗1t , x∗2t ), the consumer makes a purchase decision on Y such that the joint utility of Y and the outside good
9
z is maximized under the remaining budget (i.e., E 0 = E −
2 P
pxit x∗it ).
i=1
max
y1t ,y2t ,zt
s.t.
U (y1t , y2t , zt |x∗1t , x∗2t , ηtx , ηty ) =
2 X
y ψjt log wjy yjt + ηty + ψ z zt
(2.2)
j=1 2 X
pyjt yjt + zt = E 0
j=1
yjt ∈ {0, 1, 2, · · ·} ,
∀j ∈ {1, 2}
zt ≥ 0 where
y ψjt
2 P y y y y x x ∗ = exp αj + βxy ηt + wi xit + δj mjt + εjt . βxy captures the impact i=1
of category X on the utility of category Y and is associated with the sum of the 2 P inventory and the newly purchased amount of X (i.e., ηtx + wix x∗it ). i=1
When the sequence of purchase decisions is reversed, everything remains the same except for two things: i) the budgetary allotment is now E for the purchase of Y and 2 2 P P y y y ∗ y ∗ x 0 is only wj yjt , while ψjt E =E− pjt yjt for X, ii) ψit is dependent on ηt + i=1
j=1
influenced by the current inventory of Y and not by the purchase amount of X at time t. Note that the optimality conditions under the two sequences are indifferent if the observed purchases are all zero. Positive purchase quantities differentiate the optimality conditions under the two different sequences, informing which sequence is more likely than the other. We introduce a latent level of category interest, νt , by which the order of the purchase sequence is determined. νtx = γ0x + γ1 ηtx + γ2 νty
=
γ0y
+
γ1 ηty
+ γ2
2 P i=1 2 P j=1
10
mxit + ξtx (2.3) myjt + ξty
The parameter γ0 implies the category-specific baseline of interest and γ1 , γ2 capture the influence of inventory and marketing activities, respectively. We expect the interest of a category increases when the inventory is low or when marketing activities draw more attention from a consumer. ξt is a stochastic component of latent interest and is assumed to follow an i.i.d. type I extreme-value distribution1 , which produces a closed-form expression for the sequence probability. Category X is considered first if its interest level is greater than that of Y (i.e., νtx ≥ νty ), and the reverse is true otherwise. Because household inventory is often not observed to an analyst, a model for consumption is required to approximate the inventory of goods. We employ an exponential decay model for inventory depletion, in which a fixed proportion of goods is consumed in each period (Ailawadi et al. (2007)). The inventory of category X at time t is given by: ηtx
=
λ(t−1,t) x
x ηt−1
+
2 X
! wix xi,t−1
(2.4)
i=1
where (t − 1, t) denotes the time interval between the two shopping trips, and λx indicates the rate of inventory for category X, which lies between 0 and 1.
2.2.2
Model Likelihood
The model likelihood comprises in two stochastic components: i) the probability of each sequence, and ii) the conditional probability of purchase quantities given a sequence. Because we only observe the purchase quantities without knowing the sequence of decisions, we first derive the conditional likelihood for the purchase quantities, and then integrate over the probabilities of the sequences to obtain the marginal 1
F (x) = exp(− exp(−x))
11
likelihood of the purchase quantities. There are two possible sequences with two categories: i) first X then Y (i.e., νtx ≥ νty ), or ii) first Y then X (i.e., νtx < νty ). The probability of each sequence can be derived in closed form as follows: Pr (νtx ≥ νty ) =
1 1+exp
Pr (νtx
E), εxit is bounded only from below (i.e., ubxit = ∞). Assuming εxit j=1,j6=i
follows an i.i.d. probability distribution f , we can compute the likelihood of (x∗1t , x∗2t ) by integrating over the joint density of the errors over the region specified in (8). x
x
Zub1t Zub2t L (x∗1t , x∗2t |νtx ≥ νty ) = f (εx1t ) dεx1t × f (εx2t ) dεx2t lbx 1t
(2.9)
lbx 2t
∗ ∗ The likelihood of (y1t , y2t |x∗1t , x∗2t ) can be computed in a similar way. The budget is 2 P y now E 0 (= E − pxit x∗it ), and the utility of yjt (ψjt ) is influenced not only by the i=1
inventory (ηtx ), but also by the purchase quantities of X (
2 P i=1
13
wix x∗it ). The bounds for
error realizations (εy1t , εy2t ) are given by: lbyjt < εyjt < ubyjt ,
∀j ∈ {1, 2}
(2.10)
where ! 2 ∗ X + ηty wjy yjt y = log − αj − βxy ηtx + wix x∗it − δjy myjt y ∗ y wj (yjt − 1) + ηt i=1 ! y ∗ 2 y X w (y + 1) + η t jt j wix x∗it − δjy myjt − αjy − βxy ηtx + ubyjt = log α0 pyjt − log log ∗ wjy yjt + ηty i=1 lbyjt
α0 pyjt
− log log
Integrating the joint density of (εy1t , εy2t ) over the the region in (10) produces the ∗ ∗ likelihood of (y1t , y2t |x∗1t , x∗2t ). y
y
Zub1t Zub2t y y y ∗ ∗ L (y1t , y2t |x∗1t , x∗2t , νtx ≥ νt ) = f (ε1t ) dε1t × f (εy2t ) dεy2t lby1t
(2.11)
lby2t
∗ ∗ ) given the specific sequence is equal to , y2t The conditional likelihood of (x∗1t , x∗2t , y1t ∗ ∗ |x∗1t , x∗2t ). , y2t the product of the likelihood of (x∗1t , x∗2t ) and the likelihood of (y1t
∗ ∗ ∗ ∗ L (x∗1t , x∗2t , y1t , y2t |νtx ≥ νty ) = L (x∗1t , x∗2t |νtx ≥ νty ) × L (y1t , y2t |x∗1t , x∗2t , νtx ≥ νty ) (2.12)
Consequently, the joint likelihood of the sequence and the purchase quantities is given by multiplying the probability of the sequence in (5) and the conditional likelihood of the purchase quantities given the sequence in (12). ∗ ∗ ∗ ∗ L (x∗1t , x∗2t , y1t , y2t , νtx ≥ νty ) = Pr (νtx ≥ νty ) × L (x∗1t , x∗2t , y1t , y2t |νtx ≥ νty )
(2.13)
Because the joint likelihood of the opposite sequence and the purchase quantities is given in a similar manner, we can compute the marginal likelihood of the purchase quantities by integrating out the purchase orders: ∗ ∗ ∗ ∗ ∗ ∗ , y2t , νtx < νty ) , y2t ) = L (x∗1t , x∗2t , y1t , y2t , νtx ≥ νty ) + L (x∗1t , x∗2t , y1t L (x∗1t , x∗2t , y1t (2.14)
14
2.2.3
Extensions to N Categories
Although the model likelihood in (14) is derived for the simplest case with two categories only, it can be extended to a more general case where N categories are related to one another. A consumer has a latent level of interest for category i, which is affected by the level of inventory and the sum of marketing activities conducted on the ni brands under the category: νti
=
γ0i
+
γ1 ηti
+ γ2
ni X
mikt + ξti ,
i ∈ {1, · · · , N }
(2.15)
k=1
If ξti is assumed to follow an i.i.d. type I extreme-value distribution, the probability of a complete sequence is given by an exploded logit form as follows (Chapman and Staelin (1982)): Pr
νt1
>
νt2
> ··· >
νtN
=
N −1 Y
exp (γ0i + γ1 ηti + γ2
mikt ) P j j j exp γ + γ η + γ mkt 1 2 t 0 j=i P
(2.16)
PN i=1
Conditional on a specific sequence Si (e.g., νt1 > νt2 > · · · > νtN ), it is straight-forward ∗
to compute the conditional likelihood of purchase quantities. Let xi denote the observed purchase quantities for brands in category i. The conditional likelihood of ∗
∗
the purchase quantities across N categories (x1 , · · · , xN ) is given by: ∗
∗
L x1 , · · · , xN |Si
(2.17)
∗ ∗ ∗ ∗ ∗ ∗ = L x1 |Si × L x2 |x1 , Si × · · · × L xN |x1 , · · · , xN −1 , Si Consequently, the likelihood of purchase quantities is computed by summing over all the joint probabilities of the purchase quantities and sequence as follows: 1∗
N∗
L x ,··· ,x
=
N! X
∗
∗
Pr (Si ) × L x1 , · · · , xN |Si
i=1
15
(2.18)
2.2.4
Heterogeneity
Consumer heterogeneity is incorporated by introducing a random-effect specification for household parameters: θh = ({α}h , {β}h , {δ ∗ }h , {γ}h , {λ∗ }h ) ∼ N θ¯ + ∆0 zh , Vθ
(2.19)
where h = 1, · · · , H indexes the households, and δ = exp (δ ∗ ) , λ =
1 2 . 1+exp(−λ∗ )
We examine household size and household income as covariates zh , and center these variables so that θ¯ reflects the population mean.
2.2.5
Simulation Study
A simulation study is conducted to illustrate the performance of our model. Data are generated according to the model likelihood with randomly generated prices and feature advertising variables for 150 respondents and 150 observations per respondent. These choices were made so that the simulated prices and data are of similar length to our empirical study reported below. We illustrate the model using three product categories and two brands under each category, and various cross-category relationships are imposed among the three categories. X and Y are asymmetric complements, while Y and Z are substitutes. X and Z are assumed to be independent. A random-effect distribution was employed without covariates with covariance equal to 0.5I. Table 1 reports the estimated mean of the random-effect distribution, along with the true values. [Table 1] 2
The reparameterization guarantees 0 < δ, 0 < λ < 1.
16
All the parameters are well-recovered, having the true values within 95% credible intervals. The simulation study confirms that we can estimate models of various cross-category relationships including independence, substitutes, and asymmetric complementarity.
2.3 2.3.1
Empirical Analysis Data
We applied our model to the IRI panel dataset described in Bronnenberg et al. (2008). Milk and cereal categories are chosen to illustrate the asymmetric crosscategory inferences afforded by our model specification. The data contain each household’s grocery shopping history, in which purchase quantities, prices, and marketing activities (e.g., feature and display) are available. We include all purchase occasions in our analysis, including those when no purchases are made in either category. Our model is general enough to be applied to a brand-level analysis. However, cereal category contains many brands with relatively small volume, with the top 40 brands accounting for 70% of total demand and the share of the leading brand being just 8.8%. We therefore engage in an aggregation of stock keeping units (sku’s) to keep the number of model parameters to a reasonable level. We retain the two leading manufacturers in each product category for analysis, and aggregate the products to form three package-size groups within each manufacturer. This allows us to move from an analysis of 157 product codes (UPCs) to twelve aggregates – six in milk and six in cereal. Therefore, the subscript i in (1) corresponds to each aggregate, and αix is determined by the following four indicators: αix = κx1 D1xi + κx2 D2xi + κxl DLxi + κxs DSix 17
(2.20)
where D1xi , D2xi , DLxi , DSix are the indicators of manufacturer1, manufacturer2, large pack-size, and small pack-size, respectively. The top two manufacturers are Guida and Private Label for milk, and General Mills and Kellogg for cereal, and they account for 60∼80% of the total sales with each category. A share-weighted average of the UPClevel store variables (e.g., unit price, unit volume, feature, and display) is computed to represent those of each package-size group. We select households who have at least 20 purchases of milk and cereal during the two-year data period, resulting in 140 households. The unit volume of the most frequently purchased item is used as the base unit of volume for each category (128oz for milk, 15oz for cereal). Basic description of the data is provided in Table 2. [Table 2] Households make an average of 151 shopping trips during the two-year data period, with 42% of the trips involving the purchase of milk and 27% of the trips involving cereal. Forty-seven percent of the trips contain purchases in neither category, while 16% of the trips contain both. We find that large-sized packages of milk and medium-sized packages of milk are more frequently promoted (e.g., feature advertisement) than other package sizes, and that the data show variety seeking behavior of consumers in both milk and cereal categories. Thirteen percent of the purchases of cereal exhibit purchases from multiple manufacturers, which is rarely observed in milk. Joint purchases of multiple pack-sizes (within the same manufacturer) are observed 5% of the time for milk, and 16% for cereal. We use 90% of the data for the model calibration, leaving 10% for prediction.
18
2.3.2
Estimation Results
We assume that consumers maximize utility subject to budget and packaging constraints, and employ Lee and Allenby (2012)’s estimation strategy for indivisible demand. Demand discreteness is caused by transactions constrained to lie on a grid of support defined by the available package sizes, resulting in a model likelihood with interior points defined in terms of point masses instead of the usual density contribution to the likelihood. As discussed earlier, there are many error realizations associated with any observed purchase when demand is indivisible. We estimate the model using Bayesian MCMC methods with proper but relatively diffuse priors3 . 70,000 iterations of the chain were used to generate parameter estimates, with the first 20,000 draws discarded as burn-in. Because sequential draws from the joint posterior are autocorrelated, we thinned the chain by keeping every 10th draw of the remaining 50,000 draws (Raftery and Lewis (1995), Geyer (1992)). Table 3 reports ¯ posterior means and standard deviations of the hyper-parameters, θ. [Table 3] There is ordinal agreement between the estimates of baseline preferences for manufacturers reported in Table 3 and the market shares reported in Table 2. Consumers prefer the 1-gallon package size in milk and disfavor milk packaged in quart containers. In cereal, the larger package size (20 oz) is less preferred to medium (15 oz) or small (11 oz). More interesting are the estimates of complementary parameters that reflect cross-category effects. The two categories show a complementary relationship in which the marginal utility of one good is enhanced by the presence of 3
¯ Vθ where θ¯ ∼ N (0, 100I), Vθ ∼ IW (10, 10I). θh ≡ (αh , βh , δh∗ , γh , λ∗h ) ∼ N θ,
19
the other. The impact of milk on the utility of cereal is greater than the impact of cereal on the utility of milk (i.e., β¯mc > β¯cm ), indicating the presence of asymmetric complementarity. Figure 1 illustrates the asymmetric effect, where the sub-utility of cereal is heightened with the presence of milk while the sub-utility of milk is weakly enhanced by cereal. These results have face validity because milk is often consumed independently while cereal is more likely to be consumed with milk. [Figure 1] Marketing activities of the retailer (e.g., feature advertisement, and merchandising display) enhance the marginal utility of the promoted item, which in turn increases the likelihood of purchase. Table 3 shows differences in effectiveness of various marketing activities. Feature advertisement is effective for Guida milk and Kellogg cereal, while its influence is small for Private Label milk and General Mills cereal. The merchandising display of cereal has a greater impact in increasing the marginal utility than the feature advertisement. The inventory rate is greater for cereal than for milk, implying that consumers tend to stockpile cereal, while milk is more frequently depleted. The baseline interest of cereal is higher than that of milk, which is set to be zero for identification. However, when a consumer stockpiles cereal, the purchase decision for milk is made first as the latent interest of cereal is lowered due to the inventory. The marketing activities such as feature and display do not make a significant impact on the latent interests of the categories. Finally, we find that household size is significantly related to the individual-level parameters as summarized in Table 4. Larger households prefer larger packages of 20
milk, but not in cereal. We do not find any significant relationship between household income and individual-level parameters. [Table 4]
2.3.3
Model Comparison
The proposed model is compared with four benchmark models. The first model is identical to the proposed model except for the symmetric restriction in complementarity (i.e., βij = βji ). The second model assumes there is no interdependency between the two categories (i.e., βij = βji = 0). When categories are independent, the conditional likelihood of purchase quantities given a sequence stays the same across different sequences. Therefore, the parameters for capturing the latent level of category interest (i.e., γ) are not identified in benchmark model 2. In the third and fourth benchmark models, consumers’ purchase sequence is assumed to be fixed, and thus pre-imposed by the analyst. Benchmark model 3 assumes that milk is determined independently, and then the decision on cereal is made conditionally. The opposite sequence is true in benchmark model 4. Table 5 compares the proposed model with the benchmark models in terms of model-fit and the parameter estimates of the four benchmark models are presented in Table 6. [Table 5] [Table 6] Table 5 shows that the proposed model outperforms all the benchmark models in both in-sample and predictive fits. In Table 6, imposing symmetric restrictions on the nature of complementarity as in the first benchmark model leads to a systematic 21
bias such that the impact of milk on the utility of cereal is underestimated while the impact of cereal on the utility of milk is overestimated. Accordingly, the baseline parameters of the more dependent category (i.e., cereal) are inflated, while those of the independent category (i.e., milk) are reduced. Benchmark model 2, in which no complementarity is assumed, shows the worst performance in fitting the data. This indicates that allowing for some complementarity is important, and we also find that ignoring complementarity results in the overestimation of the baseline parameters especially for the more dependent category (i.e., cereal). Comparison to the third and fourth benchmark models indicates that the sequence of purchase decisions varies over time and across households.
2.4
Discussion
Asymmetric complements are products that drive utility in other categories at differential rates, and their presence is potentially important for product design, pricing, distribution and advertising decisions. In the extreme case of tied goods such as a computer printer and ink, it is often profitable to enhance, advertise and price discount the dominant good (e.g., the printer) with the goal of inducing sales in the dependent good. Our model informs such decisions through the estimated model parameters. In our analysis of scanner panel data, prices and merchandising variables are observed to vary and in this section we explore the implications of asymmetric complements on these variables. We use our sequential model of asymmetric complements to investigate two issues. First, we compute model-based cross-price elasticity estimates by aggregating individual-level demand estimates under various price discounts, and show that the
22
estimates are influenced by the presence of asymmetric effects. We then investigate the cross-category effect of merchandising variables.
2.4.1
Cross-price Elasticity
Complementarity produces a positive interaction in the demand for products, resulting in negative cross-price effects where the decrease in the price of one category leads to an increase in the demand of another. To estimate the cross-price elasticity between milk and cereal, we calculate the expected demand of each category for various prices. The expected demand is computed by integrating over the posterior distributions of parameters as well as the distribution of error terms for each of the actual (observed) prices versus prices that are increased or decreased by the indicated amount. Feature and display variables are set to be their average values in the data to reflect an average level of promotion in the product category. The level of inventory is similarly set to be the average value of the inventory trajectory that is computed from the household-specific estimate of the inventory rate parameter and the time-series data of purchase history. Table 7 reports changes in expected demand when prices are changed by 10%. Also reported, in parentheses, is the expected change in demand for the first two benchmark models. The first benchmark model assumes that complementary effects are symmetric (i.e., βij = βji ), and the second benchmark model assumes no complementary relationships (i.e., βij = βji = 0). [Table 7] As the price of a product decreases, its own demand increases (i.e., negative ownprice elasticity) and the demand of the complementary product increases while the 23
demand of the outside good decreases. The own price elasticity of cereal (i.e., -1.69) is greater in its magnitude than that of milk (i.e, -1.33). The demand for cereal is sensitive to the price change of milk (i.e., -0.68), while the impact of cereal prices on the demand of milk is smaller (i.e., -0.34), showing that the asymmetric effects in utility leads to asymmetric effects in price. The comparison to the benchmark models indicates that the restriction of symmetric or no interdependency distorts the own-price as well as cross-price elasticities. The model of symmetric complementarity underestimates the spillover effect of price change in milk and overestimates the spillover effect of price change in cereal. One might argue that the complementarity between the two goods is not the only source of the negative cross-price elasticity of aggregate demand. When there is a price discount in one category, it not only changes the rate of exchange among goods (i.e., the substitution effect) but also enhances the purchasing power (i.e., the income effect) of a consumer’s budget. The Slutsky equation addresses this issue by decomposing the impact of a price change into two effects: a substitution effect and an income effect. ∂hi (p, u) ∂xi (p, E) ∂xi (p, E) = − xj (p, E) ∂pj ∂pj ∂E
(2.21)
where h (p, u) is the Hicksian demand and x (p, E) is the Marshallian demand, at price p, budget E, and utility level u. While the substitution effect results from the change of relative prices, the income effect captures an increase or decrease in the consumers purchasing powers as a result of the price change. The results of Slutsky decomposition are reported in Table 8. It indicates that the income effect is negligible in our categories, and that the changes in demand reported in Table 7 primarily reflect the substitution effects. 24
[Table 8]
2.4.2
Spillover Effect of Marketing Activity
Table 9 reports estimates of the effect of changes in the feature advertisement. Reported is the percentage change in demand when the feature advertisement is enhanced by 20% in each category. We find, as with the price elasticity estimates, that the spillover effect is greater when the independent category is promoted. Enhanced feature advertisement in milk increases the demand of cereal by 0.57%, while the demand of milk increases by 0.24% with the feature advertisement in cereal. [Table 9] In summary, we find that allowing for asymmetric parameters in a direct utility model is important for measuring cross-category spillover effects. Demand of the more dependent category is influenced to a greater extent by the marketing activities on its complement, while the spillover effect is small for the opposite case. Not accounting for asymmetric parameters in the utility structure, as is in benchmark models 1 and 2, leads to systematic biases in the effects of prices and merchandising variables: (i) underestimates (overestimates) the cross-category effects of marketing activities of a less (more) dependent category, (ii) distorts the effects of marketing activities on its own demand.
2.5
Concluding Remarks
A model of asymmetric complements is developed within a formal model of direct utility maximization. Complementary effects occur when the marginal utility of one good is enhanced by the presence of a second good. A challenge in developing models 25
of such asymmetry is due to it being equal to the second derivative of the utility function ∂ 2 U/(∂x1 ∂x2 ), which is symmetric because the order of differentiation does not matter. Additional information is therefore needed to identify asymmetric utility effects, and this paper uses the longitudinal variation of marginal utility as revealed in observed purchases across product categories. A latent decision sequence is hypothesized and used to identify a sequential purchase decision where the purchase of goods early in the sequence is not affected by goods later in the sequence. We allow the order of the purchase sequence to be partially dependent on current product inventory and merchandising variables that change over time. The model likelihood is obtained by integrating over all possible decision sequences, and is shown to fit our data better than models that restrict effects to be either symmetric or non-existent. We find evidence of asymmetric spillover effects between milk and cereal, where milk influences the consumption of cereal but cereal has little effect on milk consumption. Identifying the presence of asymmetric complements is important whenever decisions have multiple components and the decision sequence is not fixed. Cross-selling activities is one example where the order of product purchase may impact the likelihood of purchasing other items. This occurs in financial services where it is believed that some initial exposure to a firm through a simple product (e.g., a checking account) aids in the successful sale of more complicated products (e.g., insurance products). The same rationale is used to offer introductory items at a discount to encourage additional purchases in the future. The identification of which introductory items have larger spillover effects is an important consideration in this strategy. We therefore believe the model has potentially wide application.
26
Additional research is needed to identify the attributes and attribute-levels that give rise to asymmetric effects. Our analysis based on observed demand data did not include information on product attributes, and we view the ability to identify the origin of asymmetric effects as a potentially profitable area of marketing research. The development of asymmetric models of stated preference data is therefore needed.
27
1.2 0.4
0.8
w/o cereal with cereal
0.0
0.4
0.8
utility of Private Label milk
w/o cereal with cereal
0.0
utility of Guida milk
1.2
Figure 2.1: Asymmetric Interdependency in Utility
0
1
2
3
4
5
0
3
4
5
1.2 0.8
w/o milk with milk
0.0
0.4
0.8
utility of Kellogg cereal
w/o milk with milk
0.4
1.2
2
quantity of Private Label milk
0.0
utility of General Mills cereal
quantity of Guida milk
1
0
1
2
3
4
5
0
quantity of General Mills cereal
1
2
3
4
quantity of Kellogg cereal
* The figures are drawn based on 128oz milk and 15oz cereal.
28
5
Table 2.1: Simulation Study (a) Baseline parameter α ¯ 1x true value 1.000 post.mean 1.000 (0.040)
α ¯ 1y 0.700 0.699 (0.043)
α ¯ 2y 0.900 0.898 (0.042)
α ¯ 1z 0.900 0.887 (0.046)
α ¯ 2z 0.850 0.848 (0.042)
(b) Cross-category Relationship parameter β¯xy β¯yx β¯xz true value 0.200 0.400 0.000 post.mean 0.194 0.371 -0.008 (0.042) (0.044) (0.040)
β¯zx 0.000 0.014 (0.037)
β¯yz -0.100 -0.118 (0.043)
β¯zy -0.200 -0.209 (0.041)
(c) Impact of Marketing Activity on Utility ∗ ∗ ∗ ∗ δ¯2y δ¯2x δ¯1y parameter δ¯1x true value -1.204 -1.204 -1.204 -1.204 post.mean -1.292 -1.292 -1.319 -1.285 (0.085) (0.098) (0.106) (0.098)
∗ δ¯1z -1.204 -1.359 (0.121)
∗ δ¯2z -1.204 -1.230 (0.109)
(d) Inventory Rate ¯∗ parameter λ x true value 0.000 post.mean -0.021 (0.074)
α ¯ 2x 0.800 0.815 (0.043)
¯∗ λ y 0.000 -0.047 (0.084)
¯∗ λ z 0.000 -0.101 (0.082)
(e) Decision Sequence parameter γ¯0y γ¯0z true value 0.000 0.000 post.mean -0.027 -0.295 (0.148) (0.245)
γ¯1 -1.000 -0.849 (0.291)
γ¯2 1.000 0.823 (0.136)
* Posterior standard deviations are given in parentheses ( ). 1 * δ = exp (δ ∗ ), λ = 1+exp(−λ ∗) . 29
Table 2.2: Data Description (a) Purchase Frequency Category Number of Trips(%)
Milk and Cereal 3312(15.7%)
30
(b) Descriptive Statistics Category Manufacturer Guida Market Share 21.02% Pack-size large medium Number of UPCs 3 3 Unit Volume(16oz) 8.00 4.00 Unit Price(USD) 3.38 1.92 Price Per Volume 0.42 0.48 Feature 0.15 0.04 Display 0.00 0.00 Purchase Incidence 7.15% 2.57% Purchase Quantity 1.35 1.28
Milk only 5625(26.6%)
Milk
Cereal only 2323(11.0%)
None 9882(46.7%)
Cereal
Private Label General Mills 54.12% 33.48% small large medium small large medium small 6 6 9 7 20 21 14 2.00 8.00 4.00 2.00 1.27 0.93 0.70 1.07 3.27 1.80 1.05 4.30 3.58 3.51 0.54 0.41 0.45 0.53 3.38 3.84 4.99 0.00 0.11 0.08 0.03 0.13 0.17 0.12 0.00 0.00 0.00 0.00 0.05 0.08 0.05 0.09% 20.23% 13.16% 1.50% 4.38% 8.53% 4.58% 1.11 1.26 1.35 1.06 1.49 1.60 1.29
large 23 1.25 3.82 3.07 0.10 0.05 5.54% 1.32
Kellogg 32.68% medium small 22 23 0.91 0.68 3.46 3.27 3.81 4.79 0.12 0.10 0.06 0.05 5.69% 5.86% 1.34 1.35
Table 2.3: Estimation Results (a-i) Baseline Preference for Manufacturer parameter κ ¯m κ ¯m κ ¯ c1 κ ¯ c2 1 2 post.mean -2.426 -1.721 -1.188 -1.212 (0.146) (0.121) (0.103) (0.090) (a-ii) Baseline Preference for Pack-size κ ¯ cs κ ¯ cl κ ¯m parameter κ ¯m s l post.mean 0.781 -2.600 -0.240 -0.009 (0.142) (0.121) (0.070) (0.070) (b) Cross-category Relationship parameter β¯mc β¯cm post.mean 0.552 0.076 (0.069) (0.046) (c-i) Impact of Feature Advertisement ¯ c∗ ¯ c∗ ¯ m∗ ¯ m∗ δ1 δ1 δ1 parameter δ1 2 1 2 1 post.mean -0.512 -1.410 -1.068 -0.665 (0.132) (0.179) (0.154) (0.146) (c-ii) Impact of Merchandising Display ¯ m∗ ¯ m∗ ¯ c∗ ¯ c∗ parameter δ2 δ2 δ2 δ2 1 2 1 2 post.mean na na -0.106 -0.668 (0.095) (0.163) (d) Inventory Rate ¯∗ ¯∗ λ parameter λ c m post.mean -2.006 0.090 (0.246) (0.114) (e) Decision Sequence parameter γ¯0c γ¯i γ¯f γ¯d post.mean 2.129 -7.338 0.118 0.363 (0.251) (0.742) (0.178) (0.291) * Posterior standard deviations are given in parentheses ( ). 1 * δ = exp (δ ∗ ), λ = 1+exp(−λ ∗) . * For milk, subscripts 1 and 2 indicate Guida and Private Label, respectively. For cereal, subscripts 1 and 2 indicate General Mills and Kellogg, respectively. 31
Table 2.4: Impact of Household Size on Parameters (a-i) Baseline Preference for Manufacturer Impact of Household Size on κ ¯m on κ ¯m 1 2 post.mean -0.138 -0.088 (0.103) (0.090) (a-ii) Baseline Preference for Pack-size on κ ¯m Impact of Household Size on κ ¯m s l post.mean 0.560 -0.136 (0.106) (0.099) (b) Cross-category Relationship Impact of Household Size on β¯mc on β¯cm post.mean -0.028 0.009 (0.053) (0.033) (c-i) Impact of Feature Advertisement ¯ m ∗ on δ1 ¯ m∗ Impact of Household Size on δ1 1 2 post.mean 0.084 0.204 (0.088) (0.110) (c-ii) Impact of Merchandising Display ¯ m∗ ¯ m ∗ on δ2 Impact of Household Size on δ2 2 1 post.mean na na (d) Inventory Rate Impact of Household Size post.mean (e) Decision Sequence Impact of Household Size post.mean
¯∗ on λ m 0.204 (0.166)
¯∗ on λ c -0.139 (0.081)
on γ¯0c -0.319 (0.195)
on γ¯i -0.305 (0.393)
on κ ¯ c1 0.058 (0.078)
on κ ¯ c2 -0.006 (0.066)
on κ ¯ cl 0.056 (0.052)
on κ ¯ cs -0.038 (0.054)
¯ c on δ1 1 0.192 (0.089)
∗
¯ c on δ1 2 0.106 (0.079)
¯ c on δ2 1 -0.005 (0.065)
∗
¯ c on δ2 2 0.150 (0.090)
on γ¯f -0.226 (0.130)
on γ¯d 0.148 (0.221)
* Posterior standard deviations are given in parentheses ( ). 1 * δ = exp (δ ∗ ), λ = 1+exp(−λ ∗) .
32
∗
∗
Table 2.5: Model-fit Comparison Model Proposed (Asymmetric Complementarity) Benchmark 1 (Symmetric Complementarity) Benchmark 2 (No Complementarity) Benchmark 3 (Fixed Sequence: Milk First) Benchmark 4 (Fixed Sequence: Cereal First)
In-sample Fit (LMD)
Prediction Fit (LMD)
-49458.78
-5208.81
-50010.32
-5261.48
-51035.61
-5364.50
-50203.07
-5314.83
-50435.24
-5361.54
* LMD: Log Marginal Density.
33
Table 2.6: Estimation Results for Benchmark Models Model
κ ¯m 1
κ ¯m 2
κ ¯ c1
κ ¯ c2
κ ¯m l
κ ¯m s
κ ¯ cl
κ ¯ cs
β¯mc
β¯cm
34
Benchmark 1 -2.438 -1.773 -1.077 -1.097 0.715 -2.479 -0.237 0.001 0.274 0.274 (sym. comp.) (0.145) (0.118) (0.106) (0.098) (0.143) (0.120) (0.069) (0.068) (0.047) (0.047) Benchmark 2 -2.270 -1.554 -0.801 -0.821 0.769 -2.690 -0.235 -0.002 0.000 0.000 (no comp.) (0.133) (0.109) (0.096) (0.082) (0.139) (0.160) (0.069) (0.063) na na Benchmark 3 -2.302 -1.610 -1.097 -1.116 0.754 -2.676 -0.238 0.000 0.285 0.080 (milk first) (0.152) (0.129) (0.113) (0.094) (0.141) (0.164) (0.066) (0.066) (0.050) (0.049) Benchmark 4 -2.353 -1.682 -0.865 -0.879 0.759 -2.550 -0.241 -0.005 -0.091 0.091 (cereal first) (0.137) (0.117) (0.098) (0.083) (0.138) (0.118) (0.066) (0.065) (0.085) (0.037) Model
¯ m∗ δ1 1
¯ m∗ δ1 2
¯ c∗ δ1 1
¯ c∗ δ1 2
¯ c∗ δ2 1
¯ c∗ δ2 2
¯∗ λ m
¯∗ λ c
Benchmark 1 -0.495 -1.376 -1.085 -0.628 -0.104 -0.650 -1.870 0.076 (sym. comp.) (0.130) (0.162) (0.176) (0.141) (0.090) (0.134) (0.209) (0.113) Benchmark 2 -0.464 -1.468 -1.064 -0.642 -0.084 -0.601 -1.918 0.183 (no comp.) (0.147) (0.168) (0.154) (0.166) (0.100) (0.121) (0.201) (0.115) Benchmark 3 -0.547 -1.479 -1.091 -0.591 -0.079 -0.654 -1.949 0.141 (milk first) (0.158) (0.183) (0.184) (0.133) (0.091) (0.134) (0.237) (0.114) Benchmark 4 -0.478 -1.522 -1.171 -0.623 -0.077 -0.634 -1.827 0.194 (cereal first) (0.124) (0.171) (0.189) (0.127) (0.100) (0.160) (0.221) (0.115) * Posterior standard deviations are given in parentheses ( ). 1 * δ = exp (δ ∗ ), λ = 1+exp(−λ ∗) .
γ¯0c
γ¯i
γ¯f
γ¯d
1.418 (0.253) na
-8.303 (0.950) na
-0.121 (0.215) na
-0.363 (0.267) na
na
na
na
na
na
na
na
na
Table 2.7: Price Elasticity
Price Reduction 10% Decrease in Milk Price 10% Decrease in Cereal Price
Milk
% Change in Demand Cereal Outside Good
13.26% (13.85%, 11.41%) 3.39% (5.24%, 0.00%)
6.78% -0.34% (5.08%, 0.00%) (-0.30%, -0.01%) 16.89% -0.34% (15.78%, 13.13%) (-0.37%, -0.06%)
* % change implies the ratio of increased demand compared with the demand under a reference condition, where neither price is changed. * Price elasticities based on benchmark model 1 and 2 are provided in ( , ) for comparison.
35
Table 2.8: Slutsky Decomposition (a) 10% Decrease in Milk Prices % Change in Demand Milk Cereal Outside Good Substitution Effect Income Effect Total Effect
12.71% 0.55% 13.26%
5.93% 0.85% 6.78%
-0.48% 0.14% -0.34%
(b) 10% Decrease in Cereal Prices % Change in Demand Milk Cereal Outside Good Substitution Effect Income Effect Total Effect
2.75% 0.64% 3.39%
14.74% 2.14% 16.89%
-0.52% 0.18% -0.34%
* % change implies the ratio of increased demand compared with the demand under a reference condition, where neither price is changed.
36
Table 2.9: Spillover Effect of Merchandising Activity % Change in Demand Milk Cereal Outside Good
Merchandising Activity
Feature Advertisement of Milk 2.23% Feature Advertisement of Cereal 0.24%
0.56% 1.83%
-0.11% -0.08%
* % change of demand is the percentage increase in demand when merchandising is enhanced by 20% as compared to existing merchandising activity.
37
Chapter 3: Modeling Indivisible Demand
3.1
Introduction
Marketplace demand is inherently indivisible. Demand for products in nearly every product category is constrained to lie on a grid defined by the array of available offerings and package sizes. A consumer who wants to purchase eight eggs cannot do so, nor can the consumer wanting to purchase one and a half cups of milk. The constraints imposed by packaging are also present in categories where consumers want less of an attribute, such as an automotive tire with a 30,000 mile tread life. The effects of indivisibility are most severe when demand is small, leading some consumers to not purchase in the category because the smallest available unit is too large. Demand for higher volume and aggregated purchases is less constrained as consumers can purchase across multiple package sizes to get close to their desired demand quantity. Despite the prevalence of indivisible demand data, models of purchase quantity have generally not addressed its effect on parameter estimates, purchase incidence and consumer welfare calculations. Exceptions include Small and Rosen (1981) who investigate the effect of simple discrete choice models on aggregate welfare calculations, and theoretical work establishing the existence of equilibria in the presence of indivisible goods (Danilov et al. (2001), Bevi´a et al. (1999)). In marketing, Kalyanam 38
and Putler (1997) propose an indivisible alternatives model where each package size is treated as a separate choice alternative, and only one package is purchased at a time. For perfect substitutes, it is also possible to engage in a simple search along each demand axis to identify the point of utility maximization (Allenby et al. (2004), Arora et al. (1998)). However, when demand data exhibit multiple-discreteness - i.e., when multiple variants of a product are simultaneously purchased - accommodating indivisibility is more difficult. Economic models of demand built on direct and indirect utility specifications rely on first-order conditions to associate model parameters to the observed data. In a direct utility model, the first-order conditions are expressed in terms of the KuhnTucker conditions of constrained maximization (Chintagunta and Nair (2011)). For indirect utility models, Roy’s identity is used to derive demand expressions that serve as the basis for parameter estimation. Hanemann (1984)’s model is a special case where a discrete choice model is combined with a conditional demand model whose form comes from the indirect utility function. All three approaches rely on the assumption that the observed purchase quantities reflect the point at which constrained utility is maximized. These models lead to a likelihood specified as a mixture of densities for the inside goods with positive demand and probability masses for the corner solutions where demand is zero (Kim et al. (2002), Song and Chintagunta (2007)). The density contribution to the likelihood utilizes one realization of the error term, ignoring the possibility that other realizations may also lead to the same demand quantities because of indivisibility constraints. In this paper, we propose a general method of dealing with the indivisibility of demand data. Consumer utility and its associated budget constraint are assumed 39
continuous, and indivisibility enters our model as a constraint in the decision space. Observed demand does not indicate the exact point, but rather the grid point, at which feasible and constrained utility is maximized. Incorporating the constraint of indivisibility is challenging because the evaluation of the data likelihood requires integration of the joint density of error terms in regions dictated by the available offerings. We use the concavity property of utility functions to identify the appropriate region of integration for computing the likelihood, and develop a variant of Bayesian data augmentation (Tanner and Wong (1987)) to simplify parameter estimation. A simulation study is used to demonstrate the importance of dealing with data indivisibility, and a scanner panel dataset is used to explore practical implications of incorrectly assuming a continuous demand space. The simulation study provides evidence of a systematic bias in parameter estimates. It also shows agreement between the continuity assumption and our model for indivisibility when the grid size is small. We then apply our model to a scanner panel dataset of 6-ounce yogurt purchases. Ignoring indivisibility results in a downward bias in parameter estimates, which in turn distorts policy implications. The continuous approach over-estimates own-price elasticities due to the underestimation of satiation effect, while underestimation of compensating values is caused by the underestimation of baseline utility. The proposed discrete approach allows us to infer that 9.52% of the non-purchase in the data stems from the restriction of indivisible demand, not from the lack of preference. In the next section we provide relevant literature in marketing and economics to summarize how the issue of data indivisibility has been dealt with. Then, we develop our method for dealing with discrete package sizes, where consumers maximize their utility by choosing a grid point subject to a budgetary allotment. In section 4, a 40
simulation study is used to illustrate the importance of dealing with demand indivisibility versus assuming it is continuously available. Section 5 extends our method to a case where the sub-utility function of an outside good is non-linear, resulting in an irregular region of integration. Our approach to dealing with indivisible demand is applied to scanner panel data in section 6, where comparison is made to standard analysis. Section 7 discusses implications of our model and analysis, and concluding remarks are offered in section 8.
3.2
Literature Review
Demand indivisibility, which we also refer to as discreteness, is often addressed by employing a discrete statistical distribution whose domain is consistent with the observed data. Examples include the work of Anderson and Simester (2004) who assume that the number of units ordered by a customer from future catalogs follows a Poisson distribution, and Manchanda et al. (2004) adopt an NBD distribution to account for the number of new prescriptions written by a physician. Statistical models are frequently assumed for count data and data thought to be generated from a censored continuous model such as in cut-point models in customer satisfaction research (Rossi et al. (2001), Bradlow and Zaslavsky (1999), Gupta (1988)). More recently, copulas have been proposed as a way of describing discrete multivariate data (Danaher and Smith (2011)). While these models are useful for describing discrete demand and consumer responses, they are limited in their ability to relate these data to an underlying process of strategic behavior in which consumers are choosing from among an array of offerings (Chintagunta et al. (2006)).
41
Structural models for consumer demand, in which consumers are thought to be goal-directed, have not been as successful at dealing with indivisible demand. Consumers are assumed to be utility maximizers subject to budget and possibly other constraints (Satomura et al. (2011)), and first-order conditions are used to associate observed demand with utility parameters. There are two approaches for deriving this association. The first approach involves the direct utility function in which firstorder (i.e., Kuhn-Tucker) conditions are used to associate observed demand with constrained utility maximization (Bhat (2008), Bhat (2005), Kim et al. (2002)). The second approach derives the data likelihood using Roy’s identity to associate observed demand with derivatives of the indirect utility function (Song and Chintagunta (2007), Mehta (2007), Chiang (1991)). Both assume the existence of a continuous demand space. An exception is the work of Dub´e (2004) who employs a univariate grid search procedure for predictions but does not incorporate indivisibility in parameter estimation. Although demand indivisibility has not been explicitly incorporated in both direct and indirect utility models, there has been some literature that recognize the presence of indivisibility and conduct a sensitivity analysis. Nair et al. (2005) check the robustness of their model by comparing the expected conditional quantity under discrete and continuous cases, concluding that the difference between the two cases is very small at an aggregate level. Kim et al. (2002) address the issue of integer constraints by changing the limits of integration in the evaluation of the likelihood for a corner solution, acknowledging the chance of the systematic bias in the parameter estimates. While these sensitivity analyses are insightful, the discreteness constraint is not structurally incorporated into the model specification and estimation. 42
Kuriyama and Hanemann (2006) proposes an integer programming approach in which the local search algorithm and the greedy method are utilized to handle the integral nature of recreation behavior data. Although their approach is similar to ours in that the optimality conditions are derived from the comparisons with the adjacent grid points, there are two major differences. First, Kuriyama and Hanemann (2006) handle zeros as if they arise from the continuous decision space, while discreteness is taken care of for positive quantities. This inconsistency in the data-generating mechanism leads to the incomplete likelihood of the data because the probabilities of all possible outcomes do not add up to one. Second, Kuriyama and Hanemann (2006)’s approach relies on the approximate likelihood that is derived from the necessary conditions for optimality, instead of the exact likelihood based on sufficient conditions for optimality. In contrast, we deal with the indivisibility that is present for both zero and positive quantities, and propose an estimation method that relies on the exact likelihood derived from the sufficient conditions for optimality.
3.3 3.3.1
Model Development Direct Utility and Constraints
We assume that a consumer considers n goods in his/her purchase decision together with an outside good x0 and has a concave utility function for good i defined over a continuous non-negative domain (i.e., marginal utility is continuous, positive and decreasing in x). We assume linear utility for the outside good, implying the value of the outside good does not satiate. The linearity assumption is commonly made in models of consumer choice, and simplifies the integration of the likelihood as explained below. Assuming non-linear utility for the outside good complicates the
43
integration, which we address in section 5. We also initially assume that utility from the n goods is additively separable, so that: U (x1t , · · · , xnt , x0t ) =
n X
ui (xit ) + α0 x0t
(3.1)
i=1
Consumers maximize utility subject to three sets of constraints i) non-negativity constraints; ii) budgetary constraints; and iii) discreteness imposed by available package sizes. The first and most obvious constraint is non-negativity of purchase quantity: xit ≥ 0,
∀i ∈ {1, · · · , n, 0}
(3.2)
The second constraint comes from a budgetary allotment (M ), which implies the maximum amount of dollar spending for the inside goods in a given shopping trip. This constraint is typically of a linear form assuming that a unit price (pit ) is constant over various purchase quantities (xit ): n X
pit xit + x0t ≤ M
(3.3)
i=1
The third constraint reflects the discrete nature of offerings in a marketplace, implying that a consumer has to purchase an integer number of units: xit ∈ {0, 1, 2, · · ·} ,
∀i ∈ {1, · · · , n}
(3.4)
The discreteness constraint is not applied to the outside good because its units are unspecified. The consumer’s choice decision is therefore formulated as: max U (x1t , · · · , xnt , x0t ) = x
s.t.
n X
ui (xit ) + α0 x0t
i=1 n X
pit xit + x0t ≤ M
i=1
xit ∈ {0, 1, 2, · · ·} , x0t ≥ 0 44
∀i ∈ {1, · · · , n}
(3.5)
3.3.2
A Discrete Likelihood for Indivisible Data
Although our methodology is general enough to be applied to any utility function, we begin with a simple form4 for the purpose of exposition, and consider the general case later. ui (xit ) =
αi eεit log (γi si xit + 1) γi
(3.6)
where si indicates an observed unit volume for good i. αi represents the baseline utility (i.e., the marginal utility at the point of zero purchase) and γi allows for flexibility in degree of satiation (i.e., higher γi implies the greater rate of satiation for good i) (see Bhat (2008)). A consumer’s decision problem in (5) can be equivalently re-expressed by substituting the binding budget constraint5 for x0t in the utility specification as follows: ! n n X X αi eεit ∗ max U (x1t , · · · , xnt ) = log (γi si xit + 1) + α0 M − pit xit (3.7) xt γi i=1 i=1 s.t. xit ∈ {0, 1, 2, · · ·} , n X M− pit xit ≥ 0
∀i ∈ {1, · · · , n}
i=1
While U is an unconstrained utility function of inside goods and an outside good, U ∗ is a constrained utility function of inside goods only satisfying the budget constraint in (3). U ∗ is a concave function and its derivative with respect to xit is given by: ∂U ∗ αi eεit si = − α0 pit ∂xit γi si xit + 1
(3.8)
4
Bhat (2008) indicates that the sub-utility specification in (6) corresponds to a special case of n αk o K P γk xk a generalized variant of the translated CES utility function (U (x) = ψ + 1 − 1 ), k αk γk k=1
when αk → 0. He also notes that the flexibility of the utility function is not sacrificed much in this limiting case because “it is possible to closely approximate a sub-utility function profile based on a combination of γk and αk values with a sub-utility profile solely based on γk or αk ”. 5
At an optimal condition, the budget constraint in (3) is always binding because the marginal utility of an outside good is greater than 0.
45
Note that the derivative of U ∗ with respect to xit is a function of xit only. This is due to the assumption of an additively separable utility function with linear sub-utility for the outside good, and implies that we can consider one good at a time to specify the sufficient conditions for an observed purchase quantity (x∗1t , · · · , x∗nt ) to be optimal: U ∗ x∗it , x∗−it > max U ∗ x∗it − 1, x∗−it , U ∗ x∗it + 1, x∗−it ,
∀i ∈ {1, · · · , n} (3.9)
where x∗−it indicates a vector of purchase quantities of all the other inside goods. Equation (9) is true because utility functions are required to be concave in each element of the quantity vector xt . The concavity property is also present when a linear budget constraint is substituted for the outside good to obtain U ∗ , and in the presence of a restricted solution space because of packaging constraints. Thus, U ∗ has a unique maximum value in the neighborhood of observed demand, taking its greatest realized value on the grid at the point of the observed data x∗t . The concavity of the utility function allows us to express optimality conditions in terms of relationships to neighboring grid points instead of all feasible points. Substituting the utility expression in (6) into (9) yields bounds on error realizations that are consistent with observed demand (x∗1t , · · · , x∗nt ) being utility maximizing subject to budget and packaging, or indivisibility, constraints: ∀i ∈ {1, · · · , n} (3.10) α0 pit γi γi si x∗it + 1 where lbit = log − log log α γ s (x∗ − 1) + 1 i i i it∗ α0 pit γi γi si (xit + 1) + 1 ubit = log − log log αi γi si x∗it + 1
lbit < εit
j=1,j6=i
M ), εit is bounded only from below (i.e., ubit = ∞). If we assume the errors are independent and follow an identical probability distribution f , the likelihood can be computed by integrating the joint density of all the errors over the region specified in (10).
L(d) (x∗1t , · · · , x∗nt ) =
ubit n Z Y
f (εit ) dεit
i=1
lbit
(3.11)
The presence of non-linear utility for the outside good, or a utility function involving multiplicative factors reflecting complementary goods, gives rise to a well-defined but more complicated set of restrictions. The intersection of the set of these restrictions forms the region of integration in (11).
3.3.3
Comparison to a Continuous Likelihood
A continuous likelihood is less flexible than a discrete likelihood because it assumes that interior points exactly satisfy the Kuhn-Tucker conditions. Assuming the outside good is positive, the likelihood for an observed purchase quantity vector (x∗1t , · · · , x∗nt ) is given by: (c)
L
(x∗1t , · · ·
, x∗nt )
=
n Y
{` (x∗it )}
(3.12)
i=1
f (νit ) × |Jεit →xit | , if x∗it > 0 Rvit ` (x∗it ) = f (εit ) dεit , if x∗it = 0 −∞ α0 pit (γi si x∗it + 1) γi si where νit = log , Jεit →xit = αi si γi si x∗it + 1 There are two important differences in the likelihood (12) relative to the discrete likelihood in (10) and (11). First, the discrete likelihood produces a greater probability mass for a zero (i.e., a corner) than that of the continuous likelihood given the 47
same set of parameters. When x∗it = 0, the upper boundary for εit in the discrete likelihood is equal to log α0αpiit − log log(γγi si i +1) , which is greater than that for the continuous likelihood log α0αpiit − log si . In other words, the continuous likelihood requires smaller baseline parameters (i.e., αi ) to produce the same corner probabilities. Baseline parameters will therefore suffer from a downward bias when zeros in the data are generated from the discrete data-generating mechanism but parameters are estimated with the continuous likelihood in (12). This is true for any concave utility function and the proof is provided in Appendix A. Intuitively, the difference in the estimates of the baseline parameters reflects the different interpretations of non-purchase (i.e., zero) from the two approaches. A model with a continuous decision space assumes that a consumer doesn’t purchase a product because she doesn’t like it. In contrast, when indivisibility restrictions are present, the interpretation of non-purchase is that a consumer might not like it enough to buy one unit. The second difference is that model fit statistics are not strictly comparable across the two likelihoods because (12) associates one error realization with each interior point while the discrete likelihood in (10) and (11) associates many error realizations. An adjustment is required to make the likelihoods from the two approaches comparable. We compute the discrete likelihood based on the parameter values that are estimated from the continuous approach and compare it with the likelihood of the discrete approach. Because the discreteness constraints are present in the true datagenerating mechanism, the continuous likelihood should be viewed as the likelihood of the data under the incorrect model, while the discrete likelihood implies the true joint probability of the observed data. We will refer to the statistics resulting from these procedure as the “true likelihood” below. 48
3.3.4
General Case
A consumer’s decision problem in (7) can be rewritten for any concave sub-utility function, ui (xit |εit ), as follows: max U ∗ (x1t , · · · , xnt ) = xt
n X
ui (xit |εit ) + α0
i=1
s.t. xit ∈ {0, 1, 2, · · ·} , n X M− pit xit ≥ 0
M−
n X
! pit xit
(3.13)
i=1
∀i ∈ {1, · · · , n}
i=1
Equation (9) should be satisfied for an observed quantity to be optimal, which defines the region in an error space that rationalize the goal-directed behavior of a consumer. lbit < εit M , ubit = ∞. The model likelihood is computed by j=1,j6=i
integrating the joint density of the errors over the specified region in (14).
3.4
Simulation Study
A simulation study is used to investigate the effect of employing a continuous likelihood when demand data are constrained to lie on a discrete grid. Data are generated by searching across the grid for the point that maximizes utility among a feasible set defined by a budgetary allotment and unit prices. In the simulation study,
49
we assume the following decision problem: max U (x1t , x2t , x0t ) = xt
1.0eε1t 0.5eε2t log (1.0s1 x1t + 1) + log (0.7s2 x2t + 1) + x0t 1.0 0.7 (3.15)
s.t. p1t x1t + p2t x2t + x0t ≤ 50 xit ∈ {0, 1, 2, · · ·} ,
∀i ∈ {1, 2}
x0t ≥ 0 where the pack sizes are set to 6 (s1 = s2 = 6), and εit ∼Normal(0, 1), pit ∼ Uniform(1, 3), ∀i ∈ {1, 2}. In the simulation, we assume that the error term εit is known to the decision maker but not to the analyst. We allow for consumer heterogeneity by employing a random-effect specification for 100 consumers, and 100 observations are simulated for each respondent by searching over all the possible grid points under the constraint and finding the utility maximizer. Once simulated, the same dataset is used to calibrate five different models: one in which the true package size (=6) is assumed, one in which a continuous decision space is assumed, and the others in which smaller pack sizes (i.e., 3,2,1) are imposed. Table 1 summarizes the simulated data and the estimation results based on the five different models. [Table 1] The top portion of the table reports the average number of interior and corner solutions across the 100 observations, and the bottom portion of the table reports the estimated means of the random-effect distributions, along with the true values. Parameter estimates are based on the mean of the posterior distribution using a diffuse but proper Normal prior with mean zero and variance 100. The lower right side of the table reports the log-likelihood of the data based on the different assumptions 50
of the unit grid, and also the true log-likelihood of the observed data. Parameter estimates from each model are used to compute the joint probability of the observed data under the assumption of the true grid-size (i.e., s1 = s2 = 6), which we name the “true log-likelihood”. The results of this simulation study show the following. First, parameter estimates are recovered to within the precision indicated by the posterior distribution only for the unit grid equal to six, which is the value used to generate the data. When the unit grid is assumed to be less than six, parameter estimates exhibit a downward bias that becomes increasingly severe as the unit grid diminishes. Second, parameter estimates for the continuous likelihood agree with the discrete estimates when the unit grid is small. This is expected because the continuous likelihood can be viewed as the mathematical limit of the discrete likelihood as the unit grid approaches zero. Third, the true log-likelihood statistic also converges to that of the continuous model, and, fourth, this measure of model fit correctly indicates that the true data-generating has a unit grid of 6.0. These results indicate that data discreteness is an important problem, and that the proposed likelihood offers a viable solution.
3.5
Irregular Regions of Integration
The assumption of additive separability for the inside goods and linear utility for the outside good simplifies the evaluation of the likelihood in two ways: (i) it reduces the number of comparisons to 2n (see equation (9)); and (ii) each comparison provides a separate inequality for each error term. This results in a rectangular region of integration that is easy to calculate. However, this computational benefit comes with the cost of a potentially simplistic utility function and the budgetary allotment
51
playing no role in the likelihood expression. That is, the optimal quantity for good i is not influenced by the price of good j, so long as the total expenditure does not exceed the budget (see Chandukala et al. (2007) section 3.3 for more discussion). In this section, we provide a solution for the more general case where the region of integration is irregular.
3.5.1
Sufficient Conditions for Optimality
Suppose that utility for the outside good is a nonlinear function of x0t (i.e., u0 (x0t ) = α0 log (x0t )). The budget constraint can be used to substitute for the outside good, yielding a utility function written in terms of the inside goods: ! n n εit X X α e i log (γi si xit + 1) + α0 log M − pit xit (3.16) U ∗ (x1t , · · · , xnt ) = γi i=1 i=1 U ∗ is a concave function and its derivative with respect to xit is now a function of (x1t , · · · , xnt ). αi eεit si ∂U ∗ − = ∂xit γi si xit + 1
α0 pit n P M− pit xit
(3.17)
i=1
Because the optimal quantity of good i (x∗it ) is influenced by the optimal quantities of the other goods, we need to compare the utility of the observed purchase quantity with the utility of all adjacent points. Therefore, the sufficient conditions for an observed purchase quantity (x∗1t , · · · , x∗nt ) to be optimal are composed of 3n inequalities. U ∗ (x∗1t , · · · , x∗nt ) ≥ max {U ∗ (x∗1t + ∆1 , · · · , x∗nt + ∆n )}∆i ∈{−1,0,1}
(3.18)
The likelihood computation becomes complicated not only because the number of inequalities increases but also because each inequality contains multiple error terms
52
that are nonlinearly associated (see Lee et al. (2012) for an application to complementary goods). We develop a variant of Bayesian data augmentation (Tanner and Wong (1987)), termed “error” augmentation (Zeithammer and Lenk (2006)) to avoid the complex integration for computing the likelihood.
3.5.2
Estimation by Bayesian Error Augmentation
Bayesian error augmentation works by considering the error realization the augmented variable instead of some variable such as y in a regression model. Probit models are frequently estimated with data augmentation by generating the latent values of utility (see Rossi et al. (2005)). Given the utility values, model parameter estimation involves analysis of a standard linear regression model. In our model, we treat the following variables as augmented: zit = αi eεit ,
∀i ∈ {1, · · · , n}
∀t ∈ {1, · · · , T }
(3.19)
or zit∗ = αi∗ + εit where zit∗ = log (zit ) and αi∗ = log (αi ). We note that conditional on αi , the augmented variables zit and εit contain the same information and are deterministically related. Given either {zjt }j6=i or {εjt }j6=i , the inequalities in (18) reduce to a one-dimension region defined by the intersection of all the constraints containing zit or εit . Each realization of εit either does or does not identify the observed demand vector x∗ as the point of constrained utility maximization. The likelihood in error augmentation is therefore either equal to one or zero, and draws of εit can be generated as a truncated draw from its prior (i.e., normal) distribution. Values of zit∗ are obtained by adding αi∗ .
53
Then, conditioning on the values of {zit∗ }t=1,··· ,T just drawn, a simple regression model can be used to generate draws of αi∗ . The draw of αi∗ is unconstrained because the truncation is induced by the optimality conditions in (18), and given the {zit∗ }t=1,··· ,T no further truncation is needed to estimate αi∗ . The reason is the same as in a Bayesian estimation of a probit model using data augmentation – latent utilities are truncated by likelihood, but once they are obtained a standard regression model can be used to obtain estimates of the model parameters. Details of the estimation procedure are provided in Appendix B. A simulation study is used to verify the procedure in Appendix C.
3.6 3.6.1
Empirical Analysis Data
We applied our methodology to the IRI household panel dataset described in Bronnenberg et al. (2008). Six popular flavors6 of 6-ounce Yoplait yogurt are chosen to illustrate our methodology. 109 households are included in our dataset, each with more than 5 shopping trips on which at least one of the six items is purchased. The data contain the trips where multiple as well as a single varieties are purchased, and the trips where none of the inside goods is purchased. Conditional on the purchase occasion, an average of 3.58 units of 6-ounce yogurt are purchased in a single shopping trip, which comprises an average of 1.81 different flavors purchased. Approximately 92% of the observed data are zeros, indicating great potential of obtaining biased estimates from assuming a continuous likelihood. In our analysis, we use 90% of the data for the model calibration, leaving 10% for prediction. 6
Blueberry, Strawberry, Vanilla, Raspberry, Key Lime, Peach
54
[Table 2] An outside good is introduced in the model specification in order to explain the variations in the total dollar spending for the inside goods. Here, the outside good is defined as the remaining dollars that are saved rather than spent on the inside goods. The sub-utility of the outside good is assumed to be linear, which implies that the value of saved money does not satiate. The linear sub-utility specification for the outside good does not allow us to infer the budgetary allotment, but it enables us to compute the exact likelihood, which decreases the computational burden of estimation. We also estimated the model with a nonlinear sub-utility for the outside good, and we find consistent results with those presented here.
3.6.2
Estimates
The model parameters are estimated both by our discrete likelihood and by the standard continuous likelihood. Heterogeneity across households is incorporated by a random-effect specification for household parameters: ∗ ∗ ∗ ∗ ∗ ∗ ¯ Vθ θh = {α1h , α2h , α3h , α4h , α5h , α6h , γh∗ } ∼ N ormal θ,
(3.20)
∗ where αih = log (αih ) , γh∗ = log (γh )7 . We use a Bayesian MCMC method for es-
timation with a conjugate but relatively diffuse prior distribution for the hyperparameters8 . 100,000 iterations of the chain were used to generate parameter estimates, with the first 30,000 draws discarded as burn-in. Table 3 displays the estimation results, comparing the two methods. 7
γh can be alternative-specific as is in our simulation study. However, the γih ’s were not differentially estimated with our empirical dataset. 8¯
θ|Vθ ∼ N (0, Vθ ⊗ 100I) , Vθ ∼ IW (10, 10I)
55
[Table 3] The discrete likelihood outperforms the continuous likelihood in both in-sample and predictive fit to the data. Although the parameter estimates from the two likelihoods show an ordinal consistency, the parameters are underestimated with the continuous likelihood due to the proliferation of zeros in the data. As we vary the unit grid of the discrete likelihood from the true package size (i.e. 6-ounce) to smaller sizes (i.e. 3, 2, 1-ounce), the parameter estimates gradually decrease, converging toward those of the continuous likelihood. The estimates of the continuous likelihood are still quite different from those of the discrete likelihood in which we assume the smallest available package size (i.e., one). This illustrates that assuming continuously available demand can be costly when zeros datum values are common. [Figure 1] Figure 1 depicts the estimated sub-utility for the blueberry-flavor yogurt from the discrete approach and compares it with one estimated from the continuous likelihood. Comparison between the two sub-utilities show two patterns of biases that stem from the ignorance of the data indivisibility. First, the baseline utility is underestimated in the continuous approach to account for the zeros in the data, some of which are due to indivisible demand(i.e., don’t like it enough), and others due to lack of preference (i.e., don’t like it, period). The slope of the sub-utility at zero quantity is steeper in the discrete approach, implying that an observation of non-purchase in data represents lower-level of preference in the continuous approach. This is consistent with our theoretical prediction discussed in section 3.3. Second, the degree of satiation is underestimated when the discreteness of the data is not accounted for. This can be 56
viewed as a bi-product of the bias of the baseline utility. As the continuous approach underestimates the baseline parameter (i.e., αi ) to rationalize zeros, a smaller γi is required to increase the log-likelihood for the positive quantities (see equation (12)). In other words, because the continuous approach infers lower-level of preference from non-purchase, it rationalizes the positive quantities by attributing to a lower degree of satiation.
3.7 3.7.1
Discussion Decomposition of Corner Probability
When a good is indivisible, non-purchase decision can arise from two different reasons: (i) a consumer does not like it, (ii) a consumer likes it but not as much as to buy one unit of it. While the former simply reflects the lack of preference, the latter can be adjusted if a smaller package is available. In this section, we explore how a package sizes affect purchase decisions by investigating the probability of nonpurchase. Based on the full posterior distribution of individual-level parameters from the 6-ounce discrete model, we compute the probability of non-purchase (i.e. corner probability) with different package sizes. These values can then be averaged across all the households to produce average corner probability. We assume that per-volume prices remain constant in this investigation. [Figure 2] Figure 2 illustrates how the probability of non-purchase changes with different package sizes of the blueberry-flavor yogurt. When the true package size (=6 ounce) is assumed, the probability of a corner is 0.902, which is consistent with the frequencies of zeros in the data. This corner probability can be decomposed into the one due to 57
the lack of preference, and the other reflecting the restriction of demand indivisibility. While the majority of non-purchase is driven by the lack of preference (i.e. the corner probability with a continuous decision space is 0.816), there exists a significant portion of non-purchase (9.52%) that is caused by the discreteness restriction in a consumer’s decision space, which is represented by the gap between the two lines in Figure 2. As the package size decreases, the corner probability also decreases and converges to the case of a continuous decision space, which reflects that the discreteness restriction in a consumer’s decision space is reduced with a smaller package size.
3.7.2
Price Elasticity and Compensating Value
Modeling data indivisibility is important because it yields consistent estimates of model parameters, which are the primitives for producing important metrics such as price elasticity and compensating value. In this section, we compare these metrics estimated from the discrete approach with those from the continuous approach and show how the different parameter estimates translate into the differences in price elasticity and compensating value. Own-price elasticities are computed by comparing the expected demand under its regular price with one in which the price is reduced by 10%. Because the maximum dollar spending for the inside goods is $11.85 in our data, we set the budgetary allotment to be $20.00 to guarantee the quantity of the outside good to be always positive9 . The upper panel of Table 4 presents the own-price elasticities of each flavor, 9
The budgetary allotment cannot be estimated because it does not affect the model likelihood when the sub-utility of the outside good is linear. However, we need to set the budgetary allotment for our counter-factual studies. Because the choice of $20.00 is arbitrary, we conducted a sensitivity analysis, in which we vary the budgetary allotment and see how sensitive our counterfactual results are. We found that our main results as for the differences between the discrete and continuous approach are robust across various budget levels.
58
and compares the elasticities from the two approaches. The own-price elasticities from the two approaches show an ordinal consistency, indicating that the demand of vanilla is least elastic to price while that of raspberry responds to a price change most sensitively. However, the own-price elasticities based on the continuous approach are greater than those of the discrete approach, which we believe is driven by the biases in the parameter estimates of consumers’ utility. Particularly, the underestimation of the satiation effects in the continuous approach results in the overestimation of the own-price elasticities. [Table 4] Compensating value measures the value of each flavor by computing the amount of money a consumer would need to reach its initial utility after removing a flavor from the product line. It reflects multiple facets of a consumer’s utility (e.g., baseline utility, satiation, etc.) and provides useful information for assortment decision. The compensating value of ith flavor for household h is numerically computed by first evaluating the indirect utility under the full assortment condition, then equating it to the indirect utility under the new condition where flavor i is removed and a budget (i)
is increased by CVh for compensation: (i)
Vh (p, M ) = Vh
(i) p, M + CVh
where Vh (p, M ) = max U (x|θh ) x
(i)
Vh (p, M ) = max U (x|θh ) x
(3.21) s.t. p0 x = M s.t. p0 x = M & xi = 0
The lower panel of Table 4 shows the per-trip compensating value of each flavor for a household. On average, deletion of one flavor is equivalent to the compensation of 59
$0.123 based on the discrete approach, and the vanilla and blueberry yogurt show the greatest value. However, when these metrics are computed based on the continuous approach, the compensating value of each flavor is underestimated by a significant amount. Underestimation of the baseline utility leads to underestimation of the compensating value. In sum, the analyses of price elasticity and compensating value demonstrate the importance of modeling indivisibility of demand. Misrepresenting consumers’ behavior by ignoring the constraint in their decision space causes an inaccurate estimation of their preferences, which in turn distorts managerially important metrics such as price elasticity and compensating value.
3.8
Concluding Remarks
This paper proposes an approach to dealing with indivisible demand by employing inequality restrictions in the model likelihood. Assuming additively separable utility and linear sub-utility for the outside good simplifies the region of integration of the model likelihood. The region become irregular whenever the utility function is not additively separable, or when utility for the outside good is specified non-linearly. Bayesian error augmentation is shown to provide a viable solution for estimation when the likelihood has regular or irregular boundaries. Simulation studies are used to validate our algorithm and compare it to estimates obtained from a continuous likelihood that assumes that constrained utility is exactly maximized at the points of observed demand. The simulations show that baseline utilities as well as the degree of satiation are underestimated from the continuous likelihood, particularly in the presence of corner solutions.
60
We find support for our model using a scanner-panel dataset of yogurt purchases. Explicitly recognizing the indivisibility of demand leads to better in-sample and the predictive fits to the data. The same pattern of biases in the parameter estimates are observed in the continuous approach as is shown in the simulation study. We show that the biased parameter estimates in the continuous approach lead to the overestimation of own-price elasticities and the underestimation of compensating values. Our analysis also indicates that a substantial amount of non-purchase is due to the effects of packaging. We believe that dealing with indivisible demand requires a model of direct utility maximization and likelihood-based estimation. Models based on indirect utility, where Roy’s identity is used to associate first-order conditions with the observed data, assume that the indirect utility is continuous and differentiable. The presence of demand discreteness is inconsistent with the notion that attainable utility is continuous, and as a result it is not possible to employ derivatives to obtain demand equations. Estimation based on the dual formulation of cost minimization is similarly unviable (Inoue (2009)). Moment-based estimation techniques such as GMM (Greene (2002) chapter 18) also require modification in the presence of inequality constraints (Pakes et al. (2006)). We leave the comparison of our method to GMM for future research. Finally, our approach can be applied to other utility functions, including models of complementarity and forward-looking behavior. These applications lead to complex regions of integration in the likelihood, for which Bayesian augmentation may be the only viable solution.
61
Figure 3.1: Sub-utility for Blueberry Yogurt: Discrete vs. Continuous
0.3 0.2 0.1 0.0
Utility
0.4
0.5
Discrete (s=6) Continuous
0
2
4
6 Quantity
62
8
10
0.82
0.84
0.86
0.88
Discrete Decision Space Continuous Decision Space
0.80
Probability of Non-purchase
0.90
Figure 3.2: Impact of Package Size on Corner Probability for Blueberry Yogurt
0
1
2
3 Package Size (ounce)
63
4
5
6
Table 3.1: Simulation Study
(a) Simulated Data x1 > 0, x2 > 0 x1 > 0, x2 = 0 x1 = 0, x2 > 0 x1 = 0, x2 = 0 Avg. Number of Obs.
16.93
34.17
15.13
33.77
Total 100
(b) Estimation Results 64
Approach
Discrete
Continuous
Unit Grid
Parameter Estimates
Model-Fit
α1∗ (=0.000)
α2∗ (=-0.693)
γ1∗ (=0.000)
γ2∗ (=-0.357)
LL
True LL
6
0.015 (0.055)
-0.685 (0.058)
-0.011 (0.068)
-0.328 (0.079)
-20560
-20560
3
-0.797 (0.050)
-1.419 (0.049)
-1.311 (0.047)
-1.648 (0.049)
-29234
-21484
2
-0.920 (0.049)
-1.512 (0.049)
-1.514 (0.042)
-1.835 (0.046)
-33229
-21852
1
-1.022 (0.046)
-1.591 (0.048)
-1.687 (0.043)
-1.993 (0.043)
-39559
-22232
-1.108 (0.046)
-1.661 (0.046)
-1.843 (0.042)
-2.131 (0.044)
-25170
-22630
* Posterior standard deviations are given in parentheses ( ). * αi∗ = log (αi ), γi∗ = log (γi ). * LL: Log Likelihood.
Table 3.2: Data Description
Flavor
Blueberry
Strawberry
Vanilla
Raspberry
Key Lime
Peach
Unit Price
0.72
0.72
0.72
0.72
0.72
0.72
Purchase Incidence
541
455
420
474
383
393
Purchase Quantity
1074
774
1025
840
813
739
Zero
5046
5132
5167
5113
5204
5194
One
243
240
157
274
170
183
Two
192
151
133
127
122
137
Three
41
33
34
25
36
27
Four and Above
65
31
96
48
55
46
* Total number of shopping trips included in the data is 5587.
65
Table 3.3: Estimation Results
Model-Fit
Estimates
66
Raspberry
Key Lime
Peach
Satiation(γ ∗ )
-3.76
-3.33
-3.69
-3.50
-1.66
(0.10)
(0.14)
(0.10)
(0.13)
(0.11)
(0.11)
-3.73
-3.69
-4.09
-3.68
-4.03
-3.84
-2.42
[-1251]
(0.11)
(0.09)
(0.14)
(0.09)
(0.12)
(0.10)
(0.07)
-12168
-1668
-3.77
-3.74
-4.13
-3.72
-4.07
-3.89
-2.54
(si = 2)
[-9019]
[-1259]
(0.11)
(0.09)
(0.13)
(0.08)
(0.12)
(0.10)
(0.06)
Discrete
-13948
-1894
-3.81
-3.78
-4.19
-3.76
-4.11
-3.94
-2.65
(si = 1)
[-9107]
[-1268]
(0.10)
(0.09)
(0.14)
(0.08)
(0.12)
(0.10)
(0.06)
-9843
-1370
-3.85
-3.81
-4.23
-3.80
-4.15
-3.98
-2.74
[-9192]
[-1278]
(0.10)
(0.09)
(0.14)
(0.08)
(0.12)
(0.10)
(0.06)
In-sample
Predictive
Discrete
-8691
-1237
-3.38
-3.34
(si = 6)
[-8691]
[-1237]
(0.12)
Discrete
-11051
-1527
(si = 3)
[-8935]
Discrete
Continuous
Blueberry Strawberry Vanilla
* Posterior standard deviations are given in parentheses ( ). * True log-likelihood values are given in parentheses [ ].
Table 3.4: Price Elasticity and Compensating Value (a) Price Elasticity Flavor
Blueberry
Strawberry
Vanilla
Raspberry
Key Lime
Peach
Discrete
1.90
2.08
1.86
2.23
2.04
2.03
Continuous
2.13
2.31
2.10
2.47
2.23
2.33
(b) Compensating Value Flavor
Blueberry
Strawberry
Vanilla
Raspberry
Key Lime
Peach
Discrete
0.170
0.108
0.175
0.084
0.100
0.102
Continuous
0.138
0.094
0.138
0.075
0.082
0.081
67
Table 3.5: Simulation Study with Non-linear Sub-utility for an Outside Good (a) Simulated Data x1 > 0, x2 > 0 x1 > 0, x2 = 0 x1 = 0, x2 > 0 Avg. Number of Obs.
16.86
29.18
19.23
x1 = 0, x2 = 0
Total
34.73
100
(b) Estimation Results 68
Approach
Discrete
Continuous
Unit Grid
Parameter Estimates α1∗ (=-3.912)
α2∗ (=-4.200)
γ1∗ (=0.000)
γ2∗ (=0.000)
M ∗ (=3.912)
6
-3.885 (0.061)
-4.220 (0.054)
0.042 (0.058)
0.028 (0.054)
3.860 (0.047)
3
-3.893 (0.061)
-4.194 (0.066)
-1.662 (0.056)
-1.744 (0.058)
2.968 (0.067)
2
-3.895 (0.054)
-4.281 (0.061)
-1.881 (0.058)
-1.967 (0.068)
2.950 (0.069)
1
-3.896 (0.058)
-4.290 (0.062)
-2.010 (0.060)
-2.103 (0.066)
2.896 (0.068)
-4.020 (0.060)
-4.297 (0.062)
-2.259 (0.058)
-2.343 (0.057)
2.832 (0.066)
* Posterior standard deviations are given in parentheses ( ). * αi∗ = log (αi ), γi∗ = log (γi ), Mi∗ = log (Mi ).
Bibliography
Ailawadi, Kusum L., Karen Gedenk, Christian Lutzky, Scott A. Neslin. 2007. Decomposition of the sales impact of promotion-induced stockpiling. Journal of Marketing Research 44(3) pp. 450–467. Ainslie, Andrew, Peter E. Rossi. 1998. Similarities in choice behavior across product categories. Marketing Science 17(2) pp. 91–106. Allenby, Greg M., Thomas S. Shively, Sha Yang, Mark J. Garratt. 2004. A choice model for packaged goods: Dealing with discrete quantities and quantity discounts. Marketing Science 23(1) pp. 95–108. Anderson, Eric T., Duncan I. Simester. 2004. Long-run effects of promotion depth on new versus established customers: Three field studies. Marketing Science 23(1) pp. 4–20. Arora, Neeraj, Greg M. Allenby, James L. Ginter. 1998. A hierarchical bayes model of primary and secondary demand. Marketing Science 17(1) pp. 29–44. Basu, Amiya, Tridib Mazumdar, S. P. Raj. 2003. Indirect network externality effects on product attributes. Marketing Science 22(2) pp. 209–221. Bevi´a, Carmen, Martine Quinzii, Jos´e A. Silva. 1999. Buying several indivisible goods. Math. Social Sci. 37(1) 1–23. Bhat, C.R. 2005. A multiple discrete-continuous extreme value model: Formulation and application to discretionary time-use decisions. Transportation Res. 39(8) 679– 707. Bhat, C.R. 2008. The multiple discrete-continuous extreme value (mdcev) model: Role of utility function parameters, identification considerations, and model extensions. Transportation Res. 42(3) 274–303. Bradlow, Eric T., Alan M. Zaslavsky. 1999. A hierarchical latent variable model for ordinal data from a customer satisfaction survey with ”no answer” responses. Journal of the American Statistical Association 94(445) pp. 43–52. Bronnenberg, Bart J., Michael W. Kruger, Carl F. Mela. 2008. Database paper: The iri marketing data set. Marketing Science 27(4) pp. 745–748. Chandukala, Sandeep, Jaehwan Kim, Thomas Otter, Peter Rossi, Greg Allenby. 2007. Choice models in marketing: Economic assumptions, challenges and trends. Foundations and Trends in Marketing 2(2) 97–184. 69
Chapman, Randall G., Richard Staelin. 1982. Exploiting rank ordered choice set data within the stochastic utility model. Journal of Marketing Research 19(3) pp. 288–301. Chiang, Jeongwen. 1991. A simultaneous approach to the whether, what and how much to buy questions. Marketing Science 10(4) pp. 297–315. Chintagunta, P. K., H.S. Nair. 2011. Discrete-choice models of consumer demand in marketing. Marketing Science 30(6) 977–996. Chintagunta, Pradeep, T¨ ulin Erdem, Peter E. Rossi, Michel Wedel. 2006. Structural modeling in marketing: Review and assessment. Marketing Science 25(6) pp. 604– 616. Christensen, L.R., D.W. Jorgenson, L.J. Lau. 1975. Transcendental logarithmic utility functions. Amer. Econom. Rev. 3 367–383. Danaher, Peter J., Michael S. Smith. 2011. Modeling multivariate distributions using copulas: Application in marketing. Marketing Science 30(1) 4–21. Danilov, Vladimir, Gleb Koshevoy, Kazuo Murota. 2001. Discrete convexity and equilibria in economies with indivisible goods and money. Math. Social Sci. 41(3) 251–273. Dub´e, Jean-Pierre. 2004. Multiple discreteness and product differentiation: Demand for carbonated soft drinks. Marketing Science 23(1) pp. 66–81. Duvvuri, Sri Devi, Asim Ansari, Sunil Gupta. 2007. Consumers’ price sensitivities across complementary categories. Management Science 53(12) pp. 1933–1945. Erdem, T¨ ulin. 1998. An empirical analysis of umbrella branding. Journal of Marketing Research 35(3) pp. 339–351. Geyer, C.J. 1992. Practical markov chain monte carlo. Statistical Sci. 7 473–483. Greene, William H. 2002. Econometric Analysis. 5th ed. Prentice Hall, Upper Saddle River, New Jersey. Gupta, Sunil. 1988. Impact of sales promotions on when, what, and how much to buy. Journal of Marketing Research 25(4) pp. 342–355. Hanemann, W.M. 1984. Discrete/continuous models of consumer demands. Econometrica 52 541–561. Hansen, Karsten, Vishal Singh, Pradeep Chintagunta. 2006. Understanding storebrand purchase behavior across categories. Marketing Science 25(1) pp. 75–90. Inoue, Tomoki. 2009. Indivisible commodities and an equivalence theorem on the strong core. IMW working papers, ISSN 0931-6558, Bielefeld: Universit¨at Bielefeld . Kalyanam, Kirthi, Daniel S. Putler. 1997. Incorporating demographic variables in brand choice models: An indivisible alternatives framework. Marketing Science 16(2) pp. 166–181. 70
Kim, Jaehwan, Greg M. Allenby, Peter E. Rossi. 2002. Modeling consumer demand for variety. Marketing Science 21(3) pp. 229–250. Kuriyama, Koich, W. Michael Hanemann. 2006. The integer programming approach to a generalized corner-solution model: The integer programming approach to a generalized corner-solution model: An application to recreation demand. Working paper . Lee, Sanghak, Greg M. Allenby. 2012. Modeling indivisible demand. Working paper . Lee, Sanghak, Jaehwan Kim, Greg M. Allenby. 2012. A direct utility model for asymmetric complements. Working paper . Li, Shibo, Baohong Sun, Ronald T. Wilcox. 2005. Cross-selling sequentially ordered products: An application to consumer banking services. Journal of Marketing Research 42(2) pp. 233–239. Manchanda, Puneet, Asim Ansari, Sunil Gupta. 1999. The ”shopping basket”: A model for multicategory purchase incidence decisions. Marketing Science 18(2) pp. 95–114. Manchanda, Puneet, Peter E. Rossi, Pradeep K. Chintagunta. 2004. Response modeling with nonrandom marketing-mix variables. Journal of Marketing Research 41(4) pp. 467–478. Mehta, Nitin. 2007. Investigating consumers’ purchase incidence and brand choice decisions across multiple product categories: A theoretical and empirical analysis. Marketing Science 26(2) pp. 196–217. Nair, Harikesh, Pradeep K. Chintagunta, Jean-Pierre Dub´e. 2004. Empirical analysis of indirect network effects in the market for personal digital assistants. Quantitative Marketing and Economics 2(1) pp. 23–58. Nair, Harikesh, Jean-Pierre Dub´e, Pradeep Chintagunta. 2005. Accounting for primary and secondary demand effects with aggregate data. Marketing Science 24(3) pp. 444–460. Pakes, A., J. Porter, Kate Ho, Joy Ishii. 2006. Moment inequalities and their application. Working paper . Pashigian, B.P. 1998. Price Theory and Applications. Irwin/McGraw-Hill: Boston, Mass. Pollak, R.A., T.J. Wales. 1992. Demand System Specification and Estimation. Oxford, UK. Raftery, A.E., S.M. Lewis. 1995. [practical markov chain monte carlo]: Comment: One long run with diagnostics: Implementation strategies for markov chain monte carlo. Statistical Sci. 7 493–497. Rossi, P.E., G.M. Allenby, R. McCulloch. 2005. Bayesian Statistics and Marketing. John Wiley and Sons Ltd.
71
Rossi, Peter E., Zvi Gilula, Greg M. Allenby. 2001. Overcoming scale usage heterogeneity: A bayesian hierarchical approach. Journal of the American Statistical Association 96(453) pp. 20–31. Satomura, Takuya, Jaehwan Kim, Greg Allenby. 2011. Multiple-constraint choice models with corner and interior solutions. Marketing Science 30(3) 481–490. Small, Kenneth A., Harvey S. Rosen. 1981. Applied welfare economics with discrete choice models. Econometrica 49(1) pp. 105–130. Song, I., P.K. Chintagunta. 2007. A discrete-continuous model for multicategory purchase behavior of households. J. Marketing Res. 44 595–612. Tanner, Martin A., Wing Hung Wong. 1987. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 82(398) pp. 528–540. Zeithammer, Robert, Peter Lenk. 2006. Bayesian estimation of multivariate-normal models when dimensions are absent. Quantitative Marketing and Economics 4(3) 241–265.
72
Appendix A: Probability Mass for Zero (Discrete vs. Continuous)
A consumer’s utility function is given as follows (time subscript is suppressed for simplicity):
U ∗ (x1 , · · · , xn ) =
n X
ui (xi ) + α0 M −
n X
! p i xi
i=1
i=1
When x∗i = 0 in the observation (i.e., x∗i = 0, x∗−i ), the probability mass for x∗i = 0 is differentially computed in the two methods: (1) Discrete approach
U ∗ x∗i = 0, x∗−i ≥ U ∗ x∗i = 1, x∗−i ⇔ ui (x∗i = 0) +
n X
∗
uj xj + α0 M −
j=1,j6=i
≥ ui (x∗i = 1) +
n X
n X
! pj x∗j
j=1,j6=i
∗
uj xj + α0
M − pi −
j=1,j6=i
n X j=1,j6=i
⇔ ui (x∗i = 0) ≥ ui (x∗i = 1) − α0 pi
Therefore, the corresponding region in error space is:
Dεi = {εi |ui (x∗i = 1, εi ) − ui (x∗i = 0, εi ) ≤ α0 pi }
73
! pj x∗j
(2) Continuous approach ∂U ∗ ∂ui (xi ) = − α0 pi ≤ 0 ∂xi xi =x∗ =0 ∂xi xi =x∗ =0 i
i
Therefore, the corresponding region in error space is: ( Cεi =
) ∂ui (xi , εi ) εi | ≤ α 0 pi ∂xi xi =x∗ =0 i
Since ui (xi ) is concave,
ui (x∗i =1)−ui (x∗i =0) 1−0