The Probit Choice Model under Sequential Search

The Probit Choice Model under Sequential Search with an Application to Online Retailing Jun B. Kim

Paulo Albuquerque

Bart J. Bronnenberg∗

July 22, 2014

Abstract We develop a probit-based choice model under optimal sequential search and apply the model to study aggregate demand of consumer durable goods. In our joint model of search and choice, we fully characterize optimal sequential search and derive a semi-closed form expression for the probability of choice that obeys the full set of restrictions imposed by optimal sequential search. Our joint model leads to a partial simulation-based estimation that avoids demanding high-dimensional, simulated integrations in evaluating choice probabilities and that is particularly attractive when the consumer search set is large. We demonstrate the applicability of the proposed model using aggregate search and choice data from the camcorder product category at Amazon.com. We show that the joint use of search and choice data provides better predictions than using search data alone and leads to more realistic estimates of consumer substitution patterns.

Keywords: Optimal sequential search, aggregate demand models, information economics, discrete choice, market structure, consumer heterogeneity

∗

Jun B. Kim is an Assistant Professor of Marketing at Hong Kong University of Science and Technology. Paulo Albuquerque is an Associate Professor of Marketing at the Simon School of Business, University of Rochester. Bart J. Bronnenberg is a Professor of Marketing, and CentER Research Fellow, Tilburg University. We thank seminar participants at the 2012 Marketing Science Conference in Boston, MA, and the 2013 Invitational Choice Symposium in Noordwijk, the Netherlands. Bart Bronnenberg gratefully acknowledges EU funding from the Marie Curie Program (IRG 230962).

1

1

Introduction

Online retailers routinely collect and publish data on consumer choice behavior. More recently, they have also started displaying additional details about browsing behavior of shoppers at their online stores. Data on product browsing patterns - for example, in the form of “consumers that viewed this item also view these other items” - are available at online retailers such as Amazon.com, Target.com, Staples.com, and Kmart.com. Data on purchases in terms of sales rank and purchases conditional on browsing specific products - “consumers that viewed this item purchased these other items” - are shown at Amazon.com and Walmart.com.1 This proliferation of online consumer data provides new ways for marketers to better understand consumer decisions in a variety of product categories. With these publicly available aggregate browsing and choice data in mind, the present study sets out to analyze consumer choice and pre-choice browsing behavior in a unified manner. We propose a theory-based empirical model that fully characterizes consumer optimal search and choice decisions in a costly search environment. We apply this model to choice-based aggregate demand models in highly differentiated consumer durable products, in which consumer valuations are often complex. By combining aggregate search and choice data, we aim to achieve a better estimation of consumer preferences and heterogeneity of tastes, which is crucial for segmentation and targeting in marketing. One important methodological advantage of considering both search and choice decisions is that the explicit inclusion of optimal sequential search into a model of choice imposes constraints on the values that the overall utility of products can take pre- and post-search. To highlight the nature of such constraints, consider the following simple illustration of choice under optimal sequential search.

1 For

a more elaborate list of available data on consumer browsing and purchase, please refer to the Appendix.

2

Example Suppose we model the joint probability that, out of J = 4 options, a consumer searches for a set of three options in the order of SK = [1, 2, 3] and makes a choice of j = 2. Let option `’s utility be u` = V` + e` , where V` and e` are expected and random utility components for `, respectively. Next, `’s reservation utility is denoted by z` , which is a measure of attractiveness to search for ` and equals the hypothetical, in-hand utility that makes the consumer indifferent about searching option ` (see Weitzman, 1979, or Section 2.2 in this paper). From the observed search sequence in SK , option 1 was the most attractive to search while option 4 was the least attractive, i.e., z1 > z2 > z3 > z4 . Collectively, optimal sequential search and choice generate a set of restrictions on e` . First, the choice of j = 2 generates the following set of inequalities on the utilities of the searched options,

V2 + e2 > V1 + e1

(1)

V2 + e2 > V3 + e3 . Second, the sequence and composition of optimal search set, SK , imposes the following set of inequalities on the random utility components of e1 and e2 , e1 < z2 −V1 e2 < z3 −V2

(2)

e2 > z4 −V2 . Intuitively speaking, the decision to search for an extra product after searching 1 implies that the utility draw for 1 was not attractive enough to forego the expected attractiveness of the unsearched set. For instance, in Equation (2), the second inequality of e2 < z3 −V2 means that the option 3 is attractive to search since its reservation utility of z3 is greater than the realized utility value of the best alternative so far, u2 = V2 + e2 . If we ignore search, accounting solely for choice conditions in Equation (1) leads to a condi-

3

tional choice probability of

Pr( j = 2| SK ) = Pr (e1 < V2 −V1 + e2 , e3 < V2 −V3 + e2 ) , which, assuming that e` are i.i.d. random variables that follow a normal distribution, gives a probit conditional choice probability expressed as ˆ∞ Pr( j = 2| SK ) = −∞

3

∏ Φ` (V2 −V` + e2) φ2(e2)de2.

(3)

`6=2

When considering both optimal search and choice, we simultaneously account for inequalities in (1) and (2). Under both sets of conditions, the joint probability of Pr( j = 2 ∩ SK ) is

Pr ([e1 < V2 −V1 + e2 , e3 < V2 −V3 + e2 ] ∩ [e1 < z2 −V1 , e2 < z3 −V2 , e2 > z4 −V2 ]) .

(4)

Due to the various restrictions on e` , the evaluation of this joint probability typically calls for a numerical simulation. However, as the size of optimal search set SK increases and as the location of choice j varies within SK , the number and nature of the search restrictions in the form of Equation (2) on choice will become more complex, which poses an estimation challenge. As an alternative approach, this paper characterizes sufficient conditions for the choice under optimal sequential search and yet develops an concise expression for choice probabilities such as in Equation (4), using only univariate integration. This will widely broaden the applicability of models that simultaneously consider search and choice. On a more substantive side, the simultaneous use of choice and search data allows researchers to improve estimation of consumer preferences and heterogeneity, which has been one of the main challenges in a wide class of empirical models that adopt choice-based demand models using aggregate level data (Berry, Levinsohn, and Pakes 1995). To better estimate heterogeneity, researchers have traditionally augmented aggregate choice data with extra data such as second choice

4

survey (Berry, Levinsohn, and Pakes 2004), consumer awareness data (Draganska and Klapper, 2011; Goeree, 2008), aggregate-level consumer switching rates (Albuquerque and Bronnenberg, 2009), and consumer demographic information (Petrin, 2002). The use of extra data is motivated by the key notion that the identification of random coefficients comes from variation in the choice set, either over time or across markets (Berry, Levinsohn, and Pakes, 2004, page 90). In many prior research, these extra data sources typically serve as additional moment conditions for estimation. We adopt a similar approach that combines data sets to better identify the demand parameters of interest. However, browsing or search data are more useful because variation in search sets across consumers directly reflects heterogeneous consumer tastes. For example, observing consumers searching almost exclusively for camcorders with hard-drive storage reveals a strong preference for this attribute. Alternatively, if consumers search for some products with hard-drive storage and others with with DVD storage, that signals more diffuse preferences for storage type. This notion is empirically supported by the existence of a large degree of heterogeneity among consumer search sets (Bronnenberg, Kim, and Mela 2014). This paper contributes to the empirical search literature in several ways. Given recent advances in empirical search literature in studying search costs in homogeneous goods (e.g., De los Santos, Hortacsu, and Wildenbeest 2012; Seiler 2013), we narrow our discussion to models in differentiated goods context, where consumer valuations are complex. First, comparing our work to a related paper by Kim, Albuquerque, and Bronnenberg (2010), we model both search and choice decisions, whereas the former models search decisions only. One key improvement from the joint use of search and choice data is that we can explicitly model that some consumers do not buy upon search. Demand estimates solely based on search data can be biased because there are consumers who search but do not buy and when unaccounted for, this may lead to poor predictions and misleading inferences. Note that the choice of the outside option cannot be modeled using search data alone. Second, Ghose, Ipeirotis, and Li (2013) and Chen and Yao (2012) study the identification of search costs in the context of heterogeneous goods and characterize the composition of the opti5

mal search set. Our approach has the benefit of additionally modeling the optimal search sequence, which allows more complete use of search data and more efficient demand estimates. Third, in two recent papers, Honka and Chintagunta (2014) and Koulayev (2014) model both sequence and composition of consumer search as we do in this paper. However, our approach leads to a semi-closed form, partial-simulation solution that does not require the high dimensional numerical integration and that can be scalable to larger choice sets, which are the norm and not the exception in consumer durable goods categories. Many researchers (e.g., Train, 2009) advocate for the adoption of such methods whenever possible, due to higher accuracy and lower computational cost. Our empirical model also adds to the body of choice-based, aggregate demand models. Unlike previous approaches in which researchers assume exogenous variations in the choice set, we introduce variation in the choice set as an outcome of an endogenous consumer search process. In doing so, we simultaneously describe the constraints on the unobserved term of utilities imposed by the search stage, develop an estimation approach that is scalable to large data sets, and achieve better estimates of consumer preferences, including those of consumer heterogeneity. Substantively, we demonstrate the applicability of the proposed approach using aggregate search and choice data from Amazon.com and study substitution patterns and market structure in the camcorder industry. The wide availability of aggregate-level data from many online sources makes our aggregate demand model applicable to many different product categories in consumer durables. The rest of the paper is organized as follows. In Section 2, we propose our model of search, choice, and choice conditional on search. Section 3 discusses the data used in our empirical application and presents the corresponding identification and estimation approaches. Section 4 discusses the results of the empirical application. We conclude in section 5.

6

2

A Probit Model of Sequential Search and Choice

2.1

Setup

We model consumer search and choice data as the outcome of optimal sequential search (Weitzman, 1979) and choice decisions. In this model, the consumer keeps searching products as long as the expected marginal benefit of doing so is greater than the marginal cost. Upon termination of search, the consumer chooses the highest utility product among the searched products. The utility for consumer i and for product ` = 1, ..., J is

ui` = Vi` + ei` ,

(5)

with

Vi` = X` bi , bi ∼ N (b, B) , ei` ∼ N 0, σi`2 ,

where X` is a row vector of product characteristics, bi is a vector that represents individual-specific tastes for product characteristics, and the variance matrix B is assumed to be diagonal. We interpret the consumer search process as a costly consumer effort to obtain the full match value of a specific option `. Prior to search, we assume that consumers know an expected match value, Vi` , but that the exact realization of the match value is subject to a shock, ei` , drawn from a known distribution. To fully resolve the unknown match value of Vi` + ei` , consumers engage in costly search.2 In our empirical setting, the values of important attributes, which are captured in Vi` , are readily 2 Note

that our interpretation of ei` is similar to those found in Anderson and Renault (1999) and Kim, Albuquerque, and Bronnenberg (2010).

7

accessible by consumers at zero search cost prior to searching for `. Given Vi` , the goal of search is to resolve the value of ei` by incurring a search cost of ci` . Upon search, consumers have access to more details about the option and receive the realized value of ei` . A large positive value of ei` obtained after search means that consumers found a good match between the resolved utility value and their idiosyncratic preferences. Finally, the utility of outside good is represented as,

ui0 ∼ N V0 , σ02 . The demand primitives in our model are the consumer utility and search cost parameters. We now present three models for aggregate data: (1) optimal sequential search, (2) choice under optimal sequential search, and (3) choice of j given search of `. The first model is very close to Kim, Albuquerque, and Bronnenberg (2010), whereas models 2 and 3 are new and constitute the main focus of this paper.

2.2

Optimal Sequential Search

For search part of our model, we use the theoretical framework from Weitzman (1979) and the empirical model of optimal search similar to the one proposed in Kim, Albuquerque, and Bronnenberg (2010). We refer to the latter if necessary to avoid repetition here. Define u∗ at any stage of the search as the highest utility among the searched products thus far. Conditional on u∗ , a consumer’s expected marginal benefit from search of product ` is3 ˆ ∗

B` (u ) =

∞

u∗

(u` − u∗ ) f (u` ) du` ,

(6)

where f (·) is the probability density distribution of u` . Intuitively, B` (u∗ ) captures the expected utility increment from alternative ` over the utility u∗ in hand. When the stochastic components of the utility are uncorrelated across alternatives, Weitzman 3 We

omit the individual index i for clarity.

8

(1979) proves that the optimal sequential search decision of a consumer relies on his or her reservation utility of search. The reservation utility, which we denote by z` , is the hypothetical in-hand utility that makes the consumer indifferent between searching and not searching option `. Mathematically, it is defined by

B` (z` ) = c` ,

(7)

where c` is the search cost for option `. Prior to search, each consumer computes her reservation utilities for all options. Armed with these reservation utilities, the consumer engages in a threestage search and choice process. First, she searches products in the order of descending reservation utility (selection rule). Second, search stops when the highest utility obtained thus far, u∗ , is greater than the highest reservation utility among the items not yet searched (stopping rule). Finally, the product with the highest utility within the searched set is chosen (choice rule). Because the rank of the reservation utility, r (`|θ ), is a one-to-one mapping into the index `, we cast the model using ` as the order of reservation utilities. The following result holds for the probability to search. Proposition 1. Rank products on reservation utility. The probability that the option with the kth highest reservation utility is searched is equal to

πk

k−1 = Pr max (V` + e` ) < zk `=1

k−1

=

∏ Φ` (zk −V`) , k > 1,

(8)

`=1

where Φ` (·) is the cumulative distribution function (CDF) of the error term e` . Proof. This follows immediately from Kim, Albuquerque, and Bronnenberg (2010) who show that the probability of inclusion for option k is equal to the probability that the first largest k − 1 draws of utilities all fall short of zk . 9

2.3

Choice Under Sequential Search Constraints

To model choice, we derive the unconditional probability of choice subject to optimal sequential search. Our proposition 2 shows that under our optimal sequential search, the choice model does not suffer from the curse of having to evaluate all possible choice sets. Proposition 3 shows that the choice probability under sequential search has a parsimonious form. Proposition 2. Rank products on reservation utility. The probability that the product ranked jth is chosen equals to J

Pr ( j) =

∑ Pr( j, SK ),

(9)

K= j

where SK = [1, . . . , K] is an ordered set such that if zk ≥ z` then 1 ≤ k < ` ≤ K, and Pr( j, SK ) is the joint probability that the jth ranked product is chosen from SK . Proof. This follows directly from the ordering on reservation values. If the consumer has a set of unique reservation values, then only one sequence exists that is optimal (Weitzman, 1979). After alternatives are ranked in the order of descending reservation values, the super-set of all possible optimal search sets consists of only J member sets containing the K = 1, ..., J products with the highest reservation values.4 Thus, this proposition states that the aggregation in Equation (9) is not over all 2J−1 possible set permutations that contain a particular product.5 Instead, given the set of reservation values, the choice probability sums over at most J admissible sets depending the location of chosen option j. This dramatically reduces the number of sets to be evaluated in the unconditional choice probability computation. Next, we develop a model for the summand in Equation (9) and prove that the joint probabilities of search and choice have a surprisingly parsimonious form. This makes our proposed approach 4 If

there are ties in the reservation values, the number of possible sets increases. result is one of the reasons responsible for making the approach computationally feasible. With J = 90 the number of sets that contain j is approximately 6 × 1026 . Evaluating a sum over this many terms is impossible. 5 This

10

practical for a relatively large scale model. The major advantage of this proposition is that even if the optimal ordered search set SK is large, its joint probability requires a uni-variate expression instead of a high dimensional integration. Proposition 3. The joint probability Pr( j, SK ) that SK is the optimal set and j is chosen contains 2 parts, with the second part relevant only if j = K, and equals zˆ K −V j

Pr ( j, SK ) =

K

∏ Φ`

zK+1 −V j

V j −V` + e j φ j (e j ) de j +

`6= j

I ( j = K) 1 − Φ j z j −V j

π j,

(10)

where I( j = K) is an indicator variable that is 1 if j = K and 0 otherwise. For completeness, (1) because consumers can not purchase products that were not searched, Pr ( j, SK ) = 0 when j > K, and (2) because the consumer can not search more than J alternatives zJ+1 = −∞. Proof. To develop the proof, first, enumerate all the restrictions that an optimal sequential search set of SK = [1, . . . , K] and a choice of the jth ranked alternative from the set place on utility of the options in the ordered SK , where j 5 K. 1. The choice rule implies that j = arg max`=1,...,K {u` }, that is, V j + e j > V` + e` , ` 6= j, ` = 1, . . . , K. With independent and normally distributed e` , these conditions mean that, condi tional on e j , Pr e` < V j −V` + e j |e j = Φ` V j −V` + e j , for ` 6= j. The conditional probability that j has the highest utility within the ordered set of SK is the product of these probabilities, ∏K Φ V −V + e . j j ` ` `6= j 2. By the selection rule, a decision to search for option K implies max {u1 , · · · , uK−1 } < zK , i.e., the maximum utility in hand after searching {1, ..., K − 1} is less than the highest reservation utility from the un-searched set {K, ..., J} . This means that, for ∀` ∈ 1, ..., K −1, V` +e` < zK . However, it does not mean that VK + eK < zK , and we need to condition below on whether this condition holds or not. 11

3. By the stopping rule, if search terminates at K, the utility draw of the chosen alternative j is greater than the reservation utility of K + 1. Therefore u j > zK+1 , i.e., e j > zK+1 −V j . Next, if we ignore the selection and stopping rules in steps 2 and 3, the choice probability of j would be obtained by integrating out e j from the choice rule for all ` 6= j, ˆ∞ Pr( j) = −∞

J

∏ Φ`

V j −V` + e j φ j (e j )de j .

(11)

`6= j

This model is a formulation of the probit model with independent error terms e.6 We next impose on choice the restrictions by selection and stopping rules, which turn out to be simple integration limits in equation (11). Now consider as a first case that the final utility draw uK is less than its own reservation value, VK + eK < zK . Under this case, the aforementioned selection restriction is true for all e` . That is, having an optimal search set of K, it must be true that V` + e` < zK , ` = 1, ..., K. At the same time, the choice of j implies that V` + e` < V j + e j for ∀` 6= j. Note that this yields another K − 1 restrictions on e` , ` 6= j. For the joint probability, Pr ( j, SK ), the selection and choice restrictions on a given e` must both be true. Since they are both one-sided conditions, they are both true when the most restrictive one is true. Next, observe that V j + e j is smaller than zK . If not, search would have been terminated before K. However, the choice inequalities V` + e` < V j + e j for ∀` 6= j imply that V` + e` < zK for ∀` 6= j too. Put differently, the joint occurrence of a choice of j and a search set of SK satisfies the selection restrictions ∀` 6= j. The only selection restriction that remains is that the utility draw on j was low enough to continue search until K, V j + e j < zK , or e j < zK −V j . Still considering the first case, from the stopping rule at option K, we also have that e j > zK+1 − V j . In sum, the selection and stopping constraints from sequential search in the probit model are a very simple set of additional lower and upper bounds on the utility shock of the chosen 6 For

a similar formulation, see Train (2009) for the expression of the probit model under complete search.

12

item e j . Combining this with Equation (11), gives the following joint probability. zˆ K −V j

Pr( j, SK , uK < zK ) = zK+1 −V j

K

∏ Φ`

V j −V` + e j φ j (e j )de j ,

(12)

`6= j

for j ∈ 1, ..., K and K 6= 1 (when K = 1, Pr ( j = 1, S1 ) = Pr (S1 ) = 1 − Φ1 (z2 −V1 )). Computationally, Equation (12) is very similar to the probit in Equation (11) except with a lower bound on the distribution of unobservables from termination of search at K and an upper bound from continuation of search until K. To apply this equation to the case of K = J, it suffices to set zK+1 = −∞. We now continue with the second case, which is identical to the first one, except that the utility draw for K is higher than zK , i.e., uK ≥ zK . Combined with the conditions from the selection rule, max {u1 , · · · , uK−1 } < zK , this must mean that the choice is K. Thus, when uK ≥ zK is true and SK is the optimal search set, the choice of K occurs with probability 1. Consequently, the joint probability Pr ( j = K, SK , uK ≥ zK ) is equal to the joint probability Pr (SK , uK ≥ zK ). This probability can be computed using the search probability in Equation (8). Conditional on uK ≥ zK , search stops at K, and the probability that SK is the optimal set is the same as the probability that K is included in the search set. This means that Pr (SK |uK ≥ zK ) = πK . Therefore also,

Pr (K, SK |uK ≥ zK ) = πK .

(13)

Note that πK does not depend on eK. To obtain the unconditional probability, we use that the condition uK ≥ zK is equivalent to eK ≥ zK − VK , and integrate the conditional probabilities (13) over eK to obtain,

13

ˆ∞ Pr(K, SK , eK ≥ zK −VK ) =

πK φK (eK )deK zK −VK

= πK (1 − ΦK (zK −VK )) .

(14)

Combining Equations (12) and (14), we can write for j ∈ 1, ..., K

Pr ( j, SK ) =

´ zK −V j

K zK+1 −V j ∏`6= j Φ`

V j −V` + e j φ j (e j )de j +

I ( j = K) · π j 1 − Φ j z j −V j

(15)

which proves the proposition. Taking both parts of Equation (10) together, the computation of the choice probability involves only uni-dimensional integration and summing over less than J terms.

2.4

Choice conditional on search

To model conditional purchase, we compute the choice probability of option j conditional on searching an option `, Pr( j|`), 1 ≤ j, ` ≤ J. Proposition 4. Rank products on reservation utility. The probability that option j is chosen conditional on searching option ` is equal to

Pr ( j|s (`)) =

∑JK=max( j,`) Pr ( j, SK ) π`

(16)

where π` is the probability that `th option is searched (see Equation 8) and Pr ( j, SK ) is the probability that j is chosen from the optimal set SK (see Equation 10). Proof. We write the conditional choice probability as,

14

Pr ( j|s (`)) =

Pr ( j, s (`)) , Pr (s (`))

(17)

where Pr (s (`)) is the search probability for ` and Pr ( j, s (`)) is the joint probability of search and choice. Note that the denominator, Pr (s (`)) , is equal to the probability that ` is in the optimal search set, Equation (8),

Pr (s (`)) = π` .

(18)

Our approach for computing Pr( j, s(`)) is to decompose the joint probability under two conditions. First, we consider a case in which j was searched prior to ` ( j ≤ `). Then the joint probability is decomposed as, J

Pr ( j, s (`) , j ≤ `) =

∑ Pr ( j, SK ) ,

(19)

K=`

where the optimal search set SK in the summand always contains both ` and j since j ≤ ` < K ≤ J. Applying the same logic, the joint probability of the complementary case, where j > `, is decomposed as, J

Pr ( j, s (`) , j > `) =

∑ Pr ( j, SK ) ,

(20)

K= j

where the optimal search set SK in the summand must contain both ` and j. Combining equations (19) and (20), the joint probability of search and choice is J

Pr ( j, s (`)) =

∑

Pr ( j, SK )

(21)

K=max( j,`)

where the expression for the summand, Pr ( j, SK ), is given in equation (15). Finally, we obtain the conditional choice probability of Pr ( j|s (`)) by substituting Equations (18) and (21) into Equation

15

(17) to obtain equation (16).

Pr ( j|s (`)) =

∑JK=max( j,`) Pr ( j, SK ) π`

,

which proves the proposition. The result in proposition 4 relies on individual level probabilities already computed for the model of the search and choice in Equation (3). Therefore computing the conditional choice shares from our individual level model adds no computational burden other than taking a sum over less than J terms. The max operator in the numerator ensures that the purchase probability of j is taken only when both j and ` are searched together. Equation (16) expresses that the probability that j is chosen given search of ` is equal to this sum divided by the probability that ` is searched.

3 3.1

Data, Estimation, and Identification Data

We illustrate our model by combining three aggregate data sets from Amazon.com for the camcorder industry: view rank data, conditional share data, and sales rank data. First, the view rank data refers to a set of ordered lists of products whose webpage was viewed by past consumers, conditional on viewing the webpage of a focal product. The view rank provides information on whether a pair of products were viewed together - i.e., their webpages were browsed in the same session by shoppers. In a similar manner, a conditional choice share list includes choice shares of products in the category, conditional on viewing a focal product. Lastly, Amazon.com posts the sales rank of each product in the category, which we refer to as sales rank data. These are aggregate data sets, but they capture patterns of individual consumer search and choice decisions at Amazon.com. For instance, the view rank data provide rich information about the set of products

16

Attributes Brand Media Formats Price Form High-Definition Pixel Zoom Number of Links Age (days)

Ranges Sony (31), Panasonic (20), Canon (14), JVC (14), other (11) MiniDV (34), DVD (30), Flash Memory(FM)(9), Hard Drive (17) $532 (mean), $258 (std. dev.) Compact (8), Conventional (82) Yes (14), No (76) 1.74M (mean), 1.45M (std. dev.) 20.1 (mean), 11.2 (std. dev.) 2.96 (mean), 6.00 (std. dev.) 266 (mean), 243 (std. dev.)

Table 1: Description of the choice options in the empirical data (occurrence frequency in parenthesis) consumers searched together while the conditional choice share and sales rank data capture crucial information on consumer choices. For more detailed description of data, please refer to Appendix and Section (3.2). For our empirical illustration, we use camcorder search and choice data during the month of June 2007. We extracted the three sets of aforementioned data and product characteristics for the 200 best selling camcorders. After removing camcorders in the lowest-price tier and of professional grade, and limiting ourselves to the top six manufacturers and top four media formats, 90 camcorders remain for analysis. The summary statistics for these products are found in Table 1. In addition, regarding the view rank data, products appear on average 26.3 (out of a possible 89) times on other product’s view rank lists, with a standard deviation of 20.3. In terms of conditional share data, its mean and standard deviation are 0.236 and 0.229, respectively. Finally, we discuss the size of outside goods in our empirical applications. The outside goods share refers to the fraction of consumers who search but do not buy in the category. Our view rank data are unconditional since they reflect search behaviors of all consumers regardless of whether they buy or not. However, conditional choice share and sales rank data reflect consumers who made a choice at Amazon.com. Hence, unaccounted for, a joint estimation of unconditional view rank data and conditional share and sales rank data may bias our model estimates. In order to 17

estimate the share of outside good choice in our empirical application, we reference various online reports on conversion rates at Amazon.com, ranging from 9% to 16% in recent years (Hancox 2008,Eisenberg 2009). We chose a conservative value of 10% for our empirical application and assume that 90% of searchers do not buy a camcorder at Amazon.com.

3.2

Estimation

Our approach to estimating parameters is in spirit of Kim, Albuquerque, and Bronnenberg (2010). However, we complement their estimation equations with the ones needed to use the sales rank and conditional share data. Our general approach is to use the equations derived in Section 2 to construct model predictions that we can match with the three data sets that we collected. Given our model and a set of candidate parameters, we simulate individual-level optimal search and choice decisions for a set of heterogeneous pseudo-consumers, aggregate their search sets and choices, and compute the predictions from these aggregations. We then look for parameters that minimize the squared differences between the actual and predicted view ranks, conditional choice shares, and sales ranks, subject to matching the fraction of consumers who browse but do not purchase. Predicting view rank data7

For each product j, Amazon.com lists a set of other products that

were viewed together in the form of a commonality index, CI j` , defined as, n j` CI j` = √ √ , n j · n`

(22)

where n j and n` are the numbers of consumers who viewed products j and `, respectively, and n j` denotes the number of consumers who viewed products j and ` together. If CI j` > CI jk , it means that ` appears before k on the view rank list for j. More specifically, the view rank indicator

7 Our

discussion in this subsection on search aggregation is similar to Kim, Albuquerque, and Bronnenberg (2010) and we direct readers to that paper for details.

18

variable, IVj,`k , is defined as IVj,`k =

   1 if CI j` > CI jk

(23)

  0 otherwise. Using our model and candidate parameters, we forecast the commonality index between j and ` as,

c j` + ε j` = p nˆ j`√ + ε j` , CI j` = CI nˆ j nˆ ` iid

(24)

2

c j` is the commonality index forecast and ε j` ∼ N(0, υ ). Further, entries in this equation where CI 2 are computed from Equation (8) such as nˆ j = ∑i πi j and nˆ j` = ∑i min πi j , πi` , j, ` = 1, ..., J and j 6= `.8 The error term captures potential Amazon.com’s measurement or aggregate-level prediction errors, similar to Bresnahan (1987) and Bajari, Fox, and Ryan (2007). The probability of observing product ` at a higher place than k on the view rank list of j, i.e., product j is viewed more often in the same search session with ` than with k, is

c j` − CI c jk ), Pr(IVj,`k = 1) = Pr(CI jk < CI j` ) = Pr(ε j,`k < CI

(25)

where ε j,`k = ε jk − ε j` . The error term of ε j,`k is a random variable with mean 0 and variance of υ, ε j,`k ∼ N(0, υ 2 ). Hence, c j` (θ , X) − CI c jk (θ , X) CI Pr(IVj,`k = 1) = Φ υ

! ,

(26)

where θ are model parameters, X are data, and Φ is the CDF for the standard normal distribution. The variable IVj,`k is directly observed in the view rank data from Amazon.com. For each product j, we observe at most

1 2

× (J − 1) × (J − 2) unique inequalities defined by Equation (23). This

leads to a large amount of restrictions on aggregate viewing by the data. Across J = 90 products, 8 Note

that, if z j > zk , the probability that j and k occur together in a set is equal to the probability that k is in the set. That is, πi,{ j and k} = πik = min (πi j , πik ) . See Kim, Albuquerque, and Bronnenberg (2010) for details.

19

the total number of observed pairwise ranks that arise from the view rank data in our empirical analysis is 176,825.

Predicting sales rank data In our empirical setting, we observe sales rank data from the camcorder category and treat them as an aggregate outcome of consumer choices at the online retailer. We link the (observed) sales ranks to the (unobserved) market shares as follows,

I Sj` =

   1 if s j > s`

(27)

  0 otherwise where s j is j’s unobserved true market share. Given X and θ in our joint model, we predict j’s market share by aggregating individual choice probabilities for j in Equation (9). The market share, s j , is modeled from its prediction of, sˆ j , with a random error term,

s j = sˆ j + ε j , iid

(28)

2

where ε j ∼ N(0, τ2 ), j ∈ {1, ..., J}. The probability of observing a pairwise sales rank inequality between j and ` is computed as,

Pr(I j` = 1) = Pr(s` < s j ) = Pr(ε j` < sˆ j − sˆ` ),

(29)

where ε j` = ε` − ε j is a random variable with ε j` ∼ N(0, τ 2 ). Thus,

sˆ j (θ , X) − sˆ` (θ , X) Pr(I j` = 1) = Φ , τ

(30)

where Φ is CDF of standard normal distribution. Lastly, we model the share of consumers who do not buy, s0 , as

s0 = sˆ0 (θ , X) + ε0 , 20

where sˆ0 (θ , X) is the model prediction for the share of consumers who do not buy in the category.

Predicting conditional share data Lastly, we discuss the prediction of conditional choice shares. The observed choice share of j conditional on searching `, s j|` , is modeled as,

s j|` = sˆ j|` (θ , X) + η j` ,

(31)

where sˆ j|` (θ , X) is the model prediction of conditional choice shares aggregated among consumers and η j` is an error term. Note that the conditional share prediction is obtained by aggregating individual conditional choice probabilities in Equation (4) across consumers.

Weights in the objective function

The aggregation errors in the view rank, sales rank, and con-

ditional choice share data vary in magnitude.9 We take into account such potential differences by weighting the squared error terms derived from the view rank, sales rank, and conditional choice shares by the inverse of their variance as in the Generalized Method of Moments. Define the errors on view rank and sales rank inequalities by yVj,`k = Pr(I j,`k = 1) − 1 and ySj` = Pr(I j` = 1) − 1 , respectively. The error for the conditional choice share observation yCj|` = s j|` − sˆ j|` (X, θ ) is the difference between the predicted and actual conditional choice share values. Our weighted nonlinear least square (WNLS) estimator (Horowitz 2010; Demidenko 2004) is,

{θ ∗ , υ ∗ , τ ∗ } = arg min wV · {θ ,υ,τ}

! h i2 yVj,`k + wS ·

∑ ( j,k,`)∈S

+wC ·

∑

h i2 ySj`

!

( j,k)∈T

∑

! h i2 yCj` + ws0 · [s0 − sˆ0 (θ , X)]2 ,

(32)

( j,`)

where S = ( j, k, `)| I j,`k = 1, j 6= k 6= ` , T = {( j, `)| I j` = 1, j 6= `}, and where wV , wS , and wC are weight values for view rank, sales rank and conditional choice share terms, respectively. We 9 In

a similar context, Imbens and Lancaster (1994) support that micro and macro data will be subject to different levels of aggregation errors.

21

impose the share of outside goods as a hard constraint in the objective function by assigning a very large weight value for its squared error term.10 The estimator {θ ∗ , υ ∗ , τ ∗ } will be a consistent estimator with any values of wV , wS , and wC . To obtain an efficient estimator, we use the weight values that are the inverse of the variance of each summand, i−1 h var(yVj,`k ) i−1 h S . = var(y j` ) i−1 h = var(yCj|` )

wV = (σV2 )−1 = wS = (σS2 )−1 wC = (σC2 )−1

(33)

Since we do not know the values of σV2 , σS2 , and σC2 in advance, we adopt an iterative estimation strategy (Greene 2003), also similar in Generalized Method of Moments. Under this strategy, we first estimate θˆ using unit weight values. In the following step, using θˆ , we obtain the estimates bV = (σˆV2 )−1 , w bS = (σˆ S2 )−1 and w bC = (σˆC 2 )−1 . Using the inverse of these estimated variance of w values as weights in Equation (32), we fully re-estimate the model. We iterate this re-estimation process until convergence.11 To obtain standard errors of the parameter estimates, we use the bootstrap resampling method proposed in Efron and Tibshirani (1994).

3.3

Identification

In this section, we provide an informal discussion on model identification. The model parameters include the mean utility and consumer heterogeneity parameters, and the mean and product-specific search cost parameters. The pre-search uncertainty σi2j for all products and outside good utility variance σ02 are fixed to 1 for identification reasons, which is common practice in aggregate choice models. Since our joint model of search and choice is calibrated on aggregate search and choice data, we condition our discussion on the general availability of such data. With additional data on 10 We

use ws0 = 1 × 10E6 for its weight value in the empirical analysis. practice, we fully iterated the model estimation twice without outside option. Two iterations yield weight values of 1.0, 9.6, and 0.77 for wV , wS and wC , respectively, which we use for final estimation including outside goods share. 11 In

22

individual pre-search decisions, we expect that σi2j can be identified as well. We start by noting that the overall search and choice popularity are instrumental to identifying the mean utility and product-specific search cost parameters. Under our model, mean utility parameters are identified by the correlation between the variation in product popularity - such as search shares or market shares - and variation of underlying product characteristics. That is, under our joint model of search and choice, a popular product will be more searched and more chosen by consumers. Therefore, either choice or search popularity alone will identify mean utility parameters, but their joint use does so more efficiently. In contrast, any discrepancies between search and choice popularities help identify product specific search cost. If a product is searched infrequently but its choice share is relatively high, then its search cost must be higher to justify this difference - consumers search for products due to either high utility or low search costs. However, conditional on search, consumers will only choose products with higher utility. Therefore, contrasting search and choice data and their gaps will be informative about product-specific search cost. In our empirical context, the outside good share can also explain the gap between search and choice data at aggregate level. Therefore, if unconditional search data are used for joint estimation, as in our empirical application, an estimation of product-specific search cost calls for an explicit accounting of the outside good in the joint model. The identification of consumer heterogeneity mainly comes from the view rank and conditional share data. These data provide observed measures of coincidence in viewership. Note that if all consumers have the same preference, there will be just one optimal search path for all consumers. This implies that the view-lists will be identical for all focal products or I j,k` = I j0 ,k` , in our empirical data. That is, the relative search popularity of k and ` on the view list of j should be the same as on the view list of j0 . This is obviously not true in general in our data. Just think of j and k as professional-grade, hard-drive camcorders and j0 and ` as inexpensive, flash-memory camcorders in a costly search environment. Among the consumers who like professional grade camcorders and 23

search j, k will be more popular in search than `, whereas among the consumers who like the inexpensive flash-memory camcorders and viewed j0 , it will more than likely be the reverse. In order to fit such patterns, consumer preferences need to be heterogeneous. Put differently, variation in the I j,k` across j identifies preference heterogeneity. Based on a very similar argument, variation in the conditional shares s j|` across ` contributes further to the identification of preference heterogeneity. Finally, the mean search cost is identified by the sparsity in the view rank data, where some products have a long list of co-searched products, while others are almost never browsed. In the following section, we demonstrate the recovery of the model parameters from a simulated data set.

3.4

Data experiment

We conduct a numerical experiment to test the recovery of model parameters from the joint model. To this end, we create 32 product options with 5 binary attributes that were arbitrarily named as brand, “Sony” or “Panasonic”; zoom, “> 10×” or “< 10×”; media, “mini DV” or not; and form, “compact” or not; and a continuous attribute, price. The chosen names for the simulated attributes in this section match the names of the attributes in our empirical application. Utility of i for j is expressed as,

ui j = Vi j + ei j = X j · βi + ei j , where X j is a row vector of levels of the aforementioned attributes, βi is consumer specific sensitivity to the attribute values, and var(ei j ) = σ 2 . We also generate product-specific search costs as a function of two attributes: one that enters mean utility, zoom of ” > 10 × ”, and one that enters exclusively search cost and not the mean utility, named “number of links”. We parametrize the product-specific search cost as,

ZOOM c j = exp γ0 + γ1 I j + γ2 L j , 24

where I ZOOM is an indicator variable whether product j is of type zoom: ” > 10×” and L j is the j number of links to j. Further, parameter γ0 is the base search cost and γ1 and γ2 are search cost sensitivities to I ZOOM and L j , respectively. We set the outside good utility as j

ui0 ∼ N(V0 , σ02 ), where we set the value of V0 such that outside goods share is 89%. Note that this is the share of consumers who will search but do not choose. Finally, we normalize all utility variance terms of σ 2 and σ02 to 1 for identification purpose. To generate data, we assign a set of values to the model parameters, draw 250, 000 “pseudohouseholds” from the joint distribution of parameters, and compute Vi j , zi j , and other relevant quantities to obtain the optimal search set and choice for each individual i. We then aggregate individual-level search and choice decisions according to Amazon.com’s recipe and generate the commonality indices, conditional choice shares, and market shares. Next, we add stochastic errors to these quantities using Equations (24), (28) and (31) and generate view rank, sales rank, and conditional share data. This is used as one set of dependent variables in the numerical experiment. To estimate the model, we take I = 3, 000 draws from the distributions of random coefficients. For each individual draw, we compute the expected utilities Vi j , reservation utilities zi j , and search probabilities πi j . The 3, 000 values of {πi j }, i = 1, . . . , I, are aggregated and used to forecast the commonality indices, which are then used to forecast the view ranks. Given a set of i’s optimal search sequence, we compute i’s choice probabilities and conditional choice probabilities using equations (9) and (16), respectively. Finally, by aggregating the individual choice probabilities across all optimal search sets and across consumers, we compute and forecast the market shares and the corresponding sales ranks. Lastly, we compute and forecast the conditional shares following the procedures in Subsection 2.4. During the estimation, the sum of squared errors in Equation (32) is minimized.

25

Sony Panasonic Zoom > 10× Media: mini DV Form: Compact Price

True values 1 -1 1 -1 1 -1

Estimated values (s.e.) 0.72 (0.14) −1.20(0.19) 0.85 (0.24) −1.24(0.20) 0.87 (0.13) −1.17 (0.41)

heterogeneity

Sony Panasonic Zoom > 10× Media: mini DV Form: Compact Price

1 2 1 2 1 1

0.96 (0.08) 1.89 (0.13) 0.86 (0.07) 1.84 (0.11) 0.90 (0.06) 0.81 (0.34)

search cost

Base cost Zoom > 10×

-7 -3

−7.00 (0.41) −3.06 (0.39)

Effect of links

-2

−2.23(0.74)

Parameter mean utility

aggregation st. dev. sum of squared errors

Outside good V0 υ τ

7 6.75 (0.20) 0.05 0.034 (0.003) 0.005 0.067 (0.038) 1543.17

Table 2: Estimation results from the numerical experiment. The recovered parameters and their corresponding standard errors are shown in Table 2. The standard errors were computed from repeated estimations over different simulated data sets obtained from re-drawing the sampling errors in equations (24), (28) and (31). That is, we re-estimate the model with 40 different simulated data sets each with different aggregation errors.12 Table 2 shows that the recovered parameters are all close to their actual values and within sampling errors. We also note that the parameters for the product-specific search cost parameters are well identified. Therefore, we conclude that data similar to those used in our empirical analysis can identify the model parameters and move to our main empirical application.

12 In

this exercise, we used value of 1 for three weights of wV , wS , and wc and 1 × 10E5 for ws0 .

26

4 4.1

Empirical Illustration Specification

For our empirical purpose, we represent a product as a bundle of characteristics. The utility of product j for individual i is modeled as,

ui j = X j βi + ei j ,

(34)

where X j is a C-dimensional row vector of product j’s characteristics and βi is a C-dimensional (column) vector that represents the individual-specific sensitivities to product characteristics. We include C = 8 product characteristics in the utility specification: brand name, price, media format, form factor, high definition, zoom, screen size, and number of pixels. We assume that ei j is a normally distributed random error, with mean 0 and variance σ 2 = 1, and that this error term is independent across i and j. For reasons of parsimony, we use one common heterogeneity parameter for all brands, as well as one for all media formats. Further, we impose a theory-driven restriction on the price coefficient, assuming a log-normal distribution to avoid arriving at positive price sensitivity for some individuals. In addition, we specify j’s search cost as

c j = exp(γ0 + X jc · γ1 ),

where γ0 is mean search cost parameter, X jc is a row vector of product attributes that are likely to affect j’s search cost, and γ1 is a column vector that captures the search cost sensitivity with respect to X jc . In our application, the row vector of X jc is defined as

X jc = log(L j ), I Hi j , log(A j ) ,

27

where L j is the number of times that product j appears on other products’ pages in the category, I Hi j is an indicator variable whether j is of a hi-definition product, and A j is the age of product j. We choose these product attributes for the following two reasons. First, retailer’s frequent recommendation of a certain product within its domain may lower consumer search cost. Next, we conjecture that online retailer’s typical promotion of newer products and products with advanced features may help reduce consumer search costs. Note that in order to avoid the potential endogeneity, we use lagged values for the L j in our empirical application. Although we have chosen these three variables in our application, researchers can use other variables in other application context.

4.2

Model Fit

We first discuss fit of our empirical model. We base our discussion on internal and external validations. For internal validation, we investigate how well our proposed model predicts the search and choice patterns in sample data. The correlation between predicted and actual sales ranks for all J = 90 options is 0.79. The hit-rate of pairwise rank inequalities, in which we compare the relative positions of two options in the actual and predicted data is 79% for sales ranks and 87 for view ranks. Compared to Kim, Albuquerque, and Bronnenberg (2010), who report 0.63 and 0.72 for in-sample sales rank correlation and sales rank hit rates, respectively, our predictions are substantially higher.13 The prediction improvement comes from the use of both search and choice data and use of outside good while Kim, Albuquerque, and Bronnenberg (2010) use search data only without outside good. Next, as an external validation, we repeat the same exercise using data set from a different month of September 2007. Across J = 86 products, the correlation between predicted and actual sales ranks is 0.73 and pair-wise hit rates are 76% for sales rank and 83% for the view rank. From both validation exercises, we conclude that our model explains the search and choice patterns in 13 We

can’t rule out that part of the discrepancy is due to a small difference in the time window of both studies.

28

both in-sample and out-of-sample data very well. We estimated the empirical model with an outside good share of 90%. As a robustness check, we also re-estimated the model with outside good share of 88%. The correlation between the two sets of parameter estimates are very high at 0.99 and we conclude that our estimates are robust to small changes in outside good shares.

4.3

Parameter Estimates

We present the parameter estimates in Table 3. First, we note that the brand intercepts have face validity. For instance, Sony, one of the best known brands during our data collection period, exhibits the highest mean brand coefficient of 2.00, while Panasonic, another popular brand, has the second highest mean value of 1.78.14 We also find significant heterogeneity in brand preferences, given the estimate of 0.71 for the standard deviation of its distribution. In terms of other product characteristics, the hard-drive media option is the most preferred, with a coefficient normalized at zero compared to the negative coefficients of other media options: MiniDV is the next preferred option (−1.04) and Flash Memory (−1.32) is the least preferred. Heterogeneity for media formats is also high with its estimate of 0.96. Compactness negatively influences the mean utility while consumers prefer higher number of pixels (0.12). At the time of our analysis, high-definition products were still at an early stage of their product life cycle, negatively impacting their desirability for the majority of consumers. For instance, the high price of high-definition TV may have worked against the general popularity of high-definition camcorders - a high-definition TV set is required to take a full advantage of the high-definition feature of camcorders. In December 2007, the same year for our data, the average price of 40-inch LCD HD TV was about $1, 500 (Magid 2007).15 Interestingly, Samsung, a major camcorder manufacturer, did not offer any high-definition options during our data collection period. Heterogeneity 14 The

coefficient for brand Sharp is normalized at zero. a casual browsing reveals that the most popular 42 inch LCD HD TV sold at Amazon.com in summer 2014 is between $350 and $450. 15 In contrast,

29

Primitive Utility

Variable Sony Panasonic Canon JVC Samsung

mean (s.e.) heterogeneity(s.e.) 2.00 (0.38) 0.71 (0.03)a 1.78 (0.43) 0.71 (0.03) 1.64 (0.43) 0.71 (0.03) 1.33 (0.42) 0.71 (0.03) 1.38 (0.32) 0.71 (0.03) −1.04 (0.11) −1.18 (0.18) −1.32 (0.18)

0.96 (0.06) 0.96 (0.06) 0.96 (0.06)

Compact −0.70 (0.27) High-Definition −0.83 (0.21) Zoom −0.01 (0.003) Pixel 0.12(0.07) log (Price) 0.93 (0.11) V0 (outside goods) 5.12 (0.39) Search cost Base search cost (γ0 ) −5.55 (0.36) Number of links −1.18 (0.11) High-Definition −2.96 (0.34) Age 0.44 (0.07) St. dev. of measurement error CI (υ) 0.04 (0.002) St. dev of measurement error sales ranks (τ) 0.002 (0.0001) Sum of squared errors 23, 983 a Random effects variance is common across brands b Random effects variance is common across media formats

1.03 (0.12) 1.36 (0.10) 0.02 (0.003) 0.22 (0.03) 0.98 (0.06)

Media: MiniDV Media: DVD Media: Flash Memory

Table 3: Estimates of the model parameters for the Camcorder category. in preferences for high-definition is especially high compared to other attributes, which is justified by how innovative this attribute was at the time of analysis. Finally, we report that our price coefficient estimates imply an average own price elasticity of −1.73. In terms of the search cost, a higher number of links to a product page reduces its search cost (−1.18), as a page becomes easier to access from other product pages. High-definition products have a lower search cost as well, which is likely driven by its innovative nature, standing out versus the more traditional products, and more mentions in initial category pages at Amazon. Using our estimated parameters, we forecast a median and modal search set size conditional on choice of 13 and 10, respectively. This suggests that the average search set includes a small fraction of 90 products in our data, even among those who end up buying a camera. In addition, 30

a relatively large consumer search set size implies that our parsimonious joint model is a practical and feasible estimation framework for similar categories, while full-simulation based estimation will be challenging to implement in this case.

4.4

Prediction Exercises

In this section, we conduct two prediction exercises: we predict how consumers substitute to different products when manufacturers increase prices and when firms withdraw products from their product lines. Companies can use the first simulation exercise to identify competing products in the market. The second exercise demonstrates that companies can use our proposed model as a product line management tool. This second case may be especially relevant to Sony which decided to scale down its operations in 2012 and 2013 for many consumer electronics categories including digital cameras (Hofilena 2014). As a part of its effort to streamline its product portfolio, Sony may use the proposed model and study the impact of product withdrawals on its overall market share. First, we predict how consumers substitute away from selected products when their prices increase. For that purpose, we increase the price of each product by 10% and compute the cross price elasticity of other products.16 Table 4 shows the best and second best substitutes for a few selected products. We note that the predicted substitution pattern is in general consistent with the implications from parameter estimates. Indeed, we see in table 4 that the best substitute products share some or all attribute levels or similar attribute values with the focal product. For instance, two closest substitutes for a Sony HD product with a retail price of $601 are other Sony products with same and other media types in the similar price range of $539 and $677. As another example, two best substitute candidates for a compact Sharp product with FM media type selling at $436 are other Sharp products with same attribute levels selling at $399 and $594. We believe the implied 16 In

order to study substitution patterns based on native consumer preferences only, we set search cost identical for all options in this set of exercises.

31

Focal Product

Price

Own Elasticity

Best Substitute

Price

Cross Elasticity

Second Best Substitute

Price

Cross Elasticity

% to outside goods

Sony, HD

$601

-2.12

Sony, HD

$539

0.15

Sony, DVD

$677

0.11

4.4

Panasonic, MiniDV

$289

-1.13

JVC, MiniDV

$299

0.19

JVC, MiniDV

$241

0.11

3.7

Canon, DVD

$300

-2.17

Panasonic, DVD

$323

0.13

Samsung, DVD

$294

0.11

4.1

Sharp, FM Compact

$436

-1.10

Samsung, FM Compact

$399

0.17

Sharp, FM Compact

$594

0.05

6.3

Samsung, MiniDV

$219

-0.66

Panasonic, MiniDV

$234

0.04

Sony, MiniDV

$297

0.03

4.8

Table 4: Price-induced substitution patterns of selected camcorders consumer substitution patterns are realistic in the camcorder market. Note that the table also lists the percentage of consumers who choose not to buy any options with the price change. Overall, manufacturers can use the proposed model and identify their competing products and quantify their impact. As a second exercise, we predict market share changes if Sony decides to withdraw some of its least popular products and streamline its product portfolio. In the past, Sony seemed to have experienced product proliferation in the camcorder category offering 31 products, or about one third of all the options in our empirical data.17 Given its poor performance in many of its consumer electronics categories (Hofilena 2014), one option Sony may pursue is rationalize its product line and discontinue some of its least popular products. Sony can use the proposed empirical model and investigate how its own market share changes under such a decision. As an illustration, we simultaneously withdraw the seven least popular products from Sony’s product line and study the redistribution of its market shares between Sony and other brands. Table 5 shows the results. At its full product line, Sony’s total market share from 31 products is 41%, in which the break-down between top 24 and bottom 7 options are 37.6% and 3.5%, respectively. 17 We

also note that the average sales rank of Sony’s products is 35 with standard deviation of 27, with the least popular product ranked at 87.

32

Sony’s share changes before and after the decision Old share New share Share gain Top 24 Sony 37.6% 39.5% 1.9% Bottom 7 Sony 3.5% 0% −3.5% All Sony 41.1% 39.5% −1.6%

Brand Sony Sony JVC Sony Outside good Sony

Products with the largest share changes Media Price Old share New share miniDV 529 2.93% 3.10% miniDV 254 1.69% 1.83% HD 677 6.26% 6.39% miniDV 297 1.92% 2.04% 92.29% 92.40% HD 778 3.43% 3.54%

Share gain 0.17% 0.14% 0.13% 0.12% 0.11% 0.11%

Table 5: Top market share gainers from Sony’s short product portfolio. After the decision to drop bottom seven products, Sony’s new, overall market share is estimated to be at 39.5%. Therefore, instead of losing 3.5% of the market share from its bottom seven products, Sony is predicted to lose 1.6%. This less-than-expected loss is due to the prediction that more than half of consumers who originally chose one of the bottom seven options switch to other Sony products. The list of products that gain market share is also shown in the same table. For instance, a Sony miniDV camcorder with price of $529 gains additional market share of 0.17%. A JVC HD product with a retail price of $677 also gains 0.13%. Note that about 0.11% of consumers choose outside goods and hence the market will be less covered from Sony’s product withdrawal. Overall, we predict that Sony’s overall share loss will be much smaller than the original market share of the seven least popular options.

4.5

Market Structure Analysis

In this section, we discuss the market structure in the camcorder category. In our context, analyzing the market structure is informative for product line managers who are concerned with the relative positioning of their brands in the marketplace (Van Heerde, Mela, and Manchanda 2004). Although

33

there is a large volume of research analyzing market structure in many consumer packaged goods categories using consumer panel data (e.g., Elrod, 1988; Erdem, 1996), market structure analysis on consumer durable goods category is less common. We demonstrate in this section that one can use the proposed model and public data to investigate many durable goods categories. In our analysis, we use the framework of clout and vulnerability to represent the asymmetric competitive positions of each brand (Kamakura and Russell, 1989; Van Heerde, Mela, and Manchanda, 2004; Sonnier, McAlister, and Rutz, 2011)

18 .

Figure 1 shows the resulting clout and

vulnerability for the six major camcorder manufacturers in our data. Each circle in the chart represents a manufacturer and the x and y coordinates represent clout and vulnerability, respectively. The size and color of the circle represent the average selling popularity and average selling price of an average product for a manufacturer. A larger size represents higher popularity and darker color represents higher selling price. We note that there is a negative association between clout and vulnerability among the brands which implies asymmetric competitive positions among the brands (Kamakura and Russell 1989). In figure 1, Sony shows the lowest level of vulnerability, which may in part be attributable to its highest brand value. Next, Canon has the highest level of clout, closely followed by JVC and Sony. JVC offers 14 products, eight of which are based on hard drive(HD) media type, the most valued among four media types. Given its small circle size, we infer that JVC mainly serves a small consumer segment with strong preferences for the hard drive media type, which leads to higher clout and lower vulnerability.19 Samsung and Sharp are the two weakest brands with low clout but with high vulnerability. Lastly, Panasonic, the second largest manufacturer in terms of product offerings, has a vulnerability level similar to other well known Japanese brands, but it noticeably exhibits lower clout, which is about half of other manufacturers. This means that Panasonic cannot

18 For

the operationalization of clout and vulnerability, we follow Kamakura and Russell (1989). that the mean utility of hard drive is normalized at 0 in our estimates while those of all other media types are negative. In addition, the total number of all hard drive products are 17. JVC offers 47% of all hard drive products in the camcorder market. 19 Note

34

Figure 1: Clout and vulnerability for camcorder manufacturers. Larger circles represent higher average popularity and darker circles represent higher average selling price. effectively lure away buyers from other manufacturers if it lowers its price with the following implications. In summary, the clout and vulnerability graph - constructed from search and choice data and with estimates from the proposed model - helps managers understand the market structure and obtain essential insights regarding product line management in their large category.

5

Conclusion

In this paper, we propose a theory-based, joint model of optimal sequential search and choice. Based on the premise that consumers look for information to fully resolve match values about products in a costly search environment, we conceptualize the consideration or search set as an outcome of an optimal sequential search process, with consumers making their choices from the resulting set of alternatives. Our model fully characterizes the costly search and choice decisions driven by the same demand primitives. For our empirical analysis, we use aggregate consumer search and choice data on the digital camcorder category from Amazon.com and study consumer 35

substitution patterns and market structure. We aim to make the following contributions to the existing literature on consumer information search and choice-based aggregate demand models. First, methodologically our model fully characterizes the costly search and consumer choice subject to sufficient conditions induced by optimal sequential search, and derive a parsimonious expression for choice probability. By doing so, the model leads to a partial-simulation estimation framework and avoid full simulation-based estimation. This feature is particularly attractive for consumer durable goods where consumer search set may be large. In addition, given that consumer search data capture rich consumer substitution patterns and that the identification of consumer heterogeneity has been one of the main challenges in choice-based aggregate demand models, the joint use of search and choice data will help researcher to better study consumer demand including the consumer heterogeneity. Substantively, we apply the proposed model to aggregate-level browsing and sales data from Amazon.com and illustrate the applicability of our empirical model. Once we estimate the demand primitives, we study the consumer substitution patterns in the presence of the price changes of a few selected products, under the withdrawal of a few products, and the market structure at the brand level. Consistent with our parameter estimates, we noticed that consumer substitution patterns are realistic and are strong along brands, price, and technical features including media types and highdefinition. We believe the proposed model will help product managers identify their competing products and understand the competitive positions of their brands using public data at many online retailers.

36

References Albuquerque, Paulo and Bart J. Bronnenberg. 2009. “Estimating Demand Heterogeneity Using Aggregated Data: An Application to the Frozen Pizza Category.” Marketing Science 28 (2):356– 372. Anderson, Simon P and Regis Renault. 1999. “Pricing, product diversity, and search costs: A Bertrand-Chamberlin-Diamond model.” The RAND Journal of Economics :719–735. Bajari, Patrick, Jeremy T Fox, and Stephen P Ryan. 2007. “Linear regression estimation of discrete choice models with nonparametric distributions of random coefficients.” The American economic review :459–463. Berry, Steven, James Levinsohn, and Ariel Pakes. 1995. “Automobile prices in market equilibrium.” Econometrica: Journal of the Econometric Society :841–890. ———. 2004. “Differentiated Products Demand Systems from a Combination of Micro and Macro Data: The New Car Market.” Journal of Political Economy 112 (1):68–105. Bresnahan, Timothy F. 1987. “Competition and collusion in the American automobile industry: The 1955 price war.” The Journal of Industrial Economics :457–482. Bronnenberg, Bart J., J. Kim, and Carl F. Mela. 2014. “Zooming In on Choice: How Do Consumers Search for Cameras Online?” Chen, Yuxin and Song Yao. 2012. “Search with Refinement.” Working paper. De los Santos, Babur, Ali Hortacsu, and Matthijs R Wildenbeest. 2012. “Testing models of consumer search using data on web browsing and purchasing behavior.” The American Economic Review 102 (6):2955–2980. Demidenko, Eugene. 2004. “Mixed Models: Theory and Applications (Wiley Series in Probability and Statistics).” . Draganska, Michaela and Daniel Klapper. 2011. “Choice set heterogeneity and the role of advertising: An analysis with micro and macro data.” Journal of Marketing Research 48 (4):653–669. Efron, Bradley and Robert J Tibshirani. 1994. An introduction to the bootstrap, vol. 57. CRC press.

Eisenberg, Bryan. 2009. “Top 10 Online Retailers by Conversion Rate: June 2009.” URL http://www.grokdotcom.com/2009/08/03/top-10-online-retailers-by-conversion-rate-june-2 Elrod, Terry. 1988. “Choice map: Inferring a product-market map from panel data.” Marketing Science 7 (1):21–40.

37

Erdem, Tülin. 1996. “A dynamic analysis of market structure based on panel data.” Marketing Science 15 (4):359–378. Ghose, Anindya, Panagiotis G. Ipeirotis, and Beibei Li. 2013. “Surviving Social Media Overload: Predicting Consumer Footprints on Product Search Engines.” Goeree, Michelle Sovinsky. 2008. “Limited information and advertising in the US personal computer industry.” Econometrica 76 (5):1017–1074. Greene, William H. 2003. Econometric analysis. Pearson Education India. Hancox, Paul. 2008. “Is Amazon’s 9.6 % Conversion Rate Low? Here’s Why I Think So...” URL http://www.conversionblogger.com/is-amazons-96-conversion-rate-low-hereswhy-i-think-so/. Hofilena, John. 2014. “Sony to cut more jobs at failing electronic equipment division.” URL http://japandailypress.com/sony-to-cut-more-jobs-at-failing-electronicequipment-division-0341844/. Honka, Elisabeth and Pradeep K. Chintagunta. 2014. “Simultaneous or Sequential? Search Strategies in the US Auto Insurance Industry.” Working paper. Horowitz, Joel L. 2010. Semiparametric and nonparametric methods in econometrics, vol. 692. Springer. Imbens, Guido W and Tony Lancaster. 1994. “Combining micro and macro data in microeconometric models.” The Review of Economic Studies 61 (4):655–680. Kamakura, Wagner A and Gary J Russell. 1989. “A probabilistic choice model for market segmentation and elasticity structure.” Journal of Marketing Research :379–390. Kim, Jun B, Paulo Albuquerque, and Bart J Bronnenberg. 2010. “Online demand under limited consumer search.” Marketing Science 29 (6):1001–1023. Koulayev, Sergei. 2014. “Estimating demand in online search markets, with application to hotel bookings.” Forthcoming. Magid, Larry. 2007. “Today’s HDTV, or Next Year’s?” URL http://www.nytimes.com/2007/ 12/20/technology/personaltech/20basics.html. Petrin, Amil K. 2002. “Quantifying the Benefits of New Products: The Case of the Minivan.” Journal of Political Economy 110 (4):705–729. Seiler, Stephan. 2013. “The impact of search costs on consumer behavior: A dynamic approach.” Quantitative marketing and economics 11 (2):155–203.

38

Sonnier, Garrett P, Leigh McAlister, and Oliver J Rutz. 2011. “A dynamic model of the effect of online communications on firm sales.” Marketing Science 30 (4):702–716. Train, Kenneth. 2009. Discrete choice methods with simulation. Cambridge, UK: Cambridge university press. Van Heerde, Harald J, Carl F Mela, and Puneet Manchanda. 2004. “The dynamic effect of innovation on market structure.” Journal of Marketing Research 41 (2):166–183. Weitzman, Martin L. 1979. “Optimal search for the best alternative.” Econometrica: Journal of the Econometric Society :641–654.

39

A

Public Data on Consumer Search and Choice

There are many popular online retailers that provide the data that summarize the aggregate-level consumer search and choices. Such data can be used to study aggregate demand in vast array of product categories. The following table summarizes the availability of such data for the largest online retailers mainly in US as of April 2014. Online Store Amazon.com, Amazon.co.uk Bestbuy.com Walmart.com Target.com Overstock.com Kmart.com QVC.com

Search Data

Choice Data

Choice Conditional On Search

Customers who viewed this item also viewed ... Customers who viewed this item also viewed ... NA

Sales Rank Best Seller (Sort)

Customers who viewed this item bought ... NA

Guests who viewed this item also viewed Customers who viewed this item also ... Customers who viewed this item also ... Customers who viewed this item also ...

Best Seller (Sort) Best Seller (Sort) Best Seller (Sort) Best Seller (Sort)

Best Seller (Sort)

Customers who viewed this item bought ... NA NA NA NA

Table A.1: Summary of online retailers and the data availability.

B

Data Description

The view rank data is a list of products that are viewed by past consumers, conditional on viewing a focal product in the same browsing section. Consumers “view” a product if they visit the web page in which they can find the detailed information of the product. For example, if option B was viewed frequently with option A, option B will appear in A’s view rank list. In addition, a product that was viewed more frequently with option A will appear higher in A’s view rank list than a product that was viewed less frequently with A. Amazon.com provides this view rank list for all products, and we refer to the set of all view rank lists as view rank data. The conditional share data consists of the choice shares of products in the category, conditional on viewing the focal product and on category incidence. That is, if option B is frequently chosen among consumers who viewed option A, B will appear on A’s conditional share list with its numeric share value. Amazon.com provides these data for all focal products. We refer to the set of all conditional share lists as the conditional share data. In principle, the conditional share data provides detailed information on market shares within specific groups of limited choice sets. Amazon.com’s conditional share lists are often truncated, i.e., they list the top four most popular

40

choices conditional on searching for A. However, their cumulative shares usually add up close to unity and hence provide rich information on conditional consumer choices.

41

The Probit Choice Model under Sequential Search

The Probit Choice Model under Sequential Search

Suggest Documents

The Probit Choice Model under Sequential Search ... - Bart Bronnenberg

A Probit Model Analysis of Factors Affecting ... - AgEcon Search

A Probit Model Analysis of Factors Affecting ... - AgEcon Search

Sequential male mate choice under sperm competition risk

Entrepreneurial Development Choice Model Under ... - Clute Institute

Site fidelity curbs sequential search and territory choice

A Generalized Ordered Probit Model

risk route choice analysis and the equilibrium model under anticipated ...

Sequential Monte Carlo EM for multivariate probit models

Search Dynamics in Consumer Choice under Time Pressure: An Eye ...

On A Sequential Probit Model of Infant Mortality in Nigeria - ijmsi

Choice Under Uncertainty

Choice under aggregate uncertainty

Sparse Estimation in a Correlated Probit Model

Education Choice Under Uncertainty

Cross-validation prior choice in Bayesian probit ...

Testing exogeneity in the bivariate probit model - Core

A Simplified Model of Choice Behavior under Uncertainty - Frontiers

Probit Analysis

Probit Analysis

Sequential sampling models of choice: Some ... - users.miamioh.edu

ivporbit:An R package to estimate the probit model ...

Patient Choice in Kidney Allocation: A Sequential

Meta Analysis in Model Implementation: Choice Sets ... - AgEcon Search