Sequential Testing of Product Designs - Semantic Scholar

7 downloads 0 Views 656KB Size Report
Jul 18, 2007 - Building on Thomke's (1998) empirical findings, Loch et al. ..... going through a 'payoff valley' to reach a 'payoff-mountain' later on” (Adam, ...
Sequential Testing of Product Designs: Implications for Learning1

Sanjiv Erat2

Stylianos Kavadias3

July 18, 2007

1 The

authors thank Christoph Loch, Beril Toktay, the Associate Editor, and two anonymous reviewers, for their

valuable suggestions and comments that have greatly helped improve the manuscript 2 Rady

School of Management, University of California at San Diego. Email: [email protected]

3 College

of Management, Georgia Institute of Technology. Email: [email protected]

Abstract Past research in new product development (NPD) has conceptualized prototyping as a ‘design-build-test-analyze’ cycle to emphasize the importance of the analysis of test results in guiding the decisions made during the experimentation process (Thomke, 1998). New product designs often involve complex architectures and incorporate numerous components, and this makes the ex ante assessment of their performance difficult. Still, design teams ofen learn from test outcomes during iterative test cycles enabling them to infer valuable information about the performances of (as yet) untested designs. We conceptualize the extent of useful learning from analysis of a test outcome as depending on two key structural characteristics of the design space, namely whether the set of designs are ‘close’ to each other (i.e., the designs are similar on an attribute level) and whether the design attributes exhibit non-trivial interactions (i.e., the performance function is complex). This study explicitly considers the design space structure and the resulting correlations among design performances, and examines their implications for learning. We derive the optimal dynamic testing policy, and we analyze its qualitative properties. Our results suggest optimal continuation only when the previous test outcomes lie between two thresholds. Outcomes below lower threshold indicate an overall low performing design space, and, consequently continued testing is suboptimal. Test outcomes above the upper threshold, on the other hand, merit termination since they signal to the design team that the likelihood of obtaining a design with a still higher performance (given the experimentation cost) is low. We find that accounting for the design space structure splits the experimentation process into two phases: the initial exploration phase where the design team focuses on obtaining information about the design space, and the subsequent exploitation phase where the design team, given their understanding of the design space, focuses on obtaining a ‘good enough’ configuration. Our analysis also provides useful contingency-based guidelines for managerial action as information gets revealed through the testing cycle. Finally, we extend the optimal policy to account for design spaces that contain distinct design subclasses.

(KEYWORDS: SEQUENTIAL TESTING; DESIGN SPACE; COMPLEXITY; CONTINGENCY ANALYSIS)

1

Introduction

Before launching a new product, firms often test multiple product prototypes to gather performance and market data, and to decrease the technical and market uncertainty. Various studies emphasize the importance of testing and experimentation for successful new product development (NPD) and, at the same time, recognize that testing may consume substantial resources (Allen, 1977; Clark & Fujimoto, 1989; Cusumano & Selby, 1995; Thomke, 2003) . Thus, these studies highlight a key trade-off involved in experimentation performance benefits versus costs of additional experiments. Over the years, a number of studies have examined this fundamental trade-off and its associated drivers. Early on, Simon (1969) observes the need to undertake multiple trials (experiments) to obtain a high performing outcome (design). DeGroot (1968) models this process as the repeated sampling of a random variable representing the design performance. In a seminal study, Weitzman (1979) recognizes that the different design performances are not necessarily ‘realizations’ of the same random variable. Instead, he views experimentation as a search over a finite set of statistically independent alternatives (designs) with distinct and known prior distributions of performances. His simple and elegant result is summarized as follows: continue testing as long as the best performance obtained so far is below a ‘reservation performance.’ Clark & Fujimoto (1989) present evidence from practice that design configuration tests are not simply arbitrary draws from independent performance distributions, but rather are iterative “design-build-test” tasks. Their study forms the basis for Thomke’s conceptualization of the experimentation process as an iterative “design-build-test-analyze” cycle (Thomke, 1998; Thomke, 2003). Thomke’s research is the first to emphasize the aspect of learning during experimentation, and to highlight that the analysis phase enables a design team to understand the design space better and revise the test schedules accordingly. Building on Thomke’s (1998) empirical findings, Loch et al. (2001) analytically examine which mode of experimentation (parallel versus sequential experimentation) minimizes the overall experimentation costs. In their model, the design team gradually learns through imperfect sequential experiments, and identifies 1

which design is the ‘best’ in fulfilling an exogenously specified functionality. In this study, we recognize that the extent of transferable learning (i.e., relevance of the outcome from testing one design configuration in understanding and predicting the performance of another configuration) is directly related to the design space structure (e.g. the similarities among different design configurations). Thomke et al. (1999) offer a tangible illustration of how such transferable learning occurs in practice: the design team, conducting simulation-based side-impact crash tests of BMW cars, observed during the course of testing that weakening a particular part of the body structure (specifically, a pillar) led to superior crash performance of the vehicle. This insight allowed them to infer the same relationship in other related (similar) designs, and to improve overall safety of BMW’s product offerings. This article seeks to answer the following research questions: (i) How does the design space structure influence the optimal number of experiments? and (ii) How does the design space structure drive the design team’s learning? We motivate our formal treatment of the problem with the following numerical example.

Motivating Example Consider a product that can be realized through two design configurations, A and B, and let a subset of design attributes be identical across the two configurations. The design team assesses, a priori, that the performance of configuration A (B) is distributed normally around a mean µA (µB ) with a standard deviation of σA (σB ). The similarities in the design attributes between the two configurations lead the design team to expect that, if A exhibits high (low) performance, then B has a higher likelihood of a high (low) performance. Thus, the scalar performances of configurations A and B (A and B respectively) are jointly multivariate normal and exhibit positive correlation ρAB = θ. We assume that testing reveals the true performance of design configurations (i.e., perfect testing), and suppose that the cost of testing design configuration A (B) is CA (CB ). Consider the following example of parameters: µA = 0.1 σA = σB

µB = 0 =

1

2

CA = CB

=

0.1

θ

=

0.8

Base line Case with No Learning First, consider the case where the performance dependencies and thus the possibility of learning are disregarded, i.e. θ = 0. Furthermore, suppose that the best performance found so far is x (x = 0 if no test has been conducted). Given x, continuing onto test design A at a cost CA results in the (expected) performance E[max{x, A}]. Hence, the marginal value from testing A is E[max{x, A}] − x −CA . Weitzman (1979) shows that this marginal value is positive and further experimentation is beneficial as long as the current best solution (i.e., x) is lower than a threshold (i.e., the reservation price of A)1 . Furthermore, Weitzman’s (1979) reservation price (denoted by WRP) of A is obtained by equating the above marginal value to zero, i.e., solving for x in the equation E[max{x, A}] − x −CA = 0. In our example, the WRP for A is 1.01 and for B is 0.91. Thus, if the performance dependencies between the two configurations are disregarded, the team optimally tests A, and then continues on to test B if and only if the outcome from testing A is less than 0.91. Testing Cycle with Learning Now consider the setting where the design team accounts for performance dependencies (i.e., correlation θ = 0.8). Suppose that design A is tested 2 and results in performance a. The conditional distribution of B √ (i.e. conditional on the realization a) is normal with mean µB + θ(a − µA ) and standard deviation σB 1 − θ2 (Johnson & Wichern, 2002). Since no subsequent learning is possible (i.e., B is the only other design), Weitzman’s (1979) result applies from this point on. Hence, the team continues to test B if and only if the WRP of B (conditional on the realization a) is greater than the best performance found so far, i.e., 0.8a + 0.28 > max(a, 0). In summary, when θ = 0.8, the team optimally tests A, and then continues to B if and only if a ∈ (−0.35, 1.4). A graphical comparison between the two cases is illustrated in Figure 1. The figure points toward two 1

Weitzman’s (1979) results presented here have been adapted to our setting. His reservation price result is valid under more

general settings including different costs and arbitrary distributions, as long as the alternatives are statistically independent. 2

It is suboptimal not to test any designs since by testing A (B) and stopping immediately, the experimenter makes a strictly

positive payoff = E[max{0, A}] −CA (E[max{0, B}] −CB ). Furthermore, since µA > µB , it is strictly better to test A first.

3

Base-line case with no learning

TERMINATE

CONTINUE -0.35

Optimal Policy with learning

0.91

CONTINUE

TERMINATE

a

1.4

TERMINATE

Good News !!

Bad News !!

Design B probably has high performance Continue Testing

Design B is probably inferior Terminate Testing

Figure 1: Differences between Base line Case Without Learning and Testing Cycle With Learning key benefits of learning: (i) cost savings due to early termination, when the acquired information and the associated analysis suggest that the remaining configurations would exhibit inferior performances (i.e., the a ≤ −0.35 region), and (ii) gains by additional experimentation when the information indicates untapped potential from the untested designs (i.e., the 0.91 ≤ a < 1.4 region). In this article, we model the experimentation process as an iterative search over a set of feasible design configurations. We develop a stochastic dynamic program, and we characterize the optimal testing policy. Our goal is to describe the effects of design space structure on learning, and, by extension, on the optimal test schedules. Furthermore, we also aim to develop and prescribe intuitive rules-of-thumb to aid managerial decision making during experimentation. Our contribution is three-fold. First, our conceptualization allows us to demonstrate how the design space structure (i.e., the similarities among the design configurations and the complexity of the performance function) critically determines the optimal amount of experimentation. Furthermore, we offer a more realistic approach to understanding learning in NPD experimentation, beyond Weitzman’s (1979) assumption of independent designs/alternatives, and Loch et al.’s (2001) highly specific assumption that testing a design alternative signals whether or not the design is the ‘best’. However, to maintain analytical tractability, we rely on specific assumptions about some facets of the testing process (e.g., we assume that design performances are normally distributed and that testing costs are equal within a class of designs). Yet, these assumptions do not fundamentally relate to the learning aspect of experimentation, and we consider them a fair price to pay

4

for extending the analysis to interrelated design configurations. In addition, in §4, we offer a set of results that illustrate the mathematical intractability in obtaining closed-form solutions once we move beyond our assumptions. Second, we identify and characterize the qualitative effects on the testing process due to the explicit consideration of the design space structure. Design teams may pursue additional tests not only to identify configurations with higher performance, but also to explore and to gain greater understanding of the design space. We term this pattern the ‘exploration phase.’ Such aggressive exploration appears in the initial stages of testing and is stronger when the interactions between design attributes are weaker and/or similarities between designs are greater. Once the team gains sufficient understanding of the design space, the likelihood of terminating the testing cycle increases. We term this the ‘exploitation phase.’ We also find that the length of the exploration phase depends on the extent of similarities between the designs. More similarities lead to a shorter exploration phase. Thus, the experimentation process is split in two phases: an exploration phase where the ‘design-build-test-analyze’ iterations are primarily conducted based on what you may learn, and an exploitation phase where the ‘design-build-test-analyze’ cycles are are driven by what you have learned. Third, we outline rules-of-thumb for guiding managerial decisions contingent on the initial test outcomes. When worse than expected performance outcomes are observed, the design team should transition sooner to an exploitation regime if the interactions between the design attributes are deemed weak. Thus, the length of the exploration regime depends critically on how the first few test results compare to the a priori performance expectations, and on the extent of interactions between the design attributes. The model formulation is presented in §2. §3.1 presents the solution to the base case together with several important properties of the optimal policy. §3.2 and 3.3 generalize the base case model to settings where the parameter estimates are uncertain, and the case where the designs belong to different sub-classes respectively. §4 concludes with managerial insights, a discussion of the limitations of the current search model and some paths for future research.

5

2

A Formal Model of Learning in Experimentation

Consider a design team who develops and tests different configurations before launching a new product. The performance of the product design depends on M distinct design attributes. Hence, each design configuration i (i = 1, 2, ..., N) is a point in the M-dimensional design space and is represented as a unique combination of the M attributes, namely xi j ( j = 1, 2, ..., M). For example, within the premises of Thomke et al.’s (1999) example, the thickness of the pillar for a design configuration i is one design attribute value xi1 . The team selects the values for the attributes xi j , and determines the performance Pi = P(xi1 , · · · , xiM ) of configuration i; P(·) is the performance function that maps the design space to a real number representing the design performance. As in most new product designs, the performance function P(·) cannot be exante specified, due to incomplete knowledge or inherently uncertain and complex relationships between the attributes (Ethiraj & Levinthal, 2004; Loch & Terwiesch, 2007). Thomke et al.’s (1999) crash test example offers a vivid illustration of how complexity naturally emerges from unforeseen interactions between the pillar and the rest of the design attributes. Despite the presence of such complexity, the design team can often estimate, at a high level, the extent of potential interactions between the design attributes3 . Furthermore, since a specific choice of attribute values constitutes a design configuration, limited changes in a subset of the design attributes result in similarities among designs. As the degree of attribute commonality increases, the design configurations appear more similar than others. We model these two aspects of the design space - potential interactions between attributes and the similarity between configurations - through a performance correlation between the designs. Specifically, the performance levels of two designs are likely to be more related if there is only a small difference between the attribute values of the two designs (a smaller ‘distance’ between the designs) and/or if the extent of interactions between the design attributes is low (a simpler performance function). On the other hand, widely 3

This is similar to assessing the interaction parameter K in an NK landscape model (Kauffman, 1993).

6

different design attributes across configurations (a larger ‘distance’ between designs) and/or a design space where the design attributes exhibit greater degree of interactions (a more complex performance function) limits the performance dependence among different configurations. To summarize, two designs are more performance correlated if they are less distant and if the performance function is less complex. The design team may, at least roughly, estimate the correlation between any two designs, as it understands the distance between the two designs and the complexity of the performance function. Furthermore, we assume that these correlations are non-negative. That is, a high (low) test outcome for a design configuration signals a higher likelihood for the performance of the other designs to be high (low). This assumption is natural in the context of performance functions (fitness landscapes)4 . For instance, Stadler (1992) demonstrates how families of performance functions (landscapes), including those generated from many combinatorial optimization problems, can be characterized by their correlations and exhibit positive correlations. Furthermore, our assumption of positive correlations conforms to NK landscapes and AR(1) processes, both of which are widely used to represent performance landscapes in complexity literature (Kauffman, 1993). Let ~P = (p1 , p2 , ..., pN ) be the vector of the performances of the N potential design configurations. ~P is random because of the design team’s limited knowledge; we assume that the joint distribution of the scalar performances is multivariate normal. Our distributional assumption is realistic for a wide range of performance landscapes. For instance, in fitness landscapes such as the widely used NK landscapes, the performance at each point is approximately normally distributed (Kauffman, 1993, p. 53). The normality assumption also results in a tractable model and allows the characterization of the optimal policy properties (our discussion in §4 illustrates that index policies may fail to exist for arbitrary distributions). Let the design team’s a priori expectations of the design performances be µ¯ = (µ1 , µ2 , .., µN ). The a 4

Intuitively, if the distance between two designs is low and the performance function is simple, we would expect the perfor-

mances to be positively correlated. Alternatively, if the distance between designs is high and the performance function is complex, we would expect the performances to be unrelated (i.e., correlation = 0). Note that in either case the correlations are always non-negative.

7

priori performance uncertainty is summarized in a performance covariance matrix ΣN . We assume that the a priori variances of the performances of all the designs are identical = σ2 . Our assumption reflects relatively homogeneous design spaces with configurations that differ with respect to a small, limited subset of design attributes. We offer a limited relaxation of this assumption in §3.3. However, as we demonstrate in §4, the equal variance assumption is not merely convenient, but is also necessary for mathematical tractability. In summary, our conceptualization assumes that the design team has a multivariate normal prior on the design performances. Thus, the design team knows the following parameters of the multivariate normal distribution of the design performances: (i) the performance expectations; (ii) the performance uncertainty (represented through the common standard deviation); and (iii) the performance correlations between the design configurations. Note that this does not imply knowledge of the exact performance levels, but only represents the a priori knowledge of design commonalities (based on, for example, shared components). Suppose that the team undertakes sequential testing, and that the design configurations 1 through k have been tested. We assume that the tests are perfect, i.e., the measurement errors, if any, are negligible, and the true performance of a design is revealed when the design is tested. Then, the conditional joint distribution (conditional on the k test outcomes observed so far) of the remaining N − k designs is again a multivariate normal distribution (Rencher, 2002). Furthermore, the conditional distribution (conditional on the k test outcomes observed so far) of each of the remaining N − k designs are univariate normal with the following conditional mean and standard deviation (Rencher, 2002): ¯ ¯ k ) (i = k + 1, .., N) µˆ ik+1 = µi + r¯ki0 Q−1 k (Pk − µ σˆ ik+1 = σ

q i0 (1 − r¯ki0 Q−1 k r¯k )

(i = k + 1, .., N)

(1) (2)

where µˆ ik+1 and σˆ ik+1 are the conditional mean and the standard deviation for design configuration i; r¯ki is a k × 1 column vector of a priori covariances of design i with all of the k tested designs; Qk is a k × k submatrix of the covariance matrix ΣN such that it contains the a priori covariances of all the tested designs; µ¯ k is a k × 1 column vector of a priori expectations of all the k tested prototypes (= (µ1 , µ2 , ..., µk )); P¯k is the 8

k × 1 column vector of performances of the k tested prototypes (= (p1 , p2 , .., pk )). After the kth test, let U be the set of all untested configurations, and T the set of all the tested ones. If the cost of testing design i is ci , and p = max j∈T p j is the maximum performance observed until now, the team faces the following dynamic decision problem: ( Vk (p, µˆ k ,U) = max

p

(3)

max j∈U (E [Vk+1 (max(p, p j ), µˆ k+1 ,U − { j})] − c j ) / = p VN (p, ·, 0)

(4)

Equations (3) and (4) are the Bellman optimality equations corresponding to the decision process. The state is described by the maximum performance realized thus far, the set of untested design configurations, and the vector of the updated mean performance values of the remaining tests. The upper branch of (3) describes the payoff if the team terminates the experimentation; it is equal to p, the highest performance observed. Upon continuation, the expected payoff results from the choice of one of the untested configurations. Note that the final payoffs are given by p = max j∈T p j . Thus, we assume that a design is available for market launch only if it has been tested. The investment involved in setting up a manufacturing process and a distribution channel makes any launch approval unlikely, unless a profitable business case, including testing data, is presented.

3

Effect of Learning on Optimal Testing

In this section, we derive the optimal dynamic search policy and we demonstrate its key structural properties. We proceed in three stages: In §3.1, we assume that the performance dependencies between any pair of design configurations are identical, that is the correlations among the design performances are identical. The assumption represents settings where the design configurations lie within a specific scientific domain, and they share a substantial number of attributes (i.e., each design configuration is of the form (x1 , x2 , .., xM ), and the first r attributes x1 , x2 , .., xr have the same value across all configurations). The design team knows 9

the performance correlations among the designs, but not the test outcome. As stated in the introduction, this reflects knowledge about the distance between design configurations, and the complexity of the performance function. In §3.2, we relax this assumption and allow for uncertain correlations. In §3.3, we relax the assumption of identical correlations between the design configurations.

3.1

Base Case: Identical Correlations

For now, we assume that any pair of design configurations has the same correlation factor. The overall performance uncertainty of the design configurations is summarized by the covariance matrix ΣN (defined in §2). It also summarizes the dependencies across the performance random variables, that is the ij-th element ai j is the covariance between the performance random variables of design configurations i and j. Then, ( ai j =

σ2 if i = j θσ2 otherwise (0 < θ < 1)

where θ is the correlation factor. The experimentation cost is composed of two elements (Thomke, 2003): (i) the direct cost of specialized resources required to conduct the experiments; and (ii) the indirect opportunity cost resulting from product introduction delays. We assume that the direct testing costs are constant and independent of the tested configuration (= c0 ). This is realistic when the cost is driven by the experimentation process rather than the experimentation subject. For example, in some athletic gear products, the designers conduct wind tunnel tests to refine a product’s aerodynamic properties (The Globe and Mail 6/24/2004). These costs are substantial and are usually independent of the tested design configuration5 . Furthermore, our constant cost assumption coincides with similar assumptions made in the extant literature (Loch et al., 2001). In §4, we offer a formal discussion of the limitations imposed by the constant cost assumption. 5

“Before the 2003 Tour (de France), Armstrong spent about $15,000 an hour, for wind tunnel tests to create a new helmet.” San

Jose Mercury News (06/28/2004)

10

The second component of the experimentation cost, the opportunity cost from delayed product introductions, can be significant. Clark (1989) notes that “each day of delay in market introduction costs an automobile firm over $1 million in lost profits”(Clark, 1989, p. 1260). We account for such losses as follows: a firm that tests k designs before terminating the test cycle incurs a delay loss L(k). Using the total effective cost of the kth test ck = c0 + L(k) − L(k − 1) allows us to derive the optimal policy for general loss function structures (see Appendix, p. 30). This explicit loss function has two advantages over a discount factor: (i) we posit that factors, such as competition and the short product life-cycle, affect the opportunity loss significantly more than the time value of money captured through a discount factor (Kavadias & Loch, 2003); and (ii) using identical discounting for both the costs and the performance levels is misleading in an NPD context, as the risks associated with the two may be very different (for instance, technical risks versus market commercialization/acceptance risks). Suppose that testing is sequential, and configurations 1 to k have been tested. Due to the performance dependencies, the outcomes of the tested configurations also convey information about the untested configurations. Proposition 1 outlines this dynamic information updating. Proposition 1 Assume that k configurations have been tested. The updated (marginal) performance distribution of an untested configuration i is normal with mean and standard deviation ¡ ¢ θ(pk −ˆµkk ) θ (i) µˆ ik+1 = µˆ ik + kθ+1−θ = µi + kθ+1−θ (∑kj=1 p j − µ j ) q kθ2 (ii) σˆ ik+1 = σ (1 − kθ+1−θ ) All the proofs are given in the Appendix. Given the learning and the evolution of knowledge outlined in Proposition 1, Theorem 1 presents the optimal testing policy for the base case. Theorem 1 Index the design configurations in descending order of their a priori expected performances, · ¸ © ª ¡ i ¡ k ¡ ¢¢¢ θ i j i j i.e., i > j ⇒ µ ≤ µ . Define χk+1 = max p j − µ + kθ+1−θ ∑ j=1 p j − µ . Then, there exist conj≤k

stant thresholds Θi (i = 1, .., N) such that after the completion of the kth configuration test, stop if and only if χik+1 ≥ Θk+1 for all i > k, else test the design configuration with the largest expected performance. 11

An interesting insight emerges from the following interpretation of the optimal policy. The term −χik+1 © ª (= µˆ ik+1 − max p j ) expresses a normalized value that accounts for the remaining potential from testing, j≤k

i.e. the difference between the expected performance of the next configuration and the best performance achieved so far. The term −Θk+1 is a constant threshold. Thus, our policy has an intuitive economic interpretation: the design team terminates experimentation when the remaining potential drops below a threshold value (i.e., −χik+1 ≤ −Θk+1 ). If testing continues, the configuration with the largest remaining potential (or alternatively the design with the largest mean) is tested next. Our optimal policy exhibits certain structural similarities with the Weitzman (1979) policy, adapted for our specific distributional assumptions to enable a meaningful comparison. Θk+1 represents a normalized value that makes management indifferent between proceeding or stopping, a quantity very similar to the classic definition of a reservation price. Reinterpreting χik+1 (= p − µˆ ik+1 ) as the normalized value obtained so far (normalized by accounting for the mean) allows us to also see the parallel between our policy and Weitzman’s (1979) policy: testing stops when the normalized value χik+1 obtained from testing exceeds a reservation price. Our choice rule, i.e. which configuration to test next, is the same as Weitzman’s rule. We can verify the claim as follows: consider two search opportunities X ∼ N(µx , σ) and Y ∼ N(µy , σ), and testing costs cx = c and cy = c. The difference between the Weitzman reservation prices of X and Y is equal to the difference between their means, i.e., W RP(X) −W RP(Y) = µx − µy 6 . Hence, the priority ranking induced by the reservation prices is identical to the one induced by the means, and Weitzman’s (1979) adapted policy translates into testing the design with the highest mean. However, our stopping rule, i.e., when experimentation should be terminated, exhibits key differences 6

Let W RP(X) = Rx and W RP(Y) = Ry . Then by definition, 0 = E[max{X, Rx }] − Rx − c and 0 = E[max{Y, Ry }] − Ry − c.

Since X and Y are different only in their means, the distribution of X is identical to the distribution of (Y − µy + µx ). Hence, 0 = E[max{X, Rx }] − Rx − c = E[max{Y − µy + µx , Rx }] − Rx − c = 0 = E[max{Y, Rx + µy − µx }] − (Rx − µy − µx ) − c. That is, the reservation price of Y is equal to Rx + µy − µx , i.e., Ry = Rx + µy − µx =⇒ (Rx − Ry ) = (µx − µy ).

12

from Weitzman’s rule. We rewrite our policy statement in Corollary 1 to enable easier comparison to Weitzman’s (1979) stopping conditions. Corollary 1 Index the design configurations in descending order of the a priori performance expectations, θ θ θ i i.e., i > j ⇒ µi ≤ µ j . Define Ωk+1 = µk+1 − kθ+1−θ µk + kθ+1−θ ∑k−1 i=1 (pi − µ ) + Θk+1 , and λk = kθ+1−θ . After ³ ´ maxi≤(k−1) {pi }−Ωk Ωk the kth test, optimally continue if and only if pk ∈ , λk 1−λk . Otherwise terminate testing.

Corollary 1 shows that accounting for performance dependencies changes the stopping rule. Continuation is optimal only when the current test outcome is within an interval. This result stands in contrast to Weitzman (1979), where continuation occurs only when the test outcome is below a threshold. Thus, the structure illustrated in Figure 1 generalizes. The key difference between the two policies is driven by the ability to learn during the experimentation process. Low test outcomes signal low future performances and, hence, decrease the attractiveness of additional experimentation. High design performances, on the other hand, signal a high remaining potential making additional experimentation worthwhile. Our results add to the discussion initiated by Loch et al. (2001) concerning the impact of learning on experimentation strategies. Our stopping conditions emerge endogenously, and they involve balancing the performance benefits and learning against experimentation costs. The results also demonstrate that the stopping conditions depend on the design space structure and the realized performance levels as opposed to an exogenously pre-specified confidence level. In Proposition 2, we characterize the properties of the stopping thresholds Θk . They qualitatively describe the optimal test termination decisions of the design team. ³ ³ ´´ kθ2 Proposition 2 Define W RPk = {x : E [max{x, Xk }] − x = c} where Xk ∼ N 0, σ2 1 − kθ+1−θ . W RPk expresses the Weitzman stopping threshold for search opportunity Xk . Then, (A) Θk is a real constant, and is decreasing in k. (B) Θk is increasing in σ. 13

(C) Θk is decreasing in θ. (D) Θk ≥ W RPk .

Claim (A) characterizes the normalized stopping thresholds, Θk , as decreasing in the number of tests conducted. This occurs for two reasons: (i) fewer design configurations remain to be tested, and hence the learning possibilities are fewer, and (ii) the residual uncertainty is lower. Claims (B) & (C) demonstrate the effect of the residual uncertainty on the optimal stopping values. Higher residual uncertainty may arise from fewer commonalities between design configurations (lower θ), greater complexity of the performance function (lower θ), or greater a priori uncertainty concerning the configuration performances (higher σ). Irrespective of the source, higher residual uncertainty makes testing more attractive by increasing the normalized reservation price. The result resembles the value-enhancing nature of uncertainty (variance) in a traditional option setting: since the design team is assured of retaining the highest performing design configuration upon stopping, the downside loss of conducting one more test is bounded by the testing cost. At the same time, the upside benefits from experimentation depends on the amount by which the performance can vary upward, which increases with higher residual uncertainty (variance). Claim (D) demonstrates that the stopping thresholds in the optimal policy may be larger compared to Weitzman’s optimal stopping thresholds W RPk , which do not consider design correlations and learning. This result illustrates the relevance of learning in experimentation and highlights one of our key contributions. As opposed to myopic stopping proposed in past studies (Weitzman, 1979; Loch et al., 2001), we find that testing may continue to provide information about the design space. Our results stand in contrast to Adam (2001), who examines the search problem where the experimenter has priors about the distributions of the search alternatives and updates these priors as the search proceeds. However, he assumes that “learning exhibits enough monotonicity to prevent the searcher from optimally

14

going through a ‘payoff valley’ to reach a ‘payoff-mountain’ later on” (Adam, 2001, p. 264). Under this assumption, an extension of the Weitzman’s (1979) policy, where stopping is still myopic, remains optimal. In contrast, we do not constrain the form of learning, and we find that optimal stopping becomes non-myopic. To summarize, experimentation offers two benefits: (i) the direct benefit of identifying configurations with higher performance (a key goal identified in the past NPD literature), and (ii) the indirect benefit from learning about the untested design configurations. Because of the second benefit, the design team may keep experimenting for the purpose of learning about the design space even when the remaining untested alternatives by themselves do not justify experimentation. We illustrate the result with a modified version of the introductory numerical example. Consider the parameters: µA = µB

=

−0.91

σA = σB

=

1

CA = CB

=

0.1

θ

=

0.1

The expected one-step benefit from testing A (= E[max{0, A}] − CA ) is 0. Thus, the myopic stopping rule (Weitzman, 1979; Adam, 2001) dictates immediate termination without testing any of the configurations. However, we can learn about the performance of configuration B from analyzing the test result of configuration A (say, θ = 0.1). Hence, testing A and obtaining a test result a, prompts the design team to optimally test B if and only if the reservation price of B, 0.89 + (−0.91) + 0.1(a − (−0.91)), is greater than the maximum obtained so far, max{a, 0}7 . Hence, testing B is optimal if and only if a ∈ (−0.71, 0.079). Since there is a non-zero probability of testing B, this implies that the optimal policy involves testing A and then testing B if and only if a ∈ (−0.71, 0.079). Our result exhibits properties of ‘strong learning’ (Adam (2001)), that is, experimentation for the purpose of learning. Figure 2 illustrates this phenomenon. It plots the probability of continuation as a function 7

The argument is identical as in §1.

15

Exploration Phase

ș = 0.1

Exploitation Phase

Probability of continuation

ș = 0.0 ș = 0.1 ș = 0.4

1

2 Exploration Phase

3

4

5

6 Number of tests

Exploitation Phase

ș = 0.4

Figure 2: Learning Effects Arising from Performance Dependencies of the number of tests conducted, for different levels of design performance dependencies8 . We observe that the dependencies across configurations (a higher correlation θ) lowers the likelihood of early termination, but increases the probability of later termination. Define the case with no dependencies (i.e., θ = 0 or no learning) as the ‘benchmark curve’. Then, we may call the phase with greater than the benchmark setting continuation probability the exploration phase, since in this phase, tests are conducted for the purpose of learning about the design space. Similarly, we call the phase with lower continuation probability the exploitation phase, where continuation, if pursued, is for the explicit purpose of obtaining a better performance. The precise transition from ‘exploration’ to ‘exploitation’ would be difficult to measure in practice, as the parameters are never precisely known. Still, an important qualitative insight for managers emerges: early on, conduct more ‘design-build-test’ iterations than what the remaining alternatives seem 8

Figure 2 was generated for the specific problem instance with N = 8, σ2 = 80, c = 30, µi = 140 ∀i = 1, .., 8 and θ ∈ {0, 0.1, 0.4}.

For each parameter setting of θ, 10, 000 simulation experiments were conducted and the probability of continuation after the kth test was calculated as the fraction of experiments where the optimal policy proceeds beyond the kth test.

16

to justify, in order to learn. Later on, conduct ‘design-build-test’ iterations only to exploit the remaining promising alternatives. Our insight offers an explanation of Ward et al.’s (1995) and Sobek et al.’s (1999) earlier observations about Toyota’s product development process, where a large number of different design options are explored upfront before converging to a final solution. Toyota follows this approach “to explore broad regions of the design space simultaneously” (Ward et al., 1995, p. 49,) and to “refine [their] map of the design space before making decisions”(Ward et al., 1995, p. 56,). While the mechanism of experimentation we examine in this article is different (sequential as opposed to simultaneous testing), the underlying driver of the exploration phase is similar, namely, the potential to learn about the design space. The following additional insights can be drawn from Figure 2: the likelihood of continuation during the initial tests and the duration of the exploration phase decrease in the complexity of the performance function and the distance between designs. This occurs for the following reason: the test outcomes hold greater information in design spaces where the complexity of the performance function is lower and/or where the distance between designs is lower. However, intensive testing continues for a shorter period of time as learning accumulates after fewer tests. In order to further build intuition about the drivers of the total number of tests, we perform comparative statics with respect to the design space characteristics. It is important, from a practitioner perspective, to outline such contingency-based guidelines (Loch et al. 2006). For ease of exposition, assume that the performance expectations of all the design configurations are identical (= µ). Let N (µ, σ) be the total number of tests conducted under the premises of the optimal policy, when the expected configuration performance is µ and the uncertainty is σ2 . Proposition 3 (µ,σ) ≥ 0. (i) The number of tests increases in the initial expectations of design performance, i.e. ∂N ∂µ (µ,σ) (ii) The number of tests increases in the initial design performance uncertainty, i.e. ∂N∂σ ≥ 0.

17

Proposition 3 states formally that more experiments should be conducted on a design space where configurations have higher a priori performance expectations and higher performance uncertainty. These results confirm intuition, but they emerge for different reasons. High initial expectations lead to the design team foreseeing a higher potential value (as expressed by the −χik term), and thus result in additional testing. In contrast, higher initial performance uncertainty leads to increase in the stopping threshold (Θk ), and leads to increased experimentation so as to reduce the (residual) uncertainty. Next, we examine the impact of performance correlations (i.e., similarities between design configurations and the complexity of the performance function) on the number of experiments. The effect is not uni-directional. It depends on the performance realizations of the initially tested designs. Therefore, we perform a contingency analysis: we assume that k configurations have been tested and have yielded performances p1 , · · · , pk . The following definitions describe different contingency scenarios. Definition 1 Suppose k design configurations have been tested yielding the following performances: p1 , · · · , pk . Then, we define the scenario (test outcomes) as: • worse-than-expected if the average configuration performance realized is lower than the average expected one, i.e.

∑ki=1 pi k




∑ki=1 µi k

;

Given these two contingencies, Proposition 4 describes the optimal actions:

Proposition 4 Assume k configurations have been tested. (A) If the test outcomes are worse-than-expected, the design team should transition to exploitation phase when the design performances are strongly correlated (high θ). (B) If the test outcomes are better-than-expected, there is no uni-directional result. 18

(C) If the test outcomes are better-than-expected with a sufficiently high difference, the design team should continue exploration phase when the design performances are strongly correlated (high θ).

Proposition 4 highlights the subtle moderating role of the performance correlations (due to the design space structure) on optimal contingency actions. Suppose that the test outcomes are, on average, poorer than initially expected. The information that these test results convey about the rest of the configurations depends on the underlying structure of the design space. When the performance correlations are greater, reflecting similar design configurations and a simple performance function, the initial test outcomes are more informative about the performances of the remaining configurations. Hence, the design team extrapolates the initial poor performances to the rest of the design space, and, consequently, transitions to the exploitation mode (i.e., increased likelihood of termination). On the other hand, if the performance correlations are weak, the initial poor test outcomes are not necessarily indicative of the remaining design performances. Consequently, the team may continue exploration even when worse-than-expected test outcomes are encountered at the early stages. Claims (B) and (C) follow a similar (but more complex) line of reasoning. If the test results are betterthan-expected, the design team uses the initial test outcomes as a signal that the remaining designs have higher than the (initially) expected performances. This signal is stronger when the design performances are more correlated, and, thus, results in continued exploration. At the same time, greater dependencies also result in faster learning (rapid uncertainty reduction) and an earlier transition to exploitation phase. Which of these opposing effects is stronger depends on the specific parameter values. When the realized performance levels are much better than expected, the effect of the ‘high performance potential’ dominates, leading to continued exploration. Table 1 summarizes the potential managerial contingency actions: After testing a few designs, the design team assesses the actual performance realizations and their difference from the initial expectations. If the test outcomes are below the a priori expectations, then the design team assesses that a worse-than-expected 19

High Correlation θ

(

Simple Performance Function Less Distance between Designs

)

Low Correlation θ

(

Complex Performance Function Greater Distance between Designs

)

Much better than expected

Worse than expected

(Very good news)

(Bad news)

Continue Exploration

Transition to Exploitation

Increased Continuation Probability

Lower Continuation Probability

Transition to Exploitation

Continue Exploration

Lower Continuation Probability

Increased Continuation Probability

Table 1: Impact of design-space structure on the contingency plan scenario has emerged. In the event that performance function is simple and/or the distance between design configurations is lesser, such initial bad news implies that there is not much more to expect from the design space at hand, thus the team should most probably attempt to reduce the remaining experimentation cost by transitioning to exploitation phase. If, on the other hand, the performance function is complex and/or the distance between design configurations is greater, then targeting reduction of the remaining experimentation costs is probably misplaced, as the initial bad news is not indicative of the remainder of the design space. Similarly, if test outcomes emerge significantly better than expected, then the design team should plan on further exploring the design space due to the high remaining potential. As before, this continued exploration is more likely when the initial good news is more indicative of the design space (i.e., when the performance function is simple and/or when the distance between designs is lesser). Table 1’s importance stems from the fact that it outlines guidelines for decision making and it demonstrates the non-trivial impact of the design space structure on the optimal test schedule. In that light, the contingency actions outlined may also be viewed as guidelines to achieve more reliable project monitoring and control (Loch et al., 2006).

3.2

Uncertain Performance Dependencies

Thus far, we have implicitly assumed that the design team can at least estimate the performance correlations between the design configurations. However, when the designs involve highly innovative technologies, the 20

performance correlations arising from the configuration similarities may be unknown. In such cases, we assume that the design team specifies a likelihood for the performance correlations. In this section, we analyze the impact of uncertainty in performance correlations on optimal experimentation.9 The design team holds the belief that the performance correlations are weak (low θL ) with probability p or strong (high θH ) with probability (1 − p). The sequential experiments yield an additional signal S , in addition to the design performance result. We assume that the signal Sk received after the kth test reveals the true correlation with probability λ, and it is independent from past performances p1 , .., pk . In other words, understanding the true performance correlations require additional information, such as exploring relevant scientific theories, and formulating abstract explanatory models of the design space. These signals are far more general than the performance outcome information; therefore, we assume that the dependence between Sk and the past performances is negligible. Theorem 2 presents the optimal policy. Theorem 2 If the uncertainty regarding the performance correlations has been resolved, then use Theorem 1. Otherwise, index the design configurations in the descending order of their a priori expected performance, © ª i.e., i > j ⇒ µi ≤ µ j . Define pk+1 max = max p j as the highest realized configuration performance. There exist j≤k

path dependent thresholds, Γk+1 (∑kj=1 p j ) (i = 1, .., N) such that after the kth test the design team should stop if and only if pk+1 max ≥ Γk+1 . Else they should proceed with the design configuration with the highest updated expected performance.

The optimal policy shows marked similarity in structure to Theorem 1. However, the stopping thresholds are no longer path independent, and they depend on the sum of the realized configuration performances. Mirroring Corollary 1, the following corollary gives an alternative formulation of the optimal policy. Corollary 2 There exists thresholds τl and τh such that there is an optimal policy where after the kth test termination is optimal if pk ≤ τl or pk ≥ τh . 9

We thank the referee team for suggesting this extension.

21

Corollary 2 suggests a surprising robustness of our initial results. Termination of the experimentation process occurs either because: (i) the test outcome is high enough so that the design team does not expect to receive a much higher value from additional experimentation (pk ≥ τh ); or (ii) the test outcome is so low that the design team expects the remaining designs to be low performing as well (pk ≤ τl ).

3.3

General Case: Multiple Design Classes

In this section, we extend our base case model to more general design space topographies. We consider that there exist distinct subclasses with stronger configuration dependencies within classes and weaker across. The new setup represents design problems that can be solved through substantially different approaches (e.g. recent efforts to substitute bar-codes with RFID tags imprinted on different materials like paper, polymers, etc; see PRC, 2004). Consider M distinct classes of design configurations. Each class j contains N j different configurations. A priori, the design configurations of class j have unknown performances with identical10 expectations (µ j ) and performance uncertainties ((σ j )2 ). In addition, the attributes that define a class (for instance the underlying technology) lead to significantly correlated performances within a class (θ = θ j within a class), and negligible dependencies across classes (θ = 0 across classes). Suppose the design team has sequentially tested k design configurations, Theorem 3 provides the dynamic optimal policy for the new setting. Theorem 3 Define the surplus performance observed thus far as the following normalization, i.e. χk+1 = j · ¸ © ª max p j − µˆ k+1 . Then, there exist constants Θij j = 1, ..., M, i = 1, .., N j such that after the kth test the j j≤k

k

design team should stop if and only if χk+1 ≥ Θ j j for all design classes, else continue with the class that has j k

the highest Θ j j − χk+1 j . 10

We restrict the performance expectations to be identical within a class to reduce the notational burden. Extensions to a more

general setting are straightforward.

22

Theorem 3 demonstrates that the stopping decision (‘how many more experiments should we undertake’) is relatively robust and is structurally similar to Theorem 1. Our policy permits an interesting analogy to Weitzman (1979), if we view the design subclasses as Weitzman’s boxes. Under this interpretation, the optimal decision is equivalent to terminating the test cycle only if the best performance obtained so far is greater than the (updated) reservation prices and continuing with the box which has the highest (updated) reservation price. However, note that this analogy can only be taken so far, since in our multiple class setting, repeated sampling of a box (design class) is feasible, unlike in the one-trial-per-box case analyzed by Weitzman. Also, similar to Corollary 1, it can be verified that the design team should stop testing the designs within a given subclass when an observed outcome from the subclass is either too high or too low (i.e., continuation only when the outcome is within an interval). Furthermore, the overall test cycle is terminated when any of the observed test outcomes is very high, since this would indicate that the likelihood of obtaining a still higher performance from any of the design classes is low, given the experimentation cost.

4

Discussion

This study examines the learning aspect of iterative testing cycles during the experimentation stage of a NPD process. We conceptualize each design configuration as a vector of attributes. These attributes jointly affect the final performance of each configuration in complex ways, due to possible coupling effects that are not fully known to the design team. Designs also may exhibit similarities with respect to their attributes, thus making some designs closer to others in the design space. Past experience and domain knowledge may allow the design team to approximate such a design space structure through a surrogate metric that accounts for both the complexity of the performance function and the distance between the designs: the correlation between design performances. We develop a normative model and we derive the optimal dynamic testing policy for design configurations that have different and uncertain performance levels. 23

Test continuation is optimal when the previous test outcomes range between two thresholds. Test outcomes below the lower threshold indicate an overall low-performing design space, where subsequent tests are unprofitable. Test outcomes above the upper threshold, on the other hand, indicate that the design team cannot expect a much higher value (than the highest value already found) by undertaking additional experiments. The upper threshold result is consistent with the optimal policy of Weitzman (1979). Furthermore, we also demonstrate that the design team may optimally pursues experimentation even after a ‘good’ design is found, which, in the absence of performance correlations, would justify stopping. The potential to learn about the design space adds value to the testing process above and beyond the immediate performance benefits. These results help us understand the role of the design space structure on learning and on optimal experimentation: less experiments with cost savings, when the information suggests that the untested design configurations may be performing low; or additional experimentation seeking higher performance outcomes, when the information indicates untapped potential. Furthermore, we demonstrate that the consideration of learning splits the overall experimentation process into two distinct phases. An exploration phase that occurs at the beginning of the testing process and is characterized by a strong tendency to continue testing even when outcomes are good. During this phase, the ‘design-build-test-analyze’ iterations allow the design team to learn about the remaining design options. We also find that greater correlations among the design performances enable faster learning and thus reduce the length of the exploration phase. Once the design team gains sufficient understanding about the design space, experimentation transitions into an exploitation phase. At that point, the likelihood of terminating testing and choosing the best design (found so far) increases. During this phase, the design team, based on what they have learned, conducts the ‘design-build-test-analyze’ iterations only to test the remaining promising alternatives. Finally, we analyze how the optimal continuation decisions change with the initial test outcomes. Our

24

findings are summarized in Table 1, and they provide a useful guideline for contingency planning. When the performance correlations of designs are high, poor test outcomes at the initial stages imply that there is not much more to expect from the remaining design space. Thus, the team should stop or transition to the exploitation phase to minimize the experimentation cost. If, on the other hand, the design correlations are low, then extrapolating from initial poor results is premature, and exploration may continue. Our policy exhibits a robust structure along two extensions: (i) the design team uncertainty about the performance dependencies (i.e. unknown correlation), and (ii) non-homogeneous performance dependencies (i.e. a design space with distinct subclasses). Limitations & Future Research In our analytical model, we employed a set of assumptions in order to develop a qualitative understanding of the impact of the design space structure on learning and on the optimal amount of experimentation. However, three of these assumptions challenge the mathematical generality of our model and thus merit further discussion: [A1] Normal distribution of design performances; [A2] Equal testing cost within a class; and [A3] Equal variance (uncertainty) in design performances within a class. Theorem 4 offers an impossibility result that demonstrates that the three key assumptions (listed above) are not merely made for convenience, but are, in fact, necessary to achieve mathematical tractability. Theorem 4 Index policies are sub-optimal if any one of the three assumptions, A1, A2, or A3 is relaxed to account for arbitrary structures. Theorem 4 demonstrates that the three key assumptions do indeed place our model at the boundaries of mathematical tractability (defined as the existence of an index policy). Despite of this inherent limit of generalizability, the model does offer an increased understanding of the key implications of design space structure on optimal experimentation. We consider our balance between realism and mathematical tractability to be a fair price to pay for developing this understanding. Loch et al. (2001) highlight the effect of problem parameters, such as test fidelity and the delay costs, on the total experimentation cost. Dahan & 25

Mendelson (2001) examine the tradeoff between performance benefits and experimentation cost when experiments are run in parallel (thus eliminating any potential for learning). Our approach thus complements these past efforts by explicitly accounting for the design space structure and its effects on learning and on the amount of experimentation. Finally, the limitations of the current conceptualization of the optimal experimentation and some of the emerging industry practices suggest a need for reevaluating the search theoretic model as a whole. Firms are increasingly seeking innovation through collaborative efforts; such settings present challenges in understanding how firms manage distributed (or delegated) experimentation (Terwiesch & Loch, 2004). At the same time, it is also imperative to recognize that experimentation and search are performed by individuals, and, hence, are subject to behavioral traits (see Zwick et al., 2003, for a discussion of how certain behavioral elements affect sequential choices). Such extensions would require further field observation of how actual testing teams arrive at decisions/judgments during experimentation cycles.

References Adam, K. 2001. Learning while searching for the best alternative. J. of Econ. Theory, 101, 252–280. Allen, T. J. 1977. Managing the flow of technology. Cambridge, MA: MIT Press. Clark, K. B. 1989. Project scope and project performance: The effects of parts and supplier strategy in product development. Management Science, 35(10), 1247–1263. Clark, K. B., Fujimoto, T. 1989. Lead-time in automobile development: Explaining the japanese advantage. Journal of Technology and Engineering Management, 6, 25–58. Cusumano, M., Selby, R. 1995. Microsoft secrets. New York, NY: The Free Press. Dahan, E., Mendelson, H. 2001. An extreme value model of concept testing. Management Science, 47, 102–116. DeGroot, M. H. 1968. Some problems of optimal stopping. Journal of Royal Statistics Society. B., 30, 108–122.

26

Ethiraj, S. K., Levinthal, D. 2004. Modularity and innovation in complex systems. Management Science, 50, 159–173. Johnson, R. A., Wichern, D. W. 2002. Applied multivariate statistical analysis. Upper Saddle River, NJ: Prentice Hall, 5 edition. Kauffman, S. A. 1993. The origins of order: Self-organization and selection in evolution. New York & Oxford: Oxford University Press. Kavadias, S., Loch, C. 2003. Dynamic prioritization of projects at a scarce resource. Production and Operations Management, 12, 433–444. Loch, C., DeMeyer, A., Pich, M. 2006. Managing the unknown: A new approach to managing high uncertainty and risk in projects. London, UK: John Wiley & Sons Inc. Loch, C., Terwiesch, C. 2007. Coordination and information exchange, Chap. 11 in Handbook of New Product Development Management, Loch C. H. and Kavadias S. (eds.). Oxford, UK: ElsevierButterworth/Heineman. Loch, C. H., Terwiesch, C., Thomke, S. 2001. Parallel and sequential testing of design alternatives. Management Science, 45, 663–678. PRC 2004. Microsystems packing research center: Leading the sop and nano paradigms in partnership with global industries. Georgia Tech Packing Research Center, Atlanta, GA. Rencher, A. 2002. Methods in multivariate analysis. New York, NY: Wiley Series in Probability and Statistics. Simon, H. A. 1969. The sciences of the artificial. Cambridge, Massachusetts: MIT Press, 1 edition. Sobek, D., Ward, A., Liker, J. 1999. Toyota’s principles of set-based concurrent engineering. Sloan Management Review, 40, 67–83. Stadler, P. F. 1992. Correlation in landscapes of combinatorial optimization problems. European Physical Letters, 20, 479–482. Terwiesch, C., Loch, C. H. 2004. Collaborative prototyping and pricing of customized products. Management Science, 50, 145–158. Thomke, S., Holzner, M., Gholami, T. 1999. The crash. Scientific American, March, 92–97. Thomke, S. H. 1998. Managing experimentation in the design of new products. Management Science, 44, 743–762. 27

Thomke, S. H. 2003. Experimentation matters: Unlocking the potential of new technologies for innovation. Cambridge, MA: Harvard Business School Press. Ward, A., Liker, J., Cristiano, J., Sobek, D. 1995. The second toyota paradox: How delaying decisions can make better cars faster. Sloan Management Review, 36, 43–61. Weitzman, M. L. 1979. Optimal search for the best alternative. Econometrica, 47, 641–654. Zwick, R., Rapoport., A., Lo, A. K. C., MuthuKrishnan, A. V. 2003. Consumer sequential search: Not enough or too much. Marketing Science, 22, 503–519.

28

Appendix Proof of proposition  1 1

r

 .. r

    r 1 r   Define Rn =   .. .. .. ..   r r .. 1

n×n

 |Rn−1 |

−r(1 − r)n−2 .. −r(1 − r)n−2



   n−2 n−2    −r(1 − r) |R | .. −r(1 − r) n−1 1   Then it can be shown that R−1 n = |Rn |     .. .. .. ..   −r(1 − r)n−2 −r(1 − r)n−2 .. |Rn−1 |

n×n

and that |Rn | = (1 − r)n−1 (nr + 1 − r)

Using the results outlined above, and the equations (1) and (2) from §2 it can be shown that the updating rules are equivalent to the equations given in the proposition. Proof of theorem 1 Let Vk (p, µ¯ k ) be the value function after testing k − 1 design configurations where p is the maximum performance seen so far, and µ¯ k is the 1×N −k +1 row vector of updated mean performances for the untested designs (¯µk = [µkk , ..., µNk ]). This is sufficient information since the mean performance already incorporates all relevant information about the individual test results. First we prove some structural properties of the value function Vk (., .). Using these properties we derive the optimal policy. Lemma 1 The value function satisfies Vk (p, µ¯ k ) = Vk (p − α, µ¯ k − α.1) +α for every α ∈ R Proof. Consider the auxiliary search problem of subtracting some constant quantity α from the mean and maximum observed performance. Then, we apply the policy that is optimal for the original problem in the auxiliary problem (this is always possible since, because of the finite variance and mean assumption, the original problem has an optimal policy). Regardless of where you stop, the payoffs from the auxiliary problem are α less than the optimal payoffs from the original problem. That is, Vk (p − α, µ¯ − α) ≥ Vk (p, µ¯ ) − α. Similarly, if we use the policy that is optimal for the auxiliary problem in the original problem, we find that, regardless of where you stop, the optimal payoffs from the auxiliary problem would be α less than the payoffs from the original problem. That is, Vk (p − α, µ¯ − α) ≤ Vk (p, µ¯ ) − α. Combining the two inequalities we obtain the statement of the lemma. Lemma 2 Define Vk (p, µ¯ k ) = Ωk (p, µ¯ k ) + p. Then Ωk (p, µ¯ k ) = Ωk (p − α, µ¯ k − α.1) 29

Proof. From lemma 1 we have: Vk (p, µ¯ k ) = Vk (p − α, µ¯ k − α.1) + α Ωk (p, µ¯ k ) + p = Ωk (p − α, µ¯ k − α.1) + p Ωk (p, µ¯ k ) = Ωk (p − α, µ¯ k − α.1)

Lemma 3 Define Vk (p, µ¯ k ) = Ωk (p, µ¯ k ) + p. Then Ωk (., .) satisfies the functional equation 11

Proof. From the Bellman equation of our dynamic decision process we obtain: µ ¶ h i © ª j Vk (p, µ¯ k ) = p ∨ max E Vk+1 (p ∨ p j , µ¯ k+1 ) − c where p = max p j j≥k

j≤k−1

Assume that if we undertake further testing, and we test j, i.e. j maximizes the expected future value i i h h £ ¤ j j E Vk+1 (p ∨ pk , µ¯ k+1 ) . That is E Vk+1 (p ∨ pk , µ¯ k+1 ) = max E Vk+1 (p ∨ pik , µ¯ k+1 ) ). The Bellman equai≥k

11

In the case of more complex delay costs define ck = c + L(k) − L(k − 1) as the effective cost of conducting the kth test. Then,

Ωk (·, ·) satisfies the functional   equation h i     E Ωk+1 (χ j ∨ Xk − θ Xk , µ¯ k+1 − µ j .1) + k kθ+1−θ k+1 Ωk (p, µ¯ k ) = max 0 ∨  h i  j≥k  j j   E χk ∨ Xk − χk − ck ³ ³ ´´ kθ2 N 0, σ2 1 − kθ+1−θ

       

where

j

j

χk = p − µk

and

Xk ∼

The proof remains identical with c being replaced by ck . We use the loss function instead of a discount factor to capture the loss due to delays since in our specific context, the potential losses due to delays far outweigh any potential changes in valuation of incurred costs. That is, delaying the expenditure during a testing cycle does not offer much value due to the relatively short time period in which the tests have to be completed, whereas delaying the product introduction would almost surely have an adverse effect in most, if not all, markets.

30

tion becomes ³ h i ´ j j j j Vk (p, µ¯ k ) = p ∨ E Vk+1 (p ∨ pk − µk+1 , µ¯ k+1 − µk+1 .1) + µk+1 − c (using Lemma 1) µ · ¸ ¶ θ j j j j j (pk − µk ), µ¯ k+1 − µk+1 .1) + µk − c = p ∨ E Vk+1 (p ∨ pk − µk − kθ + 1 − θ  h ³³ ´ ³ ´ ´i  j j j j j θ E Vk+1 p − µk ∨ pk − µk − kθ+1−θ (pk − µk ), µ¯ k+1 − µk+1 .1  = p∨ j +µk − c µ µ · ¶¸ ¶ θ j j j = p ∨ E Vk+1 χk ∨ Xk − Xk , µ¯ k+1 − µk+1 .1 + µk − c kθ + 1 − θ ¶¶ µ µ kθ2 j j 2 where χk = p − µk and Xk ∼ N 0, σ 1 − kθ + 1 − θ Hence µ · µ ¶¸ ¶ θ j j = E Vk+1 χk ∨ Xk − Xk , µ¯ k+1 − µk+1 .1 − c kθ + 1 − θ µ · µ ¶¸ ¶ θ j j j j j Xk , µ¯ k+1 − µk+1 .1 − c Vk (χk , µ¯ k − µk .1) = χk ∨ E Vk+1 χk ∨ Xk − kθ + 1 − θ j Vk (p, µ¯ k ) − µk

j χk ∨

Using the definition Vk (p, µ¯ k ) = p + Ωk (p, µ¯ k ) we rewrite the previous equation to obtain µ · ¸ ¶ θ j j j j j Vk (χk , µ¯ k − µk .1) = χk ∨ E Vk+1 (χk ∨ Xk − Xk , µ¯ k+1 − µk+1 .1) − c kθ + 1 − θ  h i  j j θ ´ ³ ∨ X − E Ω (χ .1) + X , µ ¯ − µ k k+1 k k+1 k k+1 j j j j  h kθ+1−θ i = χk ∨  χk + Ωk χk , µ¯ k − µk j E χk ∨ Xk − c Using Lemma 2 and simplifying we get  h i  j j θ ´ ³ ∨ X − E Ω (χ .1) + X , µ ¯ − µ k k+1 k k+1 k k+1 j j  h kθ+1−θ i = 0∨ Ωk χk , µ¯ k − µk .1 j j E χk ∨ Xk − χk − c  h i  j 0j θ ´ ³ ∨ X − E Ω (χ X , µ ¯ ) k+1 k k k k+1 + j 00 j  h ikθ+1−θ Ωk χk , µ¯ k = 0∨ j j E χk ∨ Xk − χk − c   h i  0j j θ   Xk , µ¯ k+1 ) E Ωk+1 (χk ∨ Xk − kθ+1−θ  h i Ωk (p, µ¯ k ) = max 0 ∨  j j  j≥k  E χk ∨ Xk − χk − c

(5)

(6)

Lemma 4 In case we choose to continue testing (according to some stopping rule), we test the remaining prototype with the highest mean.

31

j

Proof. Suppose at the kth stage, µik+1 > µk+1 , but the optimal policy dictates that we test design j instead of i. We prove the lemma by showing that this policy cannot be optimal. Our proof proceeds by constructing an auxiliary problem and demonstrating that we can not worse by testing the design with the highest mean. Consider the auxiliary problem AP where we reduce whatever we obtain from testing i by the quantity j

j

α = µik+1 − µk+1 (notice that α is a constant quantity independent of the sample path since µik+1 and µk+1 have been identically updated). Later on, if we do test j we add back this quantity α to the results we obtain from testing j (i.e. after testing i if we do continue onto test j we add α back, otherwise if j is never tested we would never add it back). This auxiliary problem AP is equivalent to the original problem OP since the distributions are identical (distribution of rewards from j (i) in auxiliary problem AP is identical to distribution of rewards from i ( j) original problem OP). Hence, the optimal policies will be identical except that when optimal policy for original problem dictates testing i, we would test j in the auxiliary problem, and when the optimal policy for the original problem dictates testing j, we would test i in the auxiliary problem. Let this optimal policy for the auxiliary problem be ψ. Also, observe that since AP and OP are equivalent in terms of distribution, the optimal payoffs (values) are identical as well. Now consider using the auxiliary problem’s optimal policy ψ in the original problem. At the stage k, it dictates testing i. The values obtained in the original problem is α more than the value obtained in the auxiliary problem (since AP was constructed by subtracting α from design i performance in OP). Hence, irrespective of where we stop, and the values obtained, the payoffs from using the policy ψ in the original problem, is always greater than the optimal payoffs of the auxiliary problem. That is, V (O) > V (P). This is a contradiction, hence we must always test the design which has the highest mean.

j

Lemma 5 It is optimal to stop testing when χk ≥ Θk . Proof. We will employ an induction argument to prove the claim. Ωk (., .) defined in 3 gives us an equivalent auxiliary search problem. From equation 5 we see that stopping in this problem amounts to stopping in original problem. Hence, we shall prove the claim for this auxiliary problem. To prove the stopping rule consider ΩN (x, Y¯ ) = 0. ΩN−1 (x, Y¯ ) = E [x ∨ XN−1 ] − x − c is a monotonically decreasing function of x (and is independent of Y¯ ). Suppose Ωk (x, Y¯ ) is monotonically decreasing function of x for each k > n (and is j

j

j

j

θ θ Xk is increasing in χk , Ωk+1 (χk ∨ Xk − kθ+1−θ Xk , µ¯ k+1 − µk+1 ) is independent of Y¯ ). Since χk ∨ Xk − kθ+1−θ j

j

j

θ decreasing in χk (and is independent of Y¯ ). Therefore, E[Ωk+1 (χk ∨ Xk − kθ+1−θ Xk , µ¯ k+1 − µk+1 )] is monoj

tonically decreasing in χk (and is independent of Y¯ ). Hence, there exists a value Θk (independent of Y¯ ) such j

that for χk ≤ Θk we stop, else we continue. (Also notice that since we already have the order of testing, 32

in the decreasing order of µ, we don’t need to have the threshold dependent on the untested prototypes and can have it depend on just the number of tests completed.) To complete the induction we observe that h i ´ ³ ´ ³ j j j j j θ i Ωk χk , µ¯ k − µk+1 = 0 ∨ E[Ωk+1 (χk ∨ Xk − kθ+1−θ Xk , µ¯ k+1 − µk+1 )] + E χk ∨ Xk − χk − c is also monotonically decreasing in χik . Lemma 6 Let F(x, λ) be non-decreasing in λ and non-increasing in x. Let x(λ) be the smallest solution to the equation F(x, λ) = 0. Then, x(λ) is non-decreasing in λ. Proof. Assume the converse, i.e. suppose λ2 > λ1 , and let x(λ2 ) < x(λ1 ). Then, F(x(λ2 ), λ2 ) ≥ F(x(λ2 ), λ1 ) (since F(x, λ) is non-decreasing in λ). Also the inequality must be strict since x(λ1 ) is the smallest solution to F(x, λ1 ) = 0 and x(λ2 ) < x(λ1 ). That is, F(x(λ2 ), λ2 ) > F(x(λ2 ), λ1 ). 0 = F(x(λ2 ), λ2 ) > F(x(λ2 ), λ1 ) ≥ F(x(λ1 ), λ1 ) = 0 since F(x, λ) is non-increasing in x This contradiction shows that x(λ) is non-decreasing in λ. Proof of Corollary 1 By Theorem 1 the optimal policy specifies continuation iff χik+1 < Θk+1 . That is, continue iff the following is true for some i > k à © ª max p j − µi +

"

à !!# k ¡ ¢ θ < Θk+1 ∑ pj −µj j≤k kθ + 1 − θ j=1 à à ! !# " k−1 ¡ ¢ © ª θ θ θ i j k pk − µ + < Θk+1 max{ max p j , pk } − ∑ p j − µ − kθ + 1 − θ µ j≤k−1 kθ + 1 − θ kθ + 1 − θ j=1

k−1 θ θ µk + kθ+1−θ (pi − µi ) + Θk+1 , and λk = Define Ωk+1 = µk+1 − kθ+1−θ ∑i=1

θ kθ+1−θ .

Thus, the continue deci-

sion is optimal iff · ¸ © ª max{ max p j , pk } − λk pk < Ωk+1 j≤k−1 © ª i.e., max{ max p j , pk } < λk pk + Ωk+1 j≤k−1 © ª i.e., max p j < λk pk + Ωk+1 j≤k−1

and pk < λk pk + Ωk+1 µ ¶ maxi≤(k−1) {pi } − Ωk Ωk i.e., pk ∈ , λk 1 − λk Proof of proposition 2 33

We develop our argument based on the following recursion µ · µ ¶¸ ¶ θ Fk (x) = 0 ∨ E[x ∨ Xk ] − x − c + E Fk+1 x ∨ Xk − Xk kθ + 1 − θ µ µ ¶¶ kθ2 where Xk ∼ N 0, σ2 1 − kθ + 1 − θ FN+1 (x) = 0 From the proof of theorem 1, we observe that Θk is the (minimum) solution of the equation Fk (x) = 0, and that Fk (x) is a non-increasing function of x. By the definition of Rk , E[x ∨ Xk ] − x − c = 0 at x = Rk . Hence the solution of Fk (x) = 0 must be greater or equal to Rk . That is Θk ≥ Rk . We prove (iii) by induction. FN+1 (x) = 0 is a non-decreasing (constant) function of σ. Suppose Fk+1 (x) £ ¡ ¢¤ θ is a non-decreasing function of σ. Hence, E Fk+1 x ∨ Xk − kθ+1−θ Xk is a non-decreasing function of £ ¡ ¢¤ θ σ. That same holds for E[x ∨ Xk ] − x − c + E Fk+1 x ∨ Xk − kθ+1−θ Xk , which is also a non-decreasing function of σ (since E[x ∨ Xk ] is a non-decreasing function of σ. This implies that Fk (x) is a non-decreasing function of σ. Similarly, FN+1 (x) = 0 is a non-increasing (constant) function of x. Suppose Fk+1 (x) is a non-increasing £ ¡ ¢¤ θ function of x. Hence, E Fk+1 x ∨ Xk − kθ+1−θ Xk is a non-increasing function of x. Hence so is E[x ∨ ¢¤ £ ¡ θ Xk a non-increasing function of x . Which implies Fk (x) is a nonXk ] − x − c + E Fk+1 x ∨ Xk − kθ+1−θ increasing function of x. That is, Fk (x) is a non-increasing function of x and a non-decreasing function of σ. Since Θk is the (minimum) solution of the equation Fk (x) = 0, Θk is a non-decreasing function of σ (using Lemma 6). To prove (iv) we use the same form of induction argument as for (iii). Also, notice that the updated variance at each stage is a decreasing function of θ. Proof of proposition 3 Let the a priori mean and variance be µ¯ and σ2 respectively. Suppose that in a given sample path the number of tests undertaken is equal to n. At the nth test the reservation price Θn dropped below χn , that is χn ≥ Θn and χi < Θi for all i < n. From proposition 2 we know that Θn is a increasing function of σ (and is independent of µ¯ ). Also, χn is a decreasing function of µ¯ . Under the assumption that the sample path remains the same, let χ0n , Θ0n be the new values corresponding to increased a priori expectation or increased a priori uncertainty. Then χ0n ≤ χn and Θ0n ≥ Θn . This implies that χ0i < Θ0i ∀i < n, and if χ0n has decreased enough (due to increased a priori expectations) and/or Θ0n has increased enough (due to increased a priori uncertainty) then χ0n < Θ0n and continuation would be optimal. Hence, the number of tests conducted remains the same or increases with increased expectations or uncertainty. Furthermore, note that since the number of tests conducted remains the same or increases for every 34

sample path with increased expectations or uncertainty, we may also claim the identical monotonic property for the expected number of tests as well. Proof of proposition 4 θ (a) Θn decreases with increase in θ. In the update equation µˆ n+1 = µ + nθ+1−θ (∑ni=1 pi − µi ), if the a

priori expectation is optimistic then the term ∑ni=1 pi − µi is negative, and µˆ n+1 decreases in θ. Hence, χn is an increasing function of θ. Therefore, the number of tests conducted will decrease or remain the same. (b) and (c). The effect of increasing correlation in the case of pessimistic a priori expectations is more complex. There would be two counteracting effects - (a) decrease in Θn because of increased correlation, and (b) increase in mean. Thus the reservation price at the nth stage can go up only if the coefficient of θ in the update equation of µˆ n+1 is sufficiently positive. That is pi >> µi . Proof of theorem 2 The probability that θ = θL is p (i.e., 1 − p is the probability that θ = θH ). Further, λ is the probability of the true correlation getting revealed after any test. Denote the state at each stage(i.e., before each test) by {x, P, s} where x is the maximum performance obtained so far, P is the vector of performances of all tested prototypes and s represents whether or not the uncertainty regarding θ has been resolved. Specifically, let s = H, L denote the state when true correlation is known to beθH ,θL ,and s = U denote the state when the true correlation is unknown Picking Rule: With a matching argument very similar to the one presented in Lemma 4 of Theorem 1, it can be shown that if the experimenter decides to continue testing, it is always optimal to test the prototype with the highest apriori mean. Stopping Rule: Next, we prove the stopping rule in Theorem 2. We prove this by induction. Let Vk (x, P, s) represent the value when after testing (k − 1) prototypes the experimenter has observed performances P, and when the knowledge about true correlations is s. Obviously, if the uncertainty regarding θ has been resolved (i.e., s = H or s = L), the experimenter would use the policy corresponding to Theorem 1. For the second case where the uncertainty has not been resolved (i.e., s = U), after the (k − 1)st test, the value function can be written as Vk (x, P, sk ) = x = E [Vk+1 (x ∨ pk , [P, pk ] , sk+1 )] − c where the first branch represents the payoffs if he stops, and the second branch the payoff when he continues. Induction Hypothesis : Vk (x, P, s)−x = ϒk (x, ∑ pi , s) is a decreasing function of x. From the proof of theorem

1, it can be seen that when s = H or L, the induction hypothesis is true. For the case where s = U. 35

When k = N − 1, VN−1 (x, P, s = 0) = max {x, E [VN (x ∨ pk , [P, pk ] , sk+1 )] − c} = max {x, pE [x ∨ pk (θL , P)] + (1 − p) E [x ∨ pk (θH , P)] − c} where pk (θ) is a r.v. with updated mean corresponding to the true correlation θ and observed performances P VN−1 (x, P, s = 0) − x = max {x, pE [x ∨ pk (θL , P)] + (1 − p) E [x ∨ pk (θH , P)] − c} − x = max {0, p (E [x ∨ pk (θL , P)] − x) + (1 − p) (E [x ∨ pk (θH , P)] − x) − c} (E [x ∨ pk (θH )] − x) is decreasing function of x, and depends on the P only through ∑N−1 i=1 pi (since the variance term is independent of observed performances, and the mean depends only the running sum). Hence, there exists a function ϒN−1 (x, ∑ pi , s) decreasing in x such that ¡ ¢ VN−1 (x, P, s = 0) − x = ϒN−1 x, ∑ pi , s For a an arbitrary k E [Vk+1 (x ∨ pk , [P, pk ] , sk+1 )] − x = (1 − λ) E [Vk+1 (x ∨ pk , [P, pk ] , s = 0)] +λ (1 − p) E [Vk+1 (x ∨ pk , [P, pk ] , s = −1)] +λpE [Vk+1 (x ∨ pk , [P, pk ] , s = 1)] −x = (1 − λ) E [Vk+1 (x ∨ pk , [P, pk ] , s = U) − x] +λ (1 − p) E [Vk+1 (x ∨ pk , [P, pk ] , s = H) − x] +λpE [Vk+1 (x ∨ pk , [P, pk ] , s = L) − x] Which (with some abuse of notation) can be written as (note that the expectations are taken with respect to the random variables pk . For instance, the second line represents expectations taken with respect to the r.v. pk when the true correlation is θH , the third line with respect to pk when true correlation is θL , and first line when the true correlation is θL with probability p and θH with probability 1 − p) = (1 − λ) (E [Vk+1 (x ∨ pk , [P, pk ] , s = U) − x ∨ pk ] + E [x ∨ pk ] − x) +λ (1 − p) (E [Vk+1 (x ∨ pk , [P, pk ] , s = H) − x ∨ pk ] + E [x ∨ pk ] − x) +λp (E [Vk+1 (x ∨ pk , [P, pk ] , s = L) − x ∨ pk ] + E [x ∨ pk ] − x) 36

which using the induction hypothesis can be written as (1 − λ) (E [Vk+1 (x ∨ pk , [P, pk ] , s = U) − x ∨ pk ] + E [x ∨ pk ] − x) +λ (1 − p) (E [Vk+1 (x ∨ pk , [P, pk ] , s = H) − x ∨ pk ] + E [x ∨ pk ] − x) +λp (E [Vk+1 (x ∨ pk , [P, pk ] , s = L) − x ∨ pk ] + E [x ∨ pk ] − x) ¡ £ ¡ ¢¤ ¢ = (1 − λ) E ϒk+1 x ∨ pk , ∑ pi , s = U + E [x ∨ pk ] − x ¡ £ ¡ ¢¤ ¢ +λ (1 − p) E ϒk+1 x ∨ pk , ∑ pi , s = H + E [x ∨ pk ] − x ¡ £ ¡ ¢¤ ¢ +λp E ϒk+1 x ∨ pk , ∑ pi , s = L + E [x ∨ pk ] − x Since ϒk+1 (x ∨ pk , ∑ pi , s) is decreasing in x ∨ pk , it must necessarily be true that E [ϒk+1 (x ∨ pk , ∑ pi , s)] is decreasing in x (since x∨ pk is increasing in x) and depends on P only through ∑ pi . Furthermore, E [x ∨ pk ]− x is decreasing in x. Thus, E [Vk+1 (x ∨ pk , [P, pk ] , sk+1 )] − x must be decreasing in x and depends on P only through ∑ pi .Since Vk (x, P, sk ) − x = max {x, E [Vk+1 (x ∨ pk , [P, pk ] , sk+1 )] − c} − x = max {0, E [Vk+1 (x ∨ pk , [P, pk ] , sk+1 )] − x − c} there must exist a function ϒk (x, ∑ pi , s) decreasing in x such that Vk (x, P, sk ) − x = ϒk (x, ∑ pi , sk ). Thus the induction hypothesis is proved. To prove the stopping property it suffices to notice that we stop iff E [Vk+1 (x ∨ pk , [P, pk ] , sk+1 )] − c ≤ x. This is equivalent to stopping when Vk (x, P, sk ) − x ≤ 0. That is stopping when ϒk (x, ∑ pi , sk ) ≤ 0. Since ϒk (x, ∑ pi , sk ) is decreasing in x, there exists a function Γk (∑ pi ) such that we stop iff x ≥ Γk (∑ pi ). (Note that the function Γk (∑ pi ) is given by the root of ϒk (x, ∑ pi , sk = U) = 0) Proof of Corollary 2 If the true correlation is θL , then continuation is optimal iff pk lies in an open interval I(θL ). Similarly, if the true correlation is θH , continuation is optimal iff pk lies (in another) open interval I(θH ). Hence, when the true θ is uncertain, it is optimal to continue if pk ∈ I(θL ) ∩ I(θH ), and to stop if pk ∈ I(θL )C ∩ I(θH )C . By choosing τl = inf{I(θL )C ∩ I(θH )C }, and τh = sup{I(θL )C ∩ I(θH )C }, the corollary is proved. Proof of Theorem 3 Proof of stopping rule: Case (a) shows that it is never optimal to stop if the stopping rule is not satisfied. Case (b) proves a stronger statement: testing jth class will cease once χ j ≥ Θ j . Case a: if χik < Θik for some i, it is never optimal to stop. Since continuing testing only the ith class will yield strictly greater payoffs.

37

Case b: We construct an induction argument on the number of untested prototypes. If there is only one untested prototype remaining with χik ≥ Θik , obviously it is never optimal to test within the class again (see Theorem 1). Suppose χik ≥ Θik ∀i. Consider the impact of testing one more prototype from some class j. Then, after the test, χik+1 ≥ χik ≥ Θik ≥ Θik+1 ∀i 6= j. Which implies that, from induction hypothesis, we wont test any of the classes i 6= j from henceforth. That is, if we are to continue testing, we can only test prototypes from class j. But this yields lesser payoffs since from Theorem 1 we know that if testing was j

j

restricted to only one class j and χk ≥ Θk , and it would be better to stop at the kth stage. Proof of picking rule: Define Γi = Θi + µˆ k+1 . Then, the proposed optimal policy where we pick the class i that has the highest Θi −·χk+1 is equivalent¸to testing a design within the class i which has the greatest Γi i © ª (since Θi − χk+1 = Θi − max p j − µˆ k+1 = Θi + µˆ k+1 + const where the constant is independent of the i j j j≤k

class). We shall use induction to demonstrate the optimality of the picking rule. Our induction hypothesis is as follows: The picking rule proposed in Theorem 3 is optimal if the number of untested prototypes is less than n. Suppose there are M classes remaining. Without loss of generality assume that (a) the reservation price at some stage be such that Γ1 > Γ2 , (b) according to the optimal policy we choose to test class 2, and (c) class 1 is such that Γ1 ≥ Γi for all i. This assumption entails no loss in generality since we can always index the classes so that the assumptions are satisfied. We will construct an alternate policy which gives a better payoff than the hypothesized optimal policy. This demonstrates that the proposed policy of not testing any prototype that doesn’t belong to the highest reservation price class cannot be optimal. The alternative policy we propose is as follows. Test 1, if the stopping rule is satisfied then stop, else test 2, and then continue testing according to the hypothesized optimal rule. The hypothesized optimal policy: after testing 2 we can have two cases, (a) p2 ≥ Γ1 and we stop and get payoff p2 ∨ p or (b) p2 < Γ1 and we test 1. If then we go onto test 1, the outcomes are as follows. The maximum reservation price changes to Γi (i may change because the highest reservation price class may change after testing class i). If Γ1 > p2 ≥ Γi p1 ≥ Γi , then we stop and the payoff is p ∨ p1 ∨ p2 . If Γ1 > p2 ≥ Γi and p1 < Γi , then we stop and the payoff is p ∨ p2 . If p2 < Γi and p1 ≥ Γi , then we stop and the payoff is p ∨ p1 . If p2 < Γi and p1 < Γi , then we continue and the expected payoff is Ψ. The proposed alternate policy: we first test 1, if stopping rule is not satisfied we test 2, and then we continue with the optimal policy. After testing 1 we have two cases, (a) p1 ≥ Γi , we stop with payoffs

38

p ∨ p1 , or (b) p1 < Γi and we test 2. If then we go onto test 2, the outcomes are as follows. If p2 ≥ Γi , then we stop and the payoff is p ∨ p2 . If p2 < Γi , then we continue and the expected payoff is Ψ. Define the following a0 = Pr(p2 ≥ Γ1 ) a1 = Pr(Γ1 ≥ p2 ≥ Γi ) a2 = Pr(p1 ≥ Γi ) a3 = Pr(Γi > p2 ≥ Γ2 ) a4 = Pr(Γ1 > p1 ≥ Γi ) e0 = E[p ∨ p2 |p2 ≥ Γ1 ] e1 = E[p ∨ p1 ∨ p2 |Γ1 ≥ p2 ≥ Γi , p1 ≥ Γi ] e2 = E[p ∨ p2 |Γ1 ≥ p2 ≥ Γi ] e3 = E[p ∨ p1 |p1 ≥ Γi ] e4 = E[p2 |p2 ≥ Γ1 ] e5 = E[p2 |Γ1 > p2 ≥ Γi ] e6 = E[p2 |Γi > p2 ≥ Γ2 ]] Then, the expected value from using the hypothesized optimal policy is V = −c + a0 e0 + a1 (−c + a2 e1 + (1 − a2 )e2 ) + (1 − a0 − a1 )(−c + a2 e3 + (1 − a2 )Ψ) Also notice that Pr(p2 ≥ Γi ) = a0 + a1 and E[p ∨ p2 |p2 ≥ Γi ] = E[p ∨ p2 |Γ1 ≥ p2 ≥ Γi ]Pr(Γ1 ≥ p2 ) + E[p ∨ p2 |p2 ≥ Γ1 ]Pr(p2 ≥ Γ1 ) = e0 a0 + e2 a1 and the value from the alternate policy is Vˆ = −c + a2 e3 + (1 − a2 )(e0 a0 + e2 a1 + (1 − a0 − a1 )Ψ) The proof from hereon is identical to Weitzman’s (1979) proof of the optimality of reservation price policy, with the main difference that since the reservation prices have changed we have to utilize Γi to represent the newly modified highest reservation price class. The difference between the payoffs between the policies can be written as Vˆ −V

= (Γ1 − Γ2 )a2 a0 + (e5 − Γ2 )a1 a2 + (e2 − Γ2 )a3 a4 + (e6 − Γ2 )a3 (a2 + a4 ) + (e2 + e6 − Γ2 − e1 )a4 a1

Using the definition of e1 (similar to Weitzman 1979 p 654), we can show that all the terms are nonnegative. That is, the alternate policy gives greater payoffs, which is a contradiction. 39

Appendix B In this Appendix, we demonstrate that the main assumptions we have adopted in formulating the model are also essential for mathematical tractability12 . The main assumptions of our model are A1: The joint distribution is multivariate normal A2: Search costs are equal A3: Variances are identical The main theorem characterizing the criticality of our assumptions is stated next. Theorem 5 If any one of the three assumptions, A1, A2, and A3, is violated, there may exist no optimal index policy13 . In Propositions 5, 6 and 7, we relax assumptions (A1), (A2), and (A3) respectively and demonstrate that optimal policy may no longer be an index policy. Proposition 5 Suppose assumptions A2, A3 are true, whereas assumption A1 is false. Then, there may exist no optimal index policy. Proof. The proposition shall be proved with a set of counter-examples. Consider the following problem instances: B high B low 135 19.53 Search cost 0.126 0.375 0.5 A 15 Case 1: A high 100 0.124 0.375 0.5 A low 0 B 15 0.25 0.75 The expected performance & variance of the two designs are Mean Variance A 50 2500 B 48.3975 2500 and the correlation between the two r.v. is 0.0046. Thus, the problem instance conforms to all our modelling assumptions except for multivariate normality. The expected value from Testing A first 51.58 Testing B first 52.24 Thus, in optimal policy, B should be tested first. That is, the index that is assigned to B (by the optimal index policy if it exists) B is greater than index for A. Now consider the following modification of the previous case. Note that only the joint distribution is modified and the unconditional distributions remain unchanged. 12

We define mathematical tractability as the existence of an index policy since without an index policy, the problem is non-

separable, and may require combinatorial techniques to achieve the optimal solution. 13

Note that the theorem only states that for arbitrary distributions, there may exist no optimal index policy. However, it is indeed

feasible that there is some other restriction of the set of allowed distributions (instead of our restriction to multivariate normal) that guarantees index policies. Hence, our theorem only claims that any other set of restrictions/assumptions cannot be much more general than what we used in our model.

40

B high B low 135 19.53 Search cost 0.127 0.374 0.5 A 15 Case 2: A high 100 A low 0 0.124 0.376 0.5 B 15 0.25 0.75 The expected performance & variance, as before, are Mean Variance A 50 2500 B 48.3975 2500 and the correlation between the two r.v. is 0.0092. Thus, as before the instance satisfies the model assumptions except multivariate normality The expected value from testing is given by Testing A first 51.47 Testing B first 52.16 Thus, in optimal policy, A should be tested first. That is, index assigned to A is greater than index for B. The two considered cases demonstrate that to avoid a contradiction, the index must necessarily depend on the joint distribution. However, an index by definition represents a computable quantity that depends on the parameters of the specific search opportunity and doesnt depend on the parameter values of other designs (i.e., the problem is seperable). However, the two cases demonstrate that such a seperable index does not exist, and thus the impossibility of an index policy is demonstrated.

Proposition 6 Suppose assumptions A1, A3 are true, whereas assumption A2 is false. Then, there may exist no optimal index policy. Proof. Consider the motivating example given in §1. Suppose the cost of testing B is greater than or equal to cost of testing A (i.e., CA ≤ CB ). Thus, in the optimal testing policy (index or otherwise), if we were to test any designs, we should first test A (since it has a lower testing cost). If there is an optimal index policy, then there exists indices Ω (µ, σ,C, θ) such that we make decisions about testing A in the optimal policy based on the index Ω (µA , σA ,CA , θ). Now consider the following two sets of problem parameters. µA = µB σA = σB CA CB θ

= = = = =

−0.91 1 0.1 0.1 0.1

µA = µB σA = σB CA CB θ

= = = = =

−0.91 1 0.1 1 0.1

The index for both the sets of problem parameters is identical = Ω(µA , σA ,CA , θ). However, it can be shown that (i) for the first set of parameters, it is optimal to test A, (ii) for the second set of parameter values, it is optimal not to test A.Hence, the index cannot be used to make the decision on whether or not to test A. Thus such an index policy is impossible. Proposition 7 Suppose assumptions A1, A2 are true, whereas assumption A3 is false. Then, there may exist no optimal index policy. Proof. Similar to Proposition 6 consider the following two examples.

41

µA = µB σA σB CA = CB θ

= = = = =

−0.91 1 1 0.1 0.1

µA = µB σA σB CA = CB θ

= = = = =

−0.91 1 0.1 1 0.1

The index for both the sets of problem parameters is identical = Ω(µA , σA ,CA , θ). However, it can be shown that (i) for the first set of parameters, it is optimal to test A, (ii) for the second set of parameter values, it is optimal not to test A.Hence, the index cannot be used to make the decision on whether or not to test A. Thus, such an index policy is impossible.

42

Suggest Documents