Boundary value problems in stochastic optimal control of advertising

Automatica 42 (2006) 1357 – 1362 www.elsevier.com/locate/automatica

Boundary value problems in stochastic optimal control of advertising夡 Kalyan Raman ∗ Loughborough Business School, Loughborough University, Leicestershire, LE 11 3TU, Great Britain, UK Received 17 May 2005; received in revised form 24 March 2006; accepted 12 April 2006 Available online 13 June 2006

Abstract Temporal patterns for advertising include constant spending over time, decreasing spending over time and increasing spending over time. This research shows that all these spending patterns emerge at optimality for the same response function dynamics, due to differences in salvage value assumptions. I use these results to develop a methodology for determining the optimal planning horizon length for each pattern of spending. 䉷 2006 Elsevier Ltd. All rights reserved. Keywords: Stochastic optimal control; Hamilton–Jacobi–Bellman; Boundary value; Nerlove–Arrow; Advertising

1. Introduction Advertisers spread their budget out over time because memory effects cause the influence of advertising to decay over time. In response to advertising decay, marketers have developed different temporal scheduling patterns for advertising. But these temporal patterns do not maximize profits because they are driven by managerial judgment rather than rigorous mathematical reasoning. What is the profit-maximizing way to allocate an advertising budget over time? Is it better to spread the budget evenly over time, to decrease spending over time, to increase spending over time, or to do something more elaborate? Past research has identified conditions favoring one or the other of these options. The best option is determined by the interplay of at least six different factors: the dynamics of demand, the dynamics of production cost, the dynamics of temporal preference for money, the forces of competition, uncertainty, and salvage value (Bass, Krishnamoorthy, Prasad, & Sethi, 2005; Dockner & Jorgensen, 1988; Fruchter & Kalish, 1997; Fruchter, 1999; Tapiero, 1978). Sasieni (1971) established that it is dynamically optimal to spread advertising expenses evenly over an infinite planning horizon for a large class of response 夡 This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor Ashutosh Prasad under the direction of Editor Suresh Sethi. ∗ Tel.: +44 1509 223114; fax: +44 1509 223961. E-mail address: [email protected].

0005-1098/$ - see front matter 䉷 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2006.04.016

models. But the influences of salvage value and uncertainty have received scant attention in the extant literature. The salvage value of a dynamic process is the value of the final level of the state variable at the end of the planning horizon. Salvage value constraints are the terminal time or boundary conditions in a finite-horizon stochastic control problem. Salvage values reflect the decision-maker’s assumptions about the nature of the market and product. High tech markets are characterized by rapid product obsolescence and short product life cycles; under such conditions, the salvage value at the end of the planning horizon would be zero. For some products, the availability of a secondary market imparts non-zero value to the decision maker at the end of the planning horizon. Leased cars are an example of secondary markets—they can be sold in the market for used cars at the end of the leasing period. Planning horizons may reflect the degree of long-range outlook of the decision maker. For example, a decision maker with a far-sighted outlook can be modeled by an infinite planning horizon, and the optimal control under an infinite planning horizon coincides with the optimal control under the natural boundary conditions at the terminal time (Appendix A). Questions remain about the influence of salvage value on optimal advertising because finite horizon problems are mathematically more challenging than infinite horizon problems, and thus numerical analysis is the norm in previous work (Bass et al., 2005). Yet it is important to understand salvage value effects analytically because they are a fundamental determinant of the temporal pattern of advertising spending

1358

K. Raman / Automatica 42 (2006) 1357 – 1362

(Dockner & Jorgensen, 1988). Bass et al. (2005) and Sethi (1974) solve finite-horizon problems but the influence of salvage value constraints on advertising policies remains significantly under researched. Salvage values influence the choice of best horizon in dynamic decision-making. Sethi and Chand (1979), Chand, Sethi, and Sorger (1992) and Sethi and Sorger (1991) have made contributions to this area but they do not address the following questions. Is it better (in an expected profit-maximizing sense) to use a short or long planning horizon? Should the planning horizon be larger or smaller when advertising effectiveness (decay) is larger? A short planning horizon is inconsistent with dynamic optimization, but a long planning horizon is undesirable in a rapidly changing industry or, as Starr (1966) notes, in an uncertain environment. Much extant research is deterministic but market response is stochastic (Prasad & Sethi, 2004; Raman & Naik, 2004). Thus, the extent to which the conclusions of extant research are robust to uncertainty is unknown. I analyze the joint impact of salvage value and uncertainty upon the structure of dynamically optimal advertising policies, and apply the results to determine optimal planning horizons. Salvage value constraints are fundamental in determining the temporal behavior of the optimal advertising policy, dictating whether or not it is best to spend evenly over time. Furthermore, salvage value constraints influence the length of the optimal planning horizon, a linkage that has been unrecognized and unexploited in prior research. 2. A stochastic dynamic model of advertising response Nerlove and Arrow (1962) conceptualized the long-term effect of advertising through the construct of goodwill. Following Rao (1986), I postulate the following stochastic differential equation for the goodwill process G(t), driven by advertising u(t): dG = (−G + u) dt + dW ,

(1)

where is the infinitesimal standard deviation and W (t) is a standard Brownian motion process. The decision-maker’s objective is find a trajectory u(t) to achieve the following maximization: T −s Max EG0 e (s) ds + (G(T ), T ) (2) u(t)

0

subject to the evolution of the stochastic differential equation (1), where (s) = mG(s) − u2 (s) is the instantaneous profit at time “s” (see Erickson, 1991 for justification of cost). 3. Derivation of stochastic optimal controls

when using an optimal policy and is defined as

T

V (g, t, T , ) = Max Eg u(t)

e

−s

(s) ds + (G(T ), T ) ,

t

(3) where Eg denotes the expectation operator, given G(t)=g, and = (, , , , m, c). The boundary condition is V (g, t, T , ) = (G(T ), T ),

(4)

where (G(T ), T ) is the salvage value of terminal goodwill. V (g, t, T , ) satisfies the Hamilton–Jacobi–Bellman (HJB) partial differential equation (Appendix B). Given (1) and (3), the HJB equation is Vt − gVg +

et 2 Vg2 4

+

2 Vgg + e−t gm = 0, 2

(5)

where Vt = jV /jt, Vg = jV /jg and Vgg = j2 V /jg 2 . The nonlinear PDE (5) is solved subject to boundary conditions prescribed by appropriate choices of (g, T ). 3.2. Optimality analysis The static solution is the outcome of maximizing the profit mg − u2 = mu − u2 to get u = m/2. The static solution neglects the dynamics of advertising ( = 0), ignores dynamics due to temporal constraints such as terminal time conditions and discounting ( = 0), and assumes deterministic response (=0). The static advertising level u=m/2 is often called the myopic policy in the literature. Stochastic dynamic optimization takes into account the factors ignored by the myopic policy. 3.3. Sketch of solution strategy The Riccati structure of (5) suggests the following functional form for V (g, t, T , ): V (g, t, T , ) = e−t {k1 (t)g 2 + k2 (t)g + k3 (t)}.

(6)

Substitution of (6) into the HJB equation generates three coupled non-linear differential equations satisfied simultaneously by the functions k1 (t), k2 (t), and k3 (t). These are solved subject to the boundary conditions given by (g, t) at t = T (Appendix B). 3.4. Specification of a family of salvage values I consider the following family of salvage values, multiplicatively separable in g and T .

3.1. Definition of value function and Hamilton–Jacobi–Bellman equation

(G(T ), T ) = e−T mg,

The value function V (g, t, T , ) denotes the optimal expected performance over the remaining time horizon [t, T ]

The parameter > 0 captures a number of substantively interesting scenarios.

for any G(T ) = g at t = T .

(7)


Specification I: natural salvage value, = 1/( + ): This boundary specification is a natural consequence of Nerlove–Arrow dynamics because the accumulated goodwill at the end of the planning horizon will decay in the absence of advertising thereafter. Therefore for any arbitrary level G(T )=g, the discounted value of the profit stream over [T , ∞) with u(t) = 0, for t > T is (e−T mg)/( + ) (Appendix A). Specification II: zero salvage value specification, = 0: A zero salvage value specification would be appropriate for an industry characterized by rapid product obsolescence or short product life cycles (the latter is typically though not always, a consequence of the former), so that the residual goodwill at the end of the horizon is worth nothing to the firm. Specification III: secondary market salvage value, = 1: This boundary specification attaches no value whatsoever to any level of goodwill G(t) after t > T but recognizes that goodwill accumulated until time T may have value in a secondary market. The firm can dispose of its accumulated goodwill G(T ) at $m/unit in a secondary market but since it has to wait till t = T , it discounts the value of G(T ) back to time t = 0. Specification IV: high equity salvage value, > 1: This specification is appropriate for a firm with high brand equity such as Coca-Cola in consumer non-durables or Microsoft Windows in consumer durables. The goodwill enjoyed by such brands could arise from strong brand loyalty (as for Coke) or a captive customer base due to high switching costs (as for Microsoft Windows). 3.5. Optimal solution for the family of salvage values (G(T ), T ) = e−T mg

1359

3.6. Properties of the general optimal policy Eq. (9) is the optimal policy for the general family of salvage values specified in (7). 3.6.1. Asymptotic behavior of the general optimal policy From (9), the policy is asymptotically even, ∀. Thus, Lim [u(t, T , ] = EvenPolicy.

(11)

t→∞

3.6.2. Temporal behavior of the general optimal policy Differentiating u(t, T , ) with respect to t: ju(t, T , ) 1 (+)(t−T ) m(−1 + ( + )). = e jt 2 Define crit = 1/( + ) and derive ⎧ ⎪ 0

ju(t,T ,) : jt

for 0 < crit , for = crit ,

u(t, T , ) =

m + 2( + )

e(+)(t−T ) m(−1 + ( + )) 2( + )

(9)

Define: EvenPolicy =

m . 2( + )

3.6.3. Terminal ⎧ value of the general optimal policy 0 for = 0, ⎪ ⎪ ⎪ ⎪ ⎪ m ⎪ ⎪ for = crit , ⎪ ⎪ ⎨ 2( + ) u(T , T , ) = m ⎪ ⎪ for = 1, ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ m for > crit . 2

The optimal policy for each of the four market conditions corresponds to the specific value of characterizing that market condition.

(8)

Table 1 summarizes the key results. 3.7. Natural specification ( = 1/( + )) u(t, T ) =

(10)

(13)

for > crit .

We will solve the stochastic control problem for the family of salvage values parametrized by , thereby obtaining the solutions for all the four market conditions in one fell swoop. The value function (t−T ) m2 2 (−2 + 2 + 2 ) gm(1 + e(+)(t−T ) (−1 + ( + ))) −t e V (g, t, T , ) = e + 4(2 + ) + ⎞ (+)(t−T ) (−1 + ( + )) 2(t−T )(+) (−1 + ( + ))2 1 2e e 2 m2 − + + ⎟ 2 + ⎟ ⎟ − ⎟ 2 4( + ) ⎠ (Appendix B). Denoting the general optimal policy by u(t, T , ) to indicate that it is open-loop, we obtain:

(12)

m . 2( + )

(14)

Thus it is optimal to maintain a constant advertising level, called an Even policy (Feinberg, 1992; Sasieni, 1989). The optimal policy increases with margin (m), effectiveness () and decreases with decay rate () and the discount rate (). Since

1360


Table 1 Optimal policies Salvage value at t = T , given G(T ) = g

Nature of optimal policy

Robustness of policy with respect to 2

(e−T mg)/( + ) 0 e−T mg e−T mg , 1 < < 1/( + ) e−T mg, > 1/( + )

Even Open Open Open Open

Robust Robust Robust Robust Robust

loop loop loop loop

the finite horizon problem with natural boundary specification yields the same optimal policy as the infinite horizon problem, this result means that a decision-maker with a long-term perspective should keep her advertising level constant in markets described by Nerlove–Arrow dynamics. 3.7.1. Zero salvage value specification ( = 0) m e(+)(t−T ) m u(t, T ) = − . 2( + ) 2( + )

(15)

From (10) and (15): ZeroSalvagePolicy = EvenPolicy(1 − e(+)(t−T ) ).

decreasing decreasing decreasing increasing

Eq. (13) shows that the optimal control is monotonically decreasing for < crit , and monotonically increasing for > crit (Eq. (13)). HighEquityPolicy < EvenPolicy ∀t, for < crit and HighEquityPolicy > EvenPolicy ∀t, for > crit . Thus the high equity assumption mandates spending less than the even policy for < crit and spending more than the even policy for > crit . A decision-maker ignoring the high equity salvage value constraint will overspend relative to an optimal decision maker for < crit and underspend relative to an optimal decision maker for > crit . 4. Optimal length of planning horizon

3.7.2. Secondary market salvage value specification ( = 1) m e(+)(t−T ) m(−1 + + ) u(t, T ) = + (16) 2( + ) 2( + ) From (10) and (16): SecondaryMarketPolicy = EvenPolicy(1 − (1−(+))e(+)(t−T ) ).

3.7.3. High equity salvage value ( > 1) m e(+)(t−T ) m(−1 + ( + )) u(t, T , ) = + , 2( + ) 2( + ) > 1.

monotonic monotonic monotonic monotonic

(17)

From (10) and (17): HighEquityPolicy=EvenPolicy(1+((+)−1)e(+)(t−T ) ).

Many decision-makers use planning horizons of 3, 5 or 10 years, but Starr (1966) observes that the practice is based on executive judgment rather than quantitative analysis. What is the best planning horizon length for advertising spending decisions and how is the answer influenced by market parameters? Determining the planning horizon length to maximize the expected profit makes the issue unambiguous. Given an exogeneously predetermined T , the value function V (g, t, T , ) evaluated at t = 0 gives the maximum expected profit over [0, T ]. Let C(T ) be the cost associated with a horizon of length T , such that C(T ) = cT (Karlin & Taylor, 1975). The problem is to endogeneously determine T to maximize the value, net of the cost associated with the planning horizon length. Thus the problem is to find arg maxT (V (g, 0, T , ) − cT ). The value function V (g, t, T , ) is shown in Eq. (8). Evaluating V (g, t, T , ) at t = 0, we obtain the optimal expected value over [0, T ] for fixed T and any arbitrary initial level g.

V (g, 0, T , ) =

e−T m2 2 (−2 + 2 + 2 ) gm(1 + e−(+)T (−1 + ( + ))) + 4(2 + ) +

⎞ 1 2e−(+)T (−1+(+)) e−2(+)T (−1+(+))2 2 2 m − + + 2+ ⎟ ⎟. − ⎠ 2 4( + )

Summing up, it is optimal to spend less than the even policy in both the zero salvage and secondary market conditions. Thus a decision-maker ignoring the zero salvage or secondary market constraint will overspend relative to an optimal decision maker. Under zero salvage and secondary market conditions, the qualitative nature of the policy is time-varying rather than even.

4.1. Profit function under general boundary conditions ((G(T ), T ) = e−T mg) The expected profit function is the difference between the expected value over [0, T ] and the cost associated with a horizon of length T .


(T ) = V (g, 0, T , ) − cT .

(18)

Setting j /jT = 0 gives the optimality equation for the expected profit-maximizing T , and its solution yields the optimal T provided that the sufficiency condition j2 /jT 2 < 0 holds at the root of the optimality equation. Successively setting = 1/( + ), 0, 1, we get the optimality equations for T for the natural, secondary market and high equity cases, respectively. A closed-form solution can be found for T in the case of natural boundary conditions. For the other three market conditions, no closed-form solution is available. 4.2. Optimal T (T ∗ ) under natural boundary conditions T∗ =

Log[m2 2 /(4c( + )2 )] ,

jT ∗ > 0, j jT ∗ > 0, jm

jT ∗ < 0, j

jT ∗ < 0, j

jT ∗ < 0. jc

The derivations obtain in a straightforward manner. Should planning horizons be longer or smaller in markets with greater advertising effectiveness (decay)? Comparitive statics provides answers to these questions. Naik, Mantrala, and Sawyer (1998) show that advertising quality erodes with time in some markets. In such markets, the average effectiveness of advertising over [0, T ] is lower and comparative statics shows that shorter planning horizons are better. The optimal planning horizon length increases with advertising effectiveness () and margin (m), and decreases with the decay rate (), discount factor (), and cost parameter (c). 5. Impact of uncertainty All the policies are robust with respect to uncertainty, a consequence of the linearity of the value function (Sethi, 1983). From the linearity of the value function, it follows that the policies would remain robust with respect to uncertainty for a more general specification of , say if were a function of the state variable. Since the value function does not depend upon , the optimal lengths of the planning horizons are also robust with respect to uncertainty. 6. Conclusions The central issues in this research were the optimality of different temporal patterns of advertising spending and their implications for the optimal planning horizon. The focal questions were: is it better to spend evenly over time, or use more elaborate spending patterns, and how is the optimal planning horizon related to the spending pattern? Although an even policy is optimal under a specific boundary constraint, other boundary constraints generate patterns of optimal spending quite different from the even policy. Across different market conditions (as captured in the salvage values), longer planning horizons are

1361

optimal when the advertising effectiveness increases or when the decay rate decreases. Acknowledgement The author grateful acknowledges the helpful comments of two anonymous reviewers and the associate editor. Appendix A Let the process Xt be an Ito diffusion: dX = f (X)dt + (X) dW For > 0, and g(.) a bounded continuous function on R n , define the resolvent operator R by R g(x) = t=∞ Ex [ t=0 e− t g(Xt ) dt], where Ex is the expectation operator given X(0) = x; then Oksendal (2003, p. 141) shows that t=∞ R g(x) = t=0 e− t Ex [g(Xt )] dt. t=∞ Apply Oksendal’s (2003) result to evaluate Ex ( t=T (t) dt) with u(t) = 0 over [T , ∞). To evaluate Eg [G(t)], set u(t) = 0 in the stochastic differential equation (SDE) satisfied by G(t) to get dG = −G dt + dW ; apply Eg to both sides of the SDE for G(t), interchange d and Eg operators (allowed by Fubini’s Theorem) and solve the resulting ordinary differential equation for Eg [G(t)] using the condition at the boundary t = T for G(t) to get Eg [G(t)] = G(T )e−(t−T ) ; fi t=∞ t=∞ nally t=T e−t Ex [g(Xt )] dt = t=T e−t G(T )e−(t−T ) dt ⇒ (G(T ), T ) = (e−T mg)/( + ) for any G(T ) = g. Appendix B Consider the family of salvage values (G(T ), T ) = e−T mg, for any G(T ) = g. Determine an optimal control u(t) such that: T u(t) = Arg maxu(t) EG0 { 0 e−s (s) ds + (G(T ), T )}, where G(t) satisfies dG = f (G, u)dt + (G) dW . Define the value function V (g, t, T , ), where = (, , , , m, c): T V (g, t, T , ) = Max Eg e−s (s) ds + (G(T ), T ) . u(t)

t

Then the HJB equation for the optimal control is (Fleming & Rishel, 1975): 2 Vgg −t −Vt = max e (t) + f (g, u)Vg + , u(t) 2 where Vt = jV /jt, Vg = jV /jg and Vgg = j2 V /jg 2 . The above non-linear partial differential equation is solved subject to G(t)=G0 , and boundary conditions (G(T ), T )=e−T mg. Substituting V (g, t, T , ) = [e−t k1 (t)g 2 + k2 (t)g + k3 (t)] into the HJB, we obtain. 2 k1 (t) + 41 2 k2 (t)2 − k3 (t) + k3 (t) + g(m − ( + )k2 (t) + 2 k1 (t)k2 (t) + k2 (t)) + g 2 (k1 (t) + 2 k1 (t)2 − (2 + )k1 (t)) = 0,

1362


where ki (t)=dki /dt. The ki (t) satisfy three coupled non-linear differential equations. Finally, V (g, t, T , ) = e−t {k1 (t)g 2 + k2 (t)g + k3 (t)}. Solving for V (g, t, T , ): (t−T ) m2 2 (−2 + 2 + 2 ) gm(1 + e(+)(t−T ) (−1 + ( + ))) −t e V (g, t, T , ) = e + 4(2 + ) +

−

1 m2 2 − +

2e(+)(t−T ) (−1+(+))

+

e2(t−T )(+) (−1+(+))2 2+

4( + )2

Next, substitute Vg = jV /jg into u(g, t, T , ) = 21 (et Vg ) to obtain: m e(+)(t−T ) m(−1 + ( + )) u(g, t, T , ) = + . 2( + ) 2( + ) V (g, t, T , ) = e−T mg at t = T , and so the boundary condition is indeed satisfied. References Bass, F. M., Krishnamoorthy, A., Prasad, A., & Sethi, S. P. (2005). Advertising competition with market expansion for finite horizon firms. Journal of Industrial and Management Optimization, 1(1), 1–21. Chand, S., Sethi, S. P., & Sorger, G. (1992). Forecast horizons in the discounted dynamic lot size model. Management Science, 38(7), 1034–1048. Dockner, E., & Jorgensen, S. (1988). Optimal advertising policies for diffusion models of new product innovation in monopolistic situations. Management Science, 34(1), 119–130. Erickson, G. M. (1991). Dynamic models of advertising competition. Boston, MA: Kluwer Academic Publishers. Feinberg, F. (1992). Pulsing policies for aggregate advertising models. Marketing Science, 11(3), 221–234. Fleming, W. H., & Rishel, R. (1975). Deterministic and stochastic optimal control. New York: Springer. Fruchter, G. E. (1999). The many player advertising game. Management Science, 45(11), 1609–1611. Fruchter, G. E., & Kalish, S. (1997). Closed-loop advertising strategies in a duopoly. Management Science, 43(1), 54–63. Karlin, S., & Taylor, H. (1975). A first course in stochastic processes. (2nd ed.), New York: Academic Press. Naik, P. A., Mantrala, M. K., & Sawyer, A. (1998). Planning pulsing media schedules in the presence of dynamic advertising quality. Marketing Science, 17(3), 214–235. Nerlove, M., & Arrow, K. J. (1962). Optimal advertising policy under dynamic conditions. Economica, 29, 129–142. Oksendal, B. (2003). Stochastic differential equations. New York: Springer.

⎞ ⎟ ⎟ ⎠

Prasad, A., & Sethi, S. P. (2004). Competitive advertising under uncertainty: A stochastic differential game approach. Journal of Optimization Theory and Applications, 123(1), 163–185. Raman, K., & Naik, P. A. (2004). Long-term profit impact of integrated marketing communications program. Review of Marketing Science, 2, Article 8. Rao, R. C. (1986). Estimating continuous time advertising-sales models. Marketing Science, 5, 125–142. Sasieni, M. W. (1971). Optimal advertising expenditure. Management Science, 18(4), 64–72. Sasieni, M. W. (1989). Optimal advertising strategies. Marketing Science, 8(4), 358–370. Sethi, S. P. (1974). Optimal institutional advertising: A minimum-time problem. Journal of Optimization Theory and Applications, 14, 213–231. Sethi, S. P., & Chand, S. (1979). Planning horizon procedures for machine replacement models. Management Science, 25, 140–151. Sethi, S. P. (1983). Deterministic and stochastic optimization of a dynamic advertising model. Optimal Control Applications and Methods, 4(2), 179–184. Sethi, S. P., & Sorger, G. (1991). A theory of rolling horizon decision making. Annals of Operations Research, 29, 387–416. Starr, M. (1966). Planning models. Management Science, 13(4), 115–141. Tapiero, C. S. (1978). Optimal advertising and goodwill under uncertainty. Operations Research, 26, 450–462. Kalyan Raman is Professor of Marketing at Loughborough Business School, Loughborough University, Leicestershire LE11 3TU, England, UK. He holds a Ph.D. in Management Science (Marketing concentration) from the University of Texas at Dallas. He attended graduate school at Purdue University and holds a M.S. in Statistics. He has published articles in Marketing Science, Management Science, Journal of Marketing Research, European Journal of Operational Research, Optimal Control Applications & Methods, Applied Mathematics Letters and other scholarly journals. He specializes in optimizing marketing decision making for problems with long-term and uncertain consequences. Kalyan is a US citizen and lives in Nottingham, England.