A multiobjective steepest descent method with applications to optimal ...

Comput Geosci (2016) 20:355–374 DOI 10.1007/s10596-016-9562-7

ORIGINAL PAPER

A multiobjective steepest descent method with applications to optimal well control Xin Liu1 · Albert C. Reynolds1

Received: 19 March 2015 / Accepted: 1 February 2016 / Published online: 16 March 2016 © Springer International Publishing Switzerland 2016

Abstract Multiobjective optimization deals with mathematical optimization problems where two or more objective functions (cost functions) are to be optimized (maximized or minimized) simultaneously. In most cases of interest, the objective functions are in conflict, i.e., there does not exist a decision (design) vector (vector of optimization variables) at which every objective function takes on its optimal value. The solution of a multiobjective problem is commonly defined as a Pareto front, and any decision vector which maps to a point on the Pareto front is said to be Pareto optimal. We present an original derivation of an analytical expression for the steepest descent direction for multiobjective optimization for the case of two objectives. This leads to an algorithm which can be applied to obtain Pareto optimal points or, equivalently, points on the Pareto front when the problem is the minimization of two conflicting objectives. The method is in effect a generalization of the steepest descent algorithm for minimizing a single objective function. The steepest-descent multiobjective optimization algorithm is applied to obtain optimal well controls for two example problems where the two conflicting objectives are the maximization of the life-cycle (long-term) net-presentvalue (NPV) and the maximization of the short-term NPV. The results strongly suggest the multiobjective steepestdescent (MOSD) algorithm is more efficient than competing multiobjective optimization algorithms.

Albert C. Reynolds

[email protected] Xin Liu [email protected] 1

McDougall School of Petroleum Engineering, University of Tulsa, Tulsa, OK 74104, USA

Keywords Multiobjective optimization · Long- and short-term production optimization · Optimal well control

1 Introduction Multiobjective optimization methods aim to find the optimal solutions for conflicting objectives where an optimal solution generally refers to a point on the Pareto front. Heuristic multiobjective optimization methods are the ones most commonly used in the engineering domain. Among these heuristic methods, the strength Pareto evolutionary algorithm-II (SPEA II) and the non-dominated sorting genetic algorithm-II (NSGA II) [17] are among the most popular and have been applied in numerous applications. As a common feature of all these heuristic derivative-free algorithms is that the number of optimization variables (design variables) needs to be reasonably small for these methods to be computationally feasible, heuristic methods will not be discussed further. Other genres of multiobjective optimization algorithms such as the weighted sum method [7, 16], the lexicographic method [14] and the normal-boundary intersection (NBI) method [2] attempt to obtain optimal solutions (points on the Pareto front) by transforming the original multiobjective optimization problem to a series of single-objective sub-problems. Then the minimization (or maximization) of this single objective function can be obtained easily by an efficient gradient-based method although a less computationally efficient derivative-free method could still be used. Instead of transforming the original optimization problem to a single-objective optimization sub-problem, the multiobjective descent algorithms of Fliege and Svaiter [4] and Qu et al. [10] provide theoretically sound methods for computing a search direction which is downhill for all objective

356

functions; a line search method can then be applied to choose step size which guarantees a decrease in all objective functions. The algorithm presented here was motivated by the work of Fliege and Svaiter[4], but unlike their work, our research results provide an analytical expression for computing the search direction at each iteration. Fliege and Svaiter and Qu et al. provide no computational results, but our goal is to compute solutions of reservoir engineering multiobjective problems. In production optimization, we normally maximize the NPV over the expected reservoir life. This is referred to as the long-term or life-cycle production optimization problem. From the operator’s point of view, it may be in his or her best interest to obtain maximum revenue over the next 1 or 2 years. Here, this last objective is categorized as the short-term production optimization problem. Maximizing the long-term NPV and maximizing the short-term NPV are two potentially conflicting objectives. After obtaining optimal long-term well controls, van Essen et. al. [11, 12] and Chen et. al. [1] attempted to improve the short-term NPV without sacrificing the optimal long-term NPV. When the number of control steps is large, this is often possible because there are sufficient degrees of freedom left after optimizing the long-term NPV. More specifically, in the original version of the van Essen et al. work [11], it is shown that the null space of the Hessian evaluated at the longterm optimal well controls is non-empty. Thus, assuming a second-order Taylor series applies at the optimum vector of controls, we can add any linear combination of vectors in the null space to the vector of optimal well controls without decreasing the long-term NPV. In van Essen et al. [11], the Hessian is calculated at every iteration, which makes the method too computationally inefficient to be used for large scale problems. Thus, [11] introduced an alternative method (switching method) to avoid the explicit calculation of the Hessian and its null space. Following the life-cycle optimization, the authors simply optimize the short-term NPV and the life-cycle NPV alternatively, i.e., if the lifecycle NPV is greater than or equal to the optimal life-cycle NPV, the short-term optimization is executed; otherwise, they switch to optimizing the life-cycle NPV. The switching method is straightforward to implement; however, the authors pointed out that the convergence of the method is slow due to infeasible solution steps. An alternate strategy for maximizing both the long-term and the short-term NPV was proposed by Chen et. al. [1]; in this procedure, they first solve the life-cycle constrained optimization problem and then optimize the short-term NPV subject to the constraint that the long-term NPV is greater than or equal to the optimal NPV obtained by life-cycle production optimization.

Comput Geosci (2016) 20:355–374

In this work, we consider the problem of determining well controls based on two conflicting criteria: maximizing the life-cycle NPV and maximizing the short-term NPV. Unlike the work of van-Essen et al.[11] and the Chen et al. work [1], we do not attempt to optimize the short-term NPV while attempting to keep the optimal long-term NPV unchanged. Instead, we apply multiobjective optimization to find a trade-off curve called the Pareto front. As discussed later, points on the Pareto front allow the engineer to choose how much of a decrease in the optimal life-cycle NPV he or she is willing to tolerate in order to increase the short-term NPV. Here, an individual point on the Pareto front is obtained by applying a multiobjective steepest descent (MOSD) algorithm where an analytical expression for the search direction is derived by solving a min-max problem. This MOSD algorithm is derived and applied to the general problem of determining well controls for the two-criteria problem of maximizing long and short-term NPV.

2 Multiobjective optimization and pareto optimality The multiobjective optimization problem (MOOP) has the form

min

f (u) = (f1 (u), f2 (u), ..., fm (u))T s.t. u ∈ S,

(1)

where fi : R n → R and S is the feasible region, which is a subset of R n . Note that a minimization problem can be converted to a maximization problem by changing the signs of all the fi s and vice versa. We refer to u = (u1 , u2 , ..., un )T as the decision or design vector. The domain of f , which here is the vector space R n , is called the decision space. We call f (u) = (f1 (u), f2 (u), ..., fm (u))T the objective vector. The vector space R m which contains the set of all objective vectors is called the objective space. The feasible region S is defined by S = {u ∈ R n | e(u) = (e1 (u), e2 (u), ..., ene (u))T = 0, c(u) = (c1 (u), c2 (u), ..., cnc (u))T ≤ 0}, (2) where the ei ’s represent equality constraints and the ci ’s represent inequality constraints. The set Z in the objective space is defined by Z = {f (u) = (f1 (u), f2 (u), ..., fm (u))T | u ∈ S}.

(3)

Note that the set Z is the image of the set S in the objective space under the mapping of the vector function f .

Comput Geosci (2016) 20:355–374

Since we have more than one objective, unless all the objectives achieve their respective minima at the same design vector, we should expect to have some sort of tradeoff in the optimal solutions. The “solution” of the multiobjective optimization problem is called the Pareto front. The Pareto front is a hyper-surface in the objective space. One important feature of this hyper-surface is that when moving from one point on this hyper-surface to another point on this hyper-surface, if one objective function value increases, at least one other objective function value must decrease. In order to define the Pareto front, we first introduce the notion of a dominant relationship. Definition 1 Given any two design vectors u1 , u2 ∈ S ⊂ R n , we say u1 dominates u2 (u1 u2 or f (u1 ) f (u2 )) if (1) ∀i ∈ {1, 2..., m}, fi (u1 ) ≤ fi (u2 ); (2) ∃j ∈ {1, 2, ..., m} such that fj (u1 ) < fj (u2 ). Definition 2.1 A decision vector u∗ ∈ S ⊂ R n is Pareto optimal if there does not exist another decision vector u ∈ S which dominates the vector u∗ , i.e., u∗ is a non-dominated point in S.

Definition 2.2 The set U = {u ∈ S |u is a Pareto optimal point in S} is called the Pareto optimal set. Definition 2.3 The Pareto front is defined as the set F = {(f1 (u), f2 (u), ..., fm (u))T | u ∈ U }. From Definitions 2.2 and 2.3, we can see that the Pareto front is the map of the Pareto optimal set from the design space to the objective space. It is important to note just as in the single objective optimization problem, gradient-based algorithms, including the one presented here, can converge to a local Pareto optimal solution. Definition 2.4 A decision vector u∗ ∈ S ⊂ R n is local Pareto optimal if there exists δ > 0 such that u∗ is Pareto optimal in S ∩ B(u∗ , δ) where B(u∗ , δ) = {u ∈ R n | ||u − u∗ || ≤ δ}. Similarly, local Pareto fronts can exist. Because a gradientbased algorithm cannot distinguish between Pareto optimal and local Pareto optimal points, we assume here that there is only one Pareto optimal front and that if a decision vector is local Pareto optimal, it is Pareto optimal. The reader should bear in mind, however, that the multiobjective steepest descent algorithm presented in this work cannot distinguish between a local Pareto optimal decision vector and a (global) Pareto optimal decision vector.

357

3 Multiobjective steepest descent method Here, we present an original derivation of an analytical expression for “the optimal” search direction for a twoobjective optimization problem and use this search direction, together with a very simple line search procedure, to obtain a multiobjective steepest descent (MOSD) method. 3.1 Background As is well known [9], for a single-objective optimization problem, the steepest descent direction is obtained by solving min ∇f (u)T v, v (4) s.t. ||v|| = 1. i.e., the steepest descent direction from the point u is the direction, denoted by d, along which the directional derivative is minimum. Throughout, · refers to the 2 norm unless specified otherwise. As is well known, for ∇f (u) = 0, the steepest descent direction is given by d = −∇f (u)/||∇f (u)||. Thus, when there are two objective functions, it is natural to attempt to reduce the values of both objective functions by selecting a search direction in which the directional derivatives of both objective functions at a given u are as small as possible. This goal can be achieved by solving min max{∇f1 (u)T v, ∇f2 (u)T v} v (5) s.t. ||v|| = 1. The solution of Eq. 5 denoted as d is referred to as the multiobjective steepest descent search direction. As discussed in detail later, once d is determined, u is updated by conducting a line search from u in the direction of d. If the vector u is such that ∇f1 (u) = 0 and ∇f2 (u) = 0, then any unit vector d is a solution of Eq. 5. However, if this u corresponds to a strict minimum of either f1 or f2 , then u is local Pareto optimal and, hence, by the assumption made after Definition 2.4, u is Pareto optimal. If u corresponds to a maximum of both f1 and f2 , then any unit vector provides a descent direction from u for both f1 and f2 . Having dealt with this degenerate case where both gradients are equal to zero at u, we assume in the remainder of the theoretical discussion that either ∇f1 (u) = 0 or ∇f2 (u) = 0. In solving Eq. 5, we also assume that does not exist a positive constant a such that ∇f2 (u) = −a∇f1 (u).

(6)

Under this assumption, the assumption that the gradients of both objection function are not zero and the assumption that

358

Comput Geosci (2016) 20:355–374

u is not Pareto optimal, the problem specified by Eq. 5 has a unique solution d. This does not necessarily prove that the multiobjective steepest descent algorithm presented in the next section will converge to a Pareto optimal point; see the convergence discussion in the last paragraph of Section 3.2.

As v ⊥ is normal to S,

3.2 Derivation of MOSD search direction

Thus, the problem specified in Eq. 5 becomes

Lemma 1 Assume that (a) ∇f2 (u) = 0, (b) u is not Pareto optimal and (c) ∇f2 (u) = −a∇f1 (u) for any positive constant a. (i) If ∇f1 (u) and ∇f2 (u) are linearly depen∇f2 (u) is the solution of Eq. 5. (ii) If dent, then d = − ||∇f 2 (u)|| {∇f1 (u), ∇f2 (u)} is a linearly independent set and d is a solution of Problem 5, then d is a linear combination of ∇f1 (u) and ∇f2 (u). Proof Part (i) has been established previously. Thus, we need only prove (ii). Let S be the hyperplane spanned by the two gradients evaluated at u, i.e., S = span {∇f1 (u), ∇f2 (u)}. Then any unit vector v can be written as the sum of a vector v ∈ S and a vector v ⊥ which is normal to S, i.e., (7)

(8)

it follows that ∇f1T v = ∇f1T v and ∇f2T v = ∇f2T v . min max{∇f1T v , ∇f2T v }

We establish theory which provides a rigorous basis for our subsequent multiobjective steepest descent (MOSD) algorithm, and we derive an analytical form for the “steepest” descent direction for a two-objective optimization problem. As noted previously, we assume that, at u, the gradients of both objective functions are not zero and seek to find a vector v of unit length which satisfies Eq. 5. If ∇f1 (u) and ∇f2 (u) are linearly dependent, then either one of the gradients is zero or ∇f1 (u) is a multiple of ∇f2 (u). For the first case, without loss of generality, we assume that ∇f1 (u) = 0; then, the solution of Eq. 5 reduces to the solution of Eq. 4 with f = f2 , i.e., is given by d = −∇f2 (u)/ f2 (u) . However, for the case where ∇f1 (u) = 0, and u corresponds to a strict local minimum of f1 (u), u will be Pareto optimal and (f1 (u), f2 (u)) will be on the Pareto front. Thus, the algorithm has converged to a Pareto optimal point and a small step in the direction d will not yield a point that dominates u. For the second case where ∇f1 (u) = c∇f2 (u), if c > 0, then the solution to Eq. 5 is −∇f2 (u)/||∇f2 (u)||; if c < 0, then ∇f1 (u) and ∇f2 (u) are in opposite directions. In this case, there is no direction along which both objective functions can be decreased, i.e., the current solution point (f1 (u), f2 (u)) is stationary and very likely on the Pareto front; hence, we do not need to solve Eq. 5. This issue is discussed in more detail after Theorem 1. The preceding discussion is summarized in part (i) of the following lemma:

v = v + v⊥.

∇f1 (u)T v ⊥ = 0 and ∇f2 (u)T v ⊥ = 0;

s.t.

v ,v ⊥ ||v + v ⊥ ||2

= (||v ||)2 + (||v ⊥ ||)2 = 1.

(9)

(10)

The equality constraint in Eq. 10 implies that ||v ⊥ || = 0, if and only if ||v || = 1. Next we show that when the two gradients are linearly independent, a solution, d, of Eq. 5 satisfies ∇f1 (u)T d < 0 and ∇f2 (u)T d < 0, Consider 1 ∇f1 ∇f2 d =− . + 2 ||∇f1 || ||∇f2 ||

(11)

(12)

Then 1 ∇f1T ∇f1 ||∇f2 || + ||∇f1 ||∇f1T ∇f2 2 ||∇f1 ||||∇f2 || 1 ||∇f1 ||(||∇f1 ||||∇f2 || + ∇f1T ∇f2 ) =− . 2 ||∇f1 ||||∇f2 || Since ∇f1 and ∇f2 are linearly independent,

∇f1T d = −

(13)

|∇f1T ∇f2 | = ||∇f1 ||||∇f2 ||| cos(α)| < ||∇f1 ||||∇f2 ||, (14) which is simply the Cauchy-Schwarz inequality. Equation 14 indicates ∇f1T ∇f2 + ||∇f1 ||||∇f2 || > 0. Hence, from Eq. 13, we have ∇f1T d < 0. With a similar derivation, we can show that ∇f2T d < 0. Since d is a solution to Eq. 5, then max{∇f1T d, ∇f2T d} ≤ max{∇f1T

d T d , ∇f } < 0, 2 ||d || ||d || (15)

so Eq. 11 holds which implies that any solution d to Eq. 5 is a descent direction for both objective functions. Let us rewrite Eq. 10 (equivalent to Eq. 5) as min

||v +v ⊥ ||=1

ˆ ||v || max{||∇f1 || cos α, ˆ ||∇f2 || cos β},

(16)

where αˆ is the angle between ∇f1 and v and βˆ is the angle between ∇f2 and v . Let d = v +v ⊥ be a solution to Eq. 16 which is equivalent to Eq. 10; then Eq. 11 implies that cos αˆ < 0 and cos βˆ < 0.

(17)

If ||v || < 1, then d = v + v ⊥ would not be a solution of Eq. 16 because we could obtain a smaller value of ˆ by increasing ||v || ˆ ||∇f2 || cos β} ||v || max{||∇f1 || cos α, ˆ and keeping αˆ and β fixed. Thus, if d = v + v ⊥ is a

Comput Geosci (2016) 20:355–374

359

π to π + α, it follows that the behavior of the directional derivatives on [π, π + α] must be qualitatively as shown in Fig. 2a or Fig. 2b, i.e., either the two directional derivatives intersect (Fig. 2b) or they do not intersect (Fig. 2a). The labels on the two subfigures will be clarified later. At this point, we have converted Eq. 5 to the following equivalent problem: min max{||∇f1 (u)|| cos(β), ||∇f2 (u)|| cos(β − α)}, β

s.t. π ≤ β ≤ π + α (19) for any given α ∈ (0, π ] where α is the angle between the two gradients; see Fig. 1. Without loss of generality, let us assume that ||∇f1 (u)|| ≤ ||∇f2 (u)||. To find the β that gives a solution to Problem 19 (19), it suffices to consider the two scenarios depicted in Fig. 2. (There would be two other similar scenarios if we did not assume ||∇f1 (u)|| ≤ ||∇f2 (u)||). Note from Fig. 1 that the unit vector v is given by

Fig. 1 The hyperplane constructed from ∇f1 and ∇f2

solution, we must have v ⊥ = 0 and d = v . Thus, d is a linear combination of the two gradients which completes the proof. Next, we proceed to construct the solution of Problem 5 (5) for the case where ∇f1 (u) and ∇f2 (u) are linearly independent. Because our derivation yields a specific formula for d, it implies that the solution of Eq. 5 is unique. The situation where the two gradients are linearly independent is depicted schematically in Fig. 1, where we define the xaxis to coincide with the direction of ∇f1 (u), denote the angle between ∇f1 (u) and ∇f2 (u) as α, and denote the angle between v and ∇f1 (u) as β. Because ||v|| = 1, the directional derivatives of the two objective functions in the direction of v are given by

v = (cos(β), sin(β))T .

For the scenario depicted in Fig. 2a, in interval [π, π + α], it can be observed that ∇f1T v is always higher than ∇f2T v so that problem min max{∇f1 (u), ∇f2 (u)} becomes v min ∇f1 (u). It is clear that β = π is the solution, i.e., v the optimal v is in the direction of −∇f1 (u), i.e., d = ∇f1 (u) . Since β = π minimizes the directional deriva− ||∇f 1 (u)|| tive of f1 (u) and corresponds to the solution of the problem defined by Eq. 19, or, equivalently, the problem defined by Eq. 5, it follows that the directional derivative of f1 at u must be greater than or equal to the directional derivative of f2 at u. Thus, − ||∇f1 (u)|| ≥ ||∇f2 (u)|| cos(π − α) = −||∇f2 (u)|| cos(α)

∇f1 (u)T v = ||∇f1 (u)|| cos(β), ∇f2 (u) v = ||∇f2 (u)|| cos(β − α), T

= −||∇f2 (u)||

(18)

where β ranges from 0 to 2π , and α ranges from 0 to π. If α is greater than π, we can choose the x-axis to coincide with ∇f2 (u), interchange the role of f1 and f2 and end up with the same problem with α ranging from 0 to π. Equation 18 indicates both directional derivatives decrease as β increases from α to π or decreases as β decreases from 2π to π + α. Thus, the optimal value of β is not in these open intervals, i.e., a value of β that corresponds to a solution of Problem 5 (5) must be in the interval [0, α] or the interval [π, π + α]. From Eq. 18, it is clear that there does not exist a β ∈ [0, α] such that ∇f1 (u)T v < 0 and ∇f2 (u)T v < 0, whereas there exist values of β in [π, π + α] such that both gradients are negative. Therefore, the solution should correspond to β ∈ [π, π + α]. Since cos(β) is an increasing function of β as β increases from π to π + α while cos(β − α) is a decreasing function of β as β increases from

(20)

∇f1 (u)T ∇f2 (u) , ||∇f1 (u)|| ||∇f2 (u)||

(21)

which is equivalent to both of the two following equations: ∇f1 (u)T ∇f1 (u) ≤ ∇f1 (u)T ∇f2 (u),

(22)

and ∇f1 (u)T (∇f1 (u) − ∇f2 (u)) ≤ 0.

(23)

Conversely, if Eq. 23 holds, then Eq. 21 holds, i.e., −||∇f1 (u)|| ≥ ||∇f2 (u)|| cos(π − α).

(24)

The left-hand side of Eq. 24 is the directional derivative of f1 (u) when β = π, or, equivalently, when v = −∇f1 (u)/||∇f1 (u)||. This v corresponds to the minimum directional derivative of f1 ; thus, for any other unit vector v; it follows from Eq. 24 that ∇f1T v > −||∇f1 (u)|| ≥ ||∇f2 (u)|| cos(π − α) > ∇f2T v. (25)

360

Comput Geosci (2016) 20:355–374

From Fig. 1, we observe when β = π, v = a[−∇f1 (u)] where a > 0 and when β = π + α, v = b[−∇f2 (u)] where b > 0. Thus, for any β ∈ [π, π + α], d can be written as the convex combination of −∇f1 (u) and −∇f2 (u), i.e., d1 ≥ 0 and d2 ≥ 0 where d1 and d2 cannot both be zero as d = 0. Without loss of generality, we assume that d2 = 0. Then, we may divide Eq. 27 by d2 and redefine d1 as the d1 /d2 to obtain D = d1 (∇f1 (u)) + (−∇f2 (u)),

(28)

where D = d/d2 . If we determine d1 in Eq. 28 such that the resulting D (28 satisfies (26), we simply have to normalize D by dividing it by its norm to obtain a vector, d, of length one which satisfies (26). Substituting Eq. 28 into Eq. 26 and solving for d1 gives d1 =

||∇f2 (u)||2 − ∇f1 (u)T ∇f2 (u) . ||∇f1 (u)||2 − ∇f1 (u)T ∇f2 (u)

Substituting Eq. 29 into Eq. 28 gives ||∇f2 (u)||2 − ∇f1 (u)T ∇f2 (u) (u)) (−∇f D = 1 ||∇f1 (u)||2 − ∇f1 (u)T ∇f2 (u) − ∇f2 (u);

(29)

(30)

thus, the solution for d when the qualitative behavior of Fig. 2b applies is given by d= Fig. 2 The two possible scenarios for determining the optimal multiobjective search direction

Thus, we have proved the following result:

d=−

∇f1 (u) ∇f1 (u)||

(31)

where D is given by Eq. 30. Combining both scenarios under the assumption that ||∇f1 (u)|| ≤ ||∇f2 (u)||, it follows that the solution to the min-max problem of Eq. 5 is

Lemma 2 Assume without loss of generality that ||∇f1 (u)|| ≤ ||∇f2 (u)|| and ∇f1 (u) = 0, then the solution of the min-max problem defined by Eq. 5 (or equivalently Eq. 19) is given by

D . ||D||

∇f1 , if ∇f1T (∇f1 − ∇f2 ) ≤ 0; d = − ||∇f 1 || D d = ||D|| , otherwise.

(32)

Note that to obtain the analogous solution of Problem 5 when ||∇f1 (u)|| > ||∇f2 (u)||, we simply interchange f1 and f2 in the equation for D and the equations for d. All preceding results, including the results of Lemma 1 and Lemma 2, are summarized in the following theorem.

if and only if Eq. 23 holds. It remains to consider the scenario depicted qualitatively in Fig. 2b. From Fig. 2b, it is clear that the solution of the min-max defined by Eq. 5 is the d which satisfies ∇f1 (u)T d = ∇f2 (u)T d.

(26)

From Lemma 1, d must be in the hyperplane spanned by the two linearly independent gradients. Hence, d can be written as d = d1 (−∇f1 ) + d2 (−∇f2 ).

(27)

Theorem 1 Assume that (a) ∇f1 (u) and ∇f2 (u) are not both zero, (b) u is not Pareto optimal and (c) ∇f2 (u) = −a∇f1 (u) for any positive constant a. Consider the minmax problem min max{∇f1 (u)T v, ∇f2 (u)T v} v

s.t. ||v|| = 1,

the solution of which is by definition the biobjective steepest descent direction. If ||∇f1 (u)|| ≤ ||∇f2 (u)||, let gs = ∇f1 (u) and g = ∇f2 (u); otherwise, let gs = ∇f2 (u) and

Comput Geosci (2016) 20:355–374

361

g = ∇f1 (u). If {∇f1 (u), ∇f2 (u)} is a linearly independent set, then the solution d to this min-max problem can be written as − ||ggss || , if gsT (gs − g ) ≤ 0; , (33) d= D ||D|| , otherwise

the gradients are only numerical approximations of the true gradients. Hence, it often difficult to satisfy Eq. 34 with α ≥ 0.98. In this case, we terminate the algorithm when we are unable to find a step length that can result in a decrease in both objective functions along the multiobjective steepest descent direction.

where ||g ||2 − gsT g (−gs ) − g . D= ||gs ||2 − gsT g

Algorithm 1 Unconstrained multiobjective steepest descent algorithm)

Theorem 1 gives the “optimal” search direction for our multiobjective steepest descent (MOSD) method. An equivalent formula for a so-called multiple-gradient descent direction has been presented in [3]. Their derivation is very different from ours. Their approach originates from a very different theoretical perspective while the derivation presented here is quite simple. As indicated in the algorithm given in the next section, we assume that the MOSD algorithm has converged if the gradients of the two objective functions are in opposite directions, i.e., we assume that u∗ is a Pareto optimal solution when ∇f1 (u∗ ) + λ∇f2 (u∗ ) = 0, λ > 0. Theoretically, for a solution to be Pareto optimal, it is also necessary to require that, for any dn normal to ∇f1 (u∗ ) (hence, also normal to ∇f2 (u∗ )), i.e., dnT ∇f1 (u∗ ) = 0 (and dnT ∇f2 (u∗ ) = 0), at least one of the two terms dnT ∇ 2 f1 (u∗ )dn and dnT ∇ 2 f2 (u∗ )dn is greater than zero (see Theorem 3.5 in [15]). Otherwise, it may be possible to decrease both objective functions along the direction dn even though Theorem 1 does not provide a biobjective steepest descent direction. In practice, it appears to be very unlikely to find that the line search algorithm converges to a solution where both Hessians are either indefinite or negative definite in the subspace spanned by vectors normal to both gradients.

Set u0 , α 0 , maxcut, α , default is α = 0.99 FOR k = 1, 2, · · · set ncut=0 evaluate ∇f1 (uk ), ∇f2 (uk ), calculate dk using Eq. 33. ∇f1 (uk )T ∇f2 (uk ) < −α IF cos(θ k ) = ||∇f k k 1 (u )||||∇f2 (u )|| stop ENDIF DO IF f1 (uk + α k d k ) < f1 (uk ) and f2 (uk + α k d k ) < f2 (uk ) uk+1 = uk + α k d k , f1 (uk+1 ) = f1 (uk + α k d k ), f2 (uk+1 ) = f2 (uk + α k d k ) α k+1 = min{2α k , α 0 }, (factor 2 restores step-size.) exit ELSE α k = α k /2, ncut=ncut+1 IF ncut>maxcut stop ENDIF ENDIF ENDDO

4 Weighted sum and NBI methods 3.3 Algorithm for multiobjective minimization After obtaining the search direction for the multiobjective steepest descent (MOSD) method, we use the simplest backtracking method (cutting the step-size by half) to determine the line search step-size. If the gradients are not zero, the angle between them should be 180 degrees (α = π radians) at a Pareto optimal solution. Thus, if k denotes the iteration index in the MOSD algorithm, the ideal convergence criteria is of the form ∇f1 (uk )T ∇f2 (uk ) < −α , ||∇f1 (uk )||||∇f2 (uk )||

(34)

where α ≈ 1. Our default value of α is 0.99; however, because gradients are evaluated by the implementation of an adjoint method coupled with our reservoir simulator,

For the completeness of this paper, we provide a short introduction of the weighted sum method and the NBI method. For a detailed description of the two methods, we refer the readers to [8]. For the weighted sum method, we assign two weights, w1 and w2 , to the two functions, where w1 and w2 are nonnegative and w1 + w2 = 1. We obtain the aggregate function F by adding the two objective functions with their corresponding weights, i.e., F (u) = w1 f1 (u) + w2 f2 (u).

(35)

Then by minimizing (or maximizing) F (u), we obtain a solution on the Pareto front. By varying the value of w1 , we can obtain different points on the Pareto front. To implement the NBI method, we first need to perform one optimization for each individual objective function. Let

362

Comput Geosci (2016) 20:355–374

is to provide the engineer a choice of optimal well controls which represent a trade off between the two conflicting objectives of maximizing the long-term (or life-cycle) NPV and maximizing the short-term NPV. 5.1 Problem formulation We define the NPV of production from a two phase (oil and water) reservoir as ⎛⎡ Np Nt t n n n ⎝ ⎣ (ron qo,i − rwn qw,i ) J (u) = n (1 + b)t /365 n=1 i=1 ⎤⎞ Nwi n t n n − rwi qwi,i , ⎦⎠ , (37) n (1 + b)t /365

Fig. 3 Schematic plot of the NBI method

i=1

us denote the optimal point for the first objective function as u∗1 and the optimal point for the second objective function as u∗2 . We define f1∗ = (f1 (u∗1 ), f2 (u∗1 ))T , f2∗ = (f1 (u∗2 ), f2 (u∗2 ))T and a 2 × 2 matrix = (f1∗ , f2∗ ). The line segment connecting f1∗ and f2∗ in the objective space is given by {(f1∗ , f2∗ )(β1 , β2 )T = β | β ∈ R 2 , β1 > 0, β2 > 0, β1 + β2 = 1}. We call this line segment the utopia line. Conceptually speaking, starting from a point on the utopia line, we attempt to search along the unit normal, n, to the utopia line in the objective space until we find a point on the boundary of Z. The schematic plot of the NBI method is shown in Fig. 3. With NBI, we obtain a boundary point by choosing β and solving the following sub-problem:

max t s.t. e(u, t) = β + tn − f (u) = 0.

(36)

Then we can obtain different points on the boundary by varying the values of β. The NBI method generates solutions that are distributed uniformly on the boundary. Therefore, in terms of finding the “best” representation of the Pareto front, the NBI method outperforms the MOSD introduced here. However, since the NBI method has additional equality constraints, it is normally more computationally expensive than the MOSD and the weighted sum methods. The advantage of the MOSD method is that, when we have a vector of design variables (say from an engineering guess), then the MOSD method can quickly provide a new vector of design variables which is better than the original vector of design variables for all objective functions.

5 Application to reservoir engineering The focus of this section is the application of the algorithm to two optimal well control problem examples. The goal

where u is the control vector; Nt is the total number of time steps; Np is the total number of producers; Nwi is the total number of water injection wells; the superscript n denotes the nth time step; ro is the oil revenue ($/STB); rw is the disposal cost of produced water ($/STB); rwi is the water n is the average oil production rate injection cost ($/STB); qo,i n over the nth time step of the ith producer (STB/day); qw,i is the average water production rate over the nth time step n is the average water of the ith producer (STB/day); qwi,i injection rate over the nth time step of the ith water injector (STB/day); b is the annual discount rate; t n is the cumulative time up to the nth simulator time step (days); and t n is the length of the nth time step (days). Define ⎛⎡ Np NtL t n n n ⎝ ⎣ (ron qo,i − rwn qw,i ) JL (u) = n (1 + b)t /365 n=1 i=1 ⎤⎞ Nwi n t n n ⎦⎠ (38) − rwi qwi,i n (1 + b)t /365 i=1

and JS (u) =

NtS

⎛⎡ ⎝⎣

n=1

−

Nwi i=1

Np n n (ron qo,i − rwn qw,i ) i=1

n n rwi qwi,i

t n n (1 + b)t /365 ⎤⎞

t n ⎦⎠ , n (1 + b)t /365

(39)

where NtL is the number of time steps for long-term or lifecycle optimization and NtS is the number of time steps for short-term optimization. When the conflicting criteria are to maximize the long-term NPV and maximize the shortterm NPV at the same time, the multiobjective optimization problem is max (JL (u), JS (u)) = min − JL (u), −JS (u) u u . (40) i ≤ uup . ≤ u s.t. ulow i i

Comput Geosci (2016) 20:355–374

25 ⊗ INJ7

⊗ INJ8

363

INJ9 ⊗ 10

20

∅ Pro3

∅ Pro4 8

15 ⊗ INJ4

⊗ INJ5

INJ6 ⊗ 6

10 ∅ Pro1

∅ Pro2

5

5.2.2 Log transform and parameter initialization 4

⊗ INJ1 5

10

⊗ INJ2 15

controls obtained from long-term optimization are displayed in Fig. 5 where in this and similar figures, the well index corresponds to the ordinate and the control step index corresponds to the abscissa. Whenever a well control is depicted by white, it means that the well is shut-in for that control step. The remaining oil saturation distributions in the reservoir after 360 days and after 1800 days of production are displayed in Fig. 6.

INJ3 ⊗ 20 25

Fig. 4 Permeability field and well locations.

Note that we only consider bound constraints in Eq. 40. 5.2 Example 1, fluvial reservoir In this example, we apply the MOSD algorithm to simultaneously maximize both the life-cycle and the short-term NPV for water flooding of a very simple 2D fluvial synthetic reservoir. The reservoir model is based on a 25×25×1 grid with x = y = 200 ft and z =20 ft. The porosity is homogenous throughout the reservoir with a value of 0.2. As shown in Fig. 4, there are 13 vertical wells consisting of 4 producers and 9 injectors arranged in a five-spot pattern. The initial pressure of the reservoir is 3800 psi; the connate water saturation is 0.2 and the residual oil saturation is 0.25. The oil price is $50/STB, the water injection costs are $0/STB, the water disposal costs are $5.56/STB, and the annual discount rate is 0.1. We set each injector under injection rate control with 0 ≤ qwi,i ≤ 2000 STB/D and set each producer under BHP control with 1500 ≤ pwf,i ≤ 4000 psi. We set the control time step size to 180 days. For long-term optimization, we maximize the NPV over 1800 days while for short-term optimization, we maximize the NPV over 360 days. 5.2.1 Base case

In this and the next example, the multiobjective steepest descent (MOSD) method is applied to generate points on the Pareto front when the objectives are to maximize the long-term and the short-term NPV. Before applying the algorithm, the bound constraints in the problem specified by Eq. 40 are removed by applying the log-transformation used by Gao and Reynolds [5]. Thus, each design variable ui is transformed to a new optimization variable si by using ui − ulow i (41) , for i = 1, 2, ..., NcL . si = ln up ui − ui The unconstrained problem with design vector s is then solved with the MOSD algorithm presented above. After the optimization, the estimated design variables are then 2000 2 1500 4 1000 6 500 8 2

4

6

8

10

0

3500 1 3000

The base case refers to the case where only the long-term NPV is maximized without consideration of the short-term NPV. The short-term NPV is evaluated by simply using the optimal well controls obtained by optimizing the longterm NPV. For the base case, maximization of JL (u) is accomplished using the trust-region quasi-Newton method developed by Chen et al. [1]. The resulting estimate of the maximum long-term NPV is $3.7085 × 108 . At the end of 360 days, the associated short-term NPV based on the longterm optimal well control is $2.685 × 108 . The optimal well

2 2500 3 2000 4 2

4

6

8

10

1500

Fig. 5 Optimal well control for long-term optimization only

364

Comput Geosci (2016) 20:355–374

Fig. 6 Oil Saturation after 360 and 1800 days obtained by optimizing only long-term NPV

25 ⊗ INJ7 20

⊗ INJ8

∅ Pro3

INJ9 ⊗

0.8

25 ⊗ INJ7

0.7

20

∅ Pro4 0.6

15 ⊗ INJ4

⊗ INJ5

INJ6 ⊗

0.8 0.7

∅ Pro3

∅ Pro4 0.6

⊗ INJ4

⊗ INJ5

INJ6 ⊗

0.5

10 ∅ Pro1

∅ Pro2

0.4

5

∅ Pro1

∅ Pro2

0.4

5 ⊗ INJ1 5

10

transformed back to the ui ’s using the inverse of the transformation for Eq. 41. In both examples, the initial step size, α 0 is set equal to 1 and the maximum number of step size cuts at each iteration is defined by maxcut = 6. Finally, we set α = 0.99. As the optimization is done in terms of the variable s, the iterate at the kth iteration of the MOSD algorithm is s k instead of uk . As there is a one-to-one correspondence between u and s, we use u and s interchangeably in the following discussions. In figures similar to Fig. 5, all controls are plotted in terms of the original control variables, i.e., the components of u. 5.2.3 Multiobjective steepest descent To obtain the Pareto front using the MOSD method, we apply this method for different initial guesses. To generate initial guesses for u (hence, s), we first obtain u∗L and u∗S , respectively, by performing, long-term optimization only and short-term optimization only. Here, for short-term optimization, after performing optimization for the first two control steps (360 days in total), we perform another optimization for the rest of the control steps (the next 1440 days)

Table 1 Solution obtained by the MOSD method for fluvial case, w1 and w2 are used to denote different initial guesses

INJ9 ⊗

15

0.5

10

⊗ INJ8

⊗ INJ2 15

INJ3 ⊗ 20 25

0.3

⊗ INJ1 5

10

⊗ INJ2 15

INJ3 ⊗ 20 25

0.3

to obtain u∗S . Then linear combinations of u∗S and u∗L with different weights are used as initial guesses for the MOSD method. We use w1 and w2 to denote the weights assigned to u∗S and u∗L . Here, both w1 and w2 are positive and they sum up to 1. The results obtained by applying the MOSD method for different initial guesses are shown in Table 1 and Fig. 7. Note that, in Table 1, the results displayed for w1 = 1 and w1 = 0 are not obtained using the MOSD method; instead, they are obtained by applying long-term and short-term optimization separately. In Fig. 7, we use bolder circles to denote the nondominated solutions. From the results of Fig. 7 and Table 1, it is clear that we have obtained nine non-dominated points out of all nine solutions. In fact, all nine points would be non-dominated by any others if we did not include the point corresponding to the NPV obtained by short-term optimization only. If we operate the fluvial reservoir using the well controls shown in Fig. 8 which are obtained by applying the MOSD method with w1 = 0.7, then, compared to the life-cycle and the short-term NPV values obtained by maximizing only the life-cycle production, the results of Table 1

w1

w2

JS ($)

JL ($)

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

2.685 × 108 2.7444 × 108 2.8089 × 108 2.8508 × 108 2.8788 × 108 2.9017 × 108 2.9237 × 108 2.9367 × 108 2.9488 × 108 2.9541 × 108 2.9705 × 108

3.7085 × 108 3.7076 × 108 3.7066 × 108 3.7047 × 108 3.7038 × 108 3.7014 × 108 3.6955 × 108 3.6902 × 108 3.6803 × 108 3.6714 × 108 3.6726 × 108

Comput Geosci (2016) 20:355–374

365

enable us to sweep more oil from the region around injectors 1, 2, and 7. At the same time, we produce more oil from producers 1 and 2 during the first 360 days for the MOSD solution because the BHP controls are lower than the corresponding optimal BHP’s obtained by maximizing the long-time NPV only. Among all nine solution points, six of the converged points reach values of cos θ less than −0.99 where θ is the angle between the gradients of the two objective functions at the converged points. For the other three solutions, the MOSD method was terminated because the maximum step cut allowed was exceeded. Final values of the cosines of the angles for these three solutions are around −0.96. The values of the cosine of the angle between the two gradients as a function of iterations for some selected weights is shown in Fig. 10. At every iteration, we need to run the simulator forward to evaluate the life-cycle and the short-term NPVs. Also, we need to run the adjoint simulator backwards in time to obtain the life-cycle and the short-term adjoint gradients. The long-term simulation run counts as 1 equivalent simulation run. Calculating the long-term gradient takes about 3/10 the time of an equivalent simulation run, and calculating the short-term gradient takes about 1/5 the time of an equivalent simulation run. Therefore, the algorithm requires about 1.5 equivalent simulation runs every time the gradient is evaluated. Note that this does not mean we only require 1.5 equivalent simulation runs per iteration since in every iteration, we may need to cut the step size and after a step size cut, it is necessary to rerun the simulator although the gradient does not need to be recalculated; see Algorithm 1. For this example, it takes 826 equivalent simulation runs for the MOSD method to obtain the nine solution points.

8

Long term NPV($)

3.8

x 10

3.75 3.7 3.65 3.6 2.7

2.75 2.8 2.85 2.9 Short term NPV($)

2.95 8

x 10

Fig. 7 Solutions obtained by applying the MOSD method for the fluvial reservoir; bolder data points represent non-dominated solutions.

indicate that we reduce the long-term NPV by only 0.38 million dollars (0.1 %), but we improve the short-term NPV by 16.5 million dollars (6.1 %). Thus, the controls corresponding to w1 = 0.7 might be a good choice. More importantly, the generation of the Pareto front of Fig. 7 and Table 1 provides the engineer the opportunity to choose how much long-term NPV to sacrifice in order to increase short-term NPV significantly. Comparing the results of Figs. 8 and 5, we see that we inject slightly more water at injectors 1, 2, and 7 at the first two control steps (360 days in total) when the shortterm NPV is considered, i.e., when the controls obtained by MOSD with w1 = 0.7 are applied. For this MOSD solution, we also lower the bottom-hole pressure for producers 1 and 4 at the first two control steps in order to obtain higher production rates. The oil saturation fields at 360 days and 1800 days that correspond to the well controls obtained by the MOSD method with w1 = 0.7 are shown in Fig. 9. Comparing the results of Fig. 9 with those of Fig. 6, we see that the well controls obtained by the MOSD method with w1 = 0.7

Fig. 8 Optimal well controls obtained by applying the MOSD with w1 = 0.7

5.2.4 Comparison to other methods In this subsection, the optimal solutions obtained by the MOSD method are compared with those from the adjusted

2000

3500 1

2 1500

3000 2

4 1000 6

2500 3

500 8

2000 4

2

4

6

8

10

0

2

4

6

8

10

366 Fig. 9 Oil saturation fields after 360 and 1800 days of production obtained by applying the MOSD method with w1 = 0.7

Comput Geosci (2016) 20:355–374 25 ⊗ INJ7 20

⊗ INJ8

∅ Pro3

INJ9 ⊗

0.8

25 ⊗ INJ7

0.7

20

∅ Pro4 0.6

15 ⊗ INJ4

⊗ INJ5

INJ6 ⊗

∅ Pro1

∅ Pro2

10

⊗ INJ2 15

weighted sum method and the NBI method. The application of the adjusted weighted sum method and the NBI method to this reservoir is discussed in [8]. The solutions obtained by all three methods are shown in Fig. 11. The constraint (Eq. 36) imposed in the NBI method is designed to ensure that the method gives optimal points that are distributed uniformly on the Pareto front. However, for the MOSD method, there is no built-in mechanism to force the final solutions to be uniformly distributed on the Pareto front. With MOSD, which Pareto optimal points are found is determined solely by the suite of initial guesses used. Thus, it is reasonable to expect that the NBI method may find

∅ Pro4 0.6

⊗ INJ4

0.4

⊗ INJ5

∅ Pro1

INJ3 ⊗ 20 25

0.3

⊗ INJ1 5

INJ6 ⊗

0.5

∅ Pro2

0.4

−0.5

−0.5

−0.6

−0.6

−0.7

−0.7

−0.8 −0.9 −1 0

10

⊗ INJ2 15

0.3

−0.8 −0.9

5

10

15

−1 0

20

5

10

iterations

15

20

iterations

−0.5

−0.6

−0.6

−0.7

−0.7

Cos(α)

−0.5

−0.8 −0.9 −1 0

INJ3 ⊗ 20 25

a better Pareto front than does the MOSD method. Nevertheless, from Fig. 11, we see that the Pareto front obtained by the multiobjective steepest descent method is comparable to those obtained by the adjusted weighted sum and the NBI methods. In terms of computational cost, in this study, the MOSD method requires 826 equivalent runs to obtain all solutions while the adjusted weighted sum method requires 1012 equivalent simulation runs and the NBI method requires 1392 equivalent simulation runs. Thus, the multiobjective steepest descent method is the most computationally efficient method to generate the Pareto front for this example.

Cos(α)

Cos(α)

0.7 ∅ Pro3

5 ⊗ INJ1 5

Cos(α)

0.8

10

5

Fig. 10 Cosine of the angle as a function of iteration number, here (0.3, 0.7) denotes applying the MOSD method with w1 = 0.3

INJ9 ⊗

15

0.5

10

⊗ INJ8

−0.8 −0.9

20

40 iterations

60

−1 0

20

40 iterations

60

Comput Geosci (2016) 20:355–374

367

8

Long term NPV($)

3.8

x 10

3.75 3.7 3.65 mosd ws nbi

3.6 2.7

2.75 2.8 2.85 2.9 Short term NPV($)

2.95 8

x 10

Fig. 11 Comparison of solutions obtained by the MOSD method with solutions obtained by the adjusted weighted sum and NBI methods; thicker data points denote non-dominated solutions when all solutions obtained from all three methods are considered

system to represent this reservoir. The initial reservoir pressure of the top layer is 3402 psi while the average reservoir pressure which coincides with the initial pressure of the third layer is 3413 psi. All wells are operated under the bottom-hole pressure control. For producers, the lower and upper bounds on the bottom-hole pressure controls are 1000 and 3300 psi, respectively, while for injectors the lower and upper bounds on the bottom-hole pressure controls are 3430 and 6000 psi, respectively. The oil revenue is $80/STB and the disposal cost of water is $8.9/STB. We neglect the water injection cost and the discount rate is zero. We set the lifecycle of production for this example equal to 1900 days. We divide this time span into 10 control steps with each control step lasting for 190 days. For short-term optimization, we consider maximizing the NPV over the first two control steps (380 days). 5.3.1 Base case

5.3 Example 2, PUNQ reservoir The geostatistical parameters of the PUNQ reservoir used here are the ones that were determined by [6]. We consider a single reservoir model for performing life-cycle and short-term optimization. The horizontal and vertical log permeability fields are shown in Figs. 12 and 13, respectively. The porosity field has features similar to the horizontal (or vertical) log permeability fields. For the PUNQ reservoir considered in this example, all producers are completed only in the top layer while all injectors are only completed in the bottom layer. The positions of all wells are shown in Fig. 12 and 13. We use a 20 × 30 × 5 rectangular grid

Fig. 12 Log horizontal permeability distribution

30 25

∅ Pro1

20

∅ Pro2

∅ Pro3

15

∅ Pro5 Pro7 ∅ ∅ Pro4

We first perform long-term optimization using the trustregion quasi-Newton method implemented by Chen et al. [1]. The long-term NPV obtained is $1.0415×109 . At the end of 380 days, the well controls obtained by maximizing only long-term production predict a short-term NPV of $3.4292×108 . The optimal well controls obtained from long-term optimization only are displayed in Fig. 14. Average injection rates and average oil production rates corresponding to the optimal long-term well control are shown in Fig. 15. The oil saturation fields at the end of 380, 760, 1330, and 1900 days, respectively, are shown in Figs. 16, 17, 18, and 19.

8

30

8

30

8

6

25

6

25

6

4

20

4

15 2

2

10

10

0 5

0

0

5 5

10

15

20

4

15

2 Pro6 ∅

10

20

−2

5 5

10

30

8

25

6

20

4

15

15

20

−2

30

5

8

⊗ INJ4

25 20

10

⊗ INJ6

6 4

⊗ INJ1

15 2

10 0 5 10

15

20

−2

2 0

⊗ INJ3

5 5

⊗ INJ7

⊗ INJ2

10

⊗ INJ5 5

10

15

20

−2

15

20

−2

368

Comput Geosci (2016) 20:355–374 30 25

∅ Pro1

20

∅ Pro2

∅ Pro3

15


8

30

8

30

8

6

25

6

25

6

4

20

4

15 2

2

10

10

0 5

0

0

5 5

10

15

20

4

15

2 Pro6 ∅

10

20

5

−2

5

10

15

30

8

30

25

6

25

20

20

15

5

10

15

20

−2

8

⊗ INJ4

20

4

−2

⊗ INJ6

6 4

⊗ INJ1

15 2

10 0 5 10

15

⊗ INJ5

−2

20

2 0

⊗ INJ3

5 5

⊗ INJ7

⊗ INJ2

10

5

10

15

20

−2

Fig. 13 Log vertical permeability distribution

Note that the results of Figs. 14 and 15 indicate that for the first four controls steps (950 days), only injection wells 2 and 3 inject a significant amount of water, and wells 5 and 1 produce the most oil. This corresponds to using injected water to displace oil in the north to north-east direction which can be deduced from the results of Figs. 16, 17, 18, and 19. As time increases, the injection rates of water at injectors 4 and 6 and the production rate at producer 1 tend

Fig. 14 Optimal well controls obtained by only maximizing the long-term NPV; PUNQ example

to be higher, and more oil is displaced from the northeast part of the reservoir towards well 1. 5.3.2 Multiobjective steepest descent method As in example 1, to obtain different initial guesses for the MOSD method, long-term and short-term optimizations are performed individually to obtain u∗L and u∗S , respectively.

6000 1

1 3000 5500

2 3

5000

4

2 2500

3 4

2000

4500 5

5 4000

6 7

3500 2

4

6

8

10

1500

6 7 2

4

6

8

10

1000

Comput Geosci (2016) 20:355–374

369

Fig. 15 Average well rates obtained by maximizing only the long-term NPV; PUNQ example

1

1 8000

8000

2 3

2 6000

3

6000

4

4 4000

4000 5

5

6 7

7 2

4

6

Then, different linear combinations of these two vectors serve as initial guesses for the MOSD method. Again, w1 denotes the weight assigned to the vector u∗L obtained by maximizing only long-term NPV. By varying w1 from 0.1 to 0.9 in increments of 0.1, the results shown in Table 2 and Fig. 20 are obtained. As shown in Fig. 20, all solutions are non-dominated points. It is important to point out that none of the nine solutions obtained are such that the cosine of the angle (cos(θ))

0.8

30

0.7

25

∅ Pro1

20

∅ Pro2

∅ Pro3

15

8

2

4

6

8

10

between the two gradients is less than −0.99. Two of the nine cos(θ) values are less than −0.98 whereas all others are around −0.96. We conjecture that the reason that cos(θ) < −0.99 is not attained is due to the truncation error that occurs during reservoir simulation and solution of the adjoint system. When using different values for the maximum allowable time step size (from 1 day to 15 days) in the simulation run, cos(θ) was evaluated for the final well controls obtained with w1 = 0.5. The value of cos(θ)

0.8

30

0.7

25

0.8

30

0.7

25

∅ Pro5 Pro7 ∅ ∅ Pro4 Pro6 ∅

0.3

10

0.3

10

0.3

0.2

5

0.2

5

0.2

5

0.6

15

0.6

20

20

0.5 0.4

0.5 15

0.5 15

0.4

0.1 10

0

10

0.6

10

5

2000

6

2000

0.4

0.1

20

5

10

0.8

30

0.7

25

15

0.1

20

5

30

0.8

⊗ INJ4

25

⊗ INJ6

0.7

0.6 20

0.6 20

⊗ INJ1

0.5 15

0.4

10

0.3

5

0.2

0.5

15 ⊗ INJ7

⊗ INJ2

10

⊗ INJ3

5

10

15

20

5

10

Fig. 16 Saturation field at the end of 380 days, obtained by optimizing the long-term NPV

15

0.4 0.3 0.2

⊗ INJ5

0.1 5

10

20

0.1

15

20

370

Comput Geosci (2016) 20:355–374 0.8

30

0.7

25

∅ Pro1

20

∅ Pro2

∅ Pro3

15


0.6

Pro6 ∅

0.3

10

30

30

25

25

20

20

15

15

10

10

5

5

0.5 0.4

0.2

5

0.1 5

10

15

2

20

4

6

8

10

12

14

16

18

20

30

30

25

25

20

20

2

4

6

8

10

12

14

16

18

20

0.8

⊗ INJ4

⊗ INJ6

0.7 0.6

⊗ INJ1

0.5

15

15

10

10

5

5 2

4

6

8

10

12

14

16

18

⊗ INJ7

⊗ INJ2 ⊗ INJ3 5

10

0.3 0.2

⊗ INJ5

20

0.4

15

0.1

20

Fig. 17 Saturation field at the end 760 days, obtained by optimizing the long-term NPV

0.8

30

0.7

25

∅ Pro1

20

∅ Pro2

∅ Pro3

15

0.8

30 25

0.6

Pro6 ∅

0.3

10

0.3

10

0.3

0.2

5

0.2

5

0.2

5

0.6

15

0.6

20

20

0.5 0.4

0.5 15

0.5 15

0.4

0.1 10

0.7

25


10

5

0.8

30

0.7

0.4

0.1

20

5

10

0.8

30

0.7

25

15

0.1

20

5

30

0.8

⊗ INJ4

25

⊗ INJ6

0.7

0.6 20

0.6 20

⊗ INJ1

0.5 15

0.4

0.5

15

10

0.3

10

5

0.2

5

⊗ INJ7

⊗ INJ2 ⊗ INJ3

10

15

20

5

10

Fig. 18 Saturation field at the end of 1330 days, obtained by optimizing the long-term NPV

15

0.4 0.3 0.2

⊗ INJ5

0.1 5

10

20

0.1

15

20

Comput Geosci (2016) 20:355–374

371 0.8

30

0.7

25

∅ Pro1

20

∅ Pro2

∅ Pro3

15

∅ Pro5 Pro7 ∅ ∅ Pro4 Pro6 ∅

10

0.5 15 10 5

10

15

0.5 15

0.4

0.1 5

0.6 20

0.5

0.3

0.7

25

0.6 20

0.4

0.8

30

0.7

25

0.6

0.2

5

0.8

30

0.4

0.3

10

0.3

0.2

5

0.2

0.1

20

5

10

0.8

30

0.7

25

15

0.1

20

5

30

⊗ INJ6

0.6 20

⊗ INJ1

0.5 15

0.4

10

0.3 0.2

5

0.5

15 ⊗ INJ7

⊗ INJ2

10

⊗ INJ3

5

10

15

20

5

10

15

0.4 0.3 0.2

⊗ INJ5

0.1 5

20

0.7

0.6 20

15

0.8

⊗ INJ4

25

10

0.1

20

Fig. 19 Saturation field at the end of production, obtained by optimizing the long-term NPV

varies from −0.89 to −0.96. Therefore, the truncation error seems to play a vital role in determining the value of cos(α). Volkov et. al. [13] also observed that the truncation error associated with selecting different time steps during simulation will affect the accuracy of the estimates of both the NPV function and the adjoint gradient. We investigate the practical value of the solutions obtained here. When operating the PUNQ reservoir with the well controls shown in Fig. 21, which are obtained by applying the MOSD method with w1 = 0.3, we

are able to significantly improve the short-term value of NPV over the short-term value of NPV obtained with the life-cycle controls (Fig. 14) without radically lowering the value of long-term NPV. Specifically, we are able to improve the short-term NPV by 405 millon dollars (118%) while the long-term NPV decreases only by 15.9 millon dollars (1.5%). Comparing well controls shown in Figs. 14 and 21 and their corresponding well rates shown in Figs. 15 and 22, we see that, to improve the short-term NPV, we inject significantly more water at injector 2 at the first

Table 2 Solution obtained by the MOSD method for PUNQ case, w1 and w2 are used to denote different initial guesses, w1 multiplies u∗L w2

JS ($)

JL ($)

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

3.4292 × 108 4.4725 × 108 5.3795 × 108 5.8919 × 108 6.4583 × 108 6.856 × 108 7.1918 × 108 7.4715 × 108 7.6621 × 108 7.8594 × 108 7.9569 × 108

1.0415 × 109 1.0406 × 109 1.0395 × 109 1.0346 × 109 1.0338 × 109 1.0325 × 109 1.0287 × 109 1.0256 × 109 1.0241 × 109 1.0184 × 109 1.008 × 109

Long term NPV($)

w1

9

1.2

x 10

1.1

1

0.9 4

5 6 7 Short term NPV($)

8

x 10

Fig. 20 Solutions obtained by applying the MOSD method for the PUNQ reservoir; all solutions are non-dominated

372

Comput Geosci (2016) 20:355–374

Fig. 21 Optimal well controls obtained by applying the MOSD method with w1 =0.3; PUNQ example

6000 1

1 3000 5500

2 3

2 2500

3

5000

4

4 2000

4500 5

5 4000

6 7

Fig. 22 Average well rates obtained by applying the MOSD method with w1 =0.3; PUNQ example

7

3500 2

4

6

8

10

2

1

3

∅ Pro2

∅ Pro3

15

5

4000 2000

4

6

8

0

10

0.8

30

0.7

25

2

4

6

8

10

0.8

30

0.7

25

0.6

Pro6 ∅

0.3

10

0.3

10

0.3

0.2

5

0.2

5

0.2

5

0.6 20

0.5 0.4

15

0.6 20

0.5 15

0.4

0.1 10

6000


10

5

4

7 2

20

0.5 15

0.4

0.1

20

5

10

0.8

30

0.7

25

15

0.1

20

5

30 25

⊗ INJ6

0.7 0.6

20

⊗ INJ1

0.5 15

0.4

0.5

15

10

0.3

10

5

0.2

5

⊗ INJ7

⊗ INJ2 ⊗ INJ3

15

20

5

10

Fig. 23 Saturation field at the end of 380 days obtained by the MOSD method with w1 =0.3

15

0.4 0.3 0.2

⊗ INJ5

0.1 10

10

0.8

⊗ INJ4

0.6 20

5

1000

8000

6

2000

7

∅ Pro1

10

3

4000

6

25

8

10000

6000

5

0.7

6

2

4

0.8

4

1 8000

2

30

1500

6

20

0.1

15

20

Comput Geosci (2016) 20:355–374

373 6000

1

6000 1

5500

2 3

5000

4

5500

2 3

5000

4

4000

7 4

6

8

10

5000 4500

5 4000

6 7

2

3

4500 5

6

5500

2

4

4500 5

6000 1

3500 2

4

6

8

10

4000

6 7

3500 2

4

6

8

10

Fig. 24 Optimal well controls obtained by various methods

two control steps. We also inject somewhat more water at injector 6. In addition, almost all producers are operated at low BHP pressure during the first two control steps. As a result, producers 1, 4, and 5 produce significantly more oil from the reservoir during the first two control steps. The oil saturation field at the end of 380 days, which corresponds to the well controls obtained by the MOSD method with w1 = 0.3, is shown in Fig. 23. Comparing Fig. 23 with Fig. 16, it is clear that much more oil has been produced from the central and northern region of the reservoir when applying the well controls shown in Fig. 21 which are obtained from the MOSD algorithm. Therefore, when the short-term NPV is considered along with the life cycle NPV, instead of sweeping only the southern part of the reservoir during the first 380 days, we displace oil from most parts of the reservoir during the first two control steps.

9

Long term NPV($)

1.2

x 10

1.1

1

0.9

mosd ws nbi 4

5 6 7 Short term NPV($)

To conclude this subsection, we note that the MOSD algorithm can be applied to biobjective optimization under geological uncertainty where the two objectives are to maximize the expectation of life cycle net-present value and to maximize the expectation of the short-term net-presentvalue. 5.3.3 Comparison to other methods Again, we compare the solutions obtained by the multiobjective steepest descent method with the solutions obtained by the adjusted weighted sum method and the NBI method [8]. Solutions obtained by all three methods are shown in Fig. 25. From Fig. 25, we see that the Pareto front obtained by the multiobjective steepest descent method is comparable to the Pareto front obtained by the adjusted weighted sum and the NBI methods. In Fig. 24, we compare well controls at the injection wells obtained by the three methods. Note that although, the well controls obtained with the three different algorithms exhibit similar qualitative features although the three methods do not give identical controls. Table 3 shows the computational resources required by all three algorithms. From Table 3, we again see that the multiobjective steepest descent method is the most computationally efficient method for jointly maximizing the life-cycle and the short-term NPV. Table 3 Total simulation runs for the MOSD, the adjusted weighted sum method and the NBI method; PUNQ reservoir

8

x 10

Fig. 25 Comparison of solutions obtained by the MOSD method with solutions obtained by the adjusted weighted sum and NBI methods; thicker data points denote non-dominated solutions when all solutions obtained from all three methods are considered

Method

Total equivalent simulation runs

MOSD WS NBI

502 952 825

374

6 Conclusions We provided an original derivation to obtain the analytical solution of the optimal search direction for the multiobjective steepest descent (MOSD) method for the case where there are only two objective functions and the only constraints are bounds on the controls. By combining this analytical solution with a line search procedure, we obtained a two objective steepest descent algorithm and applied it to two synthetic reservoirs to jointly maximize the longterm and the short-term NPV. We were unable, however, to extend the derivation of the descent direction to the case where one wishes to minimize an arbitrary number of objective functions. The computational results obtained with the MOSD method were compared with those obtained from the adjusted weighted sum (WS) and the normal boundary intersection (NBI) method. On the basis of the theoretical and computational results, the following conclusions are warranted: 1. The Pareto front obtained by the multiobjective steepest descent method is very similar to those obtained with the WS and NBI methods. 2. The MOSD algorithm is more computationally efficient than the WS and NBI methods. 3. Generation of the Pareto front provides the engineer a method for determining a trade off between maximizing the life-cycle and the short-term NPV. By accepting on the order of a 1 % decrease in the life-cycle NPV, one may gain a large increase in the short-term NPV.

References 1. Chen, C., Li, G., Reynolds, A.C.: Robust constrained optimization of short- and long-term net present value for closed-loop reservoir management. SPE J. 17, 849–864 (2012)

Comput Geosci (2016) 20:355–374 2. Das, I., Dennis, J.E.: Normal-boundary intersection: a new method for generating the Pareto surface in nonlinear multicriteria optimization problems. SIAM J. Optimization 8, 631–657 (1998) 3. D´esid´eri, J.-A.: Multiple-gradient descent algorithm, Research report, Project-Team Opale (2009) 4. Fliege, J., Svaiter, B.F.: Steepest decent method for multicriteria optimization. Math. Meth. Oper. Res. 51, 479– 494 (2000) 5. Gao, G., Reynolds, A.C.: An improved implementation of the LBFGS algorithm for automatic history matching. SPE J. 11(1), 5–17 (2006) 6. Gao, G., Zafari, M., Reynolds, A.C.: Quantifying uncertainty for the PUNQ-S3 problem in a Bayesian setting with RML and EnKF. SPE J. 11(4), 506–515 (2006) 7. Gass, S., Saaty, T.: The computational algorithm for the parametric objective function. Naval Research Logistics Quaterly 2, 39– 45 (1955) 8. Liu, X., Reynolds, A.C.: Gradient-based multi-objective optimization with applications to waterflooding optmization Computational Geoscience (2015) 9. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2006) 10. Qu, S., Goh, M., Chan, F.T.S.: Quasi-Newton methods for solving multiobjective optimization. Oper. Res. Lett. 39, 397–399 (2011) 11. van Essen, G., den Hof, P.V., Jansen, J.: Hierarchical long-term and short-term production optimization. SPE J. 16(1), 191–199 (2011) 12. van Essen, G., Zandvliet, M., den Hof, P.V., Bosgra, O., Jansen, J.: Robust waterflooding optimization of multiple geological scenarios. SPE J. 14(1), 202–210 (2009) 13. Volkov, O., Voskov, D.: Advanced strategies of forward simulation for ajoint-based optimization. In: SPE Reservoir Simulation Symposium (2013) 14. Waltz, F.: An engineering approach: hierarchical optimization criteria. IEEE Trans. Autom. Control 12, 179–180 (1967) 15. Wang, S.: Second-order necessary and sufficient conditions in multi-objective programming. Numer. Functional Anal. Optimization 12, 237–252 (1991) 16. Zadeh, L.: OptiMality and non-scalar-valued performance criteria. IEEE Trans. Autom. Control 8, 59–60 (1963) 17. Zitzler, E., Deb, K., Thiele, L.: Comparison of multi-objective evolutionary algorithm. Evol. Comput. 8, 173–195 (2000)

A multiobjective steepest descent method with applications to optimal ...

A multiobjective steepest descent method with applications to optimal ...

Suggest Documents

A modified steepest descent method with applications ... - Springer Link

OPTIMAL STEEPEST DESCENT ALGORITHMS ... - Semantic Scholar

FULL CONVERGENCE OF THE STEEPEST DESCENT METHOD ...

Convergence of the steepest descent method with line searches and ...

application of gradient steepest descent method to the problem of ...

From Stationary Phase to Steepest Descent

A method of adaptation between steepest-descent ... - Semantic Scholar

A Steepest Descent-Like Method for Variable Order

A NEW STEPSIZE FOR THE STEEPEST DESCENT METHOD ... - LSEC

Steepest Descent as Message Passing

Comparative Study Among Lease Square Method, Steepest Descent ...

Convergence of the steepest descent method for ... - Deep Blue

Mann-type hybrid steepest-descent method for three ... - Springer Link

Nonsmooth Steepest Descent Method by Proximal ... - Semantic Scholar

Distributed Control by Lagrangian Steepest Descent

Combining Quasi-Newton and Steepest Descent ...

FINE TUNING NESTEROV'S STEEPEST DESCENT ALGORITHM ...

Steepest Descent Methods for Multicriteria Optimization - CiteSeerX

Optimal Design Using Chaotic Descent Method

Steepest descent paths for integrals defining the

Steepest descent on factor graphs 1 Introduction

Steepest descent algorithm on orthogonal Stiefel manifolds

Distributed Control by Lagrangian Steepest Descent - CiteSeerX

Constant time steepest descent local search with lookahead for NK