Серия «Математика»
Том 2 (2009), № 1, С. 118-131
Онлайн-доступ к журналу: http://isu.ru/izvestia
ИЗВЕСТИЯ
Иркутского
государственного
университета
УДК 517.977
Optimal control in terms of smooth and bounded functions
O. V. Vasilieva
University of Valle, Cali, Colombia
Аннотация. In this paper, the author examines the properties of interior variations and indicates how to use them in order to formulate the necessary condition of optimality (in the form of maximum principle) for optimal control problems in the class of smooth and bounded functions with fixed end-points.
Ключевые слова: optimal control, smooth and bounded controls, interior variations, maximum principle.
1. Introduction
This paper studies a particular type of optimal control problems governed by an initial value ODE system without phase constraints where all admissible controls are smooth real functions bounded in amplitude and have their endpoints fixed. Our approach is based on the use of internal perturbations of the reference admissible control function and provides a verifiable necessary condition for optimality in the form of the maximum principle. The latter serves then as a basis for developing of an optimization technique represented by an iterative algorithm that allows to find an optimal solution among smooth and bounded functions. By optimal solutions we understand here the Pontryagin extremals, that is, admissible control functions satisfying the necessary condition of optimality. It should be noted that the majority of traditional methods will not cope with this problem since they are destined to deal only with piece-wise continuous bounded functions; however, feasible solutions may exist in the class of smooth functions. This approach is illustrated by two examples.
2. Problem formulation and preliminaries
Our goal is to minimize the objective functional
r tl
J(u) = ^>(x(ti)) + / F(x,u, t)dt ^ min (2.1)
Jto
subject to the initial-value system of n nonlinear ordinary differential equations
x = f (x,u, t), x(t0) = x0, x0 € Rn, t € T (2.2)
Here T = [to, ti] is a specified domain of independent variable t which characterizes the fixed duration of the dynamic process; x(t) € Rn, is referred to as state variable vector or as the trajectory of the system (2.2) and defines the state of the dynamic process at any time t. A smooth real function u(-) € C 1(T) is called control variable and determines the course of the action performed over the dynamic process; the domain of u(t) is T = [to, ti] and its range U C R is referred to as the set of admissible controls.
In this paper we consider a particular structure of the set of admissible controls U:
a < u(t) < ft, t € T (2.3)
where a and ft are specified real numbers. All admissible control functions u(t) must also have their end-point fixed:
u(t0) = u0, u(t1) = u1 (2.4)
where u0, u1 € U are real numbers such that a < u0 < ft and a < u1 < ft.
A feasible trajectory x(t) € Rn over the time interval T = [t0,t1 ] is a
profile of ODE system (2.2) calculated for admissible control u(t) satisfying (2.3)—(2.4). In other words, for each admissible control, the solution to initial value system (2.2) is uniquely determined and so is a feasible trajectory.
The problem to which this paper is devoted is that of choosing a smooth and bounded control function u(t) with end-point constraints out of the set of admissible controls U, which, together with its corresponding feasible trajectory x(t), minimizes the objective functional (2.1).
A pair {u, x} where u(t) € U C R and x = x(t, u) € Rn stands for its feasible trajectory, is usually called admissible process. The control function u*(t) on which the objective functional (2.1) attains its minimum, is called optimal control, and the process {u*, x*} is referred to as optimal process.
Preliminarily, for the entries of the problem (2.1)—(2.4) formulated above it should be stated that:
1. Vector-function f = (/1,..., fn) is continuous in (x, u, t) together with its partial derivatives with respect to x and satisfies Lipschitz condition in x with the same constant L for all u(t) € U, t € T :
||f(x + Ax,u,t) — f(x,u,t)|| < L||Ax||. (2.5)
2. Scalar functions p and F are continuous in their arguments together with their partial derivatives with respect to x.
3. Vector function f(x, u, t) and scalar function F(x, u, t) are differentiable with respect to u.
The above mentioned assumptions are essential for justification of the maximum principle which will be formulated later as a necessary condition of optimality for the problem (2.1)—(2.4). It is commonly known that the maximum principle offers numerical procedures which provide a substantial decrease of the objective functional at each iteration. The latter is achieved by perturbing adequately an initial control function with other functions called variations.
It should be emphasized that the solution to the problem (2.1)—(2.4) is sought among smooth and bounded real functions, therefore our approach must differ from the traditional one described, e.g., in the book [1]. Such a difference will principally consist in the use of unconventional type of functional variation called interior variation. A detailed overview of different types of functional variations is presented in [2]. Therefore, we should only outline here the core features of interior variations.
Traditionally, a variation Au(t) from a reference function u(t) is understood as a small perturbation of this function that creates a new function u(t) = u(t) + Au(t). Thus, it is clear that Au(t) must also be a function by nature. This is so-called “outer variation” since the time scaling is the same and the perturbation value for all t is added to the value of u(t) for the same t.
On the other hand, one may define an “inner variations” which go along with a formal representation of the perturbed function u(t) = (u o Y)(t) = u(Y(t)) where u is the same reference function defined in T = [t0,t1] and
Y : T ^ T must be some particular smooth real function.
As far back as in 1850s, M.V.Ostrogradski had suggested to supplement the classic Lagrange variation with a smooth simultaneous perturbation of independent variable. The earliest reference to the idea of such “simultaneous varying” can be found in [3] where the previously mentioned function
Y was defined by
Y(t) = t + e ■ 5(t). (2.6)
Here the second addend represents a kind of lag in the argument t that looks rather similar to the classic variation of Lagrange. The addend e ■ 5(t) had later received a name of interior variation. This term implies that the variation is incorporated into the reference function, that is, it is added to the function’s independent argument, performing that way an “internal perturbation”.
It should be noted that, using the arbitrariness of function 5 together with value of e, one may figure out a certain way to represent a neighboring family u£(t) = u(t+e-5(t)) of some bounded reference function u(t) € U, t €
Рис. 1. “Inner” perturbations us(t) of admissible smooth and bounded function u(t).
T so that the same constraints be valid for all members of such a family, i.e., u£(t) € U, t € T. Moreover, if this reference function u(t) is smooth within T, then by choosing £(•) € C 1(T) we can guarantee the smoothness of all u£(t). Consequently, interior variations are potentially capable of preserving both the smoothness and the boundedness of all perturbed functions u£(t). The latter is shown on the Figure 1 where u£(t) € U while u£(-) € C 1(T).
The purpose of this paper is to design an optimization algorithm for solving the problem (2.1)—(2.4) where the sequence of admissible controls is constructed using internal perturbations of a reference control function by means of interior variations. Therefore, it is essential to avoid a possible time lag in (2.6) by choosing adequately £(•) € C 1(T) and the parameter e in order to guarantee the fulfillment of the control constraint (2.3)—(2.4).
In effect, it was proved in [2] that u£(t) = u (t + e • £(t)) will have the same range as u : T ^ U С R for all e € [0,1] and for any smooth real function £(t) satisfying the condition
£(t0) = 5(t1) =0, to — t < £(t) < t1 — t, t € T. (2.7) In particular, conditions (2.7) are fulfilled for
*(t) = (0)(—1;—)j> 9(t)' I^ix|g(i)|' (2-8)
where g(t), t € T is an arbitrary smooth real function.
The total arbitrariness of g(t) in the formula (2.8) gives us a certain liberty to choose the parameters (that is, £(t) and e € [0,1]) of internal perturbation (2.6) in order to achieve a significant decrease of the value of J(u£) in comparison with J(u). More details on how to do it in practice will be considered in the Section 4.
3. Optimality conditions via interior variations
According to Vasiliev [1, p.127], for two admissible processes — the basic one {u, x = x(t, u)} and the varying one {u = u + Au, x = x + Ax = x(t, u)} — the formula for the increment of the objective functional (2.1) can be written as
r tl
AJ (u) = J (u) — J (u) = — / A„ H (^, x,u, t)dt + (3.1)
Jto
separating the dominant term (in integral form) from the remainder term nu. Here H(^, x,u, t) = (^(t), f (x,u, t)) — F(x,u, t) is the maximal Hamiltonian function, (■, ■) stands for inner product in the finite-dimensional Euclidean space Rn and AuH(^, x, u, t) denotes the partial increment of H with respect to u, that is, A«H(^, x, u, t) = H(^, x, u, t) — H(^, x, u, t), while ■0(t) € Rn is so-called “conjugate” (also referred to as “adjoint”) vector function that defines the “co-state” of the dynamic process and satisfies the terminal-value linear ODE system given by
^ = , ^(t!) = — MxM. (3.2)
dx dx
The same book [1, p.129] provides the following estimate for the state increment Ax(t) caused by the control perturbation Au(t):
||Ax(t)|| < K1 [ ||Auf (x,u,t)|| dt, (3.3)
Jto
K1 = exp[L(t1 — t0)] = const > 0,
where || ■ | stands for vector’s norm in the finite-dimensional Euclidean space Rn and Auf (x,u, t) = f (x,u, t) — f (x,u, t) denotes the partial increment of f with respect to u. The estimate (3.3) was obtained using the Lipschitz condition (2.5) together with Gronwall-Bellman’s lemma. In [1], Vasiliev considered the control perturbations in the form of elementary needles (see more details in [4]) and applied the estimate (3.3) in order to justify that nu ~ °(e), using the fact that all perturbed functions are bounded, that is, u(t) € U, t € T. The same argument remains valid if we consider internal perturbations of the form (2.6) since for adequately chosen 5(t) and e € [0,1] we can guarantee that u(t) = u£(t) € U for all t € T. Therefore, it will be appropriate for our case to write nu ~ o(e).
Taking into account the third condition formulated in the Section 2
A.sH(^, x, u, t) = dH^t) Au(t) + o (|Au|) (3.4)
where Au(t) = u£(t) — u(t) = u (t + e ■ 5(t)) — u(t) = e ■ u(t)5(t) + o(e), we can follow the deductions of Vasiliev [1, pp.129-132] and arrive to
t- ti
J(u£) — J(u) = — e W(u, t) 5(t) dt + o(e), e € [0,1], (3.5)
to
where
w(u,t) = dH(^м) u(t) (3.6)
is referred to as deviation function. Formula (3.5) is valid for all e € [0,1] and for all £(t) that satisfying either (2.7) or (2.8).
Теорема 1. (Necessary condition of optimality) Suppose that u*(t) is optimal in the problem (2.1)—(2.4) and that x*(t), ^*(t) are the feasible trajectories of the direct and adjoined systems (2.2) and (3.2), respectively. Then it holds that
W (u*,t)=0, t € T. (3.7)
Proof. The statement of the Theorem 1 becomes evident from the increment formula (3.5) considered on the optimal control, that is,
J(u£) — J(«*) = — e f W(u*,t) £(t) dt > 0
Jtn
t1
to
for any admissible £(t) ^ 0 that may have opposite signs within T. □
Замечание 1. Since the necessary condition of optimality (3.7) will hold trivially if u*(t) = 0, we should not deal with constant control functions.
4. Optimization technique based on interior variations
Traditional optimization techniques for solving optimal control problems with bounded control domain repose on successive approximations diminishing the value of the objective functional. That is, starting from some admissible control uk (t) and corresponding solutions xk(t), (t) of the
direct (2.2) and adjoined (3.2) systems, one must define a nominal admissible control uk(t) that satisfies the necessary condition of optimality (e.g., maximum principle or linearized maximum principle) for the same xk(t), 0k (t) and almost for all t € T:
uk (t)=argmax H (0k, xk ,v,t). (4.1)
v£U
Due to the presence of control constraints, uk(t) is usually perturbed with a needle-shaped variation Auk (t) that inevitably involves the nominal control uk (t), in order to guarantee the non-negativity of the dominant term in the increment formula (3.1) and, consequently, to diminish the value of J. Thus, the majority of such techniques rests on the compulsory assumption of explicit solvability of the condition (4.1).
This assumption is rather strong and therefore may present quite an obstacle for numerical calculations. On the other hand, if a smooth admissible control uk(t) is perturbed by interior variation, the resultant control
function will be smooth and admissible in the sense of control constraints. Moreover, the non-negativity of the dominant term in the increment formula (3.5) can be achieved by choosing £(t) according to (2.8) with g(t) = W(uk, t) since in that case both £(t) and W(uk, t) will carry the same sign. Thus, under such approach there will be no need to search for nominal admissible control (4.1) and the assumption on explicit solvability of the maximum condition for Hamiltonian function with respect to the control variable will become completely superfluous.
For computational purposes it will be helpful to re-write the increment formula (3.5) as
J(u£) — J(u) = — e ■ ^(u) + o(e), e € [0,1] (4.2)
where
r ti
^(u) = W(u, t) 5(t) dt > 0 (4.3)
Jto
and to formulate the necessary condition for optimality (3.7) in a simplified
form, that is, for optimal control u*(t) it holds that ^(u*) = 0. Now we will
briefly describe the optimization method based on interior variations.
STEP 1 For some admissible control uk that contains no constant section (see Remark 1), integrate both direct (2.2) and adjoined systems (3.2) and store their profiles xk = x(t, uk), 0k = 0(t, uk, xk).
STEP 2 Compose the deviation function W(uk, t) and calculate the value of ^(uk) > 0 according to (4.3).
STEP 3 Check if uk (t) satisfies the necessary condition of optimality within the limits of given precision:
IF ^(uk) = 0 THEN STOP IF ^(uk) > 0 THEN CONTINUE.
STEP 4 (internal varying) Define 5k(t) according to (2.8) for g(t) = W(uk, t) and obtain a parametric family of admissible controls uk (t) = uk (t + e ■ 5k(t)j, t € T, e € [0,1] according to (2.6).
STEP 5 Choose the optimal value of the variational parameter e according to
ek = arg min J (uk) , (4.4)
£€[0,1] v '
STEP 6 Determine the successive approximation as uk+1(t) = u^ (t), set k := k + 1 and GO TO STEP 1.
The whole process must be repeated until the condition ^(uk) =0 is satisfied within the limits of required precision. It should be noted that the described algorithm will yield as a result a Pontryagin extremal u*(t), that is, an admissible control function satisfying the necessary condition for optimality ^(u*) = 0.
Замечание 2. Apparently, due to (3.6), function W(uk, t) may have rather complicated structure in terms of t since it will always include both x(t) and ^(t). Quite often, however, these solution profiles of the direct (2.2) and conjugate (3.2) systems may only be recovered by means of numerical integration. In that case, the explicit forms of (3.6) and (4.3) may seem useless. On the other hand, by defining
*(t)=(tM tt! -1—)t) w (uk ■*> (4-5)
we just wanted to guarantee that both 5k(t) and W(uk, t) carry the same sign and thus to assure the non-negativity of the dominant term of (3.5). Therefore, the exact form of W(uk, t) is not actually required and, for computational purposes, we can approximate W(uk, t) in (4.5) with a mini-mum-degree interpolating polynomial Pk(t) carrying the same sign as W(uk, t) everywhere on T. In other words, Pk(t) must interpolate all zeros of W(uk, t) and carry the same sing as W(uk, t) for all t € T.
Теорема 2. Suppose that J(u) in the problem (2.1)—(2.4) is bounded from below for all admissible controls. Then the sequence of admissible controls {uk} generated by the described algorithm is a strictly relaxational one, i.e., J (V+1) < J (uk) , k = 0,1, 2,... and converges to the necessary condition of optimality in the sense that
lim ^ И = 0. (4.6)
k—»oo V /
Proof. Taking into account the estimate |o(e)| < K2e2, the increment
£ = uk
formula (4.2) should be examined for u = uk, u£ = uk •
J (u^ — J (uk) < —e ^(uk) + K2e2.
By virtue of the inequality ^(uk) > 0, the strict relaxation for small e > 0 becomes obvious. Hence, taking into consideration the minimization problem (4.4) we have
J(uk+1) — J(uk) < —e ^(uk) + K2e2, e € [0,1].
This inequality can be transformed into
0 < e^(uk) < J(uk) — J(uk+1) + K2e2, e € [0,1]. (4.7)
Due to the relaxation property and taking into account the boundedness of J(u) from below we have 0 < J(uk) — J(uk+1) ^ 0 when k ^ to. Then, passing to the limit in (4.7) when k ^ to
0e
lim ^(uk)
k—»oo
<K2e2, e € [0,1].
The latter is valid only if (4.6) holds. □
5. Examples
In order to illustrate the performance and the convergence of optimization technique described in the previous section it would be helpful to convey a couple of explanatory examples.
Пример 1. It is desired to minimize the cost of the waste products x(t), produced in the course of some chemical reaction, by choosing an optimal temperature policy u(t) over the time interval [0,T]. The cost of the waste products is proportional to the square of the waste products produced, and there is a cost associated with the temperature policy which is proportional to the square of the temperature applied to the reaction. The objective functional is
1 fT г 1
J(u) = о qx2(t)+ ru2(t) dt ^ min (5.1)
2
'o
where q > 0 is the cost coefficient for waste and r > 0 is the cost coefficient for temperature.
The rate of change of production of waste products at time t, x(t) is linearly related to the production of waste x(t) and to the temperature of the reaction u(t) at time t by the equation
X = flx(t) — bu(t), a, b> 0. (5.2)
The temperature u(t) is maintained between its minimal value umin and its maximal one umax:
umm < u(t) < umaL ^t € [0,T]. (5.3)
This restriction is quite logical due to the physical limitations of the equipment that regulates the temperature of the reaction. Moreover, both the initial and the final temperature of the reaction should take exact preassigned values assigned, that is,
u(0) = u0, u(T) = uT (5.4)
where uo and ut are constants within the range [umin, umax]. These restrictions are also appropriate since the reaction must start and end at some
specified temperatures in order to avoid possible alterations in the final
product.
Рис. 2. Example 1 — numerical results: (a) three successive control strategies; (b) corresponding integrand functions.
This optimal control problem is linear-quadratic and its sub-variant (5.1)—(5.2) without control constraints can be solved using so-called “feedback control” technique, attributed to Kalman [5]:
u* (t) = - b£(t)x(t)
where £(t) stands for the solution to Riccati equation
£(t) = —2a£(t) r£2(t) +q £(T) = °'
The presence of control constraints (5.3)—(5.4) does not allow us to use the above feedback scheme. However, we can try to improve a temperature regime and get a significant decrease of the functional (5.1) by performing several iterations of the algorithm proposed in the previous section.
For the following data set
T = 1, a = 2, b = 1, r = 2, q = 1, umin = 0, umax = 1, u0 = 0, uT = 0.75
we can start with some admissible smooth control function, e.g.,
u*(t) = —91 ^t — 33^ , u*(0) = u0 = 0, u*(T) = uT = 0.75
and perform successively three iterations of the algorithm based on interior variations. The results of this numerical simulation is shown by Table 1 and
Figure 2, where Fk (t) = q (t)
+ r
uk(t) stand for integrand function
of (5.1), while the dashed line is used for k = 1, thin solid line for k = 2 and thick solid line for k = 3.
The control strategy u3(t) suggests to implement a small jump of temperature in the proximity of the initial time t = 0 in order to achieve a considerable decrease of the area below the curve F3(t), equivalent to the value of the functional J(u3).
We can also observe the presence of errors on the graph of F3(t) when t approaches T =1. They are originated and accumulated due to numerical integration of the system (5.2). The latter clearly indicates that it is not advisable to follow more iterations of the method.
Iteration, k Functional, J(uk) Decrease (%)
k = 0 0.977143 -
k = 1 0.681681 30 %
k = 7 0.468412 31 %
Табл. 1. Example 1: functional values for successive approximation uk(t), k = 1, 2, 3.
Пример 2. Consider the following scalar nonlinear problem of optimal control:
J(u) = / [u(t) — 1] ■ x(t)dt ^ min (5.5)
Jo
x(t) = p ■ x(t) ■ u(t), x(0) = 1, p> 0, t € T = [0,1] (5.6)
with direct constraint for all control functions
u € C*[0,1], 0 < u(t) < 1, u(0) = 1, u(1) = 0' (5.7)
For this problem we have H(0,x,u, t) = p-^(t)-x(t)-u(t) —(u(t) — 1)-x(t)
where ^(t) can be found by solving the conjugate problem (3.2) for any admissible u(t), that is,
•0(t) = —p ■ u(t) ■ ^(t), ^(1) = 0' (5.8)
Actually, there are two separate cases to be considered: (a) p > 1 and (b) 0 < p < 1. In the case (a), the problem can be easily solved using the classical maximum principle (see more details in [6, pp.123-126]). However, this solution is not smooth and has piecewise constant structure (see Figure 3 (a)):
1, if t €
u*(t) = ^
' 0, if t €
0, P--------1
)
"P
p_ P i , u*(0) = 1, u* (1) = 0'
p____1 1
Рис. 3. Example 2 — solutions via maximum principle: (a) feasible non-smooth solution; (b) non-feasible solution.
uk(0 'H
(a)
(b)
Puc. 4• (a) Members of the sequence {uk (t)} for k = 0,1,..., 7; (b) functional values J(uk) for k = 0,1,..., 7.
In the case (b), the maximum principle provides a constant solution u*(t) = 0 which is, in effect, is non-feasible since the end-point condition u(0) = 1 is not satisfied (see Figure 3 (b)).
Here we also have that, according to [6],
inf J (u) = —1
u(t)€U
where U is defined by (5.7). Our task will be to construct a relaxational sequence of smooth admissible controls {uk(t)} for the case (b) using the optimization algorithm described in the previous section.
First, we have to choose an initial approximation out of the set of smooth real functions satisfying (5.7). An excellent choice will be to work with polynomials and make use of the extensive variety of computational tool based on polynomial interpolation, as already mentioned in Remark 2. Apparently, a segment of straight line passing through the end-points would
be an admissible control in minimum-degree polynomial form: u0(t) =
1 - t, u0(0) = 1, u0(1) = 0.
For this initial control, plotted on Figure 3 (b) with dashed line, we define p = 0.5 and then employ the numerical algorithm using for all iterations the same values of 5k(t) and ek, that is, 5k(t) = t(1 — t), £k = 1, k =
0,1, 2,.... This choice will yield us a compact form of the sequence {uk(t)} with uk(t) = (1 — t)2 . The limit of this sequence when k ^ to gives a feasible solution to our problem:
^ k
u*(t) = lim (1 — t) , t € [0,1], u*(0) = 1, u*(1) = 0
k—>00
Iteration Control Functional
k = 0 u0(t) = 1 — t -0.61654223956
k = 1 t) — 1 t) 1u -0.77167189561
k = 2 t) — 1 t) CM u -0.89664786101
k = 3 u co t) 1 1 t) 8 -0.93708411719
k = 4 CO t) — 1 t) u -0.96848132167
Ю k= u5(t) = (1 — t)32 -0.98428241605
k = 6 CO t) — 1 t) co u -0.99670285341
7 k= u t) 1 i t) 2 00 -0.99860249872
Табл. 2. Example 2: functional values for uk(t), k = 0,1,..., 7.
Figure 4(a) shows first eight terms of the sequence {uk(t)} whereas Figure 4(b) clearly indicates that the relaxation property of {uk(t)}, that is,
J (ufc+1) < J (ufc) , k = 0,1, 2,..., 7.
More detailed data is summarized in the Table 2.
Список литературы
1. Vasiliev O. V. Optimization methods / O. V. Vasiliev. — World Federation Publishers Company, Atlanta, GA, 1996.
2. Vasilieva O. Interior variations in dynamic optimization problems / O. Vasilieva // Optimization. — 2008. — Vol. 57. — P. 807-825.
3. Letnikov A. V. Kurs variatzionnogo ischisleniya [Course on variational calculus] / A. V. Letnikov. — Moscow Imperial Technical College [in Russian], 1981.
4. Alekseev V. M. Optimal’noe upravlenie [Optimal Control] / V. M. Alekseev, V. M. Tikhomirov, S. V. Fomin. — Moscow: Nauka [in Russian], 1979.
5. Kalman R. E. Contributions to the theory of optimal control / Kalman R. E. // Boletin de la Sociedad Matematica Mexicana. — 1960. — Vol. 2. — P. 102-119.
6. Vasiliev O. V., Arguchintsev A. V., 1999, Metody optimizacii v zadachah i uprazhneniyah [Optimization methods: problems and exercises] / O. V. Vasiliev, A. V. Arguchintsev. — Moscow: Fizmatlit [in Russian], 1999.
O. V. Vasilieva
Optimal control in terms of smooth and bounded functions
Abstract. In this paper, the author examines the properties of interior variations and indicates how to use them in order to formulate the necessary condition of optimality (in the form of maximum principle) for optimal control problems in the class of smooth and bounded functions with fixed end-points.
Keywords: optimal control, smooth and bounded controls, interior variations, maximum principle.
Vasilieva Olga, Professor, Department of Mathematics, University of Valle, Ciudad Universitaria Melendez, Calle 13 No. 100-00, Cali, Colombia; phone: +57 (2) 339 3227, ([email protected])