Numerical Study of a Linear Differential Game with Two Pursuers and One Evader*
Sergey S. Ganebny1, Sergey S. Kumkov1, Stephane Le Menec2 and
Valerii S. Patsko1
1 Institute of Mathematics and Mechanics,
S.Kovalevskaya str., 16, Ekaterinburg, 620990, Russia,
E-mail: [email protected] 2 EADS/MBDA France,
1 avenue Reaumur, 92358 Le Plessis-Robinson Cedex, France,
E-mail: stephane.le-menec@mbda-systems .com
Abstract. A linear pursuit-evasion differential game with two pursuers and one evader is considered. The pursuers try to minimize the final miss (an ideal situation is to get exact capture), the evader counteracts them. Two case are investigated. In the first case, each one pursuer is dynamically stronger than the evader, in the second one, they are weaker. Results of numerical study of value function level sets (Lebesgue sets) for these cases are given. A method for constructing optimal feedback controls is suggested on the basis of switching lines. Results of numerical simulation are shown.
Keywords: pursuit-evasion differential game, linear dynamics, value function, optimal feedback control.
1. Introduction and Problem Formulation
1. In the paper, a model differential game with two pursuers and one evader is studied. Three inertial objects moves in the straight line. The dynamics descriptions for pursuers P1 and P2 are
ZPl aPi ■ ZP2 ■
apt = (u1 — ap-i )/lpt, ap2 = (u2 — ap2 )/lp2 ■
U1I < H1 ■ |u2 | < M2,
ap1 (to) = 0, ap2 (to) = 0.
Here, zpi and zp2 are the geometric coordinates of the pursuers, api and ap2 are
their accelerations generated by the controls u1 and u2. The time constants lpi and lp2 define how fast the controls affect the systems.
The dynamics of the evader E is similar:
Ze = aE■ o,e = (v — aE)/Ie■ . .
(2)
IvI<v, aE (to) = 0.
Let us fix some instants T1 and T2. At the instant T1, the miss of the first pursuer with the respect to the evader is computed, and at the instant T2, the miss of the
* This work was supported by the Russian Foundation for Fundamental Researches under grants No.09-01-00436, 10-01-96006, and by the Program “Mathematical control theory” of the Presidium of RAS.
second one is computed:
rpi,E(T1) = Ize (T1) — ZpuE (T1) I ■ rp2 ,E (T2) = Ize (T2) — Zp2,e (T2 ) I. (3)
Assume that the pursuers act in coordination. This means that we can join them into one player P (which will be called the first player). This player governs the vector control u = (u1, u2). The evader is counted as the second player. The result miss is the following value:
y = min{rpi,E(T1), rp2,E(T2)}. (4)
At any instant t, all players know exact values of all state coordinates zpi, zpi, api, zp2 , zp2 , ap2, zE, zE, aE. The first player choosing its feedback control minimizes the miss y, the second one maximizes it.
Relations (1)-(4) define a standard antagonistic differential game. One needs to construct the value function of this game and optimal strategies of the players.
2. Nowadays, there are a lot of publications dealing with differential games where one group of objects pursues another group; see, for example, the following works (in some order): (Stipanovic et al., 2009), (Blagodatskih and Petrov, 2009), (Chikrii, 1997), (Levchenkov and Pashkov, 1990), (Abramyantz and Maslov, 2004), (Pshenichnii, 1976), (Grigorenko, 1991), (Breakwell, 1976). The problem under consideration has two pursuers and one evader. So, from the point of view of number of objects, it is the simplest one. On the other hand, strict mathematical studies of problems “group-on-group” usually include quite strong assumptions for the dynamics of objects, dimension of the state vector and conditions of termination. Conversely, this paper considers the problem without any assumptions of this type. Solution of the problem can be interesting for the group differential games.
3. Now, let us describe a practical problem, whose reasonable simplification gives model game (1)-(4). Suppose that two pursuing objects attacks the evading one on collision courses. They can be rockets or aircrafts in the horizontal plane. A nominal motion of the first pursuer is chosen such that at the instant T1 the exact capture occurs. In the same way, a nominal motion of the second pursuer is chosen (the capture is at the instant T2). But indeed, the real positions of the objects differ from the nominal ones. Moreover, the evader using its control can change its trajectory in comparison with the nominal one (but not principally, without sharp turns). Correcting coordinated efforts of the pursuers are computed during the process by the feedback method to minimize the result miss, which is the minimum of absolute values of deviations at the instants T1 and T2 from the first and second pursuers, respectively, to the evader.
The passage from the original non-linear dynamics to a dynamics, which is linearized with the respect to the nominal motions, gives (Shima and Shinar, 2002), (Shinar and Shima, 2002) the problem under considerations.
4. The paper includes results of numerical study of game (1)-(4) for two marginal cases: 1) both pursuers P1 and P2 are dynamically stronger than the evader E; 2) both pursuers are dynamically weaker. Results for intermediate situations will be published in another work.
Difficulty of the solution is stipulated by the fact that the payoff function y is not convex (even for the case T1 = T2). In the paper (Le Menec, 2011), a case
of “stronger” pursuers is considered and analytically methods are applied to the problem of solvability set construction in the game with zero result miss. For T1 = T2, an exact solution is obtained; if T1 = T2, then some upper approximation for the set is given. In general case, the exact analytical solution cannot be got, in the authors opinion.
The numerical study is based on algorithms and programs for solving linear differential games worked out in the Institute of Mathematics and Mechanics (Ural Branch of Russian Academy of Sciences, Ekaterinburg, Russia). The central procedure is the backward constructing level sets (Lebesgue sets) of the value function. Optimal strategies of the players are constructed by some processing of the level sets.
2. Passage to Two-Dimensional Differential Game
At first, let us pass to relative geometric coordinates y1 = zE — zpi, y2 = zE — zp2 in dynamics (1), (2) and payoff function (4). After this, we have the following notations:
Z1 = aE — a pi y2 = aE — ap2
a pi = (u1 — api )/lpi a p2 = (u2 — ap2 )/lp2
(5)
aE = (v — aE)/lpi Iu2 I < M2
Iu1I< M1, IvI<v y = min{Iy1(T1)I, Iy2(T2)I}.
State variables of system (5) are y1, y1, api, y2, y2, ap2, aE; u1 and u2 are controls of the first player; v is the control of the second one. The payoff function y depends on the coordinate y1 at the instant T1 and on the coordinate y2 at the instant T2. From general point of view (existence of the value function, positional type of the optimal strategies), differential game (5) is a particular case of a differential game with a positional functional (Krasovskii and Krasovskii).
A standard approach, which is set forth in (Krasovskii and Subbotin, 1974) and (Krasovskii and Subbotin, 1988) for study linear differential games with fixed terminal instant and payoff function depending on some state coordinates at the terminal instant is to pass to new state coordinates. They can be treated as values of the target coordinates forecasted to the terminal instant under zero controls. In our situation, we have two instants T1 and T2 , but coordinates computed at these instants are independent; namely, at the instant T1 , we should take into account y1(T1) only, and at the instant T2, we use the value y2(T2). This fact allows us to use the mentioned approach when solving differential game (5). With that, we pass to new state coordinates x1 and x2, where x1(t) is the value of y1 forecasted to the instant T1, and x2(t) is the value of y2 forecasted to the instant T2.
The forecasted values are computed by formula
Xi = yi + yiTi + apil2p.h(Ti/lpi)+ aEl2Eh(n/lE)■ i = 1, 2. (6)
Here, xi, yi, and y.i depends on t; Ti = Ti — t; h(a) = e-a + a + 1. Emphasize that the values t1 and t2 are connected to each other by the relation t1 —t2 = const = T1 —T2. One has xi(Ti) = yi(Ti).
The dynamics in the new coordinates x1, x2 is the following (Le Menec, 2011): x 1 = —lpi h(T1/lpi )u1 + Ieh(T1/lE)v, x2 = —Ip2 h(T2/lp2 )u2 + Ie h(T2/lE )v,
Iu11 < M1, Iu21 < M2, IvI < v,
y(x1(T1),x2(T2)) = min{Ix1(T1)I, Ix2(T2)I}.
The first player governs the controls u1, u2 and minimizes the payoff y; the second one has the control v and maximizes y.
Note that the control u1 (u2) affects only the horizontal (vertical) component x1 (x2) of the velocity vector x = (x1 ,x2). When T1 = T2, the second summand in dynamics (7) is the same for x1 and x2. Thus, the component of the velocity vector x depending on the second player control is directed at any instant t along the bisectrix of the first and third quadrants of the plane x1,x2. When v = +v, the angle between the axis x1 and the velocity vector of the second player is 45°; when v = —v, the angle is 225°. This property simplifies the dynamics in comparison with the case T1 = T2.
Let x = (x1, x2) and V(t, x) be the value function at the position (t, x). For any c > 0, the value function level set
Wc = {(t, x) : V(t,x) < c}
coincides with the maximal stable bridge (see (Krasovskii and Subbotin, 1974) and (Krasovskii and Subbotin, 1988)) built from the terminal set
Mc = {(t, x) : t = T1, Ix 11 < c; t = T2, Ix21 < c}.
The set Wc can be treated as the solvability set for the considered game with the result not greater than c. When c = 0, one has the situation of the exact capture. The exact capture means equality to zero, at least, one of x1(T1) and x2(T2).
Comparing dynamics capabilities of each of pursuers P1 and P2 and the evader E, one can introduce parameters (Le Menec, 2011) n = Mi/vi and ei = lE/lpi, i = 1, 2. They define the shape of the maximal stable bridges in the individual games P1 against E and P2 against E.
^ 7? > 1, T)£ > 1 H ^ T) < 1, rje < 1
T 0 r
strong pursuer weak pursuer
Fig. 1. Different variants of the stable bridges evolution in an individual game
Consider two cases: 1) ni > 1, Vi£i > 1, i = 1, 2; 2) ni < 1, Vi£i < 1, i = 1, 2. In the first case, each of pursuers P1 and P2 is stronger than the evader E; in the second one, both pursuers are weaker. The maximal stable bridges in the individual games in the first case look as it is shown in Fig. 1 (at the left); the right subfigure in Fig. 1 gives the outline for the second case. The horizontal axis is the backward time T, the vertical axis is the one-dimensional state variable x.
3. Level Sets of the Value Function
As it was mentioned above, a level set Wc of the value function is the maximal stable bridge for dynamics (7) built in the space t, x from the target set Mc. A time section (t-section) Wc(t) of the bridge Wc at the instant t is a set in the plane of two-dimensional variable x.
To be definite, let T1 > T2. Then, for any t G (T2, T1], the set Wc(t) is a vertical stripe around the axis x2. Its width along the axis x1 equals the width of the bridge in the individual game P1 -E at the instant t = T1 — t of the backward time. At the instant t = T1, half-width of Wc(T1) is equal to c.
Denote by Wc(T2 + 0) the right limit of the set Wc(t) as t ^ T2 + 0. Then, the
set Wc(T2) is cross-like, obtained by union of the vertical stripe Wc (T2 + 0) and a horizontal one around the axis x1 with the width equal 2c along the axis x2.
When t < T2, the backward construction of the sets Wc(t) is made starting from the set Wc (T2).
The algorithm, which is suggested by the authors for constructing the approximating sets Wc(t), uses a time grid in the interval [0,T]_]: tN = T1, tN-1, ..., ts = T1, tS-1,tS-2, .... For any instant tk from the taken grid, the set Wc(tk) is built on the basis of the previous set Wc(tk+1) and a dynamics obtained from (7) by fixing its value at the instant tk+1. So, dynamics (7), which varies in the interval (ti, ti+1], is changed by a dynamics with simple motions (Isaacs, 1965). The set Wc(tk) is treated as a collection of all positions at the instant tk, where from the first player guarantees guiding the system to the set Wc(tk+1) under “frozen” dynamics (7) and discrimination of the second player, that is, when the second player announces its constant control v, IvI < v, in the interval [ti,ti+1].
Due to symmetry of dynamics (7) and the sets Wc(T1), Wc(T2) with the respect to the origin, one gets that for any t < T1 the t-section Wc(t) is symmetric also.
3.1. Maximal Stable Bridges for the Case of Strong Pursuers
Simultaneous dynamic advantage of P1 and P2 with the respect to E implies that for any c, Wc(t) C Wc(t) if t < t. This means that the bridge Wc expands in the backward time. The latter allows to make independent constructions in all four quadrants. And due to the central symmetry, it is sufficient to make the constructions in the I and II quadrants only.
Let us give results of constructing t-sections Wc(t) for the following values of game parameters:
M1 = 2, M2 = 3, v = 1,
lpi = 1/2, lp2 = 1/0.857, Ie = 1.
Equal terminal instants. Let T1 = T2 = 6. Fig. 2 shows results of constructing the set Wo (that is, with c = 0). In the figure, one can see several time sections W0(t) of this set. The bridge has a quite simple structure. At the initial instant t = 0 of the backward time (when t = 6), its section coincides with the target set M0, which is the union of two coordinate axes. Further, at the instants t = 4, 2, 0, the cross thickens, and two triangles are added to it. The widths of the vertical and horizontal parts of the cross correspond to sizes of the maximal stable bridges in the individual games with the first and second pursuers. These triangles are located in the II and IV quadrants (where the signs of x1 and x2 are different, in other words, when the
Fig. 2. Two strong pursuers, equal terminal instants: time sections of the bridge W0
Fig. 4. Two strong pursuers, different terminal instants: time sections of the bridge Wo
50^2
-50“
-50
■50
evader is between the pursuers) give the zone where the capture is possible only under collective actions of both pursuers.
Time sections Wc(t) of other bridges Wc, c > 0, have a shape similar to W0(t). In Fig. 3, one can see the sections Wc(t) at t = 2 (t = 4) for a collection {Wc} corresponding to some series of values of the parameter c. For other instants t, the structure of the sections Wc(t) is similar.
Different terminal instants. Let T1 =7, T2 = 5. Results of construction of the set W0 are given in Fig. 4. When t < 5, time sections W0(t) grow both horizontally and vertically; two additional triangles appear, but now they are curvilinear.
Total structure of the sections Wc(t) at t = 2 is shown in Fig. 5.
3.2. Maximal Stable Bridges for the Case of Weak Pursuers
Now, we consider a variant of the game when both pursuers are weaker than the evader. Let us take the parameters
M1 = 0.9, M2 = 0.8, v =1, lp1 = lp2 = 1/0.7, Ie = 1.
Let us show results for the case of different terminal instants only: T1 = 7, T2 = 5.
Since in this variant the evader is more maneuverable than the pursuers, they cannot guarantee the exact capture.
Fix some level of the miss, namely, |x1(T1)| < 2.0, |x2(T2)| < 2.0. Time sections W2 0(t) of the corresponding maximal stable bridge are shown in Fig. 6. The upper-left subfigure corresponds to the instant when the first player stops to pursue. The upper-right subfigure shows the picture for the instant, when the second pursuer finishes its pursuit. At this instant, the horizontal strip is added, which is a bit wider than the vertical one contracted during the passed period of the backward time. Then, the bridges contracts both in horizontal and vertical directions, and two additional curvilinear triangles appear (see middle-left subfigure). The middle-right subfigure gives the view of the section when the vertical strip collapses, and the lower-left subfigure shows the configuration just after the collapse of the horizontal strip. At this instant, the section loses connectivity and disjoins into two parts symmetrical with respect to the origin. Further, these parts continue to contract (as it can be seen {in the l}ower-right subfigure) and finally disappear.
Time sections {Wc(t)} are given in Fig. 7 at the instant t = 0 (t1 =7, t2 = 5).
4. Optimal Feedback Control
Using knowledge of the value function provided by its level sets Wc, we can construct optimal strategies of the first and second players. Let us do it dividing the plane x1 , x2 for every instant t to some cells. Inside each cell, the optimal control takes some extremal values.
Rewrite system (7) as
x = D (t)u1 + D2 (t)u2 + E (t)v,
KI < M1, M < M2, M < v.
Here, x = (x1,x2); vectors D1 (t), D2(t), and E(t) look like
D1(t) = (—lpth((T1 — t)/lPl), 0), D2(t) = (0, —Ip2h((T2 — t)/lp2)),
E(t) = (Ieh((T1 — t)/Ie), Ieh(T — t)/E)).
Fig. 6. Two weak pursuers, different termination instants: time sections of the maximal stable bridge W2,0
30
-3
-30
Fig. 7. Two weak pursuers, different terminal instants: level sets of the value function, t = 0
We see that the vector D1(t) (D2(t)) is directed along the horizontal (vertical) axis; when T1 = T2, the angle between the axis x1 and the vector E(t) equals 45°; when T1 = T2, the angle changes in time.
4.1. Switching Lines in the Case of Strong Pursuers
Feedback control of the first player. Analyzing the change of the value function along a horizontal line in the plane x1, x2 for a fixed instant t, one can conclude that the minimum of the function is reached in the segment of intersection of this line and the set Wo(t). With that, the function is monotonic at both sides of the segment. For points at the right (at the left) from the segment, the control u1 = m1 (u1 = —M1) directs the vector D1(t)u1 to the minimum.
Splitting the plane into horizontal lines and extracting for each line the segment of minimum of the value function, one can gather these segments into a set in the plane and draw a switching line through this set, which separates the plane into two parts at the instant t. At the right from this switching line, we choose the control u1 = , and at the left the control is u1 = — /i,1. On the switching line, the control
u1 can be arbitrary obeying the constraint |u1| < /i,1. The easiest way is to take the vertical axis x2 as the switching line.
In the same way, using the vector D2(t), we can conclude that the horizontal axis x1 can be taken as the switching line for the control u2 .
Thus,
any ui G [—n, if Xi — 0.
if xi > 0, if xi < 0,
Fig. 8. Two strong pursuers, equal terminal instants: switching lines for the first player
The switching lines (the coordinate axes) at any t divide the plane x1, x2 into 4 cells. In each of these cells, the optimal control of the first player is constant. The synthesis of the first player optimal control is the same for all time instants and is shown in Fig. 8. Arrows denote the direction of the vectors Di(t)u*, i = 1, 2.
Feedback control of the second player. For a fixed instant t, consider a split of the plane xi, x2 into lines parallel to the vector E(t). Take segments of local minimum and local maximum of the value function on all lines. One can easily see that for any line (except lines passing near the origin), there are two segments of local minimum and one of local maximum located between them. The segments of minimum appear by intersection of the line with the set W0(t). The segment of maximum for the case T1 = T2 coincides with the rectilinear part of the boundary of some set Wc(t) and has slope angle equal to 45°. If T1 = T2, then the segment of maximum degenerates to a point coinciding with the corner point of a curvilinear triangle. For any point in the line outside all the segments, the control v is chosen in such a way that the vector E(t)v is oriented to the direction of growth of the value function. So, there are two parts of the line, where v = v, and two parts, where v = —v.
For a fixed instant t, the switching lines for the second player comprise of the coordinate axes and some line n (t), which passes through the middles of the segments of local minimum, if T1 = T2, and through the corner points of curvilinear triangles, if T1 = T2. An unpleasant peculiarity is that if T1 = T2, then one should take v = ±v in the switching line n(t); choices |v | < v are not optimal.
Inside each of 6 cells, to which the plane is separated by the switching lines of the second player, the control is taken either v = v or v = — v.
Fig. 9. Two strong pursuers, equal terminal instants: switching lines for the second player, t = 0
The second player optimal synthesis for the case Ti = 7, T2 = 5 is shown in Fig. 9 for t = 0. Arrows denote direction of the vectors E(t)v*.
4.2. Switching Lines in the Case of Weak Pursuers
In the case of pursuers weaker than the evader, the structure of the sets Wc is more complex in some neighborhood of the origin. This leads to more complicated shape of the switching lines both for the first and second players.
Switching lines of the first player are given in Fig. 10 at the instant t = 0 (ti =7, t2 = 5). The dashed line is the switching line for the component ui; the dotted one is for the component u2. The switching lines are obtained as a result of the analysis of the function x ^ V(t, x) in horizontal (for ui in accordance with the direction of the vector D1(t)) and vertical (for u2 in accordance with the direction of the vector D2(t)) lines. If in the considered horizontal (vertical) line the minimum of the value function is attained in a segment, then the middle of such a segment is taken as a point for the switching line. Arrows show the directions of the vectors D1(t)u1 and D2(t)u2 in 4 cells.
In Fig. 11 switching lines and the directions of the vectors E(t)v* are shown for t = 0. In this picture, we have 4 cells with constant values of the second player control.
4.3. Generating Feedback Controls. Discrete Scheme of Control
Switching lines are built as a result of processing the boundary of the sets Wc(t). With that, some grid of instants tk, where the t-sections Wc. (tk) of the maximal stable bridges Wc. are constructed by the backward procedure. The values Cj are also taken in some grid. For any instant tk, approximating switching lines are stored as polygonal lines in the memory of a computer.
-30,
30
30
-30
Fig. 10. Two weak pursuers, equal terminal instants: switching lines for the first player, t=0
30'
-30.
30
Having a position x(tk) at the instant tk, it is possible to compute the controls u\(tk, x(tk)) and u* (tk, x(tk)) analyzing location of the point x(tk) with the respect to the switching lines for u1 and u2. The vectors D1 (tk) and D2(tk) are used for this. In the case of strong pursuers, the axis x2 is the switching line for the control u1, and the axis x1 is the switching line for the control u2. The values of u* and u2 are defined by formula (8). In the case of weak pursuers, the switching line is unique for each component ui of the control too. Drawing a ray from the point x(tk) with the directing vector Di(tk), one can decide whether it crosses a switching line corresponding to the index i. If it does not, then u* (tk,x(tk)) = -^i', if it crosses, then u* (tk ,x(tk ^ Mi.
The first player control chosen at the instant tk is kept until the instant tk+1. At the position (tk+1, x(tk+1)), a new control value is chosen, etc. So, the feedback control generated by the switching lines is applied in a discrete control scheme (Krasovskii and Subbotin, 1974, Krasovskii and Subbotin, 1988).
To construct v* (tk ,x(tk)) we use the vector E(tk). Compute how many times (even or odd) a ray with the beginning at the point x(tk) and the directing vector E(tk) crosses the second player switching lines. If the number of crosses is even (absence of crosses means that the number equals zero and is even), then we take v* (tk,x(tk)) = +v; otherwise, v* (tk,x(tk)) = —v. The chosen control is kept until the next instant tk+1. In the position (tk+1 ,x(tk+1)), a new control is built, etc.
This synthesis for the first (second) player is suboptimal. Analysis of its closeness to an optimal one needs an additional study. Namely, it is necessary to show that under a coordinated choice of diameters At and Ac of grids in t and c, the feedback control of the first (second) player built on the basis of switching lines guarantees the limit of result as At ^ 0 and Ac ^ 0, which is not greater (not less) than V(to,xo) for any initial position (to,xo). Such a study for linear differential games with convex t-sections Wc (t) of maximal stable bridges is made in the works (Botkin and Patsko, 1982, Zarkh, 1990, Patsko, 2006). In the problem under consideration the sections Wc (t) are not convex, and this fact preconditions the difficulty of this problem.
5. Simulation Results
Let the pursuers P1, P2, and the evader E move in the plane. At the initial instant to = 0, velocities of all objects are parallel (Fig. 12) and sufficiently greater than the possible changes of the lateral velocity components. The instant of longitudinal coincidence of objects P1 and E is T1; the instant of coincidence of the objects P2 and E is T2. The dynamics of lateral motion is described by relations (1), (2); the resulting miss is given by formula (4).
★------►-------------------------------
PI
E
P 2
---------------------------------
Fig. 12. Schematic initial positions of the pursuers and evader
In all following results, the initial lateral velocities and accelerations are assumed to be zero:
z0 — z0 — z0 —0- a0 — a0 — a0 =0
zpl — Zp2 — — °, ap1 — ap2 — — 0-
Fig. 13. Two strong pursuers, equal termination instants: trajectories in the original space
In Fig. 13, one can see the trajectories of the objects in the original space for the case of strong pursuers and equal terminal instants for the following game parameters:
Mi — 2, — 3, v — 1, P — 1/2, lp2 — 1/0.857, Ie — 1, Ti — T2 — 6.
The pursuers P1, P2, and the evader E act optimally. The trajectories drawn by
solid lines correspond to the following initial data:
z0Pl — -40, z0p2 — 25, zE — 0.
The dashed lines denote the trajectories for the following initial lateral parameters:
zp1 — —25, zp2 — 50, zE — °.
In the first case, the evader is successfully captured (at the terminal instant, the positions of both pursuers are the same as the position of the evader). In the second variant of initial positions, the evader escapes: at the terminal instant no one of the pursuers superposes with the evader. In this case, one can see as the evader aims itself to the middle between the terminal positions of the pursuers (this guarantees to him the maximum of the payoff function y>).
Figs. 14, 15, and 16 correspond to the case of weak pursuers and different terminal instants:
Mi — 0.9, M2 — 0.8, v — 1, lp1 — lp2 — 1/0.7, Ie — 1, Ti — 7, T2 — 5.
The initial positions are taken as follows:
zPi — —12, zP2 —12, zE — 0.
Trajectories in Fig. 14 are built for the optimal controls of all objects. At the beginning of the pursuit, the evader closes to the first (lower) pursuer. It is done to increase the miss from the second (upper) pursuer at the instant T2. Further closing is not reasonable, and the evader switches its control to increase the miss from the first pursuer at the instant Ti .
Fig. 15 gives the trajectories, when the pursuers use their optimal feedback controls generated by switching lines, but the evader applies a constant control v = v escaping from Pi and ignoring P2. In Fig. 16, the situation is given, when the evader, vice versa, keeps control v = —v escaping from P2 and ignoring P1. In both these situations, the payoff is less than in the case when the second player uses optimal control. When a constant control v = +v is applied, the miss to the second pursuer at the instant T2 is less; when the second player keeps v = —v, the miss to the first pursuer at the instant T1 decreases.
Fig. 14. Two weak pursuers, different termination instants: trajectories of the objects in the original space, optimal control of the second player
Fig. 15. Two weak pursuers, different termination instants: trajectories of the objects in the original space, constant control of the second player v = +v
170 Sergey S. Ganebny, Sergey S. Kumkov, Stephane Le Menec, Valerii S. Patsko
6. Conclusion
A problem of pursuit-evasion with two pursuing objects and one evading object is considered as a two-dimensional antagonistic differential game. Difficulty of numerical solution of this problem is conditioned by the fact that the t-sections of the value function level sets are not convex. For two qualitatively different types of parameters (“strong” pursuers, “weak” pursuers), an analysis of the value function level sets is worked out in the paper. On the basis of this analysis, optimal strategies of players are built.
References
Abramyantz, T. G. and E. P. Maslov (2004). A differential game of pursuit of a group
target. Izv. Akad. Nauk Teor. Sist. Upr, 5, 16-22 (in Russian).
Blagodatskih, A.I. and N. N. Petrov (2009). Conflict Interaction Controlled Objects Groups. Izhevsk: Udmurt State University (in Russian).
Botkin, N. D. and V. S. Patsko (1983). Universal strategy in a differential game with fixed
terminal time. Problems of COntrl and Inform. Theory, 11, 419-432.
Chikrii, A. A. (1997). Conflict-Controlled Processes. Mathematics and its Applications, Vol. 405. Dordrecht: Kluwer Academic Publishers Group.
Grigorenko, N. L. (1991). The problem of pursuit by several objects. In: Differential games — developments in modelling and computation (Espoo, 1990), Lecture Notes in Control and Inform. Sci., Vol. 156, pp. 71-80. Berlin: Springer.
Hagedorn, P. and J. V. Breakwell (1976). A differential game with two pursuers and one evader. Journal of Optimization Theory and Applications 18(2), 15-29.
Isaacs, R. (1965). Differential Games. New York: Wiley & Sons.
Krasovskii, N. N. and A. N. Krasovskii (1993). A differential game for the minimax of a positional functional. In: Adv. Nonlin. Dynamics. and Control: A report from Russia. pp. 41-73. Berlin: Birkhauser.
Krasovskii, N. N. and A.I. Subbotin (1974). em Positional Differential Games. Moscow: Nauka (in Russian).
Krasovskii, N. N. and A.I. Subbotin (1988). Game-Theoretical Control Problems. New York: Springer-Verlag.
Levchenkov, A. Y. and A. G. Pashkov (1990). Differential game of optimal approach of two inertial pursuers to a noninertial evader. Journal of Optimization Theory and Applications, 65, 501-518.
Le Menec, S. (2011). Linear differential game with two pursuers and one evader. In: Annals of the International Society of Dynamic Games, Vol. 11: Advances in Dynamic Games. Theory, Applications, and Numerical Methods for Differential and Stochastic Games. M. Breton, K. Szajowski (eds.), pp. 209-226. Boston: Birkhauser.
Patsko, V. (2006). Switching surfaces in linear differential games. Journal of Mathematical Sciences, 139(5), 6909-6953.
Pschenichnyi, B.N. (1976). Simple pursuit by several objects. Kibernetika, 3, 145-146 (in Russian).
Shima, T. and J. Shinar (2002). Time varying linear pursuit-evasion game models with bounded controls. Journal of Guidance, Control and Dynamics, 25(3), 425-432.
Shinar, J. and T. Shima (2002) Non-orthodox guidance law development approach for the interception of maneuvering anti-surface missiles. Journal of Guidance, Control, and Dynamics. 25(4), 658-666.
Stipanovic, D.M., A. A. Melikyan, and N. Hovakimyan (2009). Some sufficient conditions for multi-player pursuit-evasion games with continuous and discrete observations. In: Annals of the International Society of Dynamic Games (P. Bernhard, V. Gaitsgory, O. Pourtallier, eds.), Vol. 11: Advances in Dynamic Games and Applications, pp. 133-145. Berlin: Springer.
Zarkh, M. A. (1990). Unversal strategy of the second player in a linear differential game. Prikl. Math. Mekh. 54(3), 395-400 (in Russian).