Научная статья на тему 'Global optimization via neural network approximation of inverse coordinate mappings with evolutionary parameter control'

Global optimization via neural network approximation of inverse coordinate mappings with evolutionary parameter control Текст научной статьи по специальности «Медицинские технологии»

CC BY
62
29
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
GLOBAL OPTIMIZATION / HEURISTIC METHODS / EVOLUTIONARY ALGORITHMS / NEURAL NETWORKS / PARAMETER SETTING / PARAMETER CONTROL / PARTICLE SWARM OPTIMIZATION / ГЛОБАЛЬНАЯ ОПТИМИЗАЦИЯ / ЭВРИСТИЧЕСКИЕ МЕТОДЫ / ЭВОЛЮЦИОННЫЕ МЕТОДЫ / НЕЙРОННЫЕ СЕТИ / УСТАНОВКА ПАРАМЕТРОВ / УПРАВЛЕНИЕ ПАРАМЕТРАМИ / МЕТОД РОЯ ЧАСТИЦ

Аннотация научной статьи по медицинским технологиям, автор научной работы — Pushkaryov Kirill Vladimirovich

A hybrid method of global optimization NNAICM-PSO is presented. It uses neural network approximation of inverse mappings of objective function values to coordinates combined with particle swarm optimization to find the global minimum of a continuous objective function of multiple variables with bound constraints. The objective function is viewed as a black box.The method employs groups of moving probe points attracted by goals like in particle swarm optimization. One of the possible goals is determined via mapping of decreased objective function values to coordinates by modified Dual Generalized Regression Neural Networks constructed from probe points.The parameters of the search are controlled by an evolutionary algorithm. The algorithm forms a population of evolving rules each containing a tuple of parameter values. There are two measures of fitness: short-term (charm) and long-term (merit). Charm is used to select rules for reproduction and application. Merit determines survival of an individual. This two-fold system preserves potentially useful individuals from extinction due to short-term situation changes.Test problems of 100 variables were solved. The results indicate that evolutionary control is better than random variation of parameters for NNAICM-PSO. With some problems, when rule bases are reused, error progressively decreases in subsequent runs, which means that the method adapts to the problem.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Глобальная оптимизация на основе нейросетевой аппроксимации инверсных зависимостей с эволюционным управлением параметрами

Представлен гибридный метод глобальной оптимизации НАИЗ-PSO на основе нейросетевой аппроксимации инверсных зависимостей (координат от значений целевой функции) и метода роя частиц, служащий для нахождения глобального минимума непрерывной целевой функции многих переменных в области, имеющей вид многомерного параллелепипеда. Целевая функция рассматривается как абстрактная вычислительная процедура («чёрный ящик»).Метод использует группы пробных точек, движущихся как в методе роя частиц. Одна из возможных целей движения определяется через отображение пониженных значений целевой функции в координаты посредством модифицированных дуальных обобщённо-регрессионных нейронных сетей, конструируемых по пробным точкам.Параметрами процесса управляет эволюционный алгоритм. В алгоритме управления популяция состоит из эволюционирующих правил, заключающих в себе наборы параметров. Для оценки приспособленности особи используются две числовые характеристики: краткосрочная (очарование) и долгосрочная (достоинство). По очарованию правила отбираются для размножения и применения. Достоинством определяется выживание особи при формировании новой популяции. Двойная оценка правил решает проблему вымирания потенциально полезных особей при краткосрочном изменении ситуации.Преимущество эволюционного управления над случайным изменением параметров НАИЗ-PSO в процессе поиска, а также тенденция к уменьшению погрешности при повторном использовании базы правил показаны на тестовых задачах с целевыми функциями 100 переменных.

Текст научной работы на тему «Global optimization via neural network approximation of inverse coordinate mappings with evolutionary parameter control»

ISSN 2079-3316 PROGRAM SYSTEMS: THEORY AND APPLICATIONS vol. 10, No2(41), pp. 3-31

UDC 004.023::519.853.4

m

Kirill V. Pushkaryov

Global optimization via neural network approximation of inverse coordinate mappings with evolutionary parameter control

Abstract. A hybrid method of global optimization NNAICM-PSO is presented. It uses neural network approximation of inverse mappings of objective function values to coordinates combined with particle swarm optimization to find the global minimum of a continuous objective function of multiple variables with bound constraints. The objective function is viewed as a black box.

The method employs groups of moving probe points attracted by goals like in particle swarm optimization. One of the possible goals is determined via mapping of decreased objective function values to coordinates by modified Dual Generalized Regression Neural Networks constructed from probe points.

The parameters of the search are controlled by an evolutionary algorithm. The algorithm forms a population of evolving rules each containing a tuple of parameter values. There are two measures of fitness: short-term (charm) and long-term (merit). Charm is used to select rules for reproduction and application. Merit determines survival of an individual. This two-fold system preserves potentially useful individuals from extinction due to short-term situation changes.

Test problems of 100 variables were solved. The results indicate that evolutionary control is better than random variation of parameters for NNAICM-PSO. With some problems, when rule bases are reused, error progressively decreases in subsequent runs, which means that the method adapts to the problem.

Key words and phrases: global optimization, heuristic methods, evolutionary algorithms, neural networks, parameter setting, parameter control, particle swarm optimization.

2010 Mathematics Subject Classification: 90C26; 90C59,

© K. V. Pushkaryov, 2019

© Institute of Space and Information Technologies, Siberian Federal University, 2019

© Program Systems: Theory and Applications (design), 2019

DO 10.25209/2079-3316-2019-10-2-3-31^^^^^^^^^^^^^^^^^^^^^! lY&Jj1

Introduction

Global optimization of multivariate objective functions (OFs) viewed as black boxes with minimal restrictions on the function's properties is an important problem because it promises a universal way to solve many practical problems of engineering, machine learning, etc.

Nowadays, nature-inspired computing is actively developing. It solves computational problems through imitation of natural phenomena. In optimization this approach gave birth to evolutionary algorithms (evolutionary programming, evolutionary strategies, genetic algorithms, differential evolution) [1,2] and a lot of other optimization algorithms based on various physical, chemical and biological phenomena [3], such as simulated annealing and particle swarm optimization (PSO).

Although for absolutely unrestricted OFs efficiency of all methods averaged over all possible problems is equivalent [4], methods that work with more restricted practically possible problems are of great interest. These methods must make the best use of the information they obtain by evaluating an OF at probe points. Long-term preservation of information relevant for global minimum search is an important step towards this goal.

Proper parameter setting is crucial for optimization algorithm efficiency. Optimal choice of parameters is an optimization problem in itself, where some measure of quality dependent on parameters must be optimized. Parameters may be set manually or automatically before the start of an algorithm (the parameter tuning problem) or during its execution (the parameter control problem) [5].

Manual parameter setting requires a good understanding of how parameters affect an algorithm in various situations, which in turn entails expert knowledge and experience. The data on effects and relations of parameters gathered for one problem may be inapplicable to another. Experimental exploration of parameters is difficult because the problem at hand must be solved multiple times with different parameters, so resource requirements increase manyfold.

The optimal values of parameters may be different not only for different problems but also for different stages of a search [6, p. 1]. For example, gradual reduction of the inertia weight or increasing of the neighbourhood size was suggested for PSO [7, pp. 36, 40]. A parameter control mechanism can actively interact with the search process and learn. As a result, the problem of parameter control is particularly important.

The problem of automatic parameter setting attracted a lot of attention in evolutionary optimization. A review of various approaches is presented

in [5, 6]. The range of methods is diverse: from random variation and deterministic heuristics to controllers based on fuzzy logic [8,9], predefined heuristic rules [10], reinforcement learning [11-13] or an auxiliary meta-evolutionary algorithm [12]. In [5] parameter setting methods are classified into deterministic (no feedback from the algorithm being configured, parameters are set according to a predetermined schedule), adaptive (feedback from the algorithm is taken into account), self-adaptive (parameter values are encoded in the genome along with the solutions and evolve together).

Neural network approximation of inverse coordinate mappings (NNAICM) is a heuristic method for finding the global minimum of a continuous multivariate "black box" objective function with simple bounds on the variables. The method is described in detail in [14,15]. It was included in the hybrid heuristic parallel method [16], where fixed heuristics without learning were employed to control its parameters.

Here we present a hybrid global optimization method based on NNAICM and PSO with evolutionary parameter control. NNAICM helps to find the goal to which a point (a particle in PSO terms) moving by the rules of PSO is attracted. An evolutionary algorithm controls the parameters of NNAICM at the same time.

The parameter control task in this work is special in that the parameters pertain to individual agents (particles), not to the algorithm as a whole. Hence, at each moment there may be multiple optimal parameter sets for different agents.

A numerical global optimization problem is stated as follows. Consider an objective function $(x) defined on a bounded set Q, where

(1) Q = {x: Li < xi < Ui, i = bD} c Rd.

Find an approximate minimum of the function ^m and a point x*min where it's attained:

(2) = min $(x) = $(xmm),

(3) = ) < $min +

where e$ > 0 is a tolerance that specifies the required accuracy of the solution.

1. Neural Network Approximation of Inverse Coordinate Mappings

NNAICM is based on iterative decreasing of OF values and mapping them to coordinates by a Generalized Regression Neural Network (GRNN).

The GRNN is trained on samples ($(x), x), which consist of an OF value and the corresponding coordinates. Therefore, the GRNN is said to approximate an inverse coordinate mapping. One-step learning is an important advantage of GRNN — it may be constructed from samples in one step.

The basic NNAICM algorithm is as follows:

(1) Initialization. Set k := 1. Fill P[k] — a starting set of probe points from the search space, which may be chosen at random.

(2) Train the network GRNN[k] to map OF values to coordinates on the samples {($(x),x) : x G P[k]}.

(3) Take the point xjmi„ with the minimum OF value from the set P[k].

[k]

Lower the value by some decrement ef] and map it to coordinates by the GRNN: x^I = GRNN[k] ($(x[£in ) - , where is

an approximation smoothing parameter.

(4 ) The new point is included in the set P[k+1] if its OF value is less than the current minimum:

(4) P[k+1] = f P[k] u R[k] U {x^k!} if ) < ),

\P[k] U R[k] otherwise,

where R[k] is a set of random probe points.

(5) k := k +1.

(6) Repeat steps 2-5 until stop conditions are satisfied. These steps for the function y = x2 are shown in Figure 1.

In Figure 1 we made a GRNN approximation of an inverse coordinate mapping based on probe points P1-P5. Then we used it to obtain a new point P6 and refined the approximation taking P6 into account.

Practical application of the method described above meets with some difficulties. Its efficiency quickly diminishes when the number of variables goes up. It can get stuck because each iteration starts with a single (best) probe point and random generation is not an effective source of new probe points. To conquer this problem, NNAICM was included in the hybrid heuristic parallel method [16] in which multiple optimization agents search for the global minimum. The agents iteratively execute a set of optimization algorithms (search tools), which share a common set of probe points.

When an OF has a number of minima close by value, the inverse mapping line can miss all of them by far. To overcome this issue, a special

-1

-/"........ r--;;;,7!"» ............. P,

> " P2

/ X /

Ps / /

LT s^^

\

\ \ <

-2 0 2 4 6 8 10 12 14 16

V

Figure 1. Principle of the basic NNAICM algorithm

modification of GRNN was proposed — Dual GRNN (DGRNN) [17]. Besides the input f for OF values it has an additional input Xf taking coordinates of a point — a search focus. A training sample of DGRNN consists of an input exemplar fi and an output exemplar Xj. The farther is the memorized exemplar from the focus, the lower is its activation and thus its contribution to the output, so exemplars near the focus have an advantage. DGRNN has an additional smoothing parameter sx, which is an analog of sv for the second input. The first hidden layer of DGRNN measures similarity between input values and exemplars. Similarity coefficients are multiplied for each pair of an input and an output exemplar in the second layer. The resulting coefficients are used for weighted summation of output exemplars in the third and fourth layers.

In Figure 2 the DGRNN schematic for two pairs of exemplars is shown. The connection weights not equal to 1 are printed nearby.

Finally, control of the £/, sv, sx parameters and the search focus poses a problem. In [16] DGRNN wasn't used and fixed heuristics without

The decrement of the OF

learning were employed to control ef and sv. value was computed as efk] = 3($[mix — $

)/Np, where and $

Figure 2. DGRNN schematic for two pairs of exemplars

are the maximum and the minimum OF values over the probe point set at the kth iteration, Np is the number of the probe points. To simplify setting of sv, the distance function of the first GRNN layer was changed to:

(5)

dist[k] (11,12) =

11l - 12!

^max

This makes the distances and sv relative quantities.

2. Hybrid NNAICM-PSO Method with Evolutionary Parameter Control

To improve its efficiency, NNAICM was combined with PSO and evolutionary parameter control.

Base points (BPs) are counterparts of PSO particles in the discussed method. The number of BPs Nb is a parameter. Each BP at the kth iteration has the position , the velocity v[fe] and the private goal g[k]. Initially,

(6) b[1] = g[1] = U [L, U],

(7) v[1] = U [-kvl (U - L) ,kvl (U - L)],

where L = (L1,... ,LD), U = (U1,... ,UD) are the bounds on the search space; U [l, u] is a vector of numbers uniformly distributed on the set {x: ¡i < xi < ui, i = 1,^}; kv1 is a parameter.

BPs are arranged in disjoint groups by Sbg points. Every group has

[k]

the group goal g[ ] that is selected each iteration by the lowest OF value

from the list: ggk 1], private goals of the group. A group goal may be improved by local search and trajectory extrapolation as explained below.

To prevent search stagnation, inactive BPs are restarted every Irest iterations by random reinitialization described above. A BP is restarted at the kth iteration if all conditions are satisfied:

(8) (9)

||v[fc]|| <vmm, $min,b (k - Irest ) - $ (k) < £bIrest ?

ïW)

where $min,6 (k) = mini=^ $(b[i]) is the minimum OF value over all positions of the BP from the first to the kth iteration; vmin, eb, Irest are parameters of the algorithm.

A variant of DGRNN, QDGRNN (Quantile DGRNN), was created for NNAICM-PSO. To limit the parameter ranges, differences between input and exemplar OF values f — fi, which are computed inside QDGRNN, are divided by pf 1 sample quantile of the absolute differences when negative and pf2 quantile when nonnegative. The new parameters pf 1, pf2 of the neural network replace sv. The OF decrement £f is another input parameter. It's relative too because it's subtracted after the division. Pointwise distances ||xf — Xiy are divided by their own px sample quantile, which replaces sx. Thus the mapping performed by QDGRNN is as follows:

Ne C (^X)x.

(10)

(11) (12)

C(v) = exp I ln0.5

E

Y^Ne n\

= C(v) • C(x)

C

1 - li

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Q ((- 1 : j = l,Ne) ,pf (i,ii))

— >

Pf (1? 1i)

C

(x)

exp

|ln0.5 ^

pfl Pf 2

if 1 <

if 1 > li

f

- Xi||

Q«||xf - Xj || : j = 1, Ne) ,px)

(13)

(14)

(15)

where x* is the output vector; f is the input OF value; Xf is the input focus value; C(v), C(x) are the similarity coefficients between the inputs and the corresponding exemplars fi, Xi; C(vx) is the merged similarity

x

2

f

coefficient of the ith pair of exemplars; Ne is the number of exemplar pairs; Q (A,p) is the p quantile estimation for the sample A.

Parameter control is based on evolutionary principles. The parameters are set by rules, which crossover, mutation and selection operators are applied to. The population consists of Nr rules. A rule contains application conditions (not used in this work), measures of fitness and a tuple of parameters (genotype): £f, pf 1, pf2, px, ab, Pr.

Fitness of each rule is measured by charm and merit. Charm is a short-term measure and regulates selection of rules for reproduction and application. Merit is a long-term measure. It's based on an exponential moving average of charm and determines survival of an individual.

This two-fold system preserves potentially useful individuals from extinction due to short-term situation changes. While currently efficient rules are actively reproducing and being applied, rules that became inefficient recently are "sleeping" waiting for an opportunity to become efficient again.

Iterations of the algorithm are divided into big, where application, evaluation and evolution of all rules take place, and small, where only some rules selected by charm are applied. The first and every Ibigth iteration are big.

The population is initialized randomly or loaded from a saved rule base.

At each big iteration all rules are applied to the predefined number Ntop of BPs. At the ith big iteration the BPs with numbers from 1 + (i — 1) • Ntop to i • Ntop are selected. The count continues from the start when it goes over the end. For every unselected BP a randomly chosen rule is applied such that the selection probability is proportional to krsel, where ksei G (0,1) is a parameter and r is the zero-based rank of the rule in descending order of charm. Hence, all rules are periodically evaluated for various BPs, but rules with greater charm are applied more frequently.

Application of a rule at the BP b consists of the following steps:

(1) First, an attached set Pb of probe points is formed:

(16) Pb = {b + i • p: i = —Ns ,Ns, p G Pr } ,

where Pr is the pattern of the rule; p is a pattern vector; Ns is an

algorithm parameter.

(2) QDGRNN is trained to map OF values to coordinates on the

attached set, then it maps BP's OF value:

(17) b* = QDGRNN ($(b), b, f ,pf 1,pf2,px).

BP's position is the focus of QDGRNN and f, pf i, pf 2, px are set by the rule. Let us call the point b = b + ab (b* — b) the rule candidate, where the parameter ab is set by the rule. The difference p = $(b) — $(b) is the candidate progress. Points from the attached set are possible solutions too. After application of rules the private goal g[k] for each BP is determined. At each iteration it's chosen by the minimum OF value from the list: g|fc i], the current rule candidates for the BP, b[k]. Finally, the BP moves towards the goal according to the PSO-like rule:

(18) v[k] = w: v[fc-i] + wi Ri << (gf]- bfc]) + Wg R2 < (glfc] - b[fc]),

(19) b[fc+i] = b[k] + vlfc],

where k is the iteration number; w:, wl, wg are parameters of the algorithm; g[k] is the private goal of the BP; ggk] is the group goal; Ri, R2 are vectors of random numbers uniformly distributed between 0 and 1; < denotes componentwise vector product.

Rules are evaluated by comparison of their candidate progress for the selected BPs at big iterations. A single first place (maximum progress) means higher charm than any number of second places, etc. Moreover, a rule that produced only negative progresses has lower charm than a rule that produced at least one positive progress. To achieve this, the vector of scores (0i;..., 02Nr) is associated with each rule at each big iteration:

(20) *=ic - -={::+n, f p: < 0:

where Ntop is the number of selected BPs (an algorithm parameter); S:j is Kronecker delta; p: is the progress of the rule candidate for the ith selected BP; r: is the zero-based rank of p: in descending order for the ith selected BP.

Then rules are sorted in descending lexicographical order of their score vectors. Let r be the zero-based rank of a rule in this list, then the rule's charm

(21) charm [k+i] = krcd,

where kcd G (0,1) is an algorithm parameter. Merit is calculated as follows:

(22) merit[k+i] = merit[k] • k^d + charm[k] + ppe[fc],

where kmd G (0,1) is a parameter; a is the age of the rule in iterations (starts from zero and grows by one at the end of each iteration); ppe[k] is the ratio of the average candidate progress of the rule to the maximum average progress over all rules.

New rules are created by random generation, mutation and crossover. The probability of a rule being selected as a parent is proportional to its charm.

The procedure of random rule generation is as follows. Each scalar rule parameter A is set to a random number that is uniformly distributed

on

rand> rand

The number of pattern vectors is chosen at random from 1 to D. To produce a pattern vector, uniformly distributed random numbers p01,... ,p0D G [—1,1] are generated. Then the components of

the pattern vector are pi = kmxp minj=^ (UpoL) pw, where kmxp is a coefficient equal to kmxp1 for the proportion kmxpf of the total number of random rules and kmxp2 for others. Therefore, the pattern vector has at least one component equal in magnitude to kmxp (Uj — Lj), and others are less than or equal in magnitude. By using two values of kmxp we allow for both small and big pattern vectors to be generated.

Mutation produces a new rule by random perturbation of a parent rule's parameters. Akin to evolutionary programming, in the discussed algorithm mutation creates a new individual, so it's asexual reproduction. Mutation sets a scalar parameter of an offspring to a random value having a truncated normal distribution

(23) Ao A

, a.

(m)

' !yp) \ ^V mut

1 mut , + ^ V y

where Ao, Ap are the values of the parameter of the offspring and of the parent respectively; N;,« a) is a truncated normal distribution with the mean ^ and the standard deviation a on [/,«]; /^^t, a^m^ are algorithm parameters defined independently for each rule parameter A. Pattern vectors are distributed normally:

(24) po,ij ~ N (pp,ij, a(VUt (Uj — Lj)) ,

where po,ij, pp,ij are the jth components of the ith pattern vectors of the offspring and of the parent respectively; N a) is a normal distribution

(p)

with the mean ^ and the standard deviation a; a(pUt is an algorithm parameter.

Rules produced by mutation are tested for viability. An offspring is eliminated if any of its parameters pf 1, pf2, px is less than or equal to zero.

Crossover of scalar parameters is defined by the formula |o = u|pi + (1 — u)|p2, where |o, |pi, |p2 are the parameter values of the offspring, first and second parents respectively, u G [0,1] is a random uniformly distributed coefficient determined independently for each parameter. Crossover of two pattern vectors is performed as componentwise scalar crossover described above. The vectors are selected at random, one from the first parent and one from the second. The offspring pattern size is a random variable following the binomial distribution B (n, 0.5), where n is the sum of the pattern sizes of the parents.

A new population of Nc rules is formed at each big iteration. First, Nnr = L(1 — kelt) NcJ new rules are created, among them [krandNnrJ random rules, |_kcrosNnrJ crossover rules and |_kmutNnrJ mutation rules. Pattern vectors with norms less than epat are removed from the new rules. Next, old rules are sorted in descending order of merit. First |_keltNcJ rules from this list (the elite) are put into the new population. All new rules are checked for correctness. A rule is correct, if pf i, pf2, px are positive and the pattern is not empty. Incorrect rules are eliminated.

Local search and group goal trajectory extrapolation supplement the main algorithm.

Local search is performed at each group goal every Iioc iterations. In this work we used a modified BFGS [18, p. 136] algorithm. The algorithm stops after 100 iterations or when the norm of the gradient falls below 10-i2. The last inverse Hessian approximation Hk computed by BFGS is saved and restored when local search is run next time. This memorized value becomes obsolete if the group goal shifts a lot between runs of local search. A modification of the algorithm done in this work allows for soft resetting of Hk in the following way: if the solution is not improved at the current iteration, then the next inverse Hessian approximation Hk+i is khHk + (1 — kh)E, where kh G [0,1) is an algorithm parameter (equaled 0.75 in the experiments here), E is the identity matrix. The golden section method of line search is used that stops after 10 iterations or when Wolfe conditions with the coefficients ci = 10-4, c2 = 0.9 are satisfied [18, p. 33]. The line search restarts with the interval shortened 10 times if the OF value at the current point is greater than at the starting point.

The trajectory of every group goal is extrapolated each iteration. Let gg[k] be the group goal at the kth iteration, G[fc] = jggkl],..., gg^ j be the set of its last n < Next best positions such that k > ki > • • • and $(gj,fcl]) < • • • < $(ggfcn]), where Next is a parameter. Let Pext = {gg[fe] + (gg[fc] — g^) : i = 1, n j. Now the new group goal ggk] equals

argminxe{g,[k]yJPczt $(x).

The search continues until the stopping conditions are satisfied. They're checked each Istop iterations. The algorithm stops when any of the listed conditions are satisfied: 1) the average decrease in the minimum OF value over the last Istop iterations is less than £stop, 2) the average shift of the best solution point over the last Istop iterations is less than Sstop; 3) the number of iterations is greater than Imax. When the search stops, the resulting population is saved as a rule base that can be loaded again next time.

3. Experiments

To evaluate efficiency of the method, test minimization problems of 100 variables were solved (D = 100). They were based on well-known test objective functions [19]. In all of the problems, the global minimum is at (0,0,...) with the OF value of 0.

The search spaces are from -100 to 100 for each coordinate by default. For some problems, the spaces are ones frequently used in the literature. As for Ackley's function, its local oscillations completely hide the global tendency far from the global minimum, so efficiency of the discussed method in this case essentially depends on the space size. Hence, two spaces were used: from -100 to 100 and from -50 to 50.

The test problems are as follows:

(1) Rastrigin's function on [-100; 100]D:

D

/i(x) = ^ [x2 - 10 cos (2nxj) + 10] .

i=i

(2) Shifted Rosenbrock's function on [-100; 100]D:

D-i

/2 (x) = ^ [100 ((xi+i + 1) - (xi + 1)2)2 +

i=1

(3) Ackley's function on [-100; 100]D: /s(x) = -20exp ( -0.2

A

D x? ) - exp ( D ^ cos (2nxi) ) + 20 + e.

i=1 ) \ i=1

(4) Ackley's function on [-50; 50]

D

(5) Griewangk's function on [—600; 600]D:

D 2 D

— TT cosi 1+1.

D 2 D / \

/4(x)=e ^ -n

AW = £

i=1

- ^XX cos (nb^ ,

k=Q

4000

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

i=

(6) Weierstrass' function on [—0.5;0.5]D:

~kmax

^ ak cos (2nbk (x, + 0.5))

_k=0

where a = 0.5, b = 3, kmax = 20.

(7) Katsuura's function on [—100; 100]D:

i0D-12

32 \

1 + i ^ |2jXi — round (2jXi) | 2-j I — 10D-2,

j=i J

where round (x) is x rounded to the closest integral value.

(8) Shifted Levy's function on [—100; 100]D:

D-i

/7(x) = sin2(nwi) + ^(w, — 1)2 [1 + 10sin2(nw, + 1)]

i=i

+ (wd — 1)2 [1 + sin2 (2nwD)] ,

where w, = 1 + 0.25x,.

(9) Sphere function on [—100; 100]D:

D

/8(x) = ^ x2.

i=i

(10) Disk function on [—100; 100]D:

D

/g(x) = 106x2 + £ x2.

i=2

(11) Bent cigar function on [—100; 100]D:

D

/io(x) = x2 + 106 Y, x,2.

i=2

(12) Sum of different power function on [—100; 100]D:

D

fii(x) = m M

D

li+l

l

(13) High conditioned elliptic function on [-100; 100]D:

/12 (x) = £ 106(i-1)/(D-1)x|.

¿=1

Classic test functions frequently have "convenient" properties: the global minimum is at zero or another point with equal coordinates or it can be found by searching along each coordinate independently:

(25) xmvn = I arg min $(x, X2,...), arg min $(xi, x,...),...

\Ll<X<Ui ¿2<X<U2

Therefore, random shifts and random orthogonal transformations are applied to all basic OFs, that is /*(x) = /¿(M(x - x0)), where xo = U [L + 0.4 (U - L), L + 0.6 (U - L)]. The matrix M is produced by Gram-Schmidt orthonormalization of a random matrix with elements uniformly distributed between 0 and 1. A similar approach was used in [19].

Experiments of the following types were conducted:

(1) Control experiments with random variation of parameters. The algorithm was run for each problem 100 times with random initialization. The rule base, M, x0 were initialized randomly before each run.

The described above NNAICM-PSO algorithm with disabled evolution was used. Charm and merit were set to random uniformly distributed numbers between 0 and 1 at each rule evaluation. The population was restricted to random rules without mutation or crossover.

(2) Experiments to evaluate general efficiency of NNAICM-PSO. The algorithm was run for each problem 100 times with random initialization. The rule base, M, x0 were initialized randomly before each run.

(3) Experiments to explore long-term adaptation capability of NNAICM-PSO. For each problem, 50 series by 10 runs of the algorithm were performed. The rule base was initialized randomly before the first run in series, saved after each run and reloaded for the next run (the reused rule base mode). Initialization of M, x0 was random and took place before the first and then each Rt runs in series. Experiments were conducted for Rt = 0 (M, x0 don't change inside series) and

Rt = 1.

For each run, the minimum OF value (^ min), the number of OF evaluations (Nevai), the number of iterations (Niter) and the number of OF evaluations per iteration (Nepi = Neval/Niter) were registered.

We use random variation as a control because it's known [20] that sometimes random variation of optimization algorithm parameters can improve results over static parameters by itself. We implemented random variation through modification of NNAICM-PSO to limit the differences between configurations to how parameters are set.

The parameters in the experiments were set as follows:

• the parameters of population size and composition: Nr = 100,

kelt = °.25, kcros = °.5, kmut = °.25, krand = 0.25;

• the parameters °f mutation: Z^ = Z^ = ^u^ = fet = 10-2,

a0/) = a(Pfl) = a(Pf2) = a(Px) = 0 05 Z( at) = 10-2 a( at) = 0 05 amut = amut = amut = amut = 0.05, Zmut = 10 , amut = 0.05,

amlt = 5 X 10-3;

• the parameters of rule evaluation: kcd = 0.95, kmd = 0.9999;

• the parameters of rule application: Ibig = 10, ksel = 0.95, Ntop = 10, Ns = 1;

• the parameters of random rule generation: kmxp1 = 0.2, kmxp2 = 10-6, kmxpf = 0.5, Spat = 10-6, Zf = 10-2, ^ = 10, Zf =

Z(P/2) = Z(Px) = 10-2 u(Pfi) = U(Pf2) = U(Ps) = 1 Z(ab) = 10-2 ^and rand ' rand rand rand ' ^and '

u(ab) = i0; "rand = 10;

• the parameters of local search: Iloc = 25, kh = 0.75;

• the parameters of BPs: Nb = 100, kvl =1, Sbg = 10, vmin = 10-5, £b = 5 X 10-6, Irest = 10, Wi = 0.9, ^ = 0.5, = 0.5, Next = 10;

• the parameters of stopping conditions: Istop = 100, estop = 10-7,

$ stop 0, Imax

4. Results

The statistics of the experiments with random variation (RV) and evolutionary control (EC) of parameters without reloading of the rule base (the discarded rule base mode) are shown in Tables 1 and 2 respectively.

The results of the RV and EC experiments with discarded rule base are compared in Figures 3-5 and in Table 3. In Figure 3 the results are shown as box plots: the box represents the range between the first (Q2) and third (Q3) quartiles, the orange line denotes the median, whiskers extend to values not more than 1.5(Q3 — Q2) below Q2 or above Q3, more distant values (outliers) are denoted by circles, the green triangle shows the mean. In Figures 4, 5 error bars denote standard error of the mean.

The two-tailed Mann-Whitney test was used to test signifiance of

Table 1. Results of the RV experiments

Pro- Para-blem meter

Minimum

Median Maximum Average Standard dev.

Neval Niter

2

N eval Niter

3

Neval Niter

4

Neval Niter

5

Neval Niter

6

Neval Niter

7

Neval Niter

8 .

min

Neval

Niter

9 .

min Neval Niter

Neval Niter

11 .

min

Neval Niter

12 .

min

Neval

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Niter

13

196

9.817x106 400

1.691x10-12 7.356x 106 300 20

4.451 x106 200

1.315x 10-8 4.42x106 200

1.221x10-14

4.86x106 200 71.3 1.421 x107 600 0.01705 6.731x106 300 4343 4.692x 106 200

5.492x10-16 4.961 x106 200 3.787x10

4.655 10

N eval Niter

326.8 1.238x107 500

3.744x 10-11 7.663x 106 300 20

4.681 x106 200 19.81 4.866x 106 200

3.847 10

-7

6

5.37x10 4.913 10

200

-9 6

2.172x10 4.948x10 200

1.381x 10-10 3.262x107 1100 3.87x10-7 4.89x106 200

448.7 2.468x107 1000 3.987 1.257x 107 500 20

1.206x107

500 19.91 1.771 x107 700

-14

5.033x106 200 93.86 1.9x107 800 0.0621 1.134x107 500 9905 7.331 x106 300

1.843x10-15 5.255x106 200

-7 6

2.244x10-13 5.277x106 200 106.3 5.104x107 2100 0.3598 1.409x107 600

1.225x104 9.822x107 3900

-15 6

2.566 10

5.493 10

8.014x10

5.112 10

200

-7 6

1.879x10 5.17x10' 200

1.144x10-7 5.563x107 1900 1.066x 10-6 5.096x 106 200

4.297X10 7.846x10' 300 0.000 75 1.011x108 3500 4.416x 10-6 7.596x 106 300

329.6 1.224x107 490 1.316 8.545x106 340 20

5.309x106 228 19.2 6.826x 106 285

-14

5.116x10 5.033x106 200 93.57 2.164x107 893 0.079 02 1.059x107 459 9622 1.137x 107 463

-15 6

1.764 10

5.252 10

200

200

-7

5.362x10

-7

4.909 10

200

200

-6 6

-7 6

5.649X10 5.609x10' 218

1.284X10-5 5.803X107 1976 1.185X10-6 5.137X106 202

58.45 2.162x 106 87.75 1.875 1.262x 106 50.99 0.000 200 3 1.41x106 58.45 3.378

2.828x106

113.5 3.622x 10-14

7.935x104 0

6.629 7.776 x106

316.6 0.061 23

1.497x106 61.8 1535 1.193x107 474.1 4.518x 10-16 8.844 x104 0

8.902x 10-8 9.112x104 0

8.42x10-7 9.719x105 38.42 7.662 10-5

1.656x10' 580.2 5.696x10-7 3.564x 105 14

differences in , Neval, Nepi 1. For each of the parameters separately, p-values were adjusted for multiple comparisons by the Holm-Bonferroni method2.

1The function scipy.stats.mannwhitneyu from SciPy 0.19.1 [21] was used.

2The function p.adjust with the parameter method = "holm" from R 3.4.4 [22] was

used.

Table 2. Results of the EC experiments with discarded rule base

Minimum Median Maximum Average Standard dev.

;.862x10-14 3.428x10-13 54.72 15.68 17.44

8.175x106 9.86x106 1.677x 107 1.072x 107 2.073x106

300 300 600 358 73.73

L.606x10-12 2.991 x10-11 3.987 1.037 1.749

8.071 x106 9.312x106 1.522x107 1.002x107 1.523x106

300 300 500 339 50.78

20 20 20 20 0.000 319 6

4.15x106 5.458x106 1.441 x107 6.26x 106 1.739x106

200 200 500 239 63.08

8.689x10-9 19.76 19.93 17.58 6.182

4.211 x106 7.423x106 1.876x 107 7.969x 106 3.271 x106

200 300 800 296 118.3

L.044x 10-14 3.825x 10-14 4.118x10-13 5.061 x10-14 4.683x 10-14

5.194x106 6.228x 106 6.819x 106 6.193x 106 3.46x105

200 200 200 200 0

39.17 73.61 87.24 72.53 9.259

1.751 x107 2.312x107 1.056x108 3.031x107 1.767x 107

600 750 2900 956 498.7

0.009 948 0.066 76 0.2953 0.08215 0.052 62

7.304 x106 1.177x 107 2.132x107 1.175x 107 2.315x106

300 500 800 468 81.09

4.268 9257 1.178x104 7998 3726

4.664x 106 1.064x107 8.407x 107 1.332x107 1.2x107

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

200 400 2800 479 421.5

L741x10"16 1.195x 10-15 2.149x10-15 1.236x 10-15 4.056x 10-16

5.094x106 6.505x 106 7.379x106 6.483x 106 4.42x105

200 200 200 200 0

3.54x10-7 5.437x10-7 7.794x10-7 5.491x10-7 8.739x10-8

4.857x106 5.465x 106 6.317x 106 5.487x106 3.104x105

200 200 200 200 0

1.617x10-9 2.423x10-7 7.641 x10-6 5.306x10-7 9.742x10-7

5.512x106 6.435x 106 1.014x107 6.69x 106 9.355x 105

200 200 300 209 28.62

S.579x 10-11 3.468x10-7 0.004 809 8.098x10-5 0.000 542 7

3.623x107 5.498x107 1.267x108 6.038x107 1.812x107

1100 1600 3800 1778 543.1

3.427x10-7 1.03x10-6 2.895x10-6 1.085x10-6 4.042x10-7

5.155x106 5.95x106 8.479x106 6.033x 106 4.624x105

200 200 300 202 14

Pro- Para-blem meter

1 .

min

Neval Niter

2

min N eval Niter

3 .

min Neval Niter

.

min Neval Niter

.

min Neval Niter min Neval Niter

7 .

min Neval Niter

.

min Neval Niter

.

min Neval Niter

10 .

min

Neval Niter

11

min Neval Niter

12 .

min

Neval Niter

13 .

min

Neval Niter

4

5

6

8

9

The difference in $min between the RV and EC experiments is significant at the 0.05 level (see Table 3). Significant differences were present in problems 1, 4, 6, 9. The minimum OF values for these problems were higher in the RV mode than in the EC mode (Figure 3). It's worthy to note that in problem 8 the adjusted p-value of 0.14 was greater than the significance level, but 15 values in the EC mode were less than 50 (with the

20 KlRILL V. PuSHKARYOv

Table 3. Comparison of Q*min between the RV and EC experiments

Problem P -value Adjusted p-value

1 <0.0001 <0.0001*

2 0.0893 0.6850

3 0.0856 0.6850

4 0.0004 0.0042*

5 0.7545 1

6 <0.0001 <0.0001*

7 0.2785 1

8 0.0160 0.1444

9 <0.0001 <0.0001*

10 0.2433 1

11 0.9912 1

12 0.2009 1

13 0.3242 1

Note: *) statistically significant results (p < 0.05).

minimum of 4.268) and the minimum result in the RV mode was 4343.

In the EC mode Neval was significantly (p < 0.05) higher (in problems 2-11, 13) or lower (in problem 1) (Figure 4). There was no significant change in problem 12. In all problems Nepi was significantly higher in the EC mode (Figure 5).

The most substantial difference between the RV and EC modes was observed in problem 1 with Rastrigin's function. There was simultaneous reduction of error, number of OF evaluations and iterations, which was not seen in other cases.

Therefore, the experimental results are consistent with the hypothesis that in some problems is lower with evolutionary control of NNAICM-PSO parameters than with random variation.

The statistics of the last run in series in the EC experiments with reused rule base with random transformation at the start (Rt = 0) or before each run (Rt = 1) are presented in Tables 4 and 5.

The results of the EC experiments with reused rule base are compared for Rt = 0 and Rt = 1 in Figures 6-9 and in Table 6. In Figures 7-9 colored areas denote standard error of the mean.

The two-tailed paired Wilcoxon signed-rank test was used to test signifiance of differences in , Neval, Nepi between the first and the last run in series3. For each of the three parameters separately p-values were adjusted for multiple comparisons by the Holm-Bonferroni method.

3The function scipy.stats.wilcoxon with the parameter alternative = 'two-sided' from SciPy 0.19.1 [21] was used.

Figure 3. in the RV and EC experiments with discarded rule base

The difference in between the first and the last run in series

is significant at the 0.05 level for Rt = 0 and Rt = 1 (see Table 6). In problems 1, 3, 4, 6, 8 the difference was present in both modes, while in problem 13 it was present in the Rt = 0 mode only. In these problems ^min obviously goes down in series (Figures 6, 7).

In most of the problems Nevai goes up in series (Figure 8). The differences between the first and the last run in series are significant (p < 0.05) in problems 1, 2, 4, 5, 6, 9, 11, 13 with Rt = 0 and 1, 2, 3, 4, 5, 7, 9, 11, 13 with Rt = 1.

In both modes, in all problems except No. 10, Nepi goes up in series (Figure 9). The differences in Nepi between the first and the last run in

Figure 4. Average Nevai in the RV and EC experiments with discarded rule base

Figure 5. Average NeP% in the RV and EC experiments with discarded rule base

series are also significant (p < 0.05) in all problems except No. 10. There are no significant differences in problem 10 in any mode.

No significant differences in §*min, Neval, Nepi between the Rt =0 and Rt = 1 modes were discovered by the Mann-Whitney test (p > 0.05).

Therefore, the experimental results are consistent with the hypothesis that in some problems evolutionary controlled NNAICM-PSO gives lower

with reused rule base than with discarded rule base.

Table 4. Results of the last run in series in the EC experiments with reused rule base and Rt = 0

Pro- Para-blem meter

Minimum

Median

Maximum

Average Standard dev.

-14

-14

1.634x10-13 1.234x107 300 3.987 1.675x107 500 20

1.28x107 400 19.86

2.688x107 800

2.001 x10-13

8.103x106 200 52.7 7.287x 107 1800 0.2023 2.502x107 800

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1.141x104 8.605x107 2200 2.145x10-15 8.587x106 200

8.597x 10-7 6.203x 106 200

-5

1.006x10-13

1.162x107

300 1.037

1.202x107

340 20

7.36x106 258 6.682 1.344x107 414

4.908x 10-14 7.685x 106 200 42.91 3.696x107 912 0.07794 1.382x107 472 7610 1.873x 107 570

1.198x 10-15 7.883x106 200

5.253x10-7 5.469x 106 200

-7

6

2.451 x10-14 4.48x105 0

1.749

1.882x106

56.57 0.000 609 6 2.116x 106 75.07 9.31 4.848 x106 154.9 4.164x 10-14 3.045x 105 0

5.748 1.242x107 307 0.047 29 3.14x106 84.95 1625 1.412x107 392.6 3.635x 10-16 3.433x105 0

1.154x10-7 3.423x105 0

2.444x10-6 6.694x105 14

9.707x10-5 1.751 x107 523.5 3.303x10-7 4.409x105 0

N eval

Niter

.

min N eval Niter

.

min N eval

Niter

.

min N eval Niter

5

N eval

Niter

.

min N eval Niter

.

min N eval

Niter

.

min N eval Niter

9

N eval Niter

6.395 10

10 .

min

N eval Niter

11 .

min

Neval Niter

12

min Neval Niter 13 $*min Neval Niter

7.014 10

7.689 10

3.901 10

4.134x10

5.578 10

9.415 10

1.075x 107

300

2.191x 10-12 9.954x106 300 20

4.734x106 200

8.239x10-9 5.553x106 200

1.232x 10-14 6.779x 106 200 29.23 2.017x 107 500 0.01859 7.554x106 400 4573 5.139x106 200

5.241x 10-16 7.097x106 200 3.2x10-7 4.731x106 200 2.003 10

-9 6

1.171x107

300

3.458x 10-11 1.111x107 300 20

6.804 x106 200

1.779x10-8 1.306x107

400

3.531 x10-14 7.703 x106 200 42.8 3.231 x107 800 0.06355 1.332x107 500 7670 1.515x107 450

1.192x 10-15 7.904 x106 200

4.975x 10-7 5.474 x106 200 2.199 10

-7

6

200

-11

7

1100

-7

6

7.874x10' 200

2.102x10-7 6.153x107 1750

-7

6

9.624x10 6.512 10

1.65x10 1.222x107

300

0.000 694 7 1.137x108 3600 1.968x10

7.56 10

8.341x10 7.942 10

202

1.686 10

-5

6.553 10

7

1800

-6

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

9.968x10

-7

6.493 10

200

200

200

200

2

3

4

6

7

8

5. Conclusion

Evolutionary control of parameters improves accuracy of the NNAICM-PSO method over random variation of parameters. The progressive decline of errors in objective function values in the reused rule base mode suggests adaptation of the method to the problem at hand.

Table 5. Results of the last run in series in the EC experiments with reused rule base and Rt = 1

Pro- Para-blem meter

Minimum

Median

Maximum

Average Standard dev.

4.796x 10—14 1.084x107 300

1.456x10-12 9.828x106 300 20

4.5x106 200

7.62x10-9 5.842x106 200

8.216x10-15

7.109x106 200 31.01 2.425x107 600 0.01514 7.564x 106 300 55.9 5.144x106 200

2.653x10-16 6.766x 106 200

3.488x10-7 4.412 106

—13

—13

1.058x 10—13 1.158x107 300 0.3987 1.14x107 316 20

7.163x 106 252 7.934 1.33x107 416

4.073x 10—14 7.674x 106 200 42.82 3.442x107 848 0.080 82 1.379x 107 468 6852 1.95x107 594

1.151x10-15 7.831x106 200

5.298x 10—7 5.498x106 200 7.629 10

3.215x 10—14 3.96x105 0

1.196 1.534x106 41.76 0.000 4517 2.146x 106 64 9.636 5.439x106

151.5 2.919x 10—14

2.754x105 0

4.86 9.846x 106 241.9 0.056 64 3.266x 106 88.18 2034 1.118x107

299.6 4.224x 10—16

3.881 x105 0

9.497x 10—8 3.806x105 0

1.571 x10—6 8.765x105 23.75 0.000 835 8 2.334x107 689.1 4.336x10-7 4.678x105 0

N eval

Niter

.

min Neval Niter

.

min Neval

Niter

.

min Neval Niter

5

N eval

Niter

.

min Neval Niter

.

min Neval

Niter

.

min Neval Niter

9

Neval Niter

10

Neval Niter

11 .

min

Neval Niter

12

min Neval Niter 13 $*min Neval Niter

3.557 10

6.495 10

1.075 10

200

-9

6

3.093x10 7.887 10

200

900

2.684x10

-7

8.417x10

5.121 10

6

6.492 10

2.363 10

1.162x107

300

1.868x 10—11 1.097x 107 300 20

6.291 x106 200

2.076x 10—8 1.275x 107 400

3.014x 10—14 7.672x 106 200 43.91 3.221 x107 800 0.061 95 1.351 x107 500 7049 1.739x 107 600

1.149x10-15 7.846 x106 200

5.145x 10—7 5.495x106 200

-7 6

200

2.277x 10—10 2.427x 10—7 3.3 107 6.262 107

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1700

-7

1.241 x107

300 3.987 1.804x107 500 20

1.371 x107

400 19.86 2.632x107 700

1.402x 10—13 8.296x 106 200 54.41 6.958x107 1700 0.2733 2.24x107 800

1.001 x104 6.373x 107 1700 2.311 x10—15 8.454x106 200

8.542x 10—7 6.564x 106 200

9.728x 10—6 1.2x107 300 0.005 974 1.433x108 4200 2.753x10

-6

6

7.811 10

-7

7.972X106 206

0.0001237 6.693X107 1854

-7 6

9.329x10

6.53 10

200

200

200

200

2

3

4

6

7

8

With evolutionary control and reusing of the rule base, there is no significant difference in results between random transformation of the problem at the start (Rt = 0) or before each run of the algorithm in series (Rt = 1). This indicates that the process is not sensitive to orthogonal transformations and limited shifts of the objective function.

Figure 6. Median Q*mm plotted against run number in series in the EC experiments with reused rule base

For random variation of parameters, the lowest average error was observed in problems 5, 9-13 and the lowest median error in problem 2. Only in problem 9 of these evolutionary control significantly reduced error. Moreover, there was no difference between the rule base being reused or discarded in these problems, except No. 13 with Rt = 0, during 10 runs of the algorithm. It's noteworthy that in all these problems the objective functions are polynomial or almost polynomial, including quadratic. For example, transformed Griewangk's function of 100 variables, used in problem 5, is quadratic in almost all of the search space except a small neighbourhood of the global minimum. High efficiency in such problems is

Figure 7. Average plotted against run number in series in the EC experiments with reused rule base

presumably due to supplementary quasi-Newton modified BFGS search.

The method worked well with transformed Rastrigin's function (problem 1), which is hard to minimize. Evolutionary control strongly reduced error in comparison to random variation of parameters in this case. In the reused rule base mode, average error quickly decreased from approximately 19 at the end of the first run to 10-13 at the end of the 7th run in series (Figure 7). Results were also good for transformed Ackley's function (problem 4) in this mode, where in 10 runs the median error decreased from 20 to 2 x 10-8.

Figure 8. Average Neval plotted against run number in series in the EC experiments with reused rule base

Statistical significance of the results was high. Usually the p-values were well below the threshold of 0.05.

Evolutionary control of NNAICM-PSO parameters has advantages over manual tuning. It allows to preserve, accumulate and reuse experience of solving a problem or problem class. Application of evolutionary control also allows us to exploit results from the field of evolutionary computation.

The evolutionary controller relieves the user of the burden of manual tuning of NNAICM-PSO parameters, although the ranges of random generation and mutation and other metaparameters of the controller itself

Figure 9. Average Nepi plotted against run number in series in the EC experiments with reused rule base

must still be tuned manually. Nevertheless, this task seems easier than direct manual tuning because, after initial metaparameter setting, the process will adapt to the problem. Suboptimal parameters can hinder evolution, but can't stop it. They can be corrected based on the direction of evolution deduced from the rule base.

The subjects of future study are (1) causes of efficiency or inefficiency of evolutionary controlled NNAICM-PSO in various situations, (2) how to reduce the number of objective function evaluations, (3) how to reduce the number of metaparameters.

Table 6. Comparison of ^man between the first and the last run in series in the EC experiments with reused rule base

Problem Rt = 0 Rt = 1

P -value Adjusted p-value P-value Adjusted p-value

1 <0.0001 <0.0001* <0.0001 <0.0001*

2 0.7030 1 0.1410 0.9333

3 0.0002 0.0022* 0.0012 0.0105*

4 <0.0001 <0.0001* <0.0001 <0.0001*

5 0.5955 1 0.1333 0.9333

6 <0.0001 <0.0001* <0.0001 <0.0001*

7 0.3466 1 0.6958 1

8 0.0022 0.0180* 0.0002 0.0022*

9 0.9961 1 0.2408 1

10 0.8734 1 0.4602 1

11 0.6191 1 0.5023 1

12 0.4840 1 0.9654 1

13 <0.0001 0.0002* 0.0341 0.2728

Note: *) statistically significant results (p < 0.05).

References

[1] W. M. Spears, K. A. De Jong, T. Back, D.B. Fogel, H. De Garis. "An overview of evolutionary computation", European Conference on Machine Learning, Lecture Notes in Computer Science, vol. 667, Springer, 1993, pp. 442-459. 14

[2] F. Neri, V. Tirronen. "Recent advances in differential evolution: a survey and experimental analysis", Artificial Intelligence Review, 33:1-2 (2010), pp. 61-106. 4

[3] N. Siddique, H. Adeli. "Nature inspired computing: an overview and some future directions", Cognitive computation, 7:6 (2015), pp. 706-714. 4

[4] D.H. Wolpert, W. G. Macready. "No free lunch theorems for optimization", IEEE transactions on evolutionary computation, 1:1 (1997), pp. 67-82. 4

[5] G. Karafotias, M. Hoogendoorn, A. E. Eiben. "Parameter control in evolutionary algorithms: Trends and challenges", IEEE Transactions on Evolutionary Computation, 19:2 (2015), pp. 167-187. 4 6

[6] A. Aleti, I. Moser. "A systematic literature review of adaptive parameter control methods for evolutionary algorithms", ACM Computing Surveys (CSUR), 49:3 (2016), 56. 4 6"

[7] R. Poli, J. Kennedy, T. Blackwell. "Particle swarm optimization", Swarm intelligence, 1:1 (2007), pp. 33-57. 4

[8] V. D. Koshur. "Reinforcement swarm intelligence in the global optimization method via. neuro-fuzzy control of the search process", Optical Memory and. Neural Networks, 24:2 (2015), pp. 102-108. I ' 6

[9] Sh.A. Akhmedova, V. V. Stanovov, E.S. Semenkin. "Cooperation of bio-inspired and evolutionary algorithms for neural network design", Journal of Siberian Federal University. Mathematics & Physics, 11:2 (2018), pp. 148-158. 6

[10] E. Semenkin, M. Semenkina. "Self-configuring genetic algorithm with modified uniform crossover operator", Advances in Swarm Intelligence, ICSI 2012, Lecture Notes in Computer Science, vol. 7331, Springer, 2012, pp. 414-421. 6

[11] G. Karafotias, A. E. Eiben, M. Hoogendoorn. "Generic parameter control with reinforcement learning", Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, ACM, 2014, pp. 1319-1326. I ' 6

[12] G. Karafotias, M. Hoogendoorn, B. Weel. "Comparing generic parameter controllers for EAs", 2014 IEEE Symposium on Foundations of Computational Intelligence (FOCI), IEEE, 2014, pp. 46-53. I %

[13] A. Rost, I. Petrova, A. Buzdalova. "Adaptive Parameter Selection in Evolutionary Algorithms by Reinforcement Learning with Dynamic Discretization of Parameter Range", Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, ACM, 2016, pp. 141-142.

5

[14] V. Koshur, K. Pushkaryov. "Global optimization via neural network approximation of inverse coordinate mappings", Optical Memory and Neural Networks, 20:3 (2011), pp. 181-193. t5

[15] V. D. Koshur, K. V. Pushkaryov. "Global'naya optimizatsiya na osnove neyrosetevoy approksimatsii inversnykh zavisimostey [Global optimization via neural network approximation of inverse mappings]", XIII Vserossiyskaya nauchno-tekhnicheskaya konferentsiya "Neyroinformatika-2011" [XIII All-Russian Scientific and Technical Conference "Neuroinformatics"], 1 (2010), pp. 89-98. t5

[16] K. V. Pushkaryov, V. D. Koshur. "Gibridnyy evristicheskiy parallel'nyy metod global'noy optimizatsii [Hybrid heuristic parallel method of global optimization]", Vychislitel'nye metody i programmirovanie [Computational Methods and Programming], 16 (2015), pp. 242-255. 6 6 7

[17] V. D. Koshur, K. V. Pushkaryov. "Dual'nye obobshchenno-regressionnye ney-ronnye seti dlya resheniya zadach global'noy optimizatsii [Dual Generalized Regression Neural Networks for global optimization]", XII Vserossiyskaya nauchno-tekhnicheskaya konferentsiya "Neyroinformatika-2010" [XII All-Russian Scientific and Technical Conference "Neuroinformatics"], 2 (2010), pp. 219-227. t7

[18] J. Nocedal, S. J. Wright. Numerical Optimization, Second Edition, SpringerVerlag, New York, 2006. 13

[19] N. H. Awad, M. Z. Ali, P. N. Suganthan, J. J. Liang, B. Y. Qu. "Problem Definitions and Evaluation Criteria for the CEC 2017 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization", 2016. URL 1416

[20] G. Karafotias, M. Hoogendoorn, A. E. Eiben. "Why parameter control mechanisms should be benchmarked against random variation", IEEE Congress on Evolutionary Computation, IEEE, 2013, pp. 349-355. d -f17

[21] E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools for Python, 2001-. @ti8,20

[22] R Core Team. R: A language and environment for statistical computing, 2018. ©tis

Received

Revised

Published

Recommended by

27.04.2019 09.05.2019 26.06.2019

prof. Vyacheslav M. Khachumov

Sample citation of this publication:

Kirill V. Pushkaryov. "Global optimization via neural network approximation of inverse coordinate mappings with evolutionary parameter control". Program Systems: Theory and Applications, 2019, 10:2(41), pp. 3-31.

10.25209/2079-3316-2019-10-2-3-31 url http://psta.psiras.ru/read/psta2019_2_3-31.pdf

The same article in Russian: d 10.25209/2079-3316-2019-10-2-33-65

About the author:

Kirill Vladimirovich Pushkaryov

Senior lecturer, Computer Science Department, Institute of Space and Information Technology, Siberian Federal University. Areas of interest include heuristic methods of global optimization and neural networks.

e-mail:

0000-0002-1138-886x

cyril.pushkaryov@yandex.ru

Эта же статья по-русски:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

10.25209/2079-3316-2019-10-2-33-65

i Надоели баннеры? Вы всегда можете отключить рекламу.