Научная статья на тему 'The algorithm of optimization realized by the stochastic processor'

The algorithm of optimization realized by the stochastic processor Текст научной статьи по специальности «Математика»

CC BY
31
8
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
AN EXTREMUM OF FUNCTION OF REGRESS / ADAPTIVE ALGORITHMS OF STOCHASTIC OPTIMIZATION / STOCHASTIC COMPUTERS / THE CONVERTER A CODE-PROBABILITY / ЭКСТРЕМУМ ФУНКЦИИ РЕГРЕССИИ / АДАПТИВНЫЕ АЛГОРИТМЫ СТОХАСТИЧЕСКОЙ ОПТИМИЗАЦИИ / СТОХАСТИЧЕСКИЕ ВЫЧИСЛИТЕЛЬНЫЕ УСТРОЙСТВА / ПРЕОБРАЗОВАТЕЛЬ КОД-ВЕРОЯТНОСТЬ

Аннотация научной статьи по математике, автор научной работы — Svistunov S.G.

In the measuring technique, a radar-location, hydro acoustics often it is required to solve an adaptive problem of identification, i.e. to construct model on available data about object. As examples problems of a statistical estimation of parameters can serve, to regress, the analysis of data etc. In article is offered to be used for the decision of the specified problem by the adaptive quasigradient algorithm realized by a stochastic computer. The offered approach gives the chance to reduce time of the decision and to simplify used hardware. In the article the detailed algorithm of a specialized computer is resulted. Use of a stochastic computer allows excluding "slow" operations: erection in material degree, multiplication and division of multidigit numbers. It has allowed reducing time of performance of separate iterations that at a casual signal on a device input reduces general time of definition of an extremum of function of regression. The modified variant of adaptive stochastic algorithm which the CODE-PROBABILITY can be realised on the stochastic processor with the linear converter is offered. The proof of convergence of a method is described.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Алгоритм оптимизации, реализуемый на стохастическом процессоре

В измерительной технике, радиолокации, гидроакустике часто требуется решить адаптивную задачу идентификации, т. е. построить модель по имеющимся данным об объекте. Примерами могут служить задачи статистической оценки параметров, регрессии, робастное оценивание, рекуррентное оценивание, анализ данных и т. д. Для решения указанной задачи в статье предлагается использовать адаптивный квазиградиентный алгоритм, реализуемый на стохастическом вычислительном устройстве. Предлагаемый подход дает возможность сократить время решения и упростить используемые аппаратные средства. В статье определяется алгоритм для специализированного компьютера. Использование стохастического компьютера позволяет исключить «медленные» операции: возведение в вещественную степень, умножение и деление многозначных чисел. Это позволяет уменьшить время, затрачиваемое на отдельную итерацию, и в случае случайных сигналов на входе устройства уменьшает среднее время на определение экстремума функции регрессии. Модифицированный вариант адаптивного стохастического алгоритма с преобразователем код-вероятность может быть реализован на стохастическом процессоре с линейным преобразователем. Описано доказательство сходимости метода.

Текст научной работы на тему «The algorithm of optimization realized by the stochastic processor»

The Algorithm of Optimization Realized by the Stochastic Processor

S.G. Svistunov

Emperor Alexander I Petersburg State Transport University St. Petersburg, Russia ssg47@mail.ru

Abstract. In the measuring technique, a radar-location, hydro acoustics often it is required to solve an adaptive problem of identification, i.e. to construct model on available data about object. As examples problems of a statistical estimation of parameters can serve, to regress, the analysis of data etc. In article is offered to be used for the decision of the specified problem by the adaptive quasigradient algorithm realized by a stochastic computer. The offered approach gives the chance to reduce time of the decision and to simplify used hardware. In the article the detailed algorithm of a specialized computer is resulted.

Use of a stochastic computer allows excluding "slow" operations: erection in material degree, multiplication and division of multidigit numbers. It has allowed reducing time of performance of separate iterations that at a casual signal on a device input reduces general time of definition of an extremum of function of regression.

The modified variant of adaptive stochastic algorithm which the CODE-PROBABILITY can be realised on the stochastic processor with the linear converter is offered. The proof of convergence of a method is described.

Keywords: an extremum of function of regress, adaptive algorithms of stochastic optimization, stochastic computers, the converter a code-probability.

Introduction

For the decision of many problems often it is required to define a point of a minimum (maximum) of function of regress. For example, at an estimation of those probabilities or other events. In work [1] the effective algorithm of definition of a minimum of function of regression which is adaptive is resulted and allows to reach quickly vicinities of a minimum of function of regression. This property favourably distinguishes it from classical algorithms of stochastic approximation [2, 3].

use of this algorithm demands performance enough labour-consuming calculations of the operator, namely, scalar multiplication of vectors and erection in material degree. At information processing in real time it is desirable to avoid use of similar operations and to replace with their more simple (less labour-consuming), namely: multiplication to replace with logic operation, for this purpose it is necessary to present the information not in the form of multidigit binary numbers, and in the form of probability of occurrence 0, or 1 [4]. In material degree it is possible to replace erection with number 2 erection in the integer degree. And then exponentiation operation will be replaced with simple operation of shift. Proceeding from the aforesaid, the modified variant of adaptive stochastic algorithm is offered.

The basic designations

LCCP - the linear converter a code-probability [4]. SQG - stochastic quasi-gradient [1]. U+ - step function

a.c. - almost certainly (with probability 1) M s [• ] - Conditional population mean

||'| | - norm of Rn

df - Function subdifferential f

Kx - The operator of designing of a point on set X

Z - set of integers

Rn - N-dimensional material space

U [0, c] - the uniform law of distribution of probabilities in an interval [0, c]

s - number of a step of iteration - Scalar product of vectors

/(x) - The generalised gradient of function f (x) ent - function of allocation of the integer part of number |(x, < ||x|| J_y|| - Schwarz's inequality V - The operator of Hamilton

Algorithm of stochastic optimisation As shown in [4], on exit LCCP there is a vector of random variables ps = (PsPn ), where

PI = sig« K (| S), k = 1,..., n

and - components SQG on a step of iteration which is the initial information, as - in regular intervals distributed random numbers in a range from 0 to c = 2l. The vector ys = cps will be SQG. Theorems are put In a basis of the modernised algorithm of stochastic optimisation 1 and 2, and as initial data will be SQG ys, the basis of degree a = 2 and instead of material degree undertakes its integer part on function ent. Everywhere further casual events bgQ, where (Q, F, P) - likelihood space are considered.

The theorem 1. Let f (x) - convex (it is possible, rough)

the function set on convex compact set X ^ Rn. Function satisfies to condition Lipschitz's on X.

Original russian text © S.G. Svistunov, Yu. I. Nikiforovpublished in Proc. of Petersburg State Transport University, 2008, No. 3 (16), pp. 234—249.

Intellectual Technologies on Transport. 2018. № 4

If it is carried out:

as = a > 1, ^ = 0,1,...,max|| x - yll = c1 and x, y eX,

x, yeX 11 11

< c2 a.c. s = 0,1,.... And bs ^ 0 a.c. At s

p > 0, 8> 2 log[M5a" 112 ] a.c. 5 = 0,1,..., then with probability 1 all limiting points of sequence (x5} (set by pari-

ties:

,,s+1

x"1 = K(x5 -p^5), P5+1 = min(p,P5«,Ax5 >-P} and M[t,5 /Fs] = Mg ef (x5) + b5 5 = 0,1,..., where a - the algebra F5 is defined by random variables (x0, £>,..., x5-1, ^5-1, x5) and C5 = S5 - fx ( x5) where fx(x5) edf (x5), Ms= /(x5) + b5 belong to set X* = (x* e X : f (x*) = min f (y)}. The theorem proof is

yeX

resulted in [1].

Convergence of the modified variant of adaptive

stochastic algorithm

The theorem 2. Let f ( x) - convex (it is possible, rough)

the function set on convex compact set X œ Rn . Function satisfies to condition Lipschitz's on X. If it is carried out:

max x - y\\ = c

x, yeX

1

, n,

< c = 2' a.c. 1eZ, k = 1,..., M £ = /( x5), where /(x5) ef(x5), = y5 - f(x5), where yk = c5ign(^l )u+ (|^k|-a), k = 1,..., n

a5 e U [0, c], 8 > 0.

That with probability 1 all limiting points of sequence (x5} (set by parities):

(1) (2)

(3)

(4)

(5)

(6)

xs+1 =Kx(xs-2ent(r*)ys) s = 0,1,... s = 0,1,...,

yent ( rs ) s

(7)

where r+l = min(q0, rs- < y5+x, Ax5+X > -82ent()} , (8)

qo> 0, r0 = 0, Ax5+X = x5+* - x5). (9)

Belong to set

X* = (x* eX : f (x*) = min f (y)}.

yeX

For the theorem proof it is necessary to prove some lemmas. It is necessary to notice, that from (5) follows, that

y

< c\[n =

c2 a.c.

(10)

Let's designate ps = 2

_ n.ent(rs)

Lemma 1. The parity ^ 2ent (r) = to a.c is carried out.

5=0

The proof. We will assume opposite, i.e. there is a constant

to

k, for which probability of event: A = (q : ^ 2ent(r) < k} it

s=0

is more 0, i.e. P(A) > 0. From (7, 8, 9, 10) it is had (on Schwarz's inequality [5] and to properties of the operator of designing kx [6]):

rs+1 > min(rs -

,s+1

y

Ax'

.s+1

-82ent (rs)) >

>min(q0,r5 -c222ent(r5) -82ent(r)) >min(q0,rs -c32ent(r)) min(q0,r5 -c32ent())a.c.,

where c3 = (c^ +8) > 0. Then for elementary events q e A it is had (since r0 = 0) that

r+1 > min(q(,-C3X2ent(r)) > min(-c,k)

(11)

Thus we will receive that 2r+1 > min(2q°,2-c3k) >0

2'5+1 = 2ent('i+1)+A5+1 , where -1 < A^ < 1. Hence 2ent(r+1) > 2-A+' min(2q0,2-c3k) > 0.5min(2q0,2-c3k) > 0

to

that contradicts a parity ^ 2ent('5) < k about convergence of

5=0

a number since the general member of a number does not aspire to 0.

to

Lemma 2. The parity ^ M 22 ent (r) < to a.c is carried

s =0

out.

The proof. We will show, that at the fixed value 5 value of a random variable y5+x = r5+x + f (x5+x) is limited on X . On (1) and to condition Lipschitz's [7] exist c4 also c5 such, that c4 < f (x5+x) < c5 for Vx e X. From (8) follows, that

S

r+1 < q0. From (11) min(q0, -c3) < 's+1 , as proves

i=0

this statement. Hence, exists Mrs+X, on property of integral

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Stieltjes's for function of the limited variation [8, 9]. From (4), (8), considering definition of a gradient of convex function [10] and properties of operation of designing [6] we will receive

r+1 < r -(y5+1,Ax5-8P5 < r +

+f (x5) - f (O Ax5+^ -8p 5.

i=0

Intellectual Technologies on Transport. 2018. № 4

From (2.3), (2.4), (2.7) will follow, that

M+1( c^AX*1) =M*((/+ -/(OK(x -P/) - Xs) = 0,

s+1

since Y SQG. Hence, using properties of a conditional population mean [11], we will receive

M+1 y,+i < ys -5Ps • (12)

The size 2r"+1 / 2Mr'+1 is limited for Vs . In case of final r+1 this statement is obvious. On (8) rs+1 < q0, thus there was a case, when rs+1 ^ -co at s ^ra. At rs+1 ^ -oo for

enough bigs it is had: rs+1 = rs -^ys+1, Axs+^ -8ps,

9,

where ps = 2 (r) ^ 0 at rs+1 ^ -oo . From (1, 7, 10), using schwarz's inequality and properties of the operator of designing K x we will receive, that

r -

for

.s+1

y

Ax'

s+1

-8ps < rs+1 < rs +

.s+1

y

Ax'

s+1

-SPs

rs - ClPs - 5Ps < rs+1 < rs + C22Ps - 5P

and

rs - c3ps < rs+1 < rs + c3ps, where c3 > 0 . We will designate Zs = rs - c3ps and ts = rs + c3p s .

Then ^s - Mrs+1 < rs+1 - Mrs+1 < ts - Mrs+1 a c.

's

Mrs+1 = j zdFs+1(where Fs+1(z)

— function of distri-

bution of a random variable rs+1. Using the integration formula in parts for integral and under the theorem of an average, we have

Mrs+1 = zFs+1( Z)

s

j F+1(z)dz = t, - (is - Zs ,

where 0 <p,< 1. Thus:

-its - Zs )(1 - < 1+1 - Mrs+1 < (is - Zs , ts - Zs = 1 + c3ps - rs + c3ps = 2c3ps

and since have assumed, that ps ^ 0 at s under the known theorem from the analysis we will receive lim(rs+1 - Mrs+1) = 0 and therefore lim(2'i+1 / 2M+1) = 1

i.e. the size 2Ts+1 Mr'+ — is limited for Vx is had

2r'

s> 0:

2 Mrs 8

< 1. As 2rs = 2

rs — nent( rs )+Ai

we will receive, that

, 1 p 2As 0.5p

1 >-• s.. >—i-s

2 Mr,

s 82 s

From (12, 13) it is had, since 5 > 0 :

where -1 < A„ < 1

(13)

Ms+1 ys+1 < ys -

055 p2

s+1^s+1 - s s f,Mr

8 2 s

a.c.

(14)

We take a population mean from both parts of an inequali-

0.5SM p 2

ty and it is received Mys+j < Mys--Mr s . Passing to

s2 r

exponential to the record form it is received

0.5SM p2

Let z =

2Mys+1 < 2Mys • 2 82-0.55M p2

(15)

82

Mr,

obviously z > 0 . As from (8)

ps = 2ent(r) < 2ent(q«> = p and considering (13) we will receive 0 < z < Sp . Owing to camber of function 2-Z it is

P 1 - 2-Sp

had, that for p =-=— the inequality is carried out

sp

1 -Pz > 2 Z. In this case from a parity (15) we will receive, that

2Mys+1 < 2Mys (1 -ß^

0.55Mp2) = ^ ß0.55c6Mp2 82Mrs ) 8

(16)

where c6 = inf 2f (x), and ys = rs + f ( Xs) .

xeX

Summarising an inequality (16) at i = 0,..., s, we have, that

2Mys+1 < 2My0 -0,5ß5c6 ^TMp2, 8 i=0 0 < 2 Mys+1 = 2 Mrs+1+Mf (xs+1)

but also rs+1 < q0 c4 < f (X) < c5 i.e. 2Mys+1 it is limited.

xeX

A constant

0,5ß5c6

> 0 also we will definitively receive,

that c0 ^Mp2 < 2My° - 2Mys+1 < +o , as proves a lemma

i=0

2.

Consequence. It is carried out ps ^ 0 and rs ^ -o at s ^ o a.c.

Lemma 3. It is carried outlim(p s-1 / p s) = 1 a.c.

The proof. From (8) follows (since rs ^ -o a.c.), that almost for each elementary event ®eQ there will be such number S(q) , that at s > S(q) is had

rs+1 = rs +ATs =rs -(ys+x,Axs+^-Sps. (I7)

From (7) it is had, that

Axs+: = xs+: - xs = kx (xs - psys) - xs. Using schwarz's inequality and properties of the operator of designing on convex set X, we will receive

s

8

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Intellectual Technologies on Transport. 2018. № 4

(ys+1, A*-1)-8pJ <||/+1|]\nx(xs -Ps/)-Xs

+5Ps <Cfcs + 5Ps = C3Ps,

where c3 > 0.

(18)

As p5 — 0 at s — to for enough big s will be |Ar5| << 1, i.e. r5 e R .

P5-1 = 2ent ( r5-1)-ent ( r5-1 +Arj-1)

P5 '

Function ent(r) — not decreasing and continuous in points r £ Z, therefore

2 ent ( rs-l') - ent ( ri-1 + c3 Pj-1) < P 5-1 < 2 ent ( r5-1) - ent ( r5-1 - c3) " P 5 " '

From a lemma 2 follows, that P5-1 — 0 at 5 — to and passing to a limit we will receive

1 < lim(P5-1/ P5) < 1 at r £Z. (19)

5—>TO

As r5 = r5_i + Ar5-1 and from (18) follows, that A5-1 — 0,

and from a lemma 2 r5-1 — -to at r5-1 e R 5 — to also the set Zc R will be set of a measure zero, therefore we will receive lim(P 5-1 / P5) = 1 a.c.

5—TO

The algorithm is understood as a rule of construction of sequence of the points (x5} belonging to some set X ^ Rn . It

is considered set some set of decisions X* c X . The algorithm is called as converging if it is carried out

limd(x5,X*) = 0, where d(x5,X*) = inf x5 -x .

5—to 5eX*

The theorem 3 [1]. Let following conditions are satisfied: A1. There is a compact set X such, that with probability 1

(x5(q)} CX;

A2. 3w : X — R - Continuous function;

A3. If there is such eventB cQ, asP(B) > 0 for all

Qe B there is a subsequence (x'k ( q)(q)} converging to a

pointx'(q) such, that d(x'(q),X*) > 0for any s> 0

there is a subsequence of indexes (vk (q)} such, that

x ^ U 8 (x'(a))

for

h(a) <T< vk (a)

and

lim w(xVk(q)(q)) < w(x'(q));

k—TO

A4. Function w accepts on set X* no more than counta-bly number of values;

A5.

xs (a) - xs+1(a)

^ 0 a.c. At s

If conditions A1-A5, d(xs (q), X ) — 0 a.c are satisfied.

The proof of the theorem 2 is based on the lemmas 1 proved above, 2, 3 and the theorem 3. This proof practically does not differ from the proof of the theorem 1 [1].

An estimation of speed of convergence of computing

process

For an estimation of speed of convergence it is necessary to study asymptotic properties of sequence of steps of multipliers P 5 = 2ent (r) from the theorem 2.

Lemma 4. For sequence ( A } all conditions of the theorem 2 are satisfied and function f (x) is twice continuously differentiated on the open set, containing X , then

_ r\ent( rs ) _

P s = 2

o(1/(1 + s)) a.c.,

(1 + 5) 8 ln2

where o(1 / (1 + 5)) — size infinitesimal in comparison with as e (0,5;2) 1/(1 + 5) as e (0,5; 2) as e (0,5; 2).

The proof. From a parity (8) and considering, that r5 — -TO for enough big 5 is had P 5 = 2ent (r"') = 2r-Ai

-1 < A < 0 2's = 2ent(^)+A5 = 2ent(rs-l)+As-l-(y

Hence, we will receive, that

p.=p^^'11-!-'^ • 2^ ^

Let's designate a5 = 2* = 2r5-1 ■ 2 N ' .

Let's enter size T5 = log2 [(1 + 5)a5 ], then

2-Ai-1

T = T

s -1

(2As1 - 5 ln2 • 2Ts1) + log2 (1 + s) -

5 ln2

-1/(5ln2)-(y5,Ax5)-log2 5, as: T5-1 = log2(5a5-i) log2 a5-i = Vi -log2 5

2T._1 = 5^5-! a5-! = 2W 5 .

Considering, that f (x) twice differentiated function, we have from (2.4): y5 = C5 + Vf (xs) and then we will receive

2-A5-i

t =t , +-(2A_1 -M^^ +M2A-1 -8ln2^2Vl) +

5 5-1 5 ln2 7

+ log2(1 +1/5) -1/(5 ln 2)-(C5, Ax5)-(Vf (x5), Ax5).

TO

A number ^ [log2(1 +1/ 5) -1/(5 ln 2)] converges.

5=1

From a lemma 2, the theorem of the Dub [11], considering, that Ax5 = x5 - x5= kx (x5-1 - p 5-1y5-1) - x5-,

TO

convergence martingale of some ^ ^C5, Ax5) a.c. follows.

5=1

Function f (x) is twice continuously differentiated, therefore

(vf (xs), Axs) = f (Xs) - f (xs-1) + ^

Axs

where

ys is in regular intervals limited for all s .

Thus, putting term by term last equality m of times, we will receive:

Intellectual Technologies on Transport. 2018. № 4

2 (Vf (xs), Axs) = f (xm ) - f (x0) + s 11 Axs

s=1 s=1

As it has been shown at the proof of the theorem 2 function

f (x) it is limited on compact setX, and a number

2

m

2^ s| |Axs|| is converging a.c. Owing to a lemma 2 and

s=1

o

consequently a number 2(Vf (xs), Axs^ converges a.c.

s=1

Expression for Ts in this case looks like:

2-A

T„ = T

s-1

s ln2

[(2A-1 -M2As-1) + M2 -

-S ln2 • 2X-'] + ts, where a number 2 ts is converging a.c. 2 s-1 G [1; 2],

s=1

M 2 As-1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

G [0.5;1] and thus the formula for Ts coincides with Robbins-Monroe algorithm for the equation decision b-Sln2 -2Z =0, where b = M2A'-1 .

By results about convergence of algorithms of stochastic approximation [12] it is had Ts ^ log2[b / (S ln2)] at s ^ o a.c. From here we receive, that (1 + s)Gs ^ b / (S ln 2) at s ^ o a.c. And, hence

b ( 1

a, = 2r =p, 2 s =

(1 + s ) S ln2

• + o

1 + s

a.c. Thus we

have: as = b • 2 As and ps = ■

a.

■ + o

1

(1 + s) S ln2 ^ 1 + s a.c., where as G (0,5; 2).

At an estimation of speed of convergence of recurrent stochastic algorithms, as criterion, usually, accept value

2

M

where X - a point of a minimum of function f ( x) on set X [10].

The theorem 4. For sequence {x*} all conditions of the theorem 2 are satisfied, function f ( x) is twice continuously differentiated on the open set, containing X

M

r

<a2 s = 0,1,... f ( x ) > f(x*) + B||x * -xll

where 0 and X a minimum point f (x) on X then at 80 = B /(1,25ln2) it is had

M x * -x

s+1

<

1,56 a2 B2(1 + s )

1

1 + s

a.c.

The proof. We will enter kind function

2

w( Xs ) =

X -x

. From conditions of the theorem 2.1

C * = y * - /( xs ) . As follows from (18)

g = 2ent( 1-1-C3P*-1) < P = 6*-1 _ ^ r *

= 2ent(r*-i-y*,Ax^-5p*) < 2ent(r*-i+C3P*-i) = h 1

o(1/(1 + *)) a.c.,

(20)

From a lemma 2.4 it is had, that

a,.

p =-2-

s (1 + s) S ln2 where as G (0.5; 2) . Hence, from some number s also 8 > 0 it is had:

t.. =■

a - s

<Ps <

= l.

(21)

(1 + s) S ln2 (1 + s) S ln2 Further, we use schwarz's inequality and property of the operator of designing on convex set [5, 6], definition SQG and we will receive:

w(xs+1) = x* - xs+1 < x* - xs +psys w(xs+1) < w(xs) + 2ps (ys, x* -xs) + p21|ys\\2 < w(xs) + +2ps (f *-f (xs)) + 2p^ Cs, x*-xs)-

-2gs-Cs,x* -xs) + 2gs-Cs,x* -xs) + p2 |ys||2, where f = f (x ) properties of a gradient twice continuously differentiated function, since are used ys = f ( xs) + Cs.

Thus, using parities (20, 21), we will receive:

KO < w(x )+2p* (f - f(x ))+2(p* -g-)\Ç +2g*-^C*,x* -x*) + p2||y*\\2 <w(x*) + +2t*(f* -f(x*)) + 2(h*-1 -g*-1)||ç*||||x * +2 g*-^ C *, x*-x*) + /,2| |y *

x *

(22)

Let's consider the composed 2(h*-1 -g*-1) C * • x*- x*

_ (rs-1 +C3PJ-1) - 2ent( 1-1 -Cbp,-1)

= 2(2'

c

it is limited,

x -x"

)) •c

x -x

^ 0 under the theorem 2.1,

ps-1 ^ 0 and rs-x ^ -o a.c. On a consequence from a

lemma 2. At big enough s the length of an interval [rs-1 - c3ps- J, rs-1 + c3ps-J will be << 1 and then the number can get to this interval no more than a single whole.

Considering properties of function ent we will have, that

( h - g ) < (2 ent ( I"1 + c3p s-1 ) ent ( rs-1 -c3 pJ-1) -- 1)2 ent ( rs-1 c3 p s-l) < 2 ent ( I-1 -c3 p s-1) ^ 0

2

2

Intellectual Technologies on Transport. 2018. № 4

since rs-1 ^ -ro a.c. Functionent(y) = const in those intervals, where y £ Z and consequently at s ^ ro (hs-1 - gs-1) = 0 a.c. Except for set of a measure zero

(Z c R).

Thus we receive, that

w(xs+1) < w(xs) + 2ts (f * - f (xs)) +

2

Ys

+2gs^Çs,x* -xs) +1]

a.c.

On a theorem condition it is had

l|2

f ( x) > f ( x) + B

x — x and on (10) we will receive

w(x]+1) < w(xs) — 2(a s)B w(xs) +

(1 + s) S ln2

( as + s)2

Y"

+ 2 gs—i( Cs, x * — xs).

(23)

(1 + s)2 52ln22 The size s> 0 can be any way small, therefore, considering a part of an inequality (23) convex function from as e [0.5;2]

it reaches the maximum at = 0.5 or = 2.

Let's enter designations: 2 = 1 + s , Ck =

B

S ln2

and

d, =■

0.25

i

Y

d, =

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

4

S2ln2 2

2

T

at as = 0.5 and C,r =

4 B S ln2

k S2 ln2 2

at

as = 2 w(xs+1) = u

k+1

Gk = 2gsCs, x* — xs).

Values Ck also dk get out under a condition

d,

max(—uk + —7). From (23) we will receive, that

Ck d, „ B

u2+, < uk---u, + —^ + Gk . We take S =- and

k+' 2 2 2 k2 2 1.25ln2

vt = kuk -

(125)2

Y

B2

Then we have

vk+1 = (k + 1)uk+1 —

(1.25)2

= (vt +

(125)2

B2

-)(1—(Ck—1)/k—Ck/ k2)+

dk / k + dk / k —

(125)2

b2

- + k(1 +1/k )Gk =

= y,(1 — (Ck — 1)/k — Ck /k2) —

(125)2 Y1 (Ck — 1)

Ck (1.25)21 |y k

k2 B2

+ k (1 + 1 / k ) Gk.

B 2k

■ +

+dk / k —

■ + dk / k +

Let's consider the composed:

J (1.25)2 yk (Ck — 1)

k B2k

r^ r ^ B B

At as = 0.5 C, =-= „, .

s k S ln2 B ln2

= 1.25

(24)

(25)

and

d,, =■

0.25

Y

0.25

Y

1.25ln2 0.25(1.25)2

S2ln2 2 B2ln2 2

B2

1.252 ln2 2

And then value (25) will be equal:

0.25(1.25)2 y ! 0.25(1.25)

Y

B2k

Similarly at as = 2 : Ck =

B2k

=0.

4 B

4 B

d2 =

Y

k g ln2 B ln2

4 y k

= 1.25 • 4,

1.25ln2

2

4(1.25)

k S2i2

S2 ln2 2 B2 ln2 2

B

2

1.252ln2 2

And then value of expression (25) will be equal

4(1.25)2

Y

4(1.25)2

Y

B 2k

B 2k

=0.

B2

< (k(1 + 1/ k)((1—Ck / k)Uk + Therefore from (24) we receive a parity:

+Gk + t2")—

d^ (1.25)2 Y2

C, — 1 C^ d, — (1.25)2 C,

B2

= 2uk (1—(Ck —1)/k—Ck / k2) +

+dk (1 + 1/ k )/k+2(1 + 1/ k ) Gk —

(125)2

b2

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Vk+1 < Vk (1 — ^ - it) + , 2 d + b °k < vt (1——^ ) +

+^ + k d + ^ Gt.

(26)

2

k

2

k

Y

2

2

2

k

Y

2

2

k

Y

Considering, as

minCk = 1.25 > 1, Gk = 2gs ^Cs, x* - xs^ and since

xs =kx ( xs1 -p 1y1), MsCs = 0 by definition SQG

taking at first conditional, and then an unconditional population mean from (26) we will receive

Mvk+! < Mvk (1-

0.25 1.25 4(1.25)

) + ■

< Mvk (1-0.25/ k ) +

k k2 B 2k2 4(1.25)2 aO

-M r! <

On

2,2

B2 k

a lemma about recurrent sequences [10] we will receive

lim Mvk < 0 . Thus, considering definition vk, we have

k

MUk+1 <

(125)2 a2 .1

1.56a2

+ o( ) =-O2

k (1 + s) B2

+ o(-1-)

kB2 k (1 + s) B + s

proves the theorem.

Consequence from the theorem 4 from (10) it is had:

as

II . s+i ||2 1,56nc M x * -xs+l <■

(1+* )

If the function minimum f ( x) is in set X or if the problem of unconditional optimisation Vf ( x*) = 0 and then, at s ^ œ it is had ||Vf (x*)|| ^ 0 a.c. dares., i.e. for enough

big * will be ||Vf (x*)|| < s1, where £1 > 0 small positive

number. At research of questions of stochastic optimisation

2

restriction a on a dispersion of components SQG [10] in the form of a parity M(- fk ( x* )) <a2 k = 1, n , * = 0,1,... where n - regularity of space, as a rule, is set. fk (x* ) = Vfk (x* )And in a vicinity of a stationary point

value Vf ( x* )is not enough for differentiated function. Therefore, at asymptotic estimations it is possible to put, that M[(^k)2 ] < a2 since some s. From (5) it is had

( y * )2 = c 2u+ ( ^-a * ), k = m. Further, we take a population mean and we will receive M[(yk )2] = cM|^| = 2' M . Using inequality Jensen [11] we will receive

a2 >

M[(^k)2] > [M|^k|]2

M[(r k )2]

Hence ca>M[(yk )2].

By norm definition it is had and received

2 n

Mllys|| =2M(yk)2, that nca > M

k=1

we will put

a2 = nca.

(27)

On a condition of the theorem 2.1. |^k| < c a.c. k = 1, n

s = 0,1,..., having taken advantage of a rule of three sigma it is possible to consider, as c = 3g then

aH = 3na2. (28)

Having spent similar calculations for a method under the theorem 1, considering, that instead of ys there is used SQG

as = 1 and since na2 > M||^s 11 we will receive, that

M

x * -xs+1

<

na

S2 ln2 2(2B /(S ln 2) -1)(1 + s) +o[1/(1 + s)].

From the theorem 2.

S = B/(1,25ln2).

And then for an initial method — of analogue will be:

+

(29)

M\\x *-x

s+1

<

na

1

1

B2(1 + s)

For an investigated method under the theorem 2 we will receive an estimation:

Ml x * -x^1

1,56 nca = B 2(1 + s)

P 4,7 na2

<H-+0

1 B 2(1 + s) 1

1

1 + s

■ + o

1 + s

(30)

Thus, roughly, it is possible to compare an investigated method and a method-analogue by quantity of steps of iteration.

Let S - number of iteration for the modernised algorithm, and N - number of iterations for the initial algorithm, necessary for achievement of the set accuracy of calculations then we will receive

S _ 1,56c

77 =-. (31)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

N a

Or, at parity performance

S

c = 3a, will be: — = 4,7 = 5. N

The conclusion

Though application of the modified algorithm considerably reduces time of performance of one iteration, but increases total of iterations. Despite it it provides the general gain in time at work. The given algorithm can be used for the decision of problems of the stochastic optimisation often meeting in practice. For example, the stochastic problem of distribution of resources, optimisation of parametres of technological processes, the decision of problems of identification, processing of signals in the conditions of hindrances etc. the Interesting appendix of the given algorithm is its use for the primary goal decision stenographic, namely installations of the fact of the latent information transfer in multimedia data.

2

2

2

.V

r

References

1. Uryasev S.P. Adaptive algorithms of stochastic optimization and the theory of games [Adaptivnye algoritmy 5to-kha5tiche5koy optimizat5ii i teorii igr]. - Moscow : The Science, 1990. - 184 p.

2. Robbins H., Monro S. A stochastic approximation method // Ann. Math. Statistics, 1951, vol. 22. - Pp. 400-407.

3. Kiefer J., Wolfowitz J. Stochastic estimation of the maximum of a regression function // Ann. Math. Statistics, 1952, vol. 23. - Pp. 462-466.

4. Fedorov R. F, Jakovlev V.V., Dobris G.V. Stochastic information converters [Stokha5tiche5kie preobrazovateli infor-mat5ii]. Lenigrad : Mechanical engineering, 1978. - 303 p.

5. Kantorovich A.V., Akilov G.P. The Functional analysis [Funkt5ionaVnyy analiz]. - Moscow : The Science, 1977. -741 p.

6. Karmanov V.G. Mathematical programming [Ma-tematiche5koe programmirovanie]. - Moscow : The Science, 1986. - 286 p.

7. Ermakov S.M., Zhiglyavsky A.A. The Mathematical theory of optimum experiment [Matematiche5kaya teoriya op-timaVnogo ek5perimenta]. - Moscow : The Science, 1987, 318 p.

8. Borovkov A.A. Probability theory [Teoriya veroyatno5tey]. - Moscow : The Science, 1986. - 432 p.

9. Gnedenko B.V. A probability theory Course [Kur5 teorii veroyatno5tey]. - Moscow : Fizmatgiz, 1961 - 406 p.

10. Poliak B.T. Introduction in optimization [Vvedenie v optimizat5iyu]. - Moscow : The Science, 1983. - 384 p.

11. Shiryaev A.N. Probability [Veroyatno5ts]. - Moscow : The Science, 1980. - 575 p.

12. Nevelson M.B. Stochastic approximation and recurrent evaluation [Stokha5tiche5kaya approk5imat5iya i rekurrentnoe ot5enivanie]. - Moscow : The Science, 1972. - 304 p.

Алгоритм оптимизации, реализуемый на стохастическом процессоре

С.Г. Свистунов

Петербургский государственный университет путей сообщения Императора Александра I

Санкт-Петербург, Россия, ssg47@mail.ru

Аннотация. В измерительной технике, радиолокации, гидроакустике часто требуется решить адаптивную задачу идентификации, т. е. построить модель по имеющимся данным об объекте. Примерами могут служить задачи статистической оценки параметров, регрессии, робастное оценивание, рекуррентное оценивание, анализ данных и т. д. Для решения указанной задачи в статье предлагается использовать адаптивный квазиградиентный алгоритм, реализуемый на стохастическом вычислительном устройстве. Предлагаемый подход дает возможность сократить время решения и упростить используемые аппаратные средства. В статье определяется алгоритм для специализированного компьютера.

Использование стохастического компьютера позволяет исключить «медленные» операции: возведение в вещественную степень, умножение и деление многозначных чисел. Это позволяет уменьшить время, затрачиваемое на отдельную итерацию, и в случае случайных сигналов на входе устройства уменьшает среднее время на определение экстремума функции регрессии.

Модифицированный вариант адаптивного стохастического алгоритма с преобразователем код-вероятность может быть реализован на стохастическом процессоре с линейным пребразователем. Описано доказательство сходимости метода.

Ключевые слова: экстремум функции регрессии, адаптивные алгоритмы стохастической оптимизации, стохастические вычислительные устройства, преобразователь код-вероятность.

Литература

1. Урясьев С.П. Адаптивные алгоритмы стохастической оптимизации и теории игр / под ред. Ю. М. Ермольева. - М. : Наука, 1990. - 184 с.

2. Robbins H., Monro S. A stochastic approximation method // Ann. Math. Statistics, 1951, vol. 22. - Pp. 400-407.

3. Kiefer J., Wolfowitz J. Stochastic estimation of the maximum of a regression function // Ann. Math. Statistics, 1952, vol. 23. - Pp. 462-466.

4. Федоров Р.Ф. Стохастические преобразователи информации / Р.Ф. Федоров, В.В. Яковлев, Г.В. Добрис. - Л. : Машиностроение. Ленингр. отд-ние, 1978. - 303 с.

5. Канторович Л.В. Функциональный анализ / Л.В. Канторович, Г.П. Акилов. - 2-е изд., перераб. - М. : Наука, 1977. - 741 с.

6. Карманов В.Г. Математическое программирование : учеб. пособие для вузов по специальности «Прикладная математика». - 3-е изд., перераб. и доп. - М. : Наука, 1986. - 286 с.

7. Ермаков С. М. Математическая теория оптимального эксперимента: Учебное пособие для вузов по специальности «Прикладная математика» / С.М. Ермаков, А.А. Жиглявский. - М. : Наука, 1987. - 318 с.

8. Боровков А.А. Теория вероятностей : учебное пособие для вузов. - 2-е изд., перераб. и доп. - М. : Наука. Гл. ред. физ.-мат. лит.,1986. - 432 с.

9. Гнеденко Б.В. Курс теории вероятностей : учебник для гос. ун-тов. - 3-е изд., перераб. - М. : Физматгиз, 1961. - 406 с.

10. Поляк Б.Т. Введение в оптимизацию. - М. : Наука, 1983. - 384 с.

11. Ширяев А.Н. Вероятность : учебное пособие для университетов по специальности «Математика». - М. : Наука, 1980. - 575 с.

12. Невельсон М.Б. Стохастическая аппроксимация и рекуррентное оценивание / М.Б. Невельсон, Р.З. Хасьмин-ский. - М. : Наука, 1972. - 304 с.

i Надоели баннеры? Вы всегда можете отключить рекламу.