Technical and economic indices of functioning distributed computer systems and realizability function of solving complex problems

Khoroshevsky V.G.; Pavsky K.V.; Nikitin D.S.

ВЕСТНИК ТОМСКОГО ГОСУДАРСТВЕННОГО УНИВЕРСИТЕТА

2008 Управление, вычислительная техника и информатика № 2(3)

ПРОЕКТИРОВАНИЕ И ДИАГНОСТИКА ВЫЧИСЛИТЕЛЬНЫХ СИСТЕМ

UDC 681.324

V.G. Khoroshevsky, K.V. Pavsky, D.S. Nikitin

TECHNICAL AND ECONOMIC INDICES OF FUNCTIONING DISTRIBUTED COMPUTER SYSTEMS AND REALIZABILITY FUNCTION OF SOLVING COMPLEX PROBLEMS

Estimation of the function of realizability of solving the labor-consuming problems on distributed computer systems (CS) is made. A set of integral equations for calculating the function of realizability of problem solution on distributed CSs is derived. A parallel algorithm for its computing is described.

Keywords: Distributed computer systems, technical and economic indices, realizability function.

The distributed computing system (CS) is an association spatially removed from each other concentrated CS, based on principles [1]:

1) parallelism of functioning concentrated CS (i.e. abilities of several or all concentrated systems in common and simultaneously to solve one complex problem presented by the parallel program);

2) programmable structure (i.e., opportunities automatically to adjust a communication network between concentrated CS);

3) homogeneity of structure (i.e., program compatibility various concentrated CS and uniformity of elementary machines in each of them).

The concept of distributed computer systems allows creating of robust data processing means with theoretically unlimited power.

Quality of CS functioning is estimated by means of system of characteristics of power, reliability, robustness, realizability of problem solution, and technical and economic efficiency [1].

1. Technical and economic efficiency analysis of large scale distributed computer systems functioning

It is required to solve the following problem: we have the distributed CS containing N elementary machines and restoring devices (RD) containing m devices, it is required to calculate expenses and income that will come while exploitation of the computer system.

For this project the considering method is based on stochastic models describing the process of CS functioning [1]. Stochastic models lead to prime design formulas for coordinates of a vector-function r(t) and D(t) accordingly expenses and incomes.

If X and p are intensions accordingly of stream of breakdowns in one elementary machine (EM) and of recovering of broke-down EM with one RD; then, c1 and c2 are costs accordingly to exploitation of one EM and maintenance of one RD in a time unit.

Let K(t) be the average number of trig machines in CS, then M(t) is the average number of busy RD in a restoring system. The time of system reconfiguration corresponds to v-1. It makes following equations being correct for coordinates of the vector-function of cost r,(t):

^r (t) = c [N - K (t)] + c2 [m - Mi (t)] ] (1)

r (0) = 0, i e E. |

The solution of the system (1) is presented by functions:

ri) = -P i +Yt + Pi )>

where under NX < (X+p)

Pi = '2^ (c1 - c2 )> (c1 -c2 )+mC2 ’ 5(t) = e~('(+V)t, i e E1,

(X + p)2 X + p

and under NX > (X+p)

„ iX-mp NX-mp . _Xt .

Pi = ,2 Cl y=---------—!- c, 5(7) = ekt, i e E2 ,

X X

where E1 =(N-m, N-m+1,..,N), E2 = (0, 1,..., N-m-1).

The vector-function r(t) as the solution of a set of equations (1) in stationary conditions can be written as follows:

r (t) = g t.

It is explained on the basis of the fact that under bigger t functions P << yt and 5(t)—>0.

Values y are defined as follows: under NX < (X+p)

Y = ————— NXv + c2m ,

Xp + Xv + pv

under NX > (X+p)

Y = C[

N - +-) X v

For coordinates of a vector-function of cost A(t) the following equations are correct:

dt Di (t) = c1 Ki (t) - c2m - C3 P-Mi (t) - c4N> I (2)

D (0) = 0, i e E. J

Here, c '2 is the maintenance cost of restoring device in a time unit, c3 is the cost of

spare parts being used while restoring one broke-down EM once, c4 is fixed cost while

exploitation EM in a time unit.

The solutions of a system (2) are functions:

Di (t) = Di + gt - D(8(t),

where under NX < (X+p)

Di(t) = lX,.J)P (ci + c3^)> * e E >

(X + p)

g = 7^-(q - C3X) - (c2m + c4N),

À + P-

where under NX > (X+p)

D (t) = (N (Cl - C2), i e E2,

(X + ia)

g = —(c! - C3X)- (c2m + c4N),

V

where E1 =(N-m, N-m+1,.., N), E2 = (0,1, .. N-m-1).

Let us consider that the vector-function of income A(t) taking into account index v1 which is the time of system reconfiguration, and value c5 ,that characterizes the cost of sending the data by channels of communication.

The vector-function D,(t) is described as [1].

dt Di (t) = C1Ki (t) - C '2 m - C3^Mi (t) - C4N - C5 > I (3)

D (0) = 0, i e E. J

The solution of the set of equations (3) in the stationary conditions for a vector-function of profit will be functions:

Di (t ) = gt,

where under NX < (X+p)

N p

Xv + Xp + pv

under NX > (X+p)

(cj (v + p) - c3 vX) - (c '2 w + c4 N + c5 ),

g = qmpj X + V|-(c'2 m + c3m^ + c4N + c5) .

V Xv )

In that way, using a quite easy method of calculating vector-functions r(t) and D(t), we can get the correlation between the reliability index and the cost of computer systems.

2. Realizability of solving problems by CS continued approach

The theory of realizability studies the process of solving problems by non-absolutely reliable computer systems in three regimes. In the mono-program regime a function ®(i, i) of potential realizability of solving problems by a CS is used as a quantitative characteristic. The function is the probability of the fact that by a CS which began functioning with i, 0< i < N, working elementary machines (EMs), for the time t > 0 there will be solved a problem presented as an adaptive parallel program. It is clear that

t

®(t, i) = 1 -exp[-p{K(x,i)dt] , (4)

0

where p-1 is the average time of solving a problem by one working EM; K(x, i) is the mean value of a number of working EM at a moment t > 0. Let N be the number of EMs constituting a CS; m is the number of repairing devices, X-1 and p-1 are mean values, respectively, of time between failures of an EM and of time to repair a faulty EM by one

device. Then, under the condition that for the same time interval the average number of failures in a CS does not exceed the average number of repairs that can be performed all

m devices, i.e. under the condition NX < Np, one can obtain:

K(7, i) = -Np- + lX-(N - l)p e-(X+^ , (N - m) < i < N. (5)

X + p X + p

The calculation of ®(i, i) should be made when elaborate analysis of realizability of solving problems by a CS is necessary, when it is necessary to estimate the time for a system to gain stationary regime. For instance, such a situation occurs when creating a specialized CS. When operating a CS in commercial situations there will hardly occur the necessity of such calculation. Numerical analysis shows that at modern reliability parameters of microcomputer devices the stationary regime is gained in about 10 hours. Therefore, one may assume that the functioning regime of a general CS is stationary, as a rule. Consequently, for the users of a CS to estimate the realizability of their problems to be solved, it is necessary, to use the simplest formula:

®(t, i) = 1 - exp[-pNpt /(X + p)]. (6)

It should be noted that one of multiprogram regimes - the solution of a set of problems, without loss of generality is reduced to the mono-program one. Indeed, the realizability of solving any problem from the set depends on the parameters of the subsystem that has just been attributed to it; the analysis of efficiency of the subsystem does not differ from the analysis of the whole system when solving a problem.

The analysis of a CS functioning in the regime of serving a flow of problems is reduced to solving the following problem in a simplified set-up. Let with intensity a the simplest (Poisson) flow of simple problems (represented by sequential programs) be loaded into a CS. It is necessary to calculate mean value A(t) of the number of problems contained in the system at a moment t > 0. Let intensity a be such that there holds a < N p, i.e. always in the system there are operating and vacant EM to solve entering problems. Then

A(0) = j, j e{0,1,..., i},

A(t + At) = A(t) + aAt — A(t)PAt + o(At),

A (t) = a-p A(t),

A(t) = a / P + (j - a / P) exp(-p?).

If the inequality a < N p does not hold, the calculation of A(t) is made in a similar way. Formulas (4) - (6) allow simply executing express-analysis of functioning of non-absolutely reliable CS in the basic regimes of information processing.

3. Realizability function of parallel solving problems

We suppose that labour-consuming problem is represented by the adaptable parallel program [1] and it can be solved if the system has at least one working EM.

The model for calculation of the probability of parallel solving problem on distributed CS during the time f (T) for not more than l failures and restorations of EMs is proposed [2], T - the time of solving problem is just on one EM.

Let us introduce the notation: pn(f (t), l, i) is the probability that during the time f (t) CS will solve Q(01] part of the problem provided that at instant of time f (t) and at

the initial instant of time there were n and i working EMs, respectively, and during the time f (?) the CS had l failures and restorations of EMs, t e [0, T];

P rec(n, f (t)) is the probability of restoration of one failed EM during the time f (?) for failure of (N-m) EMs and m < N restoration devices;

Pft(n, f (t)) is the probability of failure of any EM from n working EMs during the time f(?);

Vfit(n, f (t)) is the survival probability functioning of CS from n working EMs during the time f (?);

V rec(n> f (t)) is the probability of nonrestoring of any failed EM during the time f (?) in case of failure of the (N - n)th EM and in the presence of m < N restoration devices;

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

W n( f (t)) = V rec(n> f (t ))V iit(n’ f (t)) is the probability that for the time interval f (?) no failures or restorations on CS of n EMs will happen.

Let (l-1) failures have happened in the time interval [0, f (t)), and let a failure occur at the instant f (t) ( t < t), then we obtain that

P nif(t )>l , 0 = P n+1( f (t)l - 1 0 P n(f(t)- f (t) + T tun.0, i) x

xd( Pflt(« + 1, f (t))Vrec(n + 1 f (T)) ) , 1 ^ n < N •

Further, assuming (l - 1) failures in the time interval [0, f (t)), and failure or restoration of EM at the instant f (t) ( t < t), and also considering that

Pn(f (t) - f (t) + T tun, 0, i) = Wn( (t - T)k-1) ,

(Where kn is the acceleration of problem solution on n EMs, which, for simplicity, is assumed to be independent on the part of the problem solved), we obtain

P „(/(t), L 0 = P n+1( f (t)l -1 0 W n( (t - T)k~„)x

xd( Pflt(n + 1, f (t)) Vrec(n 1 f (T)) ) + Pn-1(f (t)> l - 10 x x Wn( (t - t)k-1) X d( Prec(n - 1, f (T)) Vfit(n - 1 f (t)) ), 1 < n < N •

We suppose l failures and restorations of EM on the distributed CS during the time f (?). Therefore, the probabilities pn(f (t),l, i) are found from the set of equations:

t

Pl(/it),/,0 = IP 2( f (x),/-1,i)W 1(t-T)d (P fit(2, f (t))V rec(2, f (t)))

0

t

P „(f (t),/,i) = 1P n+i( f (t),/-1,i)W n((t-T)k n1)d (P fit(n+1, f (t))V rec(n+1, f (t)))+

0

t

< + \Pn-i(f (t),/-1,i)Wn((t-T)k-1)d(Prec(n-1,f(t))Vfit(n-1,f (t))X 2^n^N-1, (7)

0

t

P N (t),/,i) = Ip N-1^.f (t),/-1,i)W N ((t-T)k N1)d ( Prec(N-1, f (t))V fit(N-1, f

0

r. Jwn(f (т)), ( Tk-1^f (t)) A (n = I),

Pn^.f(T),0,i)1

i0, (n >/(t)) v (n*1).

To calculate P( f (T), /, i), it is sufficient to use the equation:

P(f (T), /, i) = £ £ Pn(f (T), j, i). (8)

n=1 j=0

Formulas (7), (8) can be used to calculate probability of solution for the given time of a parallel problem that requires the great time cost on the distributed CS

3.1. Modeling

For the exponential law of failures with the failure rate - X of each EM of CS of n working EMs and the restoration rate - p(n) for (N - n) failed EMs the variables Pflt (n f)) , P rec(n f (t)) , V fit in, f it)) and V rec(«> f it)) are supposed to be equal [1]:

Prec(n, f it)) = ^(«) f it)exp(-p(n) f it)) ,

Pf&in, f it)) = Xnf it) expi-Xnf it)),

Vrec(«> f (t)) = exp(-p(«)f (t)) ,

V fit in, f it)) = exp i-Xnf it)).

In fig. 1 one of calculations of realizability function of parallel solving problem with following parameters kn = T/n, T = 310 hours, N = 32 EM, i = 32 EM, p, = 0.0 are presented graphically. Estimation of function f (t) was executed for CS, in which failure rate for each EM X = 0.00001; for f2(t), X = 0.00005; for f3 (t), X = 0.0001. In this case, function fi (t) - realizability of solving labor-consuming problems on CS (¿=1,2,3). Here, f (t) = 0, Vt e [0, To), (i=1,2,3); To - minimal time of parallel solving problem on distributed CS with all serviceable EM.

Cl 3) ffJ № G5 <71

Fig.1. Realizability function of parallel solving problem on distributed CS

3.2. Parallel algorithm

We divide the calculation of each integral for pn(f (t), l, i) of the set of equation (1) into X parts. Every part is solved on corresponding EM of parallel CS.

Fig. 2. Coefficient of calculation acceleration of parallel algorithm on cluster CS

Parallel algorithm for calculating (1) and (2) was implemented in the cluster CS [3] using programming language C and MPI technologies. Results of the paralleling efficiency of algorithm are shown in Fig.2.

REFERENCES

1. Khoroshevsky V.G. Architecture of computer systems (in Russia). M.: MSTU Publishing House, 2005. 512 p.

2. Pavsky K.V. Analysis of the time of solution of parallel problems on programmable structure computer systems // Optoelectronics, instrumentation and data processing. 2000. No. 2. P. 54 -62.

3. Khoroshevsky V.G., Mamoilenko S.N., Maidanov Y.S., Smirnov S.V. Robust cluster computer systems // Optoelectronics, Instrumentation and data processing. 2004. V. 40. No. 1.

Статья представлена кафедрой программирования факультета прикладной математики и кибернетики Томского государственного университета и оргкомитетом 7 Российской конференции с международным участием «Новые информационные технологии в исследовании сложных структур», поступила в научную редакцию 10 мая 2008 г.

Technical and economic indices of functioning distributed computer systems and realizability function of solving complex problems Текст научной статьи по специальности «Компьютерные и информационные науки»

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Khoroshevsky V. G., Pavsky K. V., Nikitin D. S.

Похожие темы научных работ по компьютерным и информационным наукам , автор научной работы — Khoroshevsky V. G., Pavsky K. V., Nikitin D. S.

Текст научной работы на тему «Technical and economic indices of functioning distributed computer systems and realizability function of solving complex problems»