Научная статья на тему 'On distribution of sums of random variables with invariant links and their modeling'

On distribution of sums of random variables with invariant links and their modeling Текст научной статьи по специальности «Математика»

CC BY
168
19
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
SEQUENCES OF RANDOM VARIABLES / SUM OF A fiNITE NUMBER OF RANDOM VARIABLES / SUM OF DEPENDENT RANDOM VARIABLES / DISTRIBUTION OF SUMS OF ABSOLUTELY CONTINUOUS RANDOM VARIABLES / ПОСЛЕДОВАТЕЛЬНОСТИ СЛУЧАЙНЫХ ВЕЛИЧИН / СУММА КОНЕЧНОГО ЧИСЛА СЛУЧАЙНЫХ ВЕЛИЧИН / СУММА ЗАВИСИМЫХ СЛУЧАЙНЫХ ВЕЛИЧИН / РАСПРЕДЕЛЕНИЕ СУММ АБСОЛЮТНО НЕПРЕРЫВНЫХ СЛУЧАЙНЫХ ВЕЛИЧИН

Аннотация научной статьи по математике, автор научной работы — Chebotarev Sergey V.

A general form of distribution of a sum of a finite number of absolutely continuous random variables is obtained, examples of constructing and modeling sequences with averaged links (with invariant links) are considered based on the distribution of the sum of these random variables.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

О распределении сумм случайных величин с инвариантными связями и их моделировании

В работе получен общий вид распределения суммы конечного числа абсолютно непрерывных случайные величины, рассмотрены примеры формирования и моделирования последовательностей с усредненными связями(с инвариантными связями) исходяизраспределения суммы этихслучайных величин.

Текст научной работы на тему «On distribution of sums of random variables with invariant links and their modeling»

УДК 519.21

On Distribution of Sums of Random Variables with Invariant Links and their Modeling

Sergey V. Chebotarev*

Altai State Pedagogical University Molodezhnaya, 55, Barnaul, 656015

Russia

Received 13.02.2019, received in revised form 10.06.2019, accepted 14.07.2019 A general form of distribution of a sum of a finite number of absolutely continuous random variables is obtained, examples of constructing and modeling sequences with averaged links (with invariant links) are considered based on the distribution of the sum of these random variables.

Keywords: sequences of random variables, sum of a finite number of random variables, sum of dependent random variables, distribution of sums of absolutely continuous random variables. DOI: 10.17516/1997-1397-2019-12-5-628-636.

Introduction

n

In paper [1] the sums Sn(Z(n)) = J2 Zt of finite sequences Z(n) = (Zt)tein, In = {1, 2,...,n}

t=i

with Rademacher, lattice and real random variables were investigated. For Rademacher random variables Zt € { — 1, 1}teIn the relationship between the finite-dimensional probability distribution of these sequences and the values of mixed moments was shown. Based on this study we obtained expressions for distribution of sums. In the same paper there were introduced and exploited sequences with averaged links Z(n) = (Zt)t.eIn, In = {1,2,... ,n}, based on the distribution of the sum of random variables of the original sequence Sn(Z(n)) (Shortly: sequences with averaged links or sal). For these sequences

, P(Sn(6n)) = 2k — n) . . ,

P(Zi,Z2, ...,tn)= (n)C k--, V(Zi, Z2,..., Zn) such that £ Zt = 2k — n.

Cn t=i From the properties of such sequences we note that all random variables of a sequence are equally distributed and the joint probabilities of any sets of these random variables are invariant with respect to the replacement of random variables. That is

P(Z»1 = Xi,ii2 = X2,..., iim = xm) = P(4 = Xi, ih = X2,..., ijm = xm)

is valid for any sets (ii,i2,..., im), (ji,j2,... ,jm) € In, for any 1 < m < n and for any sets (xi,x2,..., xm), where x^ € { — 1,1}.

All sequences for which the invariance property holds are defined as a class of sequences with invariant links. Further on, these concepts and results are extended to the case of sequences of lattice and real random variables. In particular, for absolutely continuous random variables in [1], an expression was obtained for a finite-dimensional distribution of random variables with averaged links, constructed from the distribution of sums of the original sequence. But, in contrast to the case of Rademacher and lattice random variables, the general form of distribution of sums of such random variables was not found. In this paper, we find the general form of distribution of a sum of a finite number of absolutely continuous random variables and consider some examples of modeling sequences with averaged links (with invariant links).

*svcheb@gmail.com © Siberian Federal University. All rights reserved

Distribution of sums of random variables

We consider the problem of finding the general form of distribution of a sum of a finite number of centered absolutely continuous random variables having as their sum an absolutely continuous random variable with a nontrivial distribution.

This problem is similar to the problems that were solved in [2,3]. Therefore, we use the results of these works. In particular, in [2] it is shown (Theorem 3) that for a sequence of Rademacher random variables y = (jt)teN, where jt G { — 1, 1} satisfying the conditions: 1 n

1. - £ M7t ^ 0,

n t=l n^x

2. there exists a weak limit £1/2(7) sequences with nondegenerate distribution

1n

Si/2(Y(n)) = ^Y] Yt ^ Sl/2(Y), ' y ' -,/n n^x '

v t=1

the limiting random variable S1/2(y) has the following distribution density:

^ 00

Kx) = ^=e 2 V vm(Y) • hm(X). (1)

v —'

v m=0

We use this result to solve the stated problem. First, similarly to [3], we approximate the random variables of the original sequence £t by lattice random variables nt s. For this we divide

, N (2k — s — 1 2k — s + 1-' the set of real numbers R as follows: Axs(k) = (- -

'S

for k = 1,..., s — 1,

s 1 s 1

Axs(0) = ( —to,--, Axs(s) = —, to . Set

ss

/ 2k — s \

P(nM = = P(Ct G Axn(k)) = P(t(Axs(k)), k = 0,1,...,s.

n

We also approximate the sum S(g(n)) = £ £t, of random variables of the investigated se-

t=i

n

quence with sums S(ns) = £ nt,s of lattice random variables nt,s. For this we divide the

t=i

t-> a (2k — ns — 1 2k — ns +1"

set of real numbers R as follows: Axsn(k) =

'-■sny

/ns x/n.S

for k = 1,... ,ns — 1,

( ns — 1] (ns — 1 \

Axsn(0) = —to,--, Axns(ns) = —, to . In this case we set

ns ns

/ 2 k_ns \

P[S(ns) = —n^) = P(S(Z(n)) e Axns(k)) = Ps(tn))(Axns(k)), k = 0,1,...,ns.

For ns we show in [4] (see Theorem 2.4) existence of a finite sequence with averaged links, which has the same distribution of sums. In its turn, the same article shows existence of a finite sequence with averaged links of Rademacher type Y(sn) such that

1 s-1

nt,s = — it+i-n. (2)

V s r-0

v i—0

For this sequence we have

1 sn

Fs{i(„))(x) = FS(.?(sn))(x) Vx e R, where S(Y(sn)) =

s t-1

Passing to the limit as s ^ to, we get

1 sn j— sn

Fs{£m)(x) = Fsy)(x) Vx € R, where S(Y) == lim — V Yt = lim —=^TYt. (3)

v t=i v t=i

Comparing the limiting random variables in (1) and (3), we see that S(Z(n)) = S(7) = = y/nSi/2(Y) and we can formulate a statement regarding the density S(Z(n)):

Theorem 1. Let a sequence of centered absolutely continuous random variables Z(n) = (Zt)tein

n

be given on a measurable space (R(n), B(n)), the sum of which Sn = Zt is a non-degenerate

i=i

absolutely continuous random variable with a density distribution n(x). Then the distribution density of the sums of these random variables is as follows:

Kx) = -7=e 2n Y] vm{l) • hm(-^r), (4)

where vm(7) are mixed moments of sequence 7 = (7t)teN.

The proof follows from the above. Corollary 1. Let a sequence of centered absolutely continuous random variables Z(n) = (Zt)tein

n

be given on a measurable space (R(n), B(n)), the sum of which Sn = Zt is a non-degenerate

i=i

absolutely continuous random variable with a density distribution n(x). Then there exists a finite sequence with averaged links Z(n) = (Zt)tein defined on the same measurable space such that its joint distribution function satisfies the following relation:

1 f x0 f xn -1 E X2 ^ ( x )

Fe (x° ,...,x°n)= fTK-^ • • e 2 t=1 * Ys Vm(7) • h^[—^)dxi (5)

V(2n) m=0 v'

n

where x = xt.

t=i

Proof. It suffices to substitute in (6) expression (4) as n(x). □

Construction and modeling of sequences with invariant links. (Examples)

We use Theorem 3.3 from [1] to construct and model a sequence with averaged links based on the sum of random variables of the original sequence. Recall that in this theorem we give an expression for the n-dimensional distribution function of a sequence with averaged links Z(n) = (Zt)teIn constructed from the sums of the original sequence of centered absolutely continuous

n

random variables Z(n) = (Zt)t.eIn, with the sum of the original random variables Sn = ^ Zt

i=i

are the essence of a non-degenerate absolutely continuous random variable with a distribution

n

density n(x), where x = J2 xt. Then the expression for n-dimensional distribution function,

t=i

satisfies the following relation:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

, (0 0) = 1 fxl fxn -1 = m(x)

,__0 0 ( n 2\ (6)

fx\.. fvHSx2-

e n ' ¡il(x)dxi • • • dxn,

where x = £ xt, yn(x) = . e . t=i \J2nn

Note that for £(n) the following expression is true:

F

sn(i)(x) = FSn(i)(x) e R

(7)

Let us consider some examples of constructing a sequence with invariant links by the distribution of the sum of these random variables.

Example 1. Let £(n) = (&)

t )tein

e R(n). 1

Let the sum of these random variables have

_

-e 2n, that is, it is a random variable with a

a probability distribution density \i(x)

normal distribution with parameters MSn = 0, DSn = n. Then the n-dimensional distribution density of a sequence with averaged links is

p«(„) (x1,---,xn) =

—n -t(E -4) . . —n -t(E -x2) 1

-e / n(x) = -e / ——

vW

-e 2n =

2 E xt

n i xt H —Tne-+

X\v(xtX

t=i

where y(xt) is the density of standard normal distribution. In this case, the sequence of random variables with invariant links will be the sequence of independent normallly distributed random variables and the simulation of random variables of such a sequence is reduced to simulation the required number of standardly distributed random variables, which does not cause any additional difficulties.

Example 2. Let a sequence be given £(2) = (£t)t.ei2 £ R(2), and let the sum of these random variables have a uniform distribution with a density n(x) = 1I^^i ^j. Here I is the indicator

function of the set {x £ ( — 1, 1)} Then the two-dimensional distribution density of a sequence with averaged links will be

P«i,«t (x1,x2)

- l(x1+xt-^t22) 1 , e V / -1

vW

{x£(-1, 1)}

1 e- J (-i-Xt)21

From here the distribution of random variables of the sequence can be found from the relation

p«i(x) = Pit(x) = p(x) =

e-1 (xt-x)t dx2

x+xt£(-1,1) 1 — x

1

e-4(xt-x)t dx2

V2n

vt

1-x

1-tx

vt

e-2 U du

vt

/1 - 2x\ ( 1 + 2x\

1 + 2xN

V2

Here F^01 (x) is the value of the distribution function of a normal random variable £ with parameters M£ = 0, D£ = 1.

1

e

u

1 — tx

Both random variables have the same distribution, but they are interdependent. To calculate the marginal density of the distributions £1,£2, we can use the MatLab package, or rather its package of symbolic calculations:

1 syms x y

2 int(exp(-(x-y)"2/4),x,-1-y,1-y)

3 ans = -pi"(1/2)*(erf(y - 1/2) - erf(y + 1/2))

As a result, we get

p(y) = - — -Vn ■ (erf(y - 1/2) - erf(y + 1/2)) = - (erf (—^ - erf (-.

Taking into account that

2 fx 2 ( x \ erf (x) = e-t dt and erf( = 2FWol (x) — 1,

Jo VV2^

we have a similar expression.

Let us check, using symbolic calculations in MatLab, the value of mathematical expectation of the obtained random variables: M£ = 0?

1 int ((y*(erf(y+ 1/2) - erf(y - 1/2))) ,y,-inf , inf)

2 ans = 0

We also calculate the variance :

1 int ((y"2 *(erf(y + 1/2) - e rf(y - 1/2))) ,y,-inf , inf)

2 ans = 7/6

As a result, we get

1 1 7 7

D£ =t2J V2 ■ (erf(y + 1/2) — erf(y — 1/2))dy = - ■ - = -.

Consider the relationship between these random variables, namely, we calculate the covariance of these random variables: Declare that A = {(x1,x2)\(x1 + x2) £ ( — 1, 1)}

V2(xi,x2)= cov(xi,x2) = Mxi ■ x2 = —^ xix2e-4(xi-x2) I^dxidx2 =

— oo — oo

2y/n

1 fx fx t

—= x1dx1 / x2e-4(xi-xt) IAdx>

2V n J-oo J-oo

2Vn J -

1 f^ f^ t

—^ x 1 dx 1 / (x2 - x{)e-4(xi-xt)tIAdx2+

2\ln J-x J-x

1 i'X fx t

+--— x1dx1 / x1e-4(xi-xt)t IAdx2. (8)

2 v n J-x J-x

Consider these two integrals separately:

1 /*w fw 2 1 /*w fw 2

1. xidxi I xie-4(Ж1-Ж2) IAdx2 = x\dxi e-4(Ж1-Ж2) IAdx2 =

2VП J-w J-w 2VП J-w J-w

/w

xlp(xi)dxi = Dxi.

w w

2. —= x 1 dx 1 (x2 —x 1 )e-4(Ж1-Ж2)2IAdx2

2у/n J J

1-2x1

w -У2

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1 i' f _1 2

xidxi ue 2u du.

— TO — TO —TO —1-2xi

V2

Taking into account that

1 —2Ж1 V2

If _1 „2 1 f (1 + 2xi)2 (1 —2xi)2 \

ue 2 „ du = e 4 - e 4 ,

лДП J лДЛУ

-1-2X1 V2

we get

w -/2 w

1 f , f _1 u2 . If ( '1+2-1»2 (1-2ч)2 \ , л/П

xidxi I ue 2 du = r— xi{ e 4 — e 4 Idxi = --

A/2^7 J л/2П J v У Л/2П л/2'

V2

As a result, we have

1 7

V2(xi,x2) = cov(xi,x2) = ——2 + 12 ~ —0.1238.

Consider the process of modeling a sequence of random variables with invariant links from this example.

We shall proceed from

'1-2x \ „ (- 1-2x^

.. (1 — 2x\ / —1 — 2x\

P(x) = J — FWo Д j

and use the inverse function method to generate random values.

First, we obtain the values of the distribution density and the distribution function of random variables:

1 dx=0.001;

2 x = -20:dx : 20 ;

3 % distribution density calculation

4 p=(normcdf((1 - 2 * x)/sqrt(2),0,1)-normcdf(( - 1 -2 * x)/sqrt(2),0,1));

5 % calculation of the distribution function values

6 F(1) = p (1) * dx;

7 for k = 2: 1: length (x)

8 F(k)=F(k-1) + p(k) * dx;

9 end;

10 % value check F(\infty)=1 ?

11 F( length (x))

Performing the above calculations in MatLab, we obtain the values of the distribution density and distribution functions of random variables.

Further on, using the method of the inverse function and the obtained values of the distribution function, we generate the values of the 1st random variable (Fig. 1).

2

12

Fig. 1. Distribution density £i,£2

1 %We generate n independent random numbers with the distribution

2 % function F(x) using method of the inverse function

3 n = 20;

4 g=rand(2,n);

5 for i = 1 :n

6 k = 1;

7 while F(k)< g(1 , i )

8 k=k +1; 9 end ;

10 s l v 1 ( i )=x (k ) ;

11 end ;

As a result of these calculations, we obtain n samples with the generated value of the first random variable.

The value of the second random variable, taking into account the dependence of their values, is formed using the conditional distribution density of the second random variable, taking into account the obtained value of the first random variable in each specific sample.

1 p2=zeros(n,length(x));

2 % we o b t a i n n c o n d i t i o n a l distribution densities of the 2nd random variable

3 for i = 1 :n

4 for k = 1: length (x)

5 p2 (i , k) = exp ( - ( slv1 ( i ) x(k))"2/4)*Ind ( slv1(i ) , x(k))/(2 *sqrt(pi));

6 end ;

7 p2(i ,:) = p2(i ,:) / (sum(p2 ( ,:)) * dx);

8 end ;

9 % Calculation of the distribution function values for the 2nd random

10 % variable using conditional densities

11 for i =1:n

12 F2(i,1) = p 2 (i ,1)*dx;

13 for k = 2:1: length (x)

14 F2(i , k) = F2(i ,k-1) + p2( ,k)* dx;

15 end ;

16 end ;

17 %Generation of 2nd random variable values by the inverse function method

18 for i =1:n

19 k=1;

20 while F2(i ,k)< g(2, i)

21 k=k +1;

22 en d ;

23 slv2(i)=x(k) ;

24 end ;

Examples of conditional distribution densities £2 are shown in Fig. 2.

Fig. 2. Conditional distribution density £2 in 1-st and 2-nd samples

Verification of the obtained results using the Kolmogorov criterion showed the consistency of the modeled data with theoretical distributions:

1 cdf = [x 1 F 1 ];

2 [H,P,KSSTAT,CV] = kstest (slv1 , cdf ,0 .01 )

3 H = 0

4 P = 0.0179

5 KSSTAT = 0.3326

6 CV = 0.3524

1 [H,P,KSSTAT,CV] = kstest (slv2 , cdf ,0 .01 )

2 H = 0

3 P = 0.1991

4 KSSTAT = 0 . 2 3 1 7

5 CV = 0 . 3 5 2 4

6

7 sm=slv 1+slv2 ;

8 y= unifcdf(x,-1 ,1);

9 cdf = [x 1 y 1 ] ;

10 [H,P,KSSTAT,CV] = kstest (sm, cdf ,0 .01 )

11 H = 0

12 P = 0.4014

13 KSSTAT = 0 . 1 9 2 0

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

14 CV = 0.3524

Simulation results: the numbers of samples are shown in the columns, and the values of random variables in the lines (Tab. 1).

Table 1.

N 1 2 3 4 5 6 7

ei 0.6860 -0.8730 0.2590 -0.4500 1.3150 -0.7690 1.3130

0.1920 1.4570 -0.9850 0.3390 -0.3440 1.6200 -0.8460

ei + ь 0.8780 0.5840 -0.7260 -0.1110 0.9710 0.8510 0.4670

N 8 9 10 11 12 13 14

ei 0.6450 -0.1510 0.6230 0.3070 0.7900 0.3560 0.5000

e2 -1.1310 0.9340 0.3230 -1.1960 0.1310 0.2410 -0.4810

ei + e2 -0.4860 0.7830 0.9460 -0.8890 0.9210 0.5970 0.0190

N 15 16 17 18 19 20

ei 0.3070 0.4150 -0.4540 -0.9930 0.3900 1.2580

e2 -0.8380 -1.3040 -0.4750 1.2400 -0.5670 -1.9750

ei+e2 -0.5310 -0.8890 -0.9290 0.2470 -0.1770 -0.7170

Above, in the process of calculations, we used the function Ind(x,y), the indicator function of the set \x + y\ < 1.

1 function Ixy = Ind( x,y )

2 if abs(x+y)<1 Ixy = 1; else Ixy = 0;

3 end

References

[1] S.V.Chebotarev, On the equivalence of finite sums of random variables, Vestnik BGPU, series: natural and exact sciences, 4(2004), 108-116 (in Russian).

[2] S.V.Chebotarev, About limit distribution of sums of random variables, Journal of Siberian Federal University. Mathematics & Physics, 9(2016), no. 1, 17-29.

[3] S.V.Chebotarev, On the limit distribution of sums of real random variables. Journal of Siberian Federal University. Mathematics & Physics, 10(2017), no. 3, 310-313.

[4] S.V.Chebotarev, About sequences of random variables with averaged relationships, Vestnik AltSPA, seriya: estestvenye i tochnye nauki, 7(2011), 28-37 (in Russian).

О распределении сумм случайных величин с инвариантными связями и их моделировании

Сергей В. Чеботарев

Алтайский государственный педагогический университет Молодежная, 55, Барнаул, 656015

Россия

В 'работе получен общий вид распределения суммы конечного числа абсолютно непрерывных случайные величины, рассмотрены примеры формирования и моделирования последовательностей с усредненными связями (с инвариантными связями) исходя из распределения суммы этих случайных величин.

Ключевые слова: последовательности случайных величин, сумма конечного числа случайных величин, сумма зависимых случайных величин, распределение сумм абсолютно непрерывных случайных величин.

i Надоели баннеры? Вы всегда можете отключить рекламу.