Научная статья на тему 'Risk function and optimality of statistical procedures for identification of network structures'

Risk function and optimality of statistical procedures for identification of network structures Текст научной статьи по специальности «Математика»

CC BY
73
5
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
RANDOM VARIABLES NETWORK / NETWORK MODEL / NETWORK STRUCTURE / PROCEDURE FOR IDENTIFICATION OF NETWORK STRUCTURE / ADDITIVE LOSS FUNCTION / RISK FUNCTION / UNBIASEDNESS / OPTIMALITY / STATISTICAL UNCERTAINTY / СЕТЬ СЛУЧАЙНЫХ ВЕЛИЧИН / СЕТЕВАЯ МОДЕЛЬ / СЕТЕВАЯ СТРУКТУРА / ПРОЦЕДУРА ОПРЕДЕЛЕНИЯ СЕТЕВОЙ СТРУКТУРЫ / АДДИТИВНАЯ ФУНКЦИЯ ПОТЕРЬ / ФУНКЦИЯ РИСКА / НЕСМЕЩЁННОСТЬ / ОПТИМАЛЬНОСТЬ / СТАТИСТИЧЕСКАЯ НЕОПРЕДЕЛЁННОСТЬ

Аннотация научной статьи по математике, автор научной работы — Koldanov Petr Alexandrovich

Исследуется проблема определения сетевой структуры на основе конечной выборки. Приводятся понятия сети из случайных величин и сетевой модели. Рассматривается два типа сети: сетевые структуры с произвольным набором элементов и сетевые структуры с фиксированным количеством элементов сетевой модели. Определение сетевой структуры рассматривается как проблема множественного тестирования. Функция риска таких процедур может быть представлена как линейная комбинация числа неверно включённых в сеть и ошибочно не включённых в сеть элементов. Приводятся достаточные условия оптимальности статистических процедур для определения сетевых структур с произвольным количеством элементов. Рассматривается концепция неопределённости статистических процедур определения сетевой структуры.I

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

dentification of network structures using the finite-size sample has been considered. The concepts of random variables network and network model, which is a complete weighted graph, have been introduced. Two types of network structures have been investigated: network structures with an arbitrary number of elements and network structures with a fixed number of elements of the network model. The problem of identification of network structures has been investigated as a multiple testing problem. The risk function of statistical procedures for identification of network structures can be represented as a linear combination of expected numbers of incorrectly included elements and incorrectly non-included elements. The sufficient conditions of optimality for statistical procedures for network structures identification with an arbitrary number of elements have been given. The concept of statistical uncertainty of statistical procedures for identification of network structures has been introduced.

Текст научной работы на тему «Risk function and optimality of statistical procedures for identification of network structures»

УЧЕНЫЕ ЗАПИСКИ КАЗАНСКОГО УНИВЕРСИТЕТА.

_ СЕРИЯ ФИЗИКО-МАТЕМАТИЧЕСКИЕ НАУКИ

2018, Т. 160, кн. 2 С. 317-326

ISSN 2541-7746 (Print) ISSN 2500-2198 (Online)

UDK 519.2

RISK FUNCTION AND OPTIMALITY OF STATISTICAL PROCEDURES FOR IDENTIFICATION OF NETWORK STRUCTURES

P.A. Koldanov

National Research University Higher School of Economics, Nizhny Novgorod, 603025 Russia

Abstract

Identification of network structures using the finite-size sample has been considered. The concepts of random variables network and network model, which is a complete weighted graph, have been introduced. Two types of network structures have been investigated: network structures with an arbitrary number of elements and network structures with a fixed number of elements of the network model. The problem of identification of network structures has been investigated as a multiple testing problem. The risk function of statistical procedures for identification of network structures can be represented as a linear combination of expected numbers of incorrectly included elements and incorrectly non-included elements. The sufficient conditions of optimality for statistical procedures for network structures identification with an arbitrary number of elements have been given. The concept of statistical uncertainty of statistical procedures for identification of network structures has been introduced.

Keywords: random variables network, network model, network structure, procedure for identification of network structure, additive loss function, risk function, unbiasedness, optima-lity, statistical uncertainty

Introduction

One approach to analyze a complex system with N elements is to consider the corresponding network model, which can be visualized as a complete weighted graph with N nodes [1]. Network model can be represented as a complete weighted graph G = (V, E, y) , where nodes V = {1, 2,..., N} correspond to the elements of the system and weights Yi,j of edges ei}j G E are given by measure y of relation (dependence, association) between elements. In this paper, we focus on probabilistic networks models only. In probabilistic network models, nodes correspond to random variables. The Gaussian graphical model is a well-developed probabilistic network model [2]. Statistical procedures for selection (identification) of the Gaussian graphical model by observations were studied in [3-5]. The weak point of statistical procedures proposed in these works is control of type I errors only.

Another probabilistic network model is the network model of financial market. Every node of the network model corresponds to stock, and the weights of edges are given by the selected measure of dependence between stock returns. For financial market, the popular network structures are threshold graph [6] and maximum spanning tree [7].

The threshold graph is unweighted graph obtained from the network model by removing edges with weights less than or equal to the given threshold. The maximum spanning tree is a spanning tree of network model with the maximum sum of edges weights. There are many publications on calculation of such network structures and interpretation of obtained results. The statistical approach to threshold graph identification is proposed in [8].

The general problem statement of network structures identification is considered in the paper. The general approach to develop statistical procedures for network structure identification is discussed. The natural quality characteristic of these procedures is mean numbers of first- and second-kind errors, respectively. Two types of network structures identification problems are introduced: problems of network structure identification with an arbitrary number of elements from the network model; problems of network structure identification with a fixed number of elements from the network model. An example of the network structure identification problem with an arbitrary number of elements from the network model is the problem of threshold graph identification. An example of the network structure identification problem with a fixed number of elements from the network model is the problem of MST identification. It is shown in [9] that the risk function of statistical procedures for network structures identification of both types can be represented as a sum of mean numbers of first- and second-kind errors. In the paper, the sufficient conditions of optimality for statistical procedures for network structures identification with an arbitrary number of elements are given. The concept of statistical uncertainty of statistical procedures for network structures identification is introduced.

1. Basic definitions and problem statement

Let X = (X1,X2,... ,XN) be a random vector. It is assumed that density f (x) of the vector X belongs to class {f (x,Q); 0 G 0}, where Q is a parametric space. The partition of parametric space Q by L regions Qi : i = l,..., L;Qi O Qj = = j is defined and hypotheses Hi : 0 G Qi, Qi C Q,i = 1,... ,L are formulated. There is finite-size sample x(l), x(2),..., x(n) from sample space X = RNxn .

The general problem is: to construct the statistical procedure S(x), which defines the partition of sample space X by L part X — D,D = {Di, D2,.. ., Dl} . Decision di : hypothesis Hi is true is accepted if (x(l), x(2),..., x(n)) G Di.

In order to formulate the problems of network structures identification, the concept of random variables network is introduced.

Definition 1. The random variables network is a pair (X,y) , where X = (Xi,... ,Xn) is a random vector and 7 = {^j : i,j = l,..., N; i = j} is a measure of dependence between random variables Xi, Xj .

The random variables network generates a network model which is complete weighted graph G = (V,, E, 7), where V = {l, 2,..., N} is a set of nodes corresponding to the random variables Xi,X2,... ,XN, and E is a set of edges with weights given by measure 7. In order to investigate the network model G = (V, E, y) , it is clear that key structures of the corresponding graph should be identified.

The key structures satisfying the following definition are investigated in the paper.

Definition 2. The network structure of network model G = (V, E, 7) is unweighted subgraph G' = (VE') : V' C V, E' C E.

Two types of network structures are considered. The first type of network structures is that one with any number of elements from the network model. The threshold graph and the Gaussian graphical model are network structures of the first type.

Definition 3. The threshold graph (TG) of network model G = (V,E,y) is subgraph G'(y0) = (V', E') : V' = V; E' C E,E' = {(i,j) : 7id > 70}, where 70 is some threshold.

The second type of network structures includes those of them with a fixed number of elements from the network model. The maximum spanning tree is a network structure

of the second type, because the maximum spanning tree must contain N — 1 edges exactly.

Definition 4. Maximum spanning tree (MST) of network model G = (V,E,y) is a tree G = (V',E') : V' = V; E' C E; \E'\ = |V\ — 1;, such that £ Yi,j is maximum.

(i,j)eE>

To provide more details, let us propose the following general formulation of the problem of network structures identification.

Let (X, y) be a random variable network. Let the density of random vector X belong to f (x) G {f (x, 9) : 9 G 0}. Let G = (V, E, y) be a network model generated by random variable network (X,y) . Let /3 G E (/3 = 1,...,K, K = N (N — 1)/2) be elements (edges) of network model G = (V,E,y). Let G = (V',E') : V' C V, E' C E be the network structure of interest, which must be defined by observations Xi(t), i = 1,.. ., N, t = 1, ..., n. Let hp : 9 G be the hypothesis that element 3 of the network model does not belong to the network structure, kp : 9 G be

the alternative to hp, Hi : 9 G 0i; i = 1,...,L be the hypothesis that elements {ii,i2,. .. ,iM}, {ii,i2, .. . ,iM} C {1, 2,.. ., K} belong to the network structure. Let M be the number of elements of the network structure. It is necessary to construct a statistical procedure to select one from the set of disjoint hypotheses:

Hi : 9 G 0i,

where

= ( n n ^) (1)

il£{i1,...,iM} is£{1,...,K}-{i1,...,iM} (1)

or

Hi = ( n kii)fl( D his).

ii£{ii ,...,iM} ise{1,...,K}-{i1,...,iM}

Depending on M, there are two types of problems:

• problems with an arbitrary number of elements of the network model M G {0,1,...,C% }

• problem with a fixed number M of elements of the network model

2. Statistical procedures for network structure identification

Let yp (x) be the tests for testing individual hypotheses hp versus kp. Let Ap be the acceptance region of test yp(x) and A-1 be the rejection region of test yp (x), respectively. Let S(x) be the statistical procedure for problem (1), where di is the decision that hypothesis Hi,i = 1,... ,L is true, and Di be the acceptance region of hypothesis Hi

S(x) = di, if x G Di,

Di n Dj = i = j, i,j = 1,..., L; U Di = X, (2)

i=i

where X is a sample space.

According to the results of [10], any procedure for network structure identification with

an arbitrary number of elements from the network model can be written in the following form: K

Di = D Ar, (3)

p=i

where

ft, an ^ =

Kip = \ „ (4)

[-1, an ^ =

For statistical procedures for network structure identification with fixed number M of elements from the network model, the condition of compatibility must be satisfied, which can be written as:

Definition 5. Set of tests yp (x), 3 = 1,... ,M is compatible with decision space of procedure S(x) (2) if

]T P (x en a;**3 ) = 1. (5)

(Kij3ii ,...,):

= -1; *M+1 "HiK

KiB- =... = KiB-iBi1 iBiM

Ki0. =... = KiB - =1

iBi iBi

If the set of tests yp(x),3 = 1,...,M is compatible with the decision space of procedure S(x), then there is one-to-one correspondence between procedure S(x) (2) and the set of tests yp(x), 3 = 1,..., M [10]. Such correspondence has the form:

Di = fl Ai, Ap = U Di, A-1 = U Di (6)

P=1 i:Ki,@ = 1 i:Ki,@ = —1

In the case of compatible set of tests yp (x), relations (6) define the statistical procedures for network structure identification.

3. Risk function of statistical procedures for network structure identification

Let w(Hi; dj) = wij be the loss from decision dj when hypothesis Hi is true. Let us assume that the loss from the correct decision is equal to zero, wii = 0 V i = 1,... ,L. According to [11], the quality of any statistical procedure S(x) is characterized by the risk function

L

R(Hi, 6; S) = ^2 wij pe (S(x) = dj), 6 e Qi, i = 1,...,L,

j=i

where Pg (S(x) = dj) is the probability of decision dj .

Let ap, bp be the loss from the first- and second-kind errors for testing of individual hypotheses hp . Consider loss function wij of the following form

wij = Y.(j ap + ejip bp ), (7)

p

where

(l, if Kip = 1, Kjp = -1, eijp = 1

0, otherwise,

Kip defined by (4).

The following theorems [9] characterize the risk function for the problem of network structure identification.

Theorem 1. Let the loss function be defined by (7). Then the risk function of the statistical procedure for the problem of identification of the network structure with an arbitrary number of elements is:

K

R(HU9,S) = J2 r(hp ,yp), (8)

p=i

where r(hp, yp) is the loss function of test yp .

In the case ap = a, bp = b, y/ = 1,... ,K, one has:

R(Hi, 9, S) = aEeY(Hi, S)} + bEe{Yn(Hi, S)}, (9)

where Yj (Hi, S) is the number of erroneously included elements (the number of first-kind errors) by procedure S if hypothesis Hi is true, Yjj(Hi,S) is the number or erroneously non-included elements (the number of second-kind errors) by procedure S if hypothesis Hi is true.

Theorem 2. Let

• the set of tests yp for testing individual hypotheses hp be compatible with the decision space of statistical procedure S for testing hypotheses Hi;

• the loss function be additive and defined by (7). Then the risk function of statistical procedure S for the problem of identification of the network structure with a fixed number of elements has the form:

K

R(HU9,S) = J2 r(hp ,yp), (10)

p=i

where r(hp, yp) is the risk function of test yp .

• If ap = a, bp = b, 3 = 1, . . . , K then the risk function of statistical procedure S for the problem of identification of the network structure with a fixed number of elements has the form:

R(Hi, 9, S) = (a + b)Ee(Yj(Hi, S)) = (a + b)Ee(Yn(Hi, S)), (11)

where Yj(Hi,S) is the number of first-kind errors, Yjj(Hi,S) is the number of second-kind errors of procedure S when hypothesis Hi is true.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Note that theorem 1 is a simple result of [10]. On the other hand, theorem 2 is new and corresponds to a generalization of the result of [10].

4. Sufficient conditions to optimality of statistical procedure for identification of network structure with arbitrary number of elements

Consider the set G of all N x N symmetric matrices G = (gi,j) with gi,j G {0,1}, i, j = 1, 2,..., N, gi,i = 0, i = 1, 2,..., N. Matrices G G G represent the adjacency matrices of all simple undirected graphs with N nodes. The total number of matrices in G is equal to L = 2M with M = N(N — 1)/2. The problem of identification of the network structure with an arbitrary number of elements can be formulated as a multiple decision problem of the selection of one hypothesis from the set of L hypotheses:

Hq : Yij < Yo, if gi,j = 0, Yij > Yo, if gi,j = 1; i = j. (12)

Let / = (i,j). Let individual tests for individual edge hypotheses:

hij : Yij < Yo vs hj Yij > Yo

have the form:

^ ( ) _ I 1 tij(x) > cij,

1^0, tij(x) < cij,

where Cj is defined from:

Pyo (Tij > Cij) _ aij (13)

and a^j is the given significance level.

According to (3), the multiple statistical procedure for identification of the network structure with an arbitrary number of elements has the form

$(x) _

( 1, Pl2(x), PlN (x)\

P2l(x), 1, P2N (x)

(14)

\PN l(x), PN2(x), 1 J

Let us define the multiple statistical procedure for network structure identification

S(x) _ dG, iff $(x) _ G. (15)

Let S _ (si,j), Q _ (qi,j), S,Q e G. Denote by w(S, Q) the loss from decision dQ when hypothesis Hs is true

w(Hs; dQ)_ w(S,Q), S,Q eG.

The risk function is defined by

R(S,0,6)_J2 w(S,Q)Pe (S(x) _ dQ), S eG, 0 e Qs,

QeQ

where Pe (S(x) _ dQ is the probability that decision dQ is taken, while the true decision is ds : 0 e Qs, Qs with 0 _ \\Yi,j ||, such that hypothesis Hs is true. According to [10], the multiple decision procedure S(x) is w-unbiased if

Y,w(S,Q)Pe (S(x)_ dQ) <Y, w(S',Q)Pe (S(x) _ dQ) V S, S' eG,0 e Qs. (16)

QeQ QeQ

Let aitj be the loss from the false inclusion of edge (i,j) in the network structure, and let bitj be the loss from the false non-inclusion of edge (i,j) in the network structure, i, j _ 1, 2,..., N, i _ j .

Then additive loss function (7) can be written as

w(S,Q)_ ai,j bi,j.

{i,j:si,i =0; {i,j:si,j=1; qij= 1} qij=0}

It means that the loss from the misclassification of Hs is equal to the sum of losses from the misclassification of individual edges.

Theorem 3. Let the loss function be additive and tests pj (x) be uniformly most powerful in the class of unbiased ( UMPU) levels a^j tests. Then statistical procedure (15) is optimal in the class of unbiased statistical procedures for identification of the net-

bij

work structure with an arbitrary number of elements if aij _ -'L-— .

aij + bij

Proof. First, we prove that statistical procedure S is unbiased. Individual tests y,j(x) are unbiased, then r(si}j,y,j(x)) < r(si j,y,j(x)) for any si}j, si j G {0,1}, i,j = 1,...,N.

The loss function is additive, then, according to theorem 1, the risk function of statis-

N

tical procedure S can be written as R(Hs, 0,S) = r(si, j, yij). Therefore, VS, S' G G,

i, j=l

e g Qs

J2w(S,Q)Pg(S(x) = dQ) w(S',Q)Pe(S(x) = dQ).

Q Q

Then S(x) is unbiased.

Now we should prove that statistical procedure S is optimal in the class of unbiased statistical procedures. Let S'(x) be any other unbiased procedure. Then S'(x) defines the partition of the sample space by L parts DG = {x : S'(x) = G}. Let Ai,j = |J DG,

G:gi,j = 0

A— = U dg . Define

' G:gi,, = 1

' (0, x G Aid,

*j=U xGAj

Tests yi j are used to test individual hypotheses, h, j - elements (i,j) do not belong to network structure S. Then, according to theorem 1, the risk function of statistical procedure S' can be written as

N

R(Hs,e,S')=J2 r(si, j,y'ij)■ i, j=1

Since statistical procedure S'(x) is unbiased, then Yw(S,Q)Pe(S'(x) = dQ) < Yw(S',Q)Pe(S'(x) = dQ) VS, S' G G, e G Qs■ (17)

QQ

Since network structure S has an arbitrary number of elements, there exists network structure S' , such that

3i,j : si ,j = si j ,sji = sj,i V (k,l) = (i,j), (k,l) = (j,i), sk, i = s'k, Then, (17) has the form:

r(si,j,j(x)) < r(si,j,y'ij(x)).

Hence, tests yi j are unbiased.

However, tests y,,j(x) are UMPU, then r(si,j,y,,j(x)) < r(si,j,yi j(x)). Therefore, R(HS,e, S) < R(HS, e,S'). ' ' ' ' □

Note that theorem 3 is based on the general ideas of [10]. Nevertheless, the restriction of the problem of identification of the network structure with an arbitrary number of elements allows to give a simpler proof.

Multiple testing procedures for Gaussian graphical model (GGM) identification. Let us consider random variables network (X, j), where vector X = (Xi, X2,..., Xn) has multivariate normal distribution N(p, T) and measure y,,j = Ip1'j | is the absolute value of partial correlation coefficient p1'j .

The individual hypotheses for the problem of GGM identification have the form:

hij : pij _ 0 vs kij : pij _ 0. (18)

According to [12], UMPU tests for testing individual hypotheses (18) are:

opt

\rij \ < 1 - 2d

I I (

|rij| > 1-2df

a/2, /2

where d3/2 is the a/2-quantile of Beta distribution Be

( n - N n - N

— -------- —----- - — — V 2 \ 2.

the multiple statistical procedure for concentration graph identification

Sopt(x) = dG, iff $opt(x) = G,

where

l o, v<l:'2t(x), v?}(x)\

$opt(x) =

o

opt

•¿2,1 (x),

According to theorem 3, it is easy to prove the following

•°2,N (x)

o

(19)

Let us define

(20)

(21)

Theorem 4. Multiple-decision statistical procedure (20) is optimal in the class of unbiased statistical procedures for GGM identification under the additive loss function.

o

5. Statistical uncertainty

Theorems 1, 2 allow to introduce the unique measure of uncertainty for the statistical procedures of network structures identification.

Definition 6. Value R(S, 0, S, n) will be called the statistical uncertainty of procedure S for network structure S identification under n observations and distribution of vector X with 0 e Qs.

Definition 7. Statistical procedure Si of network structure Si identification has a smaller statistical uncertainty for Qi C Q than statistical procedure S2 of network structure S2 identification if

R(S1,0,S1,n) < R(S2,0, S2,n), Vn, V0 e Q1

If a _ 0 , b _ , where Mi is the maximum number of type i errors (i _ 1, 2), 2M1 2M2

then the measure of statistical uncertainty is equal to the average number of erroneous decisions of procedure S. The experimental results from [13] show that the uncertainty of the statistical procedure for threshold graph identification is much smaller than the uncertainty of the statistical procedure for maximum spanning tree identification.

Conclusions

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The general approach to identification of network structures is proposed in the paper. In contrast to the known approach [4, 5], our approach allows to pay attention to both types of errors, as well as to investigate the properties of optimality and to compare different network structures by statistical uncertainty of their identification procedures.

Acknowledgments. This work was supported in part by the Laboratory of Algorithms and Technologies for Network Analysis of National Research University Higher School of Economics and by the Russian Foundation for Basic Research (project no. 18-07-00524).

References

1. Jordan M.I. Graphical models. Stat. Sci., 2004, vol. 19, no. 1, pp. 140-155. doi: 10.1214/088342304000000026.

2. Lauritzen S.L. Graphical Models. Oxford, Oxford Univ. Press, 1996. 298 p.

3. Anderson T.W. An Introduction to Multivariate Statistical Analysis. New York, John Wiley & Sons, 2003. 752 p.

4. Drton M., Perlman M.D. Model selection for Gaussian concentration graph. Biometrika, 2004, vol. 91, no. 3, pp. 591-602. doi: 10.1093/biomet/91.3.591.

5. Drton M., Perlman M. Multiple testing and error control in Gaussian graphical model selection. Stat. Sci., 2008, vol. 22, no. 3, pp. 430-449. doi: 10.1214/088342307000000113.

6. Boginski V., Butenko S., Pardalos P.M. On structural properties of the market graph. In: Innovations in Financial and Economic Networks. Cheltenham, Edward Elgar Publ., 2003, pp. 29-45.

7. Mantegna R.N. Hierarchical structure in financial markets. Eur. Phys. J. B, 1999, vol. 11, no. 1, pp. 193-197. doi: 10.1007/s100510050929.

8. Koldanov A.P., Koldanov P.A. , Kalyagin V.A., Pardalos P.M. Statistical procedures for the market graph construction. Comput. Stat. Data Anal., 2013, vol. 68, pp. 17-29. doi: 10.1016/j.csda.2013.06.005.

9. Koldanov P.A. Risk function of statistical procedures for network structures identification. Vestn. TVGU. Ser. Prikl. Mat., 2017, no. 3, pp. 45-59. doi: 10.26456/vtpmk178. (In Russian)

10. Lehmann E.L. A theory of some multiple decision problems, I. Ann. Math. Stat., 1957, vol. 28, no. 1, pp. 1-25.

11. Wald A. Statistical Decision Functions. New York, John Wiley & Sons, 1950. 179 p.

12. Koldanov P., Koldanov A., Kalyagin V., Pardalos P.M. Uniformly most powerful unbiased test for conditional independence in Gaussian graphical model. Stat. Probab. Lett., 2017, vol. 122, pp. 90-95. doi: 10.1016/j.spl.2016.11.003.

13. Kalyagin V.A., Koldanov A.P., Koldanov P.A., Pardalos P.M., Zamaraevand V.A. Measures of uncertainty in market network analysis, Phys. A, 2014, vol. 413, no. 1, pp. 59-70. doi: 10.1016/j.physa.2014.06.054.

Recieved October 10, 2017

Koldanov Petr Alexandrovich, Candidate of Technical Sciences National Research University Higher School of Economics ul. B. Pecherskaya 2, Nizhny Novgorod, 603025 Russia E-mail: pkoldanov@hse.ru

УДК 517.2

Функция риска и оптимальность статистических процедур определения сетевых структур

П.А. Колданов

Национальным исследовательский университет Высшая школа экономики, г. Нижний Новгород, 603025, Россия

Аннотация

Исследуется проблема определения сетевой структуры на основе конечной выборки. Приводятся понятия сети из случайных величин и сетевой модели. Рассматривается два типа сети: сетевые структуры с произвольным набором элементов и сетевые структуры с фиксированным количеством элементов сетевой модели. Определение сетевой структуры рассматривается как проблема множественного тестирования. Функция риска таких процедур может быть представлена как линейная комбинация числа неверно включённых в сеть и ошибочно не включённых в сеть элементов. Приводятся достаточные условия оптимальности статистических процедур для определения сетевых структур с произвольным количеством элементов. Рассматривается концепция неопределённости статистических процедур определения сетевой структуры.

Ключевые слова: сеть случайных величин, сетевая модель, сетевая структура, процедура определения сетевой структуры, аддитивная функция потерь, функция риска, несмещённость, оптимальность, статистическая неопределённость

Поступила в редакцию 10.10.17

Колданов Петр Александрович, кандидат технических наук, доцент кафедры прикладной математики и информатики

Национальный исследовательский университет Высшая школа экономики

ул. Б. Печерская, д. 2, г. Нижний Новгород, 603025, Россия E-mail: pkoldanov@hse.ru,

/ For citation: Koldanov P.A. Risk function and optimality of statistical procedures for ( identification of network structures. Uchenye Zapiski Kazanskogo Universiteta. Seriya \ Fiziko-Matematicheskie Nauki, 2018, vol. 160, no. 2, pp. 317-326.

/ Для цитирования: Koldanov P.A. Risk function and optimality of statistical ( procedures for identification of network structures // Учен. зап. Казан. ун-та. Сер. \ Физ.-матем. науки. - 2018. - Т. 160, кн. 2. - С. 317-326.

i Надоели баннеры? Вы всегда можете отключить рекламу.