Научная статья на тему 'Development of algorithm for automatic construction of a computational procedure of local image processing, based on the hierarchical regression'

Development of algorithm for automatic construction of a computational procedure of local image processing, based on the hierarchical regression Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
174
42
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
local processing / hierarchical regression / computational efficiency / machine learning / precedent-based processing / functional of full cross-validation

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Vasiliy Nikolaevich Kopenkov, Vladislav Valerievich Myasnikov

In this paper, we propose an algorithm for the automatic construction (design) of a computational procedure for non-linear local processing of digital signals/images. The aim of this research is to work out an image processing algorithm with a predetermined computational complexity and achieve the best quality of processing on the existing data set, while avoiding a problem of retraining or doing with less training. To achieve this aim we use a local discrete wavelet transform for a preliminary image analysis and the hierarchical regression to construct a local image processing procedure on the basis of a training dataset. Moreover, we work out a method to decide whether the training process should be completed or continued. This method is based on the functional of full cross-validation control, which allows us to construct the processing procedure with a predetermined computational complexity and veracity, and with the best quality.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Development of algorithm for automatic construction of a computational procedure of local image processing, based on the hierarchical regression»

DEVELOPMENT OF AN ALGORITHM FOR AUTOMATIC CONSTRUCTION OF A COMPUTATIONAL PROCEDURE OF LOCAL IMAGE PROCESSING, BASED ON THE HIERARCHICAL REGRESSION

V.N. Kopenkov 1, V.V. Myasnikov 12 1 Samara National Research University, Samara, Russia, 2 Image Processing Systems Institute of RAS, - Branch of the FSRC "Crystallography and Photonics " RAS, Samara, Russia

Abstract

In this paper, we propose an algorithm for the automatic construction (design) of a computational procedure for non-linear local processing of digital signals/images. The aim of this research is to work out an image processing algorithm with a predetermined computational complexity and achieve the best quality of processing on the existing data set, while avoiding a problem of retraining or doing with less training. To achieve this aim we use a local discrete wavelet transform for a preliminary image analysis and the hierarchical regression to construct a local image processing procedure on the basis of a training dataset. Moreover, we work out a method to decide whether the training process should be completed or continued. This method is based on the functional of full cross-validation control, which allows us to construct the processing procedure with a predetermined computational complexity and veracity, and with the best quality.

Keywords: local processing, hierarchical regression, computational efficiency, machine learning, precedent-based processing, functional of full cross-validation.

Citation: Kopenkov VN, Myasnikov VV. Development of an algorithm for automatic construction of a computational procedure of local image processing, based on the hierarchical regression. Computer Optics 2016; 40(5): 713-720. DOI: 10.18287/2412-6179-2016-40-5-713-720.

Acknowledgments: This work was financially supported by the Russian Science Foundation (RSF), grant no. 14-31-00014 "Establishment of a Laboratory of Advanced Technology for Earth Remote Sensing".

Introduction

The tasks of image processing and signal analysis need to be solved in different fields of human activity [1 -3]. Local processing of digital images is one of the most important kinds of transformation in the theory and practice of digital image processing and computer vision.

Historically, the first processing procedures used local linear methods that allow the construction of optimal (in some sense), processing procedures [1, 3]. However, the taking into consideration of new digital signal processing tasks (processing of video, audio, satellite images, etc.), problems of processing large amounts of information (satellite images, remote sensing data, hyperspectral data, multi-dimensional signals), processing in real time, and needs of rising of processing efficiency resulted in the necessity of using nonlinear type of transformations [1, 4]. One of the most common approach currently in use is the implementation of the cybernetic principle of the "black box" (the terms of other authors is the processing via recognition, processing basing on precedents and so on). In this case transformation itself and its parameters are determined by the analyzing of the input and output signals, or images.

The classic approach to construction of approximately universal procedures of local adaptive digital signal and images processing, which implements the principle of "black box" is based on usage of artificial neural networks technique [4]. An alternative, but substantially less researched version of described task solution based on using of a hierarchical computational structure, such as the decisions tree and regressions tree. [5, 6]. This paper develops the idea of the creation of universal mechanism for the construction of local non-linear computational processing procedures based on a hierarchical scheme and features based on local discrete wavelet decomposition of the image.

In addition, the methodology of decision-making on stopping the learning process, as well as the veracity of the obtained results are also presented in the article. Usually, for estimation of the generalization capability and selection a stop learning rule for processing procedure the Vapnik-Chervonenkis statistical theory is used [7]. This theory interconnects the three parameters of training: training error, veracity (reliability), and the length of training dataset. But estimates of statistical theory are highly overestimated and ignore the potential rearrangement of training and testing dataset elements. A more efficient way to estimate the generalization capability is the use of Vorontsov combinatorial theory [8], which is based on evaluation of the functional of full cross-validation, assuming verification all possible combinations of division dataset into training and testing parts. The correct solution to the task of construction of an image processing procedures which takes into account all combination of sets of training and control data is unrealizable in practice because of the giant search of variants of different combinations.

The paper is organized as follows. The first section devoted to subject introduction. The task specification and description of proposed solution, as well as a scheme of processing process are presented in the second section. Description and structure of the algorithm of construction of local image processing procedure and its parameters is presented in the third section. The fourth section is devoted to the description of methodology which allows to determine the rule for stop the process of formation and busting various combinations of training and control samples and stop construction process in general. And finally, conclusions, recommendations, acknowledgements, and references are presented in the end of the paper.

Image local processing model

Model of the local image processing technology, which implement the principle of "black box" (processing through the recognition or based on precedents), suggests decomposition of the transformation in two stages: the formation of the image fragment description (local features computing) and calculation of transformation results. The general scheme of image processing is shown in Fig 1.

To formalize the local image processing problem based on the proposed scheme let us introduce the following description of the pixel neighborhood:

0 = {(«1, n2), n1 = 0, N1 -1, n2 = 0, N2 -1} is an image domain,

x : 0 ® R is an input image,

(ftj ,«2)^x(n '"2)

0(«j, n2) c0, is an image fragment: 0(Hj,n2) = {(n + m1,n2 + m2) :

m = 0, M1 -1, m2 = 0, M 2 -1} (image constriction to region 9: x|e(„1,„2)), 8(n1,n2) = {(n1 + m1,n2 + m2) :

m1 =-(M1/2), M1/2, m2 =-(M 2/2), M 2/2} is a centered fragment,

N1, N2 are the image dimensions, M1, M2 are the fragment dimensions (processing «window»).

The main task in the first stage is the formation features (some specific set of image properties) for predetermined local image fragment - y = (y0,y1,...,yK-1)T, y e RK on the base of transformation 01 : RM1xM2 ® RK.

Fig. 1. A scheme of local image processing

These features are used to calculate the result of transformation O2: R^ ® K (and to generate the resulting image Z) during the second stage of processing.

The whole construction process is based on the processing precedents - a set of matched pairs of images {x|e(Bl,B2) ,z(ni, n2))(nl,n2y. 0(n1,n2a) c 0 (which are usually called training dataset) in order to minimize the processing error:

e=¿ S Ik -f2(f1(x))||

® min,

®1.®2

(1)

where 0 - the image domain, 0(m, «2) i 0 - restriction to the local fragment size M1 xM2:

0(n1,n2) = {(n1 + m1,n2 + m2) :

m1 = 0,M1 -1,m2 = 0,M2 -1}.

Characteristics of the decision

The most known solution of described task is based on the usage of artificial neural networks. Such approach has some special features, advantages and disadvantages which are well described in detail [4]. The alternative technology of the processing procedure construction is based on special hierarchical computational structures, such as regression trees and decision trees [5, 6]. These trees are the hierarchical structures consisting of 2 types of vertices - non-terminal vertices which define a partition of features domain, and terminal vertices which store a regression function.

The procedure based on regression trees has some advantages in comparison with the neural networks:

• automatic correction of "architecture" of the transformation;

• automatic selection of local features which is result of the partition process;

• finitely of the building and tuning process (computational efficiency);

• ease of tuning of the regression parameters in the terminal vertex.

There are some restrictions for the practical implementation. At first, the most important task on the stage of image features calculation in a "sliding window" mode is the task of development of a computationally efficient algorithm for this calculation. Moreover, this algorithm should allow consistently increase the features number up to whole system, because the traditional algorithms of the defining of the effective features subset based on iterative methods and computationally inefficient. At second, the main task on the stage of designing of hierarchical regression is the development of an algorithm to automatic construction of processing procedures on the base of the training dataset which be able to avoid the retraining and insufficient training problems.

Choosing of the linear local features types

We used the family of signal characteristics on the base of local wavelet discrete transformations (DWT) of the signals and images as an image features set. Such features have the following characteristics:

• existence of the computationally efficient calculation algorithm [9];

• complete description of the input signal;

• consistent obtaining and usage of features removes the problem of iterates on the features set.

Issues related to the features formation on the base of local DWT algorithms, as well as their advantages and specialty in relation to local image processing tasks, are considered in the work [9].

The classic scheme for fast calculating of local wavelet transformation (FWT) is based on Mallat scheme [4] and in accordance to the theory of a multiple-scale analysis can be represented as following equations:

w,

i(P) = S - 2P) • w+ (n)

w-+i (P) = S g(n - 2P) • w+ (n),

where p = 1, N, N - length of the input signal, h(n), g(n) -such filters that: X h(n) = 1, g(n) = (- 1)n h (- n + 2t - 1),

(t e Z), Dh, Dg - sizes of domain areas, l = 0,log2 M -wavelets levels, M - processing window size.

Concerning the image processing, the computational complexity of such algorithm for the wavelet levels [I1, ¿2] can be evaluated [9] as following: FWT on the base of Mallat scheme:

U*(L1,L2) = 8/3(22L -1);

Modified FWT: U2*( Z2)

8Z2 - 5Z1 - 5;

Recursive FWT: U3*(L1,L2) = 13(L2 -L1 +1).

Regression tree construction

Technology of regression trees construction consists of the following stages:

1) Selecting parameters and method_ for hierarchical structure construction.

Here we need to distinguish vertices, which require a partition based on the error evaluation, as well as to determine parameters of vertex partition (a threshold and the number of vertices divided), while at the same time selecting «the best feature», which cumulatively provide a maximum error reduction.

When using a linear regression:

K-1

fJ (y) = Xa]kyk + a]K + e, (yt e [yjj, (2)

k=0

a processing error (1) of a terminal vertex can be represented as:

£j (k, a) = miJIfj*(y) - g(y)|l m„ « , +

a 11 ll[yk,j ,yk,j ]

+ mjj^H f7(y ) - g (y ^ a max ]

a 11 J ll[yt,j.yt.j ]

where a* is an optimal partition threshold, k* is the best feature.

2) Calculating the regression coefficients for each terminal vertex.

The elementary regression adjustment (construction) for a vertex is a regression coefficient calculation based on Ordinary Least Squares (a solution of system of linear algebraic equations for all vertex elements of the training set):

f(y) = aTy; y = (y0,y^..^yK-^l/. e2 = E\\aTy - g (y )||2, e2 = (aTU-G)T (aTU-f), where a = (UUT )-1U G.

At the same time, in case of using the linear regression, it is possible, that the number of vertex elements in a set is insufficient to calculate the regression coeffi-

cients. There are different ways to solve this problem, such as: a rejection from the regression construction for the vertex, a coefficient "descent" from the upper level vertex, a reduction of the number of features considered for the vertex, etc.

However, the most effective solution is to conduct a regularization, i.e. to extend the definition of system of linear algebraic equations via the elements of the upper level set, and to solve the system with limitations, which guarantee a zero error for target terminal vertex points.

\aTyn = g(yn >

n = 1, N,

(aTu-r)T (aTu-G)®min.

To solve this problem we use the method of Lagrange multipliers based on the functional of the following form:

M C K

F = X (g (y(m)) - Y.akyr

m=1 V k=0

+

+X m (X akyf - g (y(n)) I -a min.

ti V k=0 )

Solutions of the respective system of equations are:

dF m K N

dat

i = 0K,

dF

=S 2y,(m) (g(^(m) ) - S - S y,(n)mn = 0,

g(y(j)) - S

avlj> = 0, j = 1, N.

For the transformation of the second type (outputs of the form K = {0, 1, 2,...L - 1} - solution of classification problems), the solution is formed as:

f(y) = c, where c = {l : nj = max- nj}.

Here the analogous regularization consists in the use of "upper" level vertex classification results to determine the class number for target terminal vertex:

; v-1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

l = arg max n ,

JeS,

where S, = {l: nv, = max n

{l : nj = max nv},

I l j=0,Z-1 j J '

j=0,L-

v is a terminal vertex,

v-1 is a non-terminal vertex, preceding the vertex v, Sv is a set of indices for the vertex v, n* - number of jth class objects in area of vertex v. Here the classification error is:

a = b a ^ b

1 ^ |0,

e = N2X I(f (y),z), where I(a,b) = j ^

3) Checking the limitations on the computational complexity and evaluating the quality of the processing procedure _ for the control set.

Finishing of the construction process

To determine the stop parameters for algorithm construction process, we need to estimate the generalization

ne D

ne D

g

n

capability of the local processing procedures. When the amount of available data is limited, we cannot infinitely increase the decision rule complexity, otherwise the processing procedure will "maximally adapt" to the training set and will demonstrate bad results for other images of the class under consideration. On the other hand, if the procedure is "half-taught", the transformation error will be unacceptable for both training and control sets. Obviously, for every problem there is an optimal model complexity, which provides the best possible quality of generalization.

Statistical approach

Suppose, there is a set of objects for training:

W = {y,g(y)}, j = 1T .

To evaluate the quality of the algorithm, it is necessary to divide the set into "training" and "control" parts:

WT =Ws uW,

where s + t = T, Ws n W=0.

The algorithm quality for the set W is described by the expression:

v(a, W) = (1 /1Q|) X I(W, a(Wi)), (3)

fflieQ

where A is a family of algorithms,

a is a regression or classification algorithm (computable function),

m is a method (constructed algorithm) for training via the set: m(W) = a,

[l, |a(Wi)-f(w,)| >8(w,)

I (w,., a(wi)) = \ , - for re-

[0, |a(Wi)-f(w,)| <8(w,)

gression construction problems, il, a(Wi) *f(w, )

I(w,. , a(wi)) = " " - for classification

'0, a(Wi) = f(w, )

problems.

Therefore, according to [7], a functional of a uniform error rate deviation for two sets can be represented as:

PS (A) = P {sup (v(a, W)— v(a, WS))

>e .

(4)

It allows us to write the following limitation (for s=t): Pes' (A) <AA (2s) -1.5e~e2s, where AA(s) is a growth function for the family of algorithms, AA(s) < 1.5(sh / h!).

The use of statistical theory for risk estimation, in case of solving the regression construction problem, is as follows. Let a be a vector of regression coefficients, and e2 (a) be a transformation error for a training set of size s. Then, under fairly general assumptions on the form of feature vector distribution, with a probability of (1-h) for all regression functions of a family A(W, a) it is appropriate to use estimation of the following form:

, x lh (ln(s/h) +1)- ln h I (a) <=e(a) /(1 -J^^—-—---) = J (a), (5)

where h is a capacity of decision function class, I (a) is an average risk (generally speaking, its minimization is the purpose of solving the regression construction problem).

In this case, if we specify the structure of training methods m1 ^ m2 mM = m for the allowable family of algorithms A(W, a), then it becomes possible to minimize the average risk functional by structure elements. For each constructed algorithm ap = mp (W, a) we calculate the estimation ap using Ordinary Least Squares, then we search for the best estimation among a1,a2,...,aM in sense of a minimum of expression J( a) from (5):

aoP, = arg min( J (a)),

1<a<M

which is the solution of regression construction problem with parameters a opt.

In case of hierarchical regression with a linear function of elementary regression, for terminal vertices the structure c m2 e mM = m produces "trees" nested within one another, which store the elementary regression parameters in each vertex. Here, a regarded as a generalized vector of coefficients, which consists of regression coefficients for each terminal vertex of the tree.

The capacity of decision function class determines "the number of degrees of freedom" (the number of class parameters). This value is indirectly related to the amount of memory required to store hierarchical regression parameters. It is easy to see that for linear regression h = K + 1, for piecewise-linear regression h = (K + 1) P, for piecewise-constant regression h = P + 1, where K is a number of features, P is a number of terminal vertices of the tree.

Furthermore, according to [1, 2], we can write the following expression:

h< 4,5((2Sr-1/(A — 1) !)e

-0.5e2(i-l)

(6)

which characterizes the relationship between 3 parameters: e - accuracy, h - reliability (trustworthiness) and s - size of a training set.

The considered method can be used to develop a stopping rule for construction process, but the basic problem of a statistical theory is overestimation due to excessive generalization. The theory is valid for any target function, arbitrary distribution of objects in space, and a very broad class of training methods, i.e. many significant features of the training process are not taken into account.

Combinatorial approach

A combinatorial approach arose as an attempt to construct a statistical learning theory more precisely, starting with its initial postulates [7]. It turned out that the algorithm model complexity is not a determining factor of training quality [8]. The way, the algorithm parameters are adjusted via set, is much more important. Combinatorial theory allows to substantiate the use of arbitrarily complex algorithm models, provided that they are configured appropriately. Major studies in combinatorial theory development are aimed at showing how the parameter adjustment should be performed to avoid the risk of retraining.

As mentioned above, low error rate for a given training set, in general, does not mean, that the constructed algorithm will also perform well for other sets. In this case, the error rate for a control set Q', which, in general, does not intersect with the training set Q;, also cannot adequately describe the training quality. The disadvantage of this approach is that, generally speaking, we fix a random set partition QT = Q; u Q' into training and control parts and even if the value eQ is small enough, there is no guarantee that, in case of the other partition QT = Q;' u Q' of the same set, the value eQ,. is also small.

From these considerations follows the requirement for the functional, which characterizes the training quality of the final set: it must be invariant under arbitrary set permutations [8].

Let (Q;,Q'n), n = 1, 2,...,Nbe the all possible partitions of a set QT into training and control sets. Let v(m(Qn), W;) be the error rate of an algorithm m(Q;), constructed on the basis of a set Qsn, tested with the use of a set Qn ( an analogue of function (3) for various combinations of control and training sets). The number of all N set partitions is C;.

Full cross-validation functional, which characterizes the quality of m(Q) method training for a finite set of objects Q and demonstrates the invariance property:

1 N

QS (m(W), w) = - X ), wn).

N n=l

(7)

It is proved in [2], that the expected value of this quality functional is limited:

EQSte (m(Q), Q) < Pf (A), where Pf' (A) is a functional of a uniform bias of an error rate for two sets (4) from Vapnik theory [7].

Next, if we consider h as a capacity of decision function class, it becomes possible to write an estimate for (7) in the following form:

Qf'e (m(Q), Q) < (CT+C1 + ...chT )(c;_e, /c; ). (8)

This estimate allows to significantly (by orders) reduce the requirements for the size of the training set. Empirical studies show, that this model selection technique is more preferable in many cases than principles of structural risk minimization and minimum description length, based on various formalizations of an algorithm complexity concept. In [2] the concept of effective capacity is introduced, and it is shown that statistical estimates remain correct, if we replace capacity with effective capacity. At the same time, in particular problems the effective capacity can be considerably lower than the total capacity of the family, e.g. in case of linear decision rules.

The effective capacity:

Ql(m, QT) = (1/N) x

-'Sup v

N

x£[sup(v(m,wn)-v(mwn)) > e] < C(The^s/h!)

where C is a constant, and ; = '.

Moreover, we can estimate a local effective capacity -a value of a parameter h from the expression (8), such that the dependence Q;'e (m(Q), Q) can be approximated by the following formula in the best way possible:

jh r^k z-^s-k

Q(m(Q), Q) » c- max £ -C^,

' h! fee) kEK(e,o) CT

where M(e, o) = {m | e' < m <'+a;}, K(e, o) = {k | max(0, m -') < k < min(o;, (m - e'); / T)}.

Unlike Vapnik concept of the effective capacity, the local effective capacity considers all three factors: features of the object distribution, features of the target function, and features of the training method.

Algorithm for automatic construction ofprocessing procedure with limitations on complexity and execution quality

Development of effective algorithm of the local image processing based on a hierarchical regression and such features as a local DWT of image requires simultaneous consideration of different performance indicators: the computational complexity of the procedure, its quality or processing error and generalization capability.

The scheme of the algorithm of automatic construction of processing procedures is shown in Fig. 2. The algorithm assumes the consistent accumulation of features as long as the functional of a full sliding control is decreasing (that means that a quality of processing is improving), and the computational complexity of the procedure remains in predetermined limits.

Experimental researches

As an experimental task, we consider the task of image filtration. The solution involves the use of local image processing procedures based on regression tree (RT) and artificial neural network (NN). The comparison of the processing quality and computational complexity of these algorithms is presented in table 1.

Table 1. The comparison resuli;

NN e 10.92 10.78 10.69 10.67 10.63 10.62 10.62 10.64

U 54 99 189 279 369 459 549 621

RT e 11.28 11.01 10.89 10.81 10.72 10.62 10.57 10.61

U 31 38 41 44 46 48 50 52

As can be seen from the table, the proposed method of hierarchical regression has the better accuracy with essentially smaller computational complexity than well known neural network method.

The experimental results allow us to draw the following conclusions:

- the constructed computational procedure for local processing demonstrates significantly higher efficiency (both speed and processing quality) for solving filtering and image restoration problems, compared with known Wiener filter;

- the computational procedure for local processing and its construction method demonstrate superiority over the method of processing and construction, based on the use of artificial neural networks. As for computational complexity, it is superior by orders.

n=1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Algorithm for conctructing

/XT, k=0 / XT=XS+Xt

I —

New feature add

--k+1 I

Generation of databases: Xs, Xt

T.

Regression tree construction: a=|i(Xs). Calculation: v(a,Xs), v(a,Xi)

The learning algorithm (n) Begin

Selection of nodes for partitioning. Evaluation of regression tree after partition - U*

Nodes partition: r:=r+l Output: a

Calculating of the regression parameters: vr=v(a,Xt), v(a,Xs)

a min Vmin ■ ^k min =Vkmin

Fig. 2. A diagram of algorithm for construction a computational procedure of local image processing

Methodology of stopping of training process

Taking into account generalization capability of the local processing procedures based on a functional of a full sliding control [8] we can estimate that the total number of all possible N decompositions of dataset is CST . General scheme of construction procedure is shown in Fig. 3.

Is quite logical fact that in the case of images processing the construction of processing procedures which takes into account all combination of training and testing dataset is unrealizable because of the incredibly large busting on various combinations of datasets. Therefore, we have had to develop a method of determination rule for stop busting process on the base of a finite number of samples.

For sufficiently large sample volumes can be assumed that the error rate of the algorithm has a binomial distribution with t degrees of freedom (the length of the test dataset) and the probability of " success" = p (quality of the algorithm on a control set).

In this way:

v(m(Wn), W'n) ~ Bin(t, p).

Function of the probability is specified as:

pv (r) = c;pr (1 - p)t-r, r = 0t.

Then the distribution of the functional full cross-validation is evaluated by:

Qst (m(W), W) =1X n (m(wn ), wn ) - bw N ■ t, p).

N n=0

Creation

of initial dataset

X z

i i i i i i +

(Xi,Zi), i=l,T

Features calculation:

i=l,T

Mhz,,

Construction of transformation

a=u(Qr\ v (a,CY) v (a,Of)

Datasets generation:

ST Q.'

Fig. 3. Schema of construction procedure with exhaustive search on datasets

Decision about continuing or stopping generation of different combinations training and control datasets and transition to the next subset of features can be taken based on the analysis of functional Q[' ~ Bin(N1 • t, p1), Q2' ~ Bin(N2 • t, p2) for the different subsets of features. We decide whether to recalculate the feature space, or to stop the process of building a processing procedure under the assumption of p2<pi, with veracity g (and correspondingly p2 < pi, with the veracity (1 - g)).

In such case the quality of the algorithm on dataset W can be estimated as:

v(m(W), wt )

1, P

where I(wi, |m(w,.)) = ■

[0, 1- p

Moreover, if n >> 1 (that is justified, because n is the number of objects and corresponds to the image size) and the 1 is fixed, we obtain the Poisson distribution with parameter 1 Bin (n, 1 / n) » P(1).

In this case to make a decision of stop generation process for various combinations of training and control datasets, and to transition to next features set we have to calculate confidence intervals for the expectation of a Poisson distribution for the functional full cross-validation on a datasets Ni and N2 in form:

X [^ - (t1-a/2 V1! /VN2), ^2 + (t1-a/2^VV^] ,

where t1-a/2 - quantile of distribution N0,1 for level 1 - a / 2 (a = 1 - g).

Pi P2

mm^k

_

Fig. 4. Calculation of confidence intervals

The decision of stopping generation of different combinations training and control datasets and about the switching to the next subset of features is taken at a moment when a separation of calculated confidence intervals on adjacent steps is achieved.

Illustration of the process ofprocessing algorithm construction

Figure 5 shows an example of training of a regression tree, for different sets of features (K=1, 2, 3,...,12, with a gradual increase). The graphs show the noise reduction (e2 / Dv) with the increasing of a regression tree depth (Hav). Fig. 6 presents a statistics of process of regression tree construction on a various combinations of training and control datasets (group of points of each colors correspond to the optimum value of quality for a given set of features K=1,2,3,...,12, in the case of exhaustive search a some number of partitioning of a dataset W on training and control part (Wsn, W^), n = 1,2,..., N).

Fig. 7 shows a graph of the construction process of local image processing procedures, with confidence intervals, for the optimal values of quality, and Figure 8 -calculation of the required number of combinations of training/testing datasets for the making a decision of switching to the next set of features (the number of combinations required for the separation of the confidence intervals on the adjacent steps).

2 4 6 8 10 12 14 16 18

Fig. 5. Process of training ofprocessing procedure at various features space

eVDv 0.22 -0 21 0.200.190.18- #S 0.17-

0.16-L----

• k=2

□ k=4

o k=6

k=8

+ k=10

k=12

/ 1

2 4 6 8 10 12 K

Fig. 6. Statistics of quality ofprocedures for different combinations of training and control datasets

0.230.22 0.21 0.20 0.19 0.18 0.17

z2/Dv

23456789 10 11 12 K

Fig. 7. Construction of the processing procedure with confidence intervals

1000000 100000 100001000100-

ln{N)

123456789 10 11 K

Fig. 8. Calculating of combinations number

Conclusions and results

The paper presents an efficient technology that allows to realize automatic construction of computational procedure of local processing of digital signals/images. In accordance with the creation processes, the constructed computational procedure has a specified complexity, the highest quality, and the generalizing ability. The proposed method of estimation of the required number of algorithm training iterations and, as a consequence, the

stopping rule of the formation different combinations of training and testing datasets based on their particular number allows to use the full functionality of combinatorial theory and a functional of full cross-validation control during the constructing (training) processing procedures, which are tuning on the bases of a training dataset. And as a result, possible to prevent problems of retraining/poorly trained processing algorithms and, at the same time, construct the local processing procedure with predetermined computational complexity and veracity, and with the best quality (for an existing training dataset).

References

[1] Soifer VA, ed, Chernov AV, Chernov VM, Chicheva MA, Fursov VA, Gashnikov MV, Glumov NI, Ilyasova NY, Khramov AG, Korepanov AO, Kupriyanov AV, Myasnikov EV, Myasnikov VV, Popov SB, Sergeyev VV. Computer Image Processing, Part II: Methods and algorithms. VDM Verlag; 2010. ISBN: 978-3-639-17545-5.

[2] Woods R, Gonzalez R. Digital Image Processing. 2nd ed. Prentice Hall; 2002. ISBN 0-201-18075-8.

[3] Pratt W. Digital image processing. 4th ed. Wiley-Interscience; 2007. ISBN: 978-0-471-76777-0.

[4] Haykin S. Neural Networks: A Comprehensive Foundation. Upper Saddle River, NJ: Prentice Hall; 1999. ISBN 013-273350-1.

[5] Breiman L Friedman J, Olshen R, Stone C. Classification and regression trees. Monterey, CA: Wadsworth, Inc.; 1984. ISBN 978-0412048418.

[6] Kopenkov V, Myasnikov V. An algorithm for automatic construction of computational procedure of non-linear local image processing on the base of hierarchical regression [In Russian]. Computer optics 2012; 36(2):257-266.

[7] Vapnik, VN, Chervonenkis AYa. Theory of Pattern Recognition [in Russian]. Moscow: "Nauka" Publisher; 1974.

[8] Vorontsov K. A combinatorial approach to assessing the quality of training algorithm [In Russian]. Mathematical problems of cybernetics 2004; 13: 5-36.

[9] Kopenkov V. Efficient algorithms of local discrete wavelet transform with HAAR-like bases. Pattern Recognition and Image Analysis 2008; 18(4): 654-661. DOI: 10.1134/S1054661808040184.

[10] Kopenkov V. On halting the process of hierarchical regression construction when implementing computational procedures for local image processing. Pattern Recognition and Image Analysis 2014; 24(4): 506-510. DOI: 10.1134/S1054661814040087.

Authors' information

Vasiliy Nikolaevich Kopenkov (1978 b.), graduated from S.P. Korolyov Samara State Aerospace University (SSAU) at 2001, received his Candidate's degree in Technical Sciences in 2011. At present he is an assistant at SSAU's Geoinformat-ics and Information Security sub-department, holding a part-time position of a researcher at the Image Processing Systems Institute of the Russian Academy of Sciences. The area of research interests includes digital signals and image processing, geoinformatics, pattern recognition. He is a co-author of 53 scientific papers, including 10 research articles. He is a member of the Russian Association of Pattern Recognition and Image Analysis. E-mail: [email protected] . Website: http://www.ipsi.smr.ru/staff/kopenkov.htm .

The information about author Vladislav Valerievich Myasnikov you can find on page 712 this issue.

Code of State Categories Scientific and Technical Information (in Russian - GRNTI)):27.43.51 Received September 26, 2016. The final version - October 19, 2016.

i Надоели баннеры? Вы всегда можете отключить рекламу.