Научная статья на тему 'On nonparametric modelling of multidimensional noninertial systems with delay'

On nonparametric modelling of multidimensional noninertial systems with delay Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
85
18
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
NONPARAMETRIC IDENTIFICATION / DATA ANALYSIS / COMPUTATIONAL MODELLING / НЕПАРАМЕТРИЧЕСКАЯ ИДЕНТИФИКАЦИЯ / АНАЛИЗ ДАННЫХ / ВЫБОРКА / КОМПЬЮТЕРНОЕ МОДЕЛИРОВАНИЕ

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Medvedev A.V., Chzhan E.A.

We consider the problem of noninertial objects identification under nonparametric uncertainty when a priori information about the parametric structure of the object is not available. In many applications there is a situation, when measurements of various output variables are made through significant period of time and it can substantially exceed the time constant of the object. In this context, we must consider the object as the noninertial with delay. In fact, there are two basic approaches to solve problems of identification: one of them is identification in «narrow» sense or parametric identification. However, it is natural to apply the local approximation methods when we do not have enough a priori information to select the parameter structure. These methods deal with qualitative properties of the object. If the source data of the object is sufficiently representative, the nonparametric identification gives a satisfactory result but if there are «sparsity» or «gaps» in the space of input and output variables the quality of nonparametric models is significantly reduced. This article is devoted to the method of filling or generation of training samples based on current available information. This can significantly improve the accuracy of identification of nonparametric models of noninertial systems with delay. Conducted computing experiments have confirmed that the quality of nonparametric models of noninertial systems can be significantly improved as a result of original sample «repair». At the same time it helps to increase the accuracy of the model at the border areas of the process input-output variables definition.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «On nonparametric modelling of multidimensional noninertial systems with delay»

MSC 93B30

DOI: 10.14529/ mmp 170210

ON NONPARAMETRIC MODELLING OF MULTIDIMENSIONAL NONINERTIAL SYSTEMS WITH DELAY

A.V. Medvedev, E.A. Chzhan

Siberian Federal University, Krasnoyarsk, Russian Federation E-mail: [email protected]

We consider the problem of noninertial objects identification under nonparametric uncertainty when a priori information about the parametric structure of the object is not available. In many applications there is a situation, when measurements of various output variables are made through significant period of time and it can substantially exceed the time constant of the object. In this context, we must consider the object as the noninertial with delay. In fact, there are two basic approaches to solve problems of identification: one of them is identification in "narrow" sense or parametric identification. However, it is natural to apply the local approximation methods when we do not have enough a priori information to select the parameter structure. These methods deal with qualitative properties of the object. If the source data of the object is sufficiently representative, the nonparametric identification gives a satisfactory result but if there are "sparsity" or "gaps" in the space of input and output variables the quality of nonparametric models is significantly reduced. This article is devoted to the method of filling or generation of training samples based on current available information. This can significantly improve the accuracy of identification of nonparametric models of noninertial systems with delay. Conducted computing experiments have confirmed that the quality of nonparametric models of noninertial systems can be significantly improved as a result of original sample "repair". At the same time it helps to increase the accuracy of the model at the border areas of the process input-output variables definition.

Keywords: nonparametric identification; data analysis; computational modelling.

Introduction

Simulation of discrete-continuous systems is of considerable interest due to the practice prevalence of situations when some components of the output variables are measured through significant periods of time which are substantially bigger then the time constant of the object. For example, transient process of dynamic object can be completed in 20 minutes, but output variable measurements are carried out after 2 hours. Here we consider the problem of identification of multidimensional static systems with delay under nonparametric uncertainty, when the model parametric structure of the process is not known. In other words, the a priori information about the process under study is not enough to more or less objectively define the model of the process within the parameter vector. In this case, the identification of the problem can be considered in the framework of nonparametric system [1]. It should be noted that further used nonparametric Nadaraya - Watson estimation of regression function refers to the category of local approximation methods as opposed to parametric methods.

In nonparametric identification of multidimensional static objects with delay quality of the resulting model depends heavily on the initial data. Sample of observations of the input and output variables can have a number of disadvantages [2]. They can be of different nature, come out of measurement error, the functioning of the investigated process and various control discreteness of input and output variables. Here we consider the problem

of identification of stochastic systems when the sample of observations contains nsparsityn in the regulated area of the process.

Note that for solving the problem of identification within the parametric approach, the problem is not so acute. But in nonparametric identification it requires special attention. In case of nonparametric identification there can arise a situation when the forecast of output variable is inaccurate or even can not be calculated because of an uncertainty type [0/0]. This is typical for the nonparametric regression function estimation on observations that is used to solve this problem. To some extent, this is "payment" for the absence of the stage of parametric structure definition of the investigated process model. The selection of parametric structure within the parameter vector is a quite difficult task and it requires significant research efforts.

The distribution of the observations sample in the space of input and output variables plays an important role in nonparametric estimation. Often there is a need to supplement the initial learning sample in order to eliminate the nsparsityn in certain subregions of the investigated process. Below we discuss methods, techniques of supplement of the initial learning sample, which, ultimately, lead to models improvement of the object under nonparametric identification.

1. The Problem Statement

Consider a multi-dimensional static object with delay, its general scheme is shown in Fig. 1 [3, 4].

Fig. 1. The General Scheme of the Investigated Object

In Fig. 1 we accepted the following notation: A is an unknown object operator, the input vector of the object u(t) = (ul(t),u2(t),...,um(t)) E Q(u) C Rm has dimension m, the output variable vector of the object x(t) = (xl(t),x2(t),...,xn(t)) E Q(x) C Rn has dimension n, t is continuous time, At is control discreteness of input and output variables of the process; £(t) is a random noise vector; Gu, Gx are control units of input and output variables with random noise gu(t),gx(t) which have zero mathematical expectations and bounded variances; ut and xt are measurement of variables u(t) and x(t) at discrete

time. Thus, by measuring the values of input and output variables, we obtain a sample {ui,Xi,i = 1,s}, where s is the sample size, which is said to be the initial sample of observations.

It is clear that, if we conduct another experiment on the same object, we get different sample with another distribution of input and output variables in space of observations. In particular, the subregion with large amount of observations may be replaced by sparsity.

The content of the identification problem is to construct a nonparametric model of the investigated process based on available learning sample {ui,xi,i = 1,...,s} under conditions of incomplete information, when it is difficult to parameterize the model. The asymptotic properties of nonparametric estimations of regression functions have been studied in detail in [5]. Analysis of smoothing properties of nonparametric estimations of regression functions have been considered in sufficient detail in the monographs [6, 7]. Note that the sparsity of the sample in the space of input and output variables should not be confused with the gaps in the data. It is necessary to restore the sparsity areas due to the internal properties of nonparametric Xadaraya Watson estimation. However, it should be clearly understood that generated elements included in the initial learning sample do not contain information about the object, because they are not actual data obtained on-site. We use these generated elements in computing nonparametric estimation to eliminate the uncertainty of type [0/0]. It should be noted that we supplement the sample of observations with new elements that are not measured on a real object, however, they are generated on the basis of the initial learning sample which reflects properties of the real object.

2. Parametric Identification

In the construction of models of discrete-continuous processes there dominates a parametric identification or identification in "narrow" sense [3, 4]. The parametric identification of stochastic systems is based on two main phases: parameterization of the model and estimation of the parameter vector with the sample of observations of input and output variables of the process. In other words, in the first stage we select the parametric structure of the model, for example:

Xa(t) = f (u(t - T),a), (1)

where f is a certain function, a is a parameter vector, t is delay. At the second stage, we estimate the parameters a on the basis of the available sample {ui,xi, i = 1, s}. There are many methods and algorithms to get these estimations [3].

In this way, the main difficulty lies in the choice of a parametric structure of the object (1). This is the most difficult stage for researchers. Here it would be appropriate to recall the phrase of Democritus: "Even a slight deviation from the truth, in the future leads to infinite error".

3. Nonparametric Identification

In most cases, we have little information about the object and only a few qualitative properties of the investigated object, such as uniqueness or ambiguity, the linearity of the dynamic object or non-linearity and others. In this case, the a priori information is not sufficient to select the parametric structure of the object. It is proposed to use identification methods in a "broad" sense. On this occasion, professor N.S. Eaybman in the preface to

the book [4] mentions: "priori information about the object in the identification in a broad sense is absent or very poor, so we have to previously solve the large number of additional problems. These problems include: system parametric structure selection and model class assignment... ". As a model of the object we take the nonparametric estimation of the regression function Nadaraya - Watson [1, 8]:

x . = S'=i x, n=i *(c-lu - u))

£*=i j - u))'

where bell-shaped function &(c-l(uj — uj)), i = 1, s and smoothing parameter cs satisfies the convergence conditions [4, 5]. cs

of difference between the object and model output, based on a sliding exam when in the i

s

R(cs) = y^(xk — xs(uk, cs))2 ^ min, k = i. (3)

k=l

Estimation xs(u) (2) based on sample {ui,xi,i = 1, s} belongs to the class of local approximations. Note that the function $(c-1 (uj — uj)), i = 1, s, j = 1, m has the following property:

i j j f> 0,iic~sl\uj — uj| <n, $(cs (u — u,)) = n ^_______; (4)

0, otherwise,

where i = 1, s,j = 1, m, n is a constant depending on the choice of a particular bell-shaped function $(cssl(uj — uj^te is ^^^^rnined by the values of (uj — uj) and the

smoothing parameter c^. Value of the argument (c-l(uj — uj)) of the bell-shaped function

uj cs

kernel as the bell-shaped function:

I

Ф^"1 (U - U)) ={ 1 " Cjl W - W - U1 " 1 (5)

then n = 1 Below, we discuss the of the point и = и1 ,j = 1,m for fixed

cs. In the analysis of nonparametric estimation of the regression function from observations (2) may arise situations when none of the elements of the learning sample {щ, Xi,i = 1, s} belongs to cs-neighborhood of и = и1 ,j = 1,m, which lead, in view (4), to uncertainty (2) of the form [0/0].

Estimation xs(u') at the point и' = (и[,и'2, ■■■,u'm) is restored on the basis of the sample elements that are in the cs-neighborhood of the point и'. The obvious is the fact that the accuracy of estimation depends on the number of items on which this estimation is

cs

of the point и1, it impossible to give the estimation. In this case there is a problem of

[0/0]

cs

uncertainty), but the forecast xs(u') can be inaccurate.

The accuracy of nonparametric estimation (2) depends on the sample of observations {ui, xi= 1, s}. In many practical problems, even for the same process under investigation the samples {ui, xi,i = 1, s} in different time intervals may differ significantly, which affect on the accuracy of forecast xs (u). Hence, there is a problem of generating the working learning sample based on the initial sample {ui, xi,i = 1, s}. In the initial learning sample there are sparsity and subregions with large amount of observations of the domain Q(x, u) (Fig. 2).

Fig. 2. Correlation Field of Input Variables for Initial Sample

In Fig. 2 subdomains with sparsity and boundary points where Nadaraya - Watson nonparametric estimation (2) is inaccurate are marked with asterisks. Our task is to algorithmically convert the initial sample shown in Fig. 2 into a working sample, for example, shown in Fig. 3.

3.0 -

2 & 4

1.5 -r

0 o

o ^ o

o*

o o ° o

^ O 0 o

O o ~ v "o O 0° °

O 5

0 o

ftO cm.

O 0

o c

o ° 0

> cV oV O

0 O O

, o O o

o o o o ° 0

A O OflOo

° « n ° • o

o * o °

i ■ i ■ ■ \ * ■ * ■ h ■ ■ ■ ■ i

tii

20

2.5

3.0

Fig. 3. Correlation Field of Input Variables for Working Sample

Note that although we are interested in the case of m-dimensional vectors u E Q(u) c R™, for the simplicity of visualization the two-dimensional vector u E Q(u) c R2 is presented in the figures.

4. Method of Working Sample Generation

The main idea of generation of a working sample {u,, x,,i = 1, N}, N > s based on the initial sample {u,, x,,i = 1, s} lies in the fact that the sparsity of the field is complemented by new sample items that are included in the working sample according to a particular algorithm.

The idea of sample generation is the following: we generate working samples based on initial observations using different methods. In particular, this idea is used in the bootstrap methods [9]. Bootstrap methods are widely used in statistical analysis in estimating the distribution parameters and hypothesis testing [10, 11].

Generally speaking, it should be noted that only the initial sample of observations of input and output variables {ui,xi,i = 1, s} contains information on the investigated process. We need the newly created elements only to improve the efficiency of nonparametric models, because they are based on local approximation methods. It should be understood that the generated new points do not contain information about the object. Over time, when there are new real measurements of object variables, they are naturally included in the initial learning sample.

Algorithm of generating new working sample is the following.

- Using initial sample {ui,xi,i = 1, s} find value of smoothing parameter cs by

cs

- Denote by pk the number of the sample elements {ui, xi,i = 1, s}, which are located cs k

{ui, xi,i = 1, s}, which satisfy the inequality: if Y\m=l ^(c-l(ujk — uj)) > 0, then the element ui cs uk

- Calculate the average number of elements pav, in cs-neighborhoods of the original sample elements using the following formula:

Pav. = S Pi. (6)

i=1

Calculate the Euclidean distance between all elements of the sample {ui, xi, i =1, s}:

d(ui, Uj )

\

£ (U - uj)2. (7)

l=1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Let П' be a set of all dist ances d(ui, Uj), i = 1,s,j = 1, s,i > j.

- Select elements of the initial sample {ui, xi,i = 1, s} between which the distances d are minimal. For example, {u^,, xi,i = 1, s1}, s1 < s. As it is shown by numerous numerical studies, the size of new sample s1 ranges from 1/2 to 2/3 of the initial sample size s. Among all the elements of the set П' С П we find the minimum value of dmin(ui, uj) and begin to form a new sample set Й С П, including in it the couple members щ, Uj, and excluding them from the set ГУ. Then find in the set ГУ the following minimum distance between elements of the initial sample, which is also included in the sample set. If dmin(ui,uj) = 0, then the elements ui and Uj coincide, we include in the sample set only one element which was not previously included. Repeat this procedure until a new set Й will not contain about 70% of elements of the initial sample. We suggest to select 70% of the sample due to numerous computational experiments. So, П is a set of all elements that make up domains with a sufficient number of elements.

Check the following condition for all the elements of sample set Q:

Pk < Pav,k = 1; si. (8)

If (8) is satisfied, then we exclude the k-th element from the sample set Q.

- Thus, all elements of the sample set Q are excluded from the original sample {ui,xi,i = 1, s}. We obtain the sample {ui,xi,i = 1,s'},s' < s. All the elements of the sample are located in sparsity subdomains. This sample also contains the boundary elements.

Next, consider the generation of new elements to be included in the working sample {ui,xi,i = 1, N}, N > s.

- Check the following condition for the sample size s': pi < pav,,i = 1,s'. If the condition is satisfied, then we generate k elements in cs-neighborhood of the point ui , where ki = pav, — pi,i = 1, s'. For example, if the re are 7 element s (pi = 7) of the initial sample {ui, xi,i = 1, s} in cs-neighborhood of the element u, and an average is 6 (pav. = 6), we do not generate any new element, but if pi = 4, then we generate 2 additional elements. These elements uk are generated according to the following rule:

ujki = uj + Zkcs,j = 1,m,i = 1,s', k =1, ki, (9)

where Zk is a random variable distributed according to a uniform law in the interval [—1; 1], uj is a value of input variables u,. In the cs-neighborhood of the point ui we generate elements uk.

- For generated elements u = u object output values x(t) are not known, so for these elements we calculate the estimation of the output variable xs(u) (2) based on the initial sample {ui,xi,i = 1, s}. Thus, for each value of u, obtained by (9), we calculate the estimation of xs(u). If there is a situation of uncertainty, i.e. in the cs-neighborhood there

cs

and the initial learning sample form a new working sample {ui,xi,i = 1,N},N > s. So, new working sample consists of observations {ui,xi, i = 1, s} and generated elements.

cs

elements, so some of them may be located too close to each other, for example, the distance between them is less than or equal to a sufficiently small value e, defined experimentally, or even coincide. Such generated elements are not of interest, and they should be removed. Note that we remove the artificially generated sample elements. To do this, we calculate the value of the average distance between the points of the main sample {ui, xi,i = 1, s}:

2 s s

dav. = s( 1) J2Y1 d(ui,u3), i<j, (10)

( ) i=1 j=1

where d(ui,uj) is the distance between the elements u, and uj, which is calculated by (7).

- Determine the distance from the artificially generated element to all elements of the new sample. Next, we find the value of the minimum distance dmin, if dmin < e, where e = adav., then this item is removed from the sample. The value of the coefficient a is determined experimentally so that the size of the working sample in 1.5-2 times bigger then the size of the initial sample {u,, x,,i = 1, s}. Thus, all the extra are deleted.

a

sample.

- After the working sample is generated, calculate estimation (2), defining a new value of smoothing parameter cs according to (3). In (2) for values xi we use the object output for the observations of the initial sample {ui,xi, i = 1, s} and model output value for the generated elements, as for them there is no way to calculate the output value of the object.

Thus, the new generated sample elements are located in sparsity subdomains (Fig. 3). The sample size is increased, on average, in 1, 5-2 times, depending on the original sample size. In calculation of the non-parametric estimation of regression function there is a large number of observations in the cs-neighborhood of the initial observations sample, so we improve the accuracy of modelling and eliminate uncertainty in calculations.

This method of the working sample generation can improve the accuracy of recovery of nonparametric estimation of the regression function (2) for the boundary points due to the fact that in the cs-neighborhood of these points we generate some new elements.

The peculiarity of the above described method is that there is no need to specifically allocate the boundary elements of the initial sample. However, it must be done in order to estimate how accuracy of modelling is changed. This is easily done by statistical modelling methods.

5. Computer Modelling

In computer modelling of supplement of the initial sample {ui,xi,i = 1, s} we get the working sample {Ui,Xi,i = 1,N},N > s. For the simplicity of visualization consider two-dimensional vector u E Q(u) C R2. Without loss of generality, let the investigated object be described by:

x(u) = ui + u2 + £, (11)

where £ is a uniformly distributed random noise:

£ = k6x, (12)

where coefficient k determines the level of interference, 6 is a random variable distributed according to a uniform law with zero expectation in the range of [—1; 1].

It should be noted that we take a uniform distribution law to toughen the simulation conditions, because normal and similar law of distribution is more natural in practice. The coefficient k, in fact, determines the percentage of interference. Thus, for example, for 5% noise: k = 0, 05.

Let the values of the input variables be distributed in the range of [0; 3]. Thus we have a sample of observations {ui,xi,i = 1,100}. The initial observations sample is generated in such a way that in the space of input and output variables there are subdomains of sparsity with a small number of items (Fig. 2).

Construct a nonparametric estimation of the regression function xs(u) (2) on the basis of the initial sample of observations. Note again that the dependence of (11) is unknown, but only the sample {ui, xi,i = 1,100} is given. We present the following results when the nature of the relationship is non-linear, and the dimension of the vector x is 10. But first, let us consider the two-dimensional case in more detail.

We use the non-parametric model (2). If there is a situation of uncertainty of the forecast at u = u' based on the original sample of observations {ui,xi,i = 1,100}, in cs-neighborhood of the point u = u' there are no sample elements, the forecast value xs(u')

x

As a result of the above-described methods we receive new sample {Ui, xi,i = 1, 281}, which includes elements of the original sample and generated, which now is called the new learning sample. Then we conduct an experiment with this sample {ui,xi,i = 1, 281} in the sliding test mode. In addition, from (11) we generate a new sample {ui,xi,i = 1,100} uniformly distributed in the space of input and output observations and use it as the examining sample.

The relative approximation error has the following form:

W =

\

(xi xsi)

-, (13)

s i=l

—[Y, (xi— mxY

i=l

where mx is the estimation of expectation of the output variable x xi the result of

x u = ui xsi

u = ui

In Table 1 the following notation is used: "before" is the relative error for the initial sample, "after" is the relative error for the working sample, which includes the elements of the original sample and generated with the proposed method, A is the number of elements of examining sample for which it is impossible (because of an uncertainty [0/0]) to receive the forecast based on the initial sample, B is the number of such elements based on the working sample.

Table 1

Results of modelling of object (11)

Sample Error "before" Error "after" A B

Examining sample 0,363 0,141 28 0

Initial sample 0,136 0,096 4 0

Border points of the initial 0,135 0,084 3 0

sample

As it can be seen from Table 1, the use of the working sample leads to two times improvement of estimation accuracy in average. Furthermore, the use of the working sample allows getting forecast for all examining sample points.

As we use the nonparametric models so parameterization is not required, these models are robust to the type of nonlinearity. Consider the results of modelling of the nonlinear object. Let the object be described by the following equation:

x(u) = u[ — 2 sin u2 + (14)

where £ is a uniformly distributed noise (12), input variables ui,u2 £ [0; 3] x [0; 3]. We use (14) to generate the initial learning sample {ui? xi,i = 1, s}.

The sample also contains sparsity. The size of the initial sample is 200 elements. Then, using the above algorithm we generate new elements. The size of the working sample is 453 element. We carry out series of experiments similar to the case of simulation of linear object. The results are shown in Table 2.

Table 2

Results of modelling of the object (14)

Sample Error "before" Error "after" A В

Examining sample 0,838 0,237 4 0

Initial sample 0,212 0,113 0 0

Border points of the initial 0,347 0,155 0 0

sample

As it can be seen from Table 2, we can not get the estimation for 4 elements of the examining sample in the case of using the original sample. If we use the working sample, generated using the proposed method, there is no uncertainty and we can get the estimation for all sample elements. In addition, if we use the working sample, the nonparametric estimation xs(u) is two times more accurate.

Consider results of the above experiments for the high dimensional vector u. Assume that the investigated object has the form:

x(u) = 0, 5ui — sin u2 + 0, 3u3 + u4 — 0, 3u5 + u6 + 2u7 + 2 cos u8 + u9 + ui0 + (15)

The size of the initial and examining sample is 300 elements. In the sample, as well as in previous experiments, there are subdomains of sparsity and the lack of observations. The results of similar experiments are shown in Table 3. As it is seen from the experiment results, the use of the new working sample increases the accuracy of identification.

Table 3

Results of modelling of the object (15)

Sample Error "before" Error "after" A В

Examining sample 0,812 0,612 51 0

Initial sample 0,427 0,277 1 0

6. A Practical Example

Consider the results of applying the proposed method by the example of the oxygen-converter steel smelting process simulation. The process is described by the controlled and uncontrolled variables. The controlled input variables are the following:

- material consumption, t: raw iron (u^, scrap (u2), lime (u3), broken electrodes (u4), flux (u5), agglomerate fluxed (u6), coal (u7);

- oxygen blowdown, m3 (u8);

- heating oxygen, m3 (u9); uncontrolled variables:

- the chemical composition of raw iron, %: silicon Si magnesium Mn (^2), sulfur S (^3), phosphorus P (^4);

- temperature of iron, °C (^5);

- converter load, t

and output variables, which are responsible for the quality of the finished steel:

° xi

% x2

C (x3), magnesium Mn (x4), sulfur S (x5), phosphorus P (x6).

Thus, there are 15 input and 6 output variables which describe the investigated object. We have 176 oxygen-converter steel heats. It is necessary to get a model of the process. Due to the fact that the a priori information is not sufficient, it is proposed to use a nonparametric estimation (2).

The simulation, as in the previous case, has two stages. At the first stage we use the initial sample of observations, obtained by measuring the input and output variables of the process, as a learning sample. At the second stage, using the proposed method we generate new elements. The simulation results are presented in Table 4.

Table 4

Results of modelling of the oxygen-converter steel heats

Output variable Error "before" Error "after" A B

The metal turndown 0,99 0,51 19 0

temperature (xi)

Aluminum, A1 (x2) 1 0,63 30 0

Carbon, C (x3) 1 0,59 24 0

Magnesium, Mn (x4) 0,95 0,64 18 0

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Sulfur, S (x5) 0,85 0,35 15 0

Phosphorus, P (x6) 1 0,49 18 0

We use the proposed methodology to supplement the initial sample of observations. Using of the new learning sample leads to improvement of modelling accuracy. It should be noted that we make the estimation for all elements of the initial sample. For example, for the variable x3 (carbon concentration) model provides a forecast for the whole sample.

Conclusion

The main purpose of this article is to improve the accuracy of nonparametric identification by supplement the initial sample obtained on the real object with new elements. It should be noted once again that the identification of noninertial process with delay is carried out in conditions of nonparametric uncertainty, when it is impossible to get the parametric model due to lack of a priori information. In many practical problems the distribution of measurements of input and output variables of the object is often substantially non-uniform, there can be subdomains of sparsity. Use of nonparametric identification algorithms, based on Nadaraya - Watson estimation, leads to a rather rough models, if the size of the initial sample of observations is small. Earlier, we noted that the new generated elements of working sample do not replace observations on the object, but from a computational point of view, significantly improve the accuracy of nonparametric identification algorithms. We should also keep in mind that the new elements of the working sample are generated on the basis of available initial observations, so they are indirectly related to the object under investigation. In conclusion, we present the results of modelling of the oxygen-converter steel smelting process. We apply the method of working sample generation which allows to significantly increase the accuracy of the model.

References

1. Medvedev A.V. Osnovy teorii adaptivnyh sistem [Fundamentals of the Theory of Adaptive Systems]. Krasnoyarsk, Sibirskiy gosudarstvennyy aerokosmicheskiy universitet, 2015.

2. Zagoruyko N.G. Prikladnye metody analiza dannykh i znaniy [Applied Methods of Data and Knowledge Analysis]. Novosibirsk, Sobolev Institute of Mathematics, 1999.

3. Tsypkin Ja.Z. Osnovy informatsionnoy teorii indentifikatsii [The Foundation of Information Identification Theory]. Moscow, Nauka, 1984.

4. Eykhoff P. System Identification Parameter and State Estimation. London, N.-Y., Sydney, Toronto, Wiley, 1975.

5. Vasilev V.A., Dobrovidov A.V., Koshkin G.M. Neparametricheskoe otsenivanie funktsionalov ot raspredeleniy statsionarnykh posledovatel'nostey [Nonparametric Estimation of Functionals of Distributions of Stationary Sequences]. Moscow, Nauka, 2004.

6. Hardle V. Applied Nonparametric Regression. Cambridge, Cambridge University Press, 1990. DOI: 10.1017/CCOL0521382483

7. Katkovnik V.Ya. Neparametricheskaya identifikatsiya i sglazhivanie dannykh: rnetod lokal'noy approksimatsii [Non-Parametric Identification and Data Smoothing: Local Approximation Method]. Moscow, Nauka, 1985.

8. Nadaraya E.A. Neparametricheskie otsenki plonosti veroyatnosti i krivoy [Non-Parametric Estimation of the Probability Density and the Regression Curve]. Tbilisi, Tbilisi University, 1983.

9. Bradley E. Bootstrap Methods: Another Look at the Jackknife. Annals of Statistics, 1979, vol. 7, no. 1, pp. 1-26. DOI: 10.1214/aos/1176344552

10. Garcia-Soidan P., Menezes R., Rubinos O. Bootstrap Approaches for Spatial Data. Stochastic Environmental Research and Risk Assessment, 2014, no. 28, pp. 1207-1219. DOI: 10.1007/s00477-013-0808-9

11. Loh J., Stein M.L. Spatial Bootstrap with Increasing Observations in a Fixed Domain. Statistica Sinica, 2008, vol. 18, no 2, pp. 667-688.

12. Medvedev A.V. Neparametricheskie sistemy adaptacii [Nonparametric Adaptation Systems]. Novosibirsk, Nauka, 1983.

Received July 27, 2016

УДК 519.87 DOI: 10.14529/mmpl70210

О НЕПАРАМЕТРИЧЕСКОМ МОДЕЛИРОВАНИИ МНОГОМЕРНЫХ БЕЗЫНЕРЦИОННЫХ СИСТЕМ С ЗАПАЗДЫВАНИЕМ

А.В. Медведев, Е.А. Чжан

Сибирский федеральный университет, г. Красноярск

Рассматривается ЗеЬДеЬЧеЬ идентификации безынерционных объектов с ЗеШ аЗ Д ЫВ ei— нием в условиях непараметрической неопределенности, т.е. когда априорные сведения о параметрической структуре исследуемого объекта отсутствуют. Во многих приложениях возникает ситуация, когда измерение тех или иных выходных переменных осуществляется через значительные промежутки времени и могут существенно превышать постоянную времени объекта. В этой связи приходится рассматривать объект как безынерционный с запаздыванием. В сущности, для решения задач идентификации используются два основных подхода: один из них - это идентификация в «узком» смысле или параметрическая идентификация либо при недостатке априорных сведений для выбора параметрической структуры естественно применить методы локальной аппроксимации, которые в последнем случае используют в качестве априорных сведений лишь качественные свойства исследуемого объекта. В случае, если

исходные данные об объекте достаточно представительны, то непараметрическая идентификация дает удовлетворительный результат, если же в пространстве входных и выходных переменных имеют места разреженности, то качество непараметрических моделей существенно снижается. Настоящая статья посвящена методике заполнения или генерации обучающих выборок на основании имеющейся текущей информации. Это позволяет существенно повысить точность непараметрических моделей при идентификации безынерционных систем с запаздыванием. Проведенные вычислительные эксперименты подтвердили, что качество непараметрических моделей безынерционных систем может быть существенно улучшено в результате «ремонта» исходной выборки. Одновременно значительно повышается точность модели на границе областей определения входных-выходных переменных процесса.

Ключевые слова: непараметрическая идентификация; анализ данных; выборка; компьютерное моделирование.

Литература

1. Медведев, A.B. Основы теории адаптивных систем / A.B. Медведев. - Красноярск: Сибирский государственный аэрокосмический университет, 2015.

2. Загоруйко, Н.Г. Прикладные методы анализа данных и знаний / Н.Г. Загоруйко. - Новосибирск: Институт математики им. С.Л. Соболева, 1999.

3. Цыпкин, Я.З. Основы информационной теории идентификации / Я.З. Цыпкин. - М.: Наука, 1984.

4. Эйкхофф, П. Основы идентификации систем управления / П. Эйкхофф. - М.: Мир, 1975.

5. Васильев, В.А. Непараметрическое оценивание функционалов от распределений стационарных последовательностей / В.А. Васильев, A.B. Добровидов, Г.М. Кошкин. - М.: Наука, 2004.

6. Хардле, В. Прикладная непараметрическая регрессия / В. Хардле. - М.: Мир, 1993.

7. Катковник, В.Я. Непараметрическая идентификация и сглаживание данных: метод локальной аппроксимации / В.Я. Катковник. - М.: Наука, 1985.

8. Надарая, Э.А. Непараметрические оценки плотности вероятности и кривой / Э.А. На-дарая. - Тбилиси: Из-во Тбилисского ун-та, 1983.

9. Bradley, Е. Bootstrap Methods: Another Look at the Jackknife / E. Bradley // Annals of Statistics. - 1979. - V. 7, № 1. - P. 1-26.

10. Garcia-Soidan, P. Bootstrap Approaches for Spatial Data / P. Garcia-Soidan, R. Menezes, O. Rubinos // Stochastic Environmental Research and Risk Assessment. - 2014. - № 28. -P. 1207-1219.

11. Loh, J. Spatial Bootstrap with Increasing Observations in a Fixed Domain / J.M. Loh, M.L. Stein // Statistica Sinica. - 2008. - V. 18, № 2. - P. 667-688.

12. Медведев, A.B. Непараметрические системы адаптации / A.B. Медведев. - Новосибирск: Наука, 1983.

Александр Васильевич Медведев, доктор технических наук, профессор, кафедра «Информационные системы:», Сибирский федеральный университет (г. Красноярск, Российская Федерация).

Екатерина Анатольевна Чжан, ассистент, кафедра «Информатика», Сибирский федеральный университет (г. Красноярск, Российская Федерация), [email protected].

Поступила в редакцию 21 июля 2016 г.

i Надоели баннеры? Вы всегда можете отключить рекламу.