Научная статья на тему 'Piecewise Continuous Segmentation of Multidimensional Experimental Signals'

Piecewise Continuous Segmentation of Multidimensional Experimental Signals Текст научной статьи по специальности «Медицинские технологии»

CC BY
28
5
i Надоели баннеры? Вы всегда можете отключить рекламу.

Аннотация научной статьи по медицинским технологиям, автор научной работы — Alexander G. Dmitriev

An algorithm for piecewise continuous approximation of structural experimental multidimensional signals with a previously unknown number of intervals for splitting signals into "similar" fragments is proposed. The construction of a multidimensional piecewise continuous approximating function is performed "left – to – right", which allows to use the dynamic programming method to determine the boundaries of the partition intervals. The approximation quality criterion is used, taking into account the number of data on the partition intervals and the "complexity" of the local signal models used.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Piecewise Continuous Segmentation of Multidimensional Experimental Signals»

Piecewise Continuous Segmentation of Multidimensional Experimental

Signals

Alexander G. Dmitriev Cherepovets Higher Military Engineering Schoolof Radio Electronics Cherepovets, Russia dag334a@fxmail.ru

Abstract

An algorithm for piecewise continuous approximation of structural experimental multidimensional signals with a previously unknown number of intervals for splitting signals into "similar" fragments is proposed. The construction of a multidimensional piecewise continuous approximating function is performed "left - to - right", which allows to use the dynamic programming method to determine the boundaries of the partition intervals. The approximation quality criterion is used, taking into account the number of data on the partition intervals and the "complexity" of the local signal models used.

number of "similar" sites (partition intervals). In addition, the approximating function usually suffers a gap at the boundaries of "similar" sites [Dmi10, Dor84].

The aim of the work is to develop an algorithm for piecewise continuous approximation of multidimensional signals with a previously unknown number of intervals of splitting signals into "similar" fragments, which under certain conditions delivers the optimal value to the selected criterion of approximation quality.

1 Problem Statement

Let a set of signals (a multidimensional signal) y(t) = (y(1) (t),..., y(^ (t)) collectively characterizing the object under study, be presented for analysis. The values y()(t), i = 1,...,s are specified at discrete points in time t = tl,...,tN. The criterion of approximation quality J on the sample of experimental values is chosen as:

Introduction

In various applications, the problem arises of analyzing the so-called structural experimental signals, considered as a time-sequential combination of simpler signals (functions), that have constant properties at the corresponding time intervals [Mot79, Kos04]. Processing of such signals in most cases is reduced to a two-stage procedure: the allocation of "similar" fragments (segmentation stage) and the subsequent construction of the description of the presented signals as a whole. The use of existing methods of signal segmentation proves to be insufficiently effective in studies under conditions of high dimensionality, limited experimental observations and an unknown

Copyright c by the paper's authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). In: A. Khomonenko, B. Sokolov, K. Ivanova (eds.): Selected Papers of the Models and Methods of Information Systems Research Workshop, St. Petersburg, Russia, 4-5 Dec. 2019, published at http://ceur-ws.org

JJijm 1 yt-Ftj (i)

¿=i j

teijT]

where - a priori "weight" of the signal y(', nj — the number of discrete samples on the interval (Tj-i, Tj ], Fftt,, a«) = f aj j (t) - polynomial over

given

j

set

of basis

functions

{jk (t), k = 1,..., m}, af = (a® ,...,ajm) -the vector of

estimated parameters in j-th interval.

Criterion (1) uses weights ^ (i = 1,..., s) and

nj / (nj - m) (j = 1,..., r). The introduction of weights

p. is due to the fact that in practice the parameters

often have different practical significance. Specific values p. are chosen for meaningful reasons, usually

a

they are normalized so that ^ /nt = 1. Weights

i= 1

nj / (nj - m) are the usual normalizing coefficients

that take into account the dimension of the model. It is required to find such a partition

T = (T0,T1,...,Tr), T0 <T <... <T (synchronous

change of signals yw is assumed) of a given interval ft,tN], Ta = ti, Tr = tN into r intervals (TJ_1,TJ],

j = 1,...,r ( r - in general, unknown) and to determine on each of these intervals such values of the parameter vectors a'j\ i = 1,...,5, that the functional (1) takes a

minimum value under the condition of restrictions on the continuity of approximating functions:

Fj >) = f^cT ,<),

(j = 1,...,r); (i = 1,...,s).

(2)

2 Algorithm

Denote by-the s(l) (Tj_1, Tj) error of approximation of

the i-th signal on the interval (Tj-l,Tj ] , calculated by

the method of least squares. Then criterion (1) will take the form:

J = (3)

¿=1 7=1 7

Let's make the following assumption. Borders Tj

(j = 1,..., r-1), partitions T = (T0,T,...,Tr) and

corresponding local approximating functions on each interval of partition will be found from "left to right". Let the position of the boundary Tl be determined,

find the local approximating functions Fj')(t, a(')),

delivering the minimum s1- '(T1,T2), i = 1,...,5. Next,

we fix the position of the boundary Tl, find the

boundary T2 and the corresponding local

approximating functions F2\t, a,')), delivering a

minimum of s(,>(Tl,T2) (i = 1,...,s)with a restriction

on continuity at the boundary Tl:

F^T <) = F^T af), i = 1,..., 5.

Fix T2 , etc.

Under this assumption criterion (3) with constraints (2) has the following property. Let be Tj some fixed

position of the right boundary of the j-th interval.

Then the bounds T* ,...,T* _*, obtained by minimizing

(3) only over the bounds T1,...,TJ_1 are independent of

the boundary values Tj+l,..,Tr_v Indeed, the

functional (3) can be represented as the sum of two nonnegative quantities: J = Ja + Jb , where

Ja=j (T!,..Tj / j=11 -n- E M/- » (T1, T )

nk -

T = TJ-

jb = jb(Tj+i,...Tr_ /j = ] I nn-m(T-i,Tk)

k=j+i

nk _

T = Tj.

But then, obviously, argmin J = argmin Ja.

It follows from this property that if Tj is the optimal

position of the j-th boundary, then the boundaries T,...,T*-i, obtained by minimizing J by T,...,Tj_l,

are also optimal. The considered property of criterion (3) makes it possible to use the dynamic programming procedure [Bel69] to determine the optimal boundaries of intervals.

Let the number of intervals be equal to r0.

The following recursive algorithm finds the partition and local approximating functions, delivering the optimal value to the functional (1) (under the above assumption).

First the functions Jj(Tj) (j = 1,2,...,r0 -1) are tabulated sequentially, where

Ji(T) = —1-ZHs<f> (To,T), T = tp,...,tN_(m_V)p;

n -mtt -0

JJ(TJ) = t min {.Jj-!(TjJ + ^-Zw10(W)'

...,tTj- p

T = t-

j j-p>

,t

N-(ro - J) p'

j = 2,..., ro-1;

Tj - the reference number corresponding to the

boundary Tj, p is the specified minimum allowed

number of samples on the partition interval. Approximating functions on adjacent intervals are constructed taking into account the conditions for continuity (2).

At the same time, the values

MJ_l(TJ), T = j ,...,tN _(ro-j ) p,

j = 2,..., -1, -

the values of the optimal positions of the boundaries Tj_ for each Tj, - are stored. Next, the optimal boundaries of the intervals are determined:

argmin

Tr0-l =t(r0-2)-p '~'tN-p

J-l (To-l) +

+£ Ml£(')(Tro-i,tN ) o

,(4)

K-2 = M,-2 fcl ) ,.., T1* = M1 (T ).

The extreme nature of the dependence J on r

is used to find the partition TH = (T0,T",...,TrH ^Tr)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

at an unknown number of intervals. Indeed, with an increase r , on the one hand, there is an increase in weights nj / (nj - m + 1) due to a decrease in the

average nj, which, other things being equal, leads to

an increase in the criterion (1). On the other hand, as the number of intervals increases, the usual quadratic residual decreases, resulting in a decrease in (1). The simultaneous action of these factors leads to the fact that the functional (1) reaches its minimum value at some intermediate (not boundary) value rH .

For determination rH it is possible to use the

approach offered in work [Dmi10]. First, the minimum values of the functional (1) Jj (tN), j = 2,..., rmax are

calculated . At the same time, the values M j-l (Tj ) of

the optimal positions of the boundaries Tj_ for each

T are stored . Next, the number of intervals is

selected and the optimal boundaries are determined. As rH the smallest number of intervals r is chosen, at

which Jr (tN) takes the minimum value. Optimal

bounds THare determined from expressions analogous to (4):

Cl = MH-1 (tN ) . C2 = 2 (Cl ).

..., TH = Ml (T?). The value r is chosen

for substantive or Statistical reasons. In particular, as rmax can be used the value [N / p], where [ x] is the integer part x.

Modeling

To check the efficiency of the algorithm, its modeling was carried out on special multidimensional signals. The developed program made it possible to obtain a multidimensional signal with specified properties. The relevant procedure is as follows. The vector function is considered on the interval: [1, tf ]:

, (t ) = (/>(t )>(t )),

where

j=i

v{:) • t

j-i

T-i - T

(i = 1,...,

is a superposition of piecewise linear continuous function and independent Gaussian interference with and dispersion b2 ;

zero

mean

T0 = 1, T* = N, Tj, j = 1,...,r -1 - nodal points of piecewise linear vector function; yj, j = 0,..., r* -values of this function in points Tj ; e. -

characteristic function equal to one, if t e (Tj,Tj ],

and equal to zero, if t ï (Tj_, Tj ].

The nodal points Tj, the values of piecewise linear

functions yj at these points , the number of intervals

r* are determined randomly using a random number generator by the following procedure:

L j = 0; To =1; ^ = S0,-, i = U, S.

r-l

2. If A''-/-, < /;. then move to 4. Otherwise - we pass to 3.

3. j = j+\- Tj =Tj_l+[pjc + p], ([x] is the integerpart x), v(;" =gJj-d, /=l,...,s, thenmoveto 2.

4. T;=iV; /•' /.

c;, , /?. - random, independent, uniformly

distributed, respectively, at intervals [-l;l] and [0;l]

values; p. d. c. X - pre-selected parameters of the algorithm.

Next, the values y"(l). (I = 1.2..... A'), (i = L....V) of the vector function (5) are calculated. Linear functions were used as local approximating functions:

F*Jn(t,a(J)) = a(Jl)+a(Jl-t, /=1,...,*, j = \...,r .

In the studies, the experimental material contained three groups of three-component multidimensional signals, obtained using the procedure described above. The first group consisted of multidimensional signals without noise, the second and third, respectively, with average (or low) (dispersion b2 e [0,01-0,1]) and increased noise level (b2 e [0,1-0,3]). As expected, for no-noise the algorithm found the desired number of intervals, partitioning, and approximating functions without error.

Typical dependences of the criterion J on r for the second signal group are shown in figures 1

(the t sign in the figures indicates the actual number of intervals).

J+

x0,l

io-

Q" 8" 7" 6-5" 4-3-2-

1- ^_

0 1234567891 11 r A

Figure 1: Average noise level

For signals with an average noise level, the minimum functional, as a rule, falls on the desired number of intervals.

Typical dependences of the criterion J on r for the third signal group are shown in figures 2.

J*

x0,l

1234567891 11 r

Figure 2: Increased noise level

For signals with an increased noise level, the discrepancy between the optimal and the desired number of intervals was more often manifested. For signals with high noise levels were more likely to be a mismatch is found and the specified numbers of intervals.

For both groups of signals, this discrepancy was observed when there were adjacent intervals in the multidimensional signal, at which the difference in the" behavior " of the signal was insignificant, and, as a rule, such intervals contained a small number of samples.

It can also be noted, that for the signals of the second group, the reduction section of the graph of the dependence of J on r jumps into the increase section, and for signals with an increased noise level, such a transition occurs smoothly. The latter can be used to construct optimization procedures that perform a limited search over r.bjhb

Conclusion

Thus, the proposed approach of constructing a piecewise continuous approximating function "left - to - right" allows to apply the dynamic programming

method to determine the boundaries of the partition intervals. Using the extreme behavior of the approximation quality criterion, the number of partition intervals is determined.

References

[Mot79] V. V. Mottl, I. B. Muchnik. Hidden Markov models in structural analysis of signals. M.: Fizmatlit, 1999.

[Dmi10] A. G. Dmitriev. The Algorithm of optimal structural approximation of experimental multidimensional signals. Naukoemkie technologii, No 9: 31-35, 2010.

[Kos04] A. A. Kostin, O. V. Krasotkina, M. V. Markov, V. V. Mottl, I. B. Muchnik. Dynamic programming algorithms for analysis of non-stationary signals. Computational Mathematics and Mathematical Physics, V 44, No 1: 62-77, 2004.

[Dor84] A. A. Dorofeyuk, A. G. Dmitriev. Methods for piecewise approximation of multidimensional curves. Automatics and telemechanics, No 12: 101-108, 1984.

[Bel69] R. Bellman, S. Dreyfus. Applied problems of dynamic programming. M: Nauka, 1969.

i Надоели баннеры? Вы всегда можете отключить рекламу.