Научная статья на тему 'Testing the algorythm of the “Caterpillar”-SSA method for time series recovery'

Testing the algorythm of the “Caterpillar”-SSA method for time series recovery Текст научной статьи по специальности «Строительство и архитектура»

CC BY
154
41
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
TREND ALLOCATION / PERIODICALS FINDING / SILENCING / DECOMPOSITION OF TIME SERIES INTO COMPONENTS

Аннотация научной статьи по строительству и архитектуре, автор научной работы — Vohmyanin S. V.

The basic algorithm of the “Caterpillar”-SSA method is considered and tested.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Testing the algorythm of the “Caterpillar”-SSA method for time series recovery»

the applied mathematical models are the most actual for the computation of the modes in the complex design of honeycomb panels, because these computational

experiments permit to obtain the most detailed information about thermal mode results in various external conditions.

y, CM

0 20 40 60 SO 100 120 140 160 x, cm

i mm \

0 5 10 15 20 25 30

Fig. 4. Temperature field of a thermal-loaded honeycomb panel internal surface

It should be noticed that computational experiments for honeycomb panels with complex construction require a significant volume of different input information like thermophysical characteristics and geometric parameters of the constructions (sizes, coordinates of the device slots), mass, etc. The collection of such information requires much time. The time consumption increases significantly if it’s necessary to make a number of computational experiments for different honeycomb panel

constructions in order to find the optimal temperature mode. To decrease the consumption it’s reasonable to automatize the process of input information gathering. The information should be read from the CAD-system database. To realize this approach, we are currently developing a software complex for the computation of honeycomb panel thermal modes on the base of the presented mathematical model. This complex is integrated with a CAD-system. It will allow significant simplifying of the input procedure, output, edition, and visualization of the input and calculated data.

The developed computational model allows the conducting of computations for honeycomb panel hems’ temperature fields, considering the detailed information on the honeycomb panel design, heat generation of electronics, and external thermal conditions. In order to find the optimal thermal modes for electronics, it’s possible to use this model to optimize the honeycomb panel’s construction, and to select the most appropriate arrangements of the devices and heat pipes in the panel’s hems.

References

1. Unsteady thermal conditions of spacecrafts of satellite system / M. V. Kraev, O. V. Zagar, V. M. Kraev, K. F Golikovskaja ; SibSAU. Krasnoyarsk, 2004.

2. Alekseev N. G., Zagar O. V., Kasjanov A. O. The system of maintenance of a thermal mode of the device with regulation of temperature in a narrow range // Reshetnev’s readings: Proc. of XI Intern. scientific conf. Krasnoyarsk, 2007. P. 213.

3. Samarskij A. A. Theory of difference scheme. M. : Nauka, 1989.

© Vasilyev E. N., Derevyanko V. V., 2010

S. V. Vohmyanin

Siberian State Aerospace University named after academician M. F. Reshetnev,

Russia, Krasnoyarsk

TESTING THE ALGORYTHM OF THE “CATERPILLAR”-SSA METHOD FOR TIME SERIES RECOVERY

The basic algorithm of the “Caterpillar”-SSA method is considered and tested.

Keywords: trend allocation, periodicals finding, silencing, decomposition of time series into components.

One of the significant problems in the analysis of time series is the separation of trend and periodicals presses from the noise. This research is about a robust method of time series analysis: “Caterpillar”-SSA, which is currently being developed.

Let’s investigate the functioning of this algorithm and state, in what its specificity is exactly. The variant of the algorithm described below doesn’t essentially differ from the basic one [1], it has only been simplified without any changes in result.

We consider the given time series F:

fc, f1,..., fN-1, (1)

where N is its length. Further we assume that F is a nonzero series.

The algorithm consists of four consistent steps: investment, singular decomposition, grouping, and diagonal averaging.

The investment procedure converts the time series F into a sequence of multidimensional vectors called the trajectory matrix.

To analyze the time series we select parameter L called “the length of period”, which is in the open interval 1 < L < N. Thus K = N-L-1 investment vectors are created:

converts each resultant matrix Ys), s = 1, 2, ..., m, into series f(s) with the help of the following formula:

X = (f-1, /,..., , 1 < i < K .

(2)

These vectors form the trajectory matrix of the series F the columns of which are the sliding parts of the series with length L: from the first point to L-th, from the second to (L + 1)-th and so on:

(3)

( f J0 f1 . .. /k-1 ]

X = [ X1 : X 2 : ..: Xk ] = f1 f2 . .. /k

v fL-1 fL . .. fN-1 /

1 k+1

k + 1 S ^n,k-n+ k 1 n=1

1 L *

TT S y* ,k - n + 2

L n=1

1 N-K*+1

1 S y*,

0 < k < L -1,

L -1 < k < K

(6)

N - k

K < k < N.

It’s known that univocal conformity exists between matrixes of dimension L x K like (2) and the series (1) of length N = L + K-1 [1].

The result of the following step will be a singular decomposition of the trajectory matrix (2) in the sum of elementary matrixes.

Let S = X-XT. We will assign the eigenvalues of matrix S taken in nondecreasing order as Xi, X2, ..., 1L, and the orthonormal system of eigenvectors of matrix S, corresponding to ordered eigenvalues, such as U1, U2, ..., UL. Then the singular decomposition of trajectory matrix X is to be written as the following expression:

X =SV , (4)

where Vt = Ut ■ UT ■ X , I = 1, ..., L. Considering that each of the matrixes Vi to have rank 1, they can be denoted as elementary matrixes.

The initial time series is assumed to be a sum of

several series. The results allow us under certain

conditions, to define, according to the form of the eigenvalues and the eigenvectors, what kind of items they are and what combination of elementary matrixes

corresponds to each of them.

At the next stage there is a grouping, by decomposition (3) the set of indexes {1, 2, ..., L} is divided into m non-crossing subsets I1, I2, . , Im. Thereby the decomposition (3) can be written down as:

m

X = S Yi , (5)

i=1

where Yi = S V are the resultant matrixes for each

kelj

subset I, I = 1, ..., m.

Actually, precisely at the grouping stage, the initial time series is divided into periodicals, noise, and trend. The basic criterion of the grouping is the importance of each elementary matrix Vk, to be corresponding to its eigenvalue Xk.

At the last stage of the algorithm each matrix of grouped decomposition (4) is converted into a series of length N.

Let L* = min (L, K), K* = max (L, K). Also let y*y = Yp, if L < K and y*p = Yp, if L > K. Diagonal averaging

This formula corresponds to the averaging of the elements along “diagonals” I + j = k + 2.

Thus, applying diagonal averaging (5) to resultants

matrixes Ys), we get a seriesF(s) = (f0(s),f1(s),...,/N-1). The initial time series F is decomposed into the sum of m series:

m m

F = S,F(s), =Sf(s),

n=1 n=1

n = 0, 1, ..., N—1, s = 1, 2, ..., m. (7)

So, the result of the algorithm is the decomposition of the time series into interpreted additive components. For all this it doesn’t require stationarity from the series, knowledge of the trend modelor, or any data about the presence of periodicals in the series and their periods. With such simple assumptions, the “Caterpillaf’-SSA method is able to solve various tasks, such as trend allocation, detection of periodical presses, number smoothing and the construction of the full decomposition of the series into the sum of trend, periodicals and noise [2].

Certainly, the given method also has some disadvantages. First of all, there isn’t an automatic grouping of the components of singular decomposition of the trajectory matrix to get the components of the initial series. At the same time successful decomposition depends on the correct grouping. Secondly, the absence of a model doesn’t allow to prove the hypothesis about the presence of this or that component in the time series (this disadvantage is objectively inherent in non-parametric methods). We should also state that the considered non-parametric method in certain situations permits us to obtain the results, which frequently slightly differ in accuracy from many parametrical methods in the analysis of the series with the defined model [3].

Let’s look at the algorithm work on three various examples to investigate its advantages and disadvantages. There is a time series in the each example, consisting of the sum of the generated interferences R, and given required function x;.

f = xt + Rt.

Further, we define the criterion of efficiency by the formula:

where A, is a restored (cleared of noise) series achieved with the help of the algorithm. In (7) the numerator is the sum of squares of deviations between restored series and “clear” series, and the denominator is the sum of squares of interferences. So, formula (7) shows the parts of the interferences are not separated after the application of the algorithm; we shall call it “silencing”.

Example 1. A simple time series; weak interferences:

Xi = I + 10; I = 0, 1, ..., 49; N = 50; L = 25.

Ri is a random value with uniform distribution from the interval [-2; 2].

Matrix S has dimensions 25 x 25 and 25 eigenvalues h (tabl. 1).

The grouping of indexes 24-th and 25-th is chosen, as corresponding to the most significant components. Elementary matrixes V24 and V25 correspond to them. Calculating diagonal averaging for resultant matrix Y° = V24 + V25, we get the restored series (fig. 1).

Fig. 1. Graphs of series: “clear”, with noise and restored

Noise clearing is W = 11.4% of the initial interferences.

Example 2. A series with periodicals, average interferences:

i(i - 60)

I = 0, 1,

100

, 59; N

+ 5sin (i);

60; L = 30.

Ri is a random value with uniform distribution from the interval [-3; 3].

Matrix S has dimensions of 30x30 and 30 eigenvalues h (tab. 2).

Grouping of ones indexes from 27-th to 30-th is chosen, as corresponding to the most significant components. Elementary matrixes V27, V28, V29 and V30 correspond to them. Calculating the diagonal averaging for resultant matrix Y°= V27 + V28 + V29 + V30, we get the restored series (fig. 2).

Fig. 2. Graphs of series: “clear”, with noise and restored

Noise clearing is W = 25.6 % of initial interferences.

Example 3. A series with several periodicals, high interferences:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

x, = 0.03i + 1.6sin (0.3i + 0.17) + 1.3sin (2i + 0.57);

I = 0, 1, ..., 49; N = 50; L = 15.

Ri is a random value with normal distribution, c = Vs .

Matrix S has dimensions 15 x 15 and 15 eigenvalues h (tab. 3).

In this situation due to the high interferences the choice of components for the grouping is inconvenient, and to recognize a trend and periodicals is difficult. The analysis has shown that the increase in index quantity in a similar situation results in the restoring of not only an additive component, but also that of non-separated noise.

Table 1

The contribution of eigenvalues k of matrix S, in percentage of their sum

i 1 2 3 4 5 6 7 8 9 10 11 12 13

hi, % 0,00 0,00 0,00 0,00 0,00 0,00 0,01 0,01 0,01 0,00 0,01 0,01 0,01

i 14 15 16 17 18 19 20 21 22 23 24 25

hi, % 0,02 0,02 0,02 0,02 0,03 0,03 0,03 0,04 0,08 0,08 2,76 96,8

Table 2

The contribution of eigenvalues k, of matrix S, in percentage of their sum

i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

hi, % 0,03 0,03 0,03 0,03 0,04 0,05 0,02 0,07 0,07 0,07 0,08 0,01 0,00 0,00 0,12

i 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

h,, % 0,11 0,12 0,12 0,17 0,20 0,21 0,21 0,22 0,26 0,29 0,33 4,14 5,97 7,58 79,44

Table 3

The contribution of eigenvalues ki of matrix S, in percentage of their sum

i 1 2 3 4 5 6 7 8

hi, % 2,50 2,49 2,85 2,14 1,94 3,29 4,00 4,73

i 9 10 11 12 13 14 15

hi, % 5,26 6,83 7,78 9,99 11,31 13,83 21,07

Noise clearing with the 3 most significant components is W = 21.8%, for 4 it’s W = 29.2% and for 5 it’s

W = 34.6 % .

The results for 3 selected components are shown below as graphs (fig. 3).

Fig. 3. Graphs of series: “clear”, with noise and restored

Concluding the given examples we can state that the basic algorithm of the “Caterpillar”-SSA method copes with the assigned task: for time series it separates trend and periodicals from interferences, reducing noise level down to 2-3 times; although the types of significant components aren’t defined, whether they are linear, periodic, logarithmic or other. This is an advantage of the method, which will make possible to create a powerful mechanism of non-parametric analysis of time series in the future, including computer programs.

The disadvantage of the basic algorithm is the necessity of manual intervention for the divided components analysis; also there is a problem in selecting the length of period and the quality of additive components division, depending on that. Further research will be dedicated to the automation of analyzing processes and other methods, improving the quality of the algorithm work results and reducing the manual aspect in this process.

References

1. Golyandina N. E. The method of “Caterpillar”-SSA: the analysis of temporal aisles : textbook. Saint-Petersburg, 2004.

2. The main components of temporal aisles: the “Caterpillar” method / under the editorial of D. L. Danilov, A.A. Zhigliavski. Saint-Petersburg : Presscom, 1997.

3. Golyandina N., Nekrutkin V., Zhigljavsky A. Analysis of Time Series Structure: SSA and Related Techniques. London : Chapman& Hall/CRC, 2001.

© Vohmyanin S. V., 2010

Yu. Yu. Yakunin, A. A. Gorodilov Siberian Federal University, Russia, Krasnoyarsk

ALGORITHMS FOR CALCULATING COMPLEX INDICATORS IN DYNAMIC STRUCTURES OF DATA REPRESENTATION

This paper presents algorithms for calculating complex indicators on set factual data, represented in dynamic structures with the application of the graph theory.

Keywords: dynamic structures of the data, tripartite graph, algorithm of graph’s round, complex indicators.

The problem of rupture between scientific methods of representing (describing) real world objects and the storage of this information in information systems has existed for a long time and has not yet been solved satisfyingly. The database management system (DBMS) is the best system available today, which allows the storage of information in the form of objects [1; 2] (the object-oriented approach [3]) or globals [1] (hierarchical representation of the information in the form of a tree). However, even such an approach can capture only part of the variety represented in the information of modern scientific methods [4]. Such rupture substantially slows the development of science and engineering in the field of information technology.

According to this, the essential restriction for information system design is the standard way of data storage, which is based on static structures (i. e. for the description of a subjected field’s objects in order to store information, a database of the data storage structure is created in advance). This results in the fact that such structures should be created by the designer of information system during a stage of its designing and it (this structure) cannot change during the development of this system and its maintenance. It is not necessary to speak about the expenses at which changes in the system come. It is obvious that if such changes are possible, even in an insignificant part of this structure, it would come at the same expense as the original production of

i Надоели баннеры? Вы всегда можете отключить рекламу.