Научная статья на тему 'Stable identification of linear autoregressive model with exogenous variables on the basis of the Generalized Least Absolute deviation method'

Stable identification of linear autoregressive model with exogenous variables on the basis of the Generalized Least Absolute deviation method Текст научной статьи по специальности «Математика»

CC BY
96
9
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
ALGORITHM / AUTOREGRESSIVE MODEL / LINEAR PROGRAMMING / PARAMETER IDENTIFICATION / АЛГОРИТМ / МОДЕЛЬ АВТОРЕГРЕССИИ / ЛИНЕЙНОЕ ПРОГРАММИРОВАНИЕ / ПАРАМЕТРИЧЕСКАЯ ИДЕНТИФИКАЦИЯ

Аннотация научной статьи по математике, автор научной работы — Panyukov A.V., Mezaal Ya.A.

Least Absolute Deviations (LAD) method is a method alternative to the Ordinary Least Squares OLS method. It allows to obtain robust errors in case of violation of OLS assumptions. We present two types of LAD: Weighted LAD method and Generalized LAD method. The established interrelation of methods made it possible to reduce the problem of determining the GLAD estimates to an iterative procedure with WLAD estimates. The latter is calculated by solving the corresponding linear programming problem. The sufficient condition imposed on the loss function is found to ensure the stability of the GLAD estimators of the autoregressive models coefficients under emission conditions. It ensures the stability of GLAD-estimates of autoregressive models in terms of outliers. Special features of the GLAD method application for the construction of the regression equation and autoregressive equation without exogenous variables are considered early. This paper is devoted to extension of the previously discussed methods to the problem of estimating the parameters of autoregressive models with exogenous variables.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Stable identification of linear autoregressive model with exogenous variables on the basis of the Generalized Least Absolute deviation method»

MSC 91B84

DOI: 10.14529/ m m p 18 010 4

STABLE IDENTIFICATION OF LINEAR AUTOREGRESSIVE MODEL

WITH EXOGENOUS VARIABLES ON THE BASIS

OF THE GENERALIZED LEAST ABSOLUTE DEVIATION METHOD

A.V. Panyukov, Ya.A. Mezaal

South Ural State University, Chelyabinsk, Russian Federation E-mail: paniukovav@susu.ru, yaser_ali_84@yahoo.com

Least Absolute Deviations (LAD) method is a method alternative to the Ordinary Least Squares OLS method. It allows to obtain robust errors in case of violation of OLS assumptions. We present two types of LAD: Weighted LAD method and Generalized LAD method. The established interrelation of methods made it possible to reduce the problem of determining the GLAD estimates to an iterative procedure with WLAD estimates. The latter is calculated by solving the corresponding linear programming problem. The sufficient condition imposed on the loss function is found to ensure the stability of the GLAD estimators of the autoregressive models coefficients under emission conditions. It ensures the stability of GLAD-estimates of autoregressive models in terms of outliers. Special features of the GLAD method application for the construction of the regression equation and autoregressive equation without exogenous variables are considered early. This paper is devoted to extension of the previously discussed methods to the problem of estimating the parameters of autoregressive models with exogenous variables.

Keywords: algorithm; autoregressive model; linear programming; parameter identification.

Introduction

One of the important problems in measurement theory is the identification of linear autoregressive models [1-4]

yt

ajyt-j + y1 bjxtj + et, t = i,2,...,т, (l)

j=i

here yi, y2,..., yn are the values of the state variable, xit,x2t, ■ ■ ■ ,xnt are the values of controls at time points t = 1, 2,... ,T, ti, e2, ..., et are random errors, ai,a2,a3 ... ,am and bi,b2,b3 ... ,bm are unknown coefficients.

We consider the evaluation of the coefficients of the linear autoregressive equation (1) with exogenous variables. Ordinary Least Squares (OLS) is the parametric method in common used for estimation of the regression equation coefficients. We need some strict assumptions to use OLS. They include independence and normal distribution of errors and determinacy of explanatory variables [5]. Even minor violations of stated assumptions dramatically lower the efficiency of estimators. Let us note the instability of OLS estimation process in case of presence of large measurements errors. In this case, estimated coefficients become inconsistent. Finding estimates of autoregressive equation becomes substantially complicated due to the poor conditionality of the equations system representing necessary conditions for minimization of squared deviations sum.

Least Absolute Deviations (LAD) method is a method alternative to OLS. It allows to obtain robust errors in case of violation of OLS assumptions [5].

We present two types of LAD: Weighted LAD method [6] and Generalized LAD method [9]. The interrelation of methods established in [10] made it possible to reduce the problem of determining the GLAD estimates to an iterative procedure with WLAD estimates. The latter is calculated by solving the corresponding linear programming problem. The sufficient condition [9] imposed on the loss function is found to ensure the stability of the GLAD estimators of the autoregressive models coefficients under emission conditions. It ensures the stability of GLAD estimates of autoregressive models in terms of outliers. Special features of the GLAD method application for the construction of the regression equation are considered in [10]. Special features of the use of GLAD for constructing the autoregressive equation without exogenous variables are considered in [9]. This paper is devoted to extension of the previously discussed methods to the problem of estimating the parameters of autoregressive models with exogenous variables.

1. The Relationship between WLAD and GLAD Estimates

One can get the WLAD estimations of coefficients by solving the problem

{al,a*,...,a*m, bl,b*, ...,b*n) = oxg min

(b1,b2,...,bn)£Rn

T 't=i

yt a y— -

E-

j=i j=i

bj xtj

)

(2)

where pt > 0, t = 1, 2,...,T are predetermined coefficients. This problem represents the problem of convex piecewise linear optimization, and the introduction of additional variables reduces it to the problem of linear programming

min

(alta2,...,am )eRm.

(bi,b2,...,bn)eRn (ui,v,2,...,UT )€RT

£

I i=1

PtUt

-ut < yt- ajyt-j -Yjj=ibjxtj < ut, ut > 0, t = 1, 2,... ,T

.

(3)

This problem has a canonical form with n+m+T +1 variables and 3n inequality constraints including the conditions for the non-negativity of the variables Uj, j = 1, 2,... ,T. The main problem with the use of WLAD method is the absence of general formal rules for choosing weight coefficients. Consequently, this approach requires additional research.

The GLAD estimates can be obtained from the solution of the problem

(ai

* * *, a2*,

am

bi*, b2*

b2

bn*)

Tim n

are min E P\yt - E ajyt-j - E bj

(ai,a2,...,am)€Rm t=i \ j=i j=

(bi,b2,...,bn)eRn V

i bj j

(4)

where p(*) is a convex upward monotonically increasing twice continuously differentiable P(0) = 0

Theorem 1. All local minima of the GLAD estimation problem for the coefficients of the autoregressive equation (1) belong to the set

ai(k),a2(k),

(k)

bi(k),b2(k),

b(k)

U =

am

mn

yt = ajyt-j + bjxtj j=i j=i

t G k = {ki,k2,..., km+n : 1 < ki < k2 < ... < km+n < T}

*

Proof. The set i7 contains the solutions of all possible joint systems of m + n linearly independent equations

У = aj У— + bj Xtj' 1 E k

j=i j=i with m + n unknowns a1,a2,a3 ..., am, b1,b2,b3 ... ,bn.

If the solution (a1,a2,a3 ... ,am,b1,b2,b3 ... ,bn) e U then there exists an e-neighbourhood for which the loss function is continuous and convex upwards. Consequently, such a solution can not be a local minimum. This implies the assertion of the theorem.

Obviously, the number of systems is equal C'n+m- Thus, the solution of problem (3) can be reduced to choosing the best of Cn+m solutions of linear algebraic equations systems.

This approach is applicable for m < 3 . To compute GLAD estimates for higher order dimension problems the interrelation between WLAD and GLAD estimates have to be used from

U

(1) for each collection of weights {pt}'n=1

arg min у pt

(ai,a,2,...,am)eRm ^

(b1,b2,...,bn)eRn

t=l

yt

y1 a у*

3=1

-3

3=1

bj xtj

e U;

(5)

(2) for all ^1(k), a2(k),..., am(k\ b1(k), b2(k),..., e U there is such a collection

of weights {pt}'n=1 that

(ai*,a2*,a3* ...,am*, bi* ,b2*,... ,bn*) E

E arg min Pt

(ai,a2,...,am)eRm t=l (bi,b2,...,bn)€Rn

yt- E aj yt-j- E bj xtj j=i j=i

(6)

Proof. The proof of the first part of the theorem essentially repeats the proof of Theorem 1. The validity of the second part of the theorem follows from the fact that the weights of the active part of the constraints can be considered as non-zero, and the weights of the inactive part are equal to zero. In this case the minimal value of the loss function is equal to zero and it is achieved by solving the chosen system of equations. This implies the assertion of the theorem.

Theorems 1 and 2 give a way to determine the weight coefficients for the linear programming problem (3) and thus allow the problem (4) to be reduced to solving the sequence of linear programming problems (3).

2. Algorithm for Computing GLAD Estimates

The primal solution of problem (4) is based on the usage of theorem 1 and involves finding all node points and choosing one of them as a solution that ensures the minimum of the objective function.

The brute force algorithm requires the solution of Cj+m linear equations systems of order m + n. For large values of n and m this leads to a significant computational complexity. An alternative approach is based on the reduction of this problem to the sequence linear programming problems (3). Consider possible algorithms based on this approach.

Algorithm GLAD estimator Input:

number of measures T;

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

values [yt}J=i of the endogenous variable;

values {{xtj}'T=1}'j=1 of exogenous variables;

function p(*).

Output:

estimation of coefficients of autoregressive equation

Step 1. For all t

/ * * * * 7*7*

(ai ,a2 a ... ,am , bi ,b2 , 1, 2,... ,T do pt = 1; k := 0;

:= arg mm

(«1 ,a2,...,am)eRm

(b1,b2,,...,bn)eRn

(ui,U2,,...,ut)€RT

b (fc) bn

T

E Ptut : t=i

,bn) .

,(fc)) \

( (ai(fc),a2(fc),as(fc)

bi(fc),b2(fc),bs(fc).

V (ui(k),u2(k),u3(k),...,ut(k))

-Ut < yt - E ajyt-j - E bjxtj < ut, j=i j=i ut > 0, t = 1, 2,... ,T

Step 2. For all t =1, 2,...,T do k := k + 1;

( (ai(k),a2(k),aa(k) ...,am(k)) ^

(bi(k),b2(k),bs(k)... ,bn(k))

V (ui(k),u2(k),u3(k),...,ut(k)) )

arg mm

(ai,a,2,...,am)€Rm

(bi,b2,,...,bn)eRn (Mi,M2,,...,Mt)eRT

T

ptut

t=i

-ut < yt - E aj yt-j - E bj Xtj < ut, j=i j=i ut > 0, t = 1, 2, . . . , T

Step 3. If

f ai(k),a2(k),a3(k) ...,am(k)\ f al(k-i), a2(k-i), a^(k-i) ...,am(k-i) \

V bi(k),b2(k) ,b3(k) ...,bn(k) J ^ bi(k-i),b2(k-i) ,b3(k-i) ...,bn(k-i) )

then go to Step 2.

Step 4. Stop. Target values are

(

ai(k), a2(k), a3(k)

,a

(k)

.

bi(k),b2(k),b3(k) ...,bn(k)

The performance justification of this algorithm leads us to the following theorem.

Theorem 3. If the loss function p(*) is convex upward, monotone increasing, continuously differentiable on the positive semi-axis, and satisfies the condition p(0) = M < <x> then the sequence

(

al(k), a2(k), a3(k)..., am^k)

bi(k),b2(k),b3(k) ...,bn(k)

)

constructed by the GLAD estimation algorithm converges to the global extremum of problem

uh

Proof. It follows from the requirements imposed on the function p(*) that at any point uk an approximation

^(uik)){n) = p(u(k)) - p'(u(k)) • u(k) + p'(u(k)) •

u y ) • u

(7)

is a majorant, i.e.

(Vu = Uk)(p(u) < Vu(k) (u)), p(uk) = v(uk). Therefore, in accordance with the algorithm

n

t=i \

yt aj(kyt-j bj {k)Xtj

j=i

j=i

)

E

t=i

P

mn

yt - E a3(k)yt-j - E bj(k)xtj j=i j=i

-'Pt ■

+Pt

гЕ

t=i

P

(

mn

Уг - E aj(k)yt-3 - E bj(k)xtj

j=i

j=i

yt - E aj(k)yt-j - E bj(k)xtj j=i j=i

+

>

yt - E aj(k)yt-j - E bj(k)xtj j=i j=i

-Pt •

yt - E aj(k)yt-j - E bj(k)xtj j=i j=i

+

/

+ min

(ai,a2,a3...,am )eRm

n

t=i \

e

t=i

P

yt - ajyt-j - bjxtj j=i j=i

\

)

yt - E aj(k)yt-j - E bj(k)xj

j=i

j=i

-Pt ■

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

mn

yt - E aj (k)yt-3 - E bj(k) j=i

n

+ £\Pt t=i \

j yt-j Z^ bj xtj j=i j=i

+

(8)

yt aj^yt-j bj (k+i)xtj

j=i

j=i

)

n /

£ l (u(k)) t=i \

Vt a,

j=i

j=i

>

p

t=i

Vt aj(k)Vt-3

j=i

b

j=i

j ( kxtj

)

The first equality and the inequality following it are obvious. The second equation is a consequence of changing the notation of the variables in step 1. The third equation is the result of the choice of the weight coefficients in step 2 and equality (7). The last inequality is a consequence of relations (8). Therefore

n

t=i V

Vt a, {k)Vt-j -J] bj {k)xtj

j=i

j=i

)

E

t=i

(u(k) )

Vt aj(k+%-j

j=i >

n

EW

t=i V

Vt

.(k+i).

~'xtj

j=i m

)

>

aj(k+i)Vt-j

j=i

b

j=i

j ( ixtj

)

moreover, equality is attained only if for all t =1, 2,... n and for all k =1, 2,... ,m. That is why the sequence

is '(

Vt -Y, aj(k) V— -J] b3 {k)xt3 j=1 j=1

k=0,i,

is monotonically decreasing and bounded below by zero, hence it has a unique limit point. The existence of limit point of the sequence

(

ai(k),a2(k),a3(k)... ,am(k)

)

k = 1, 2,...

bi(k),b2(k),b3(k) ...,bn(k)

follows from continuity and monotonicity of functions p(*) . The limit point

(

* * * *

a1 , a2 , a3 , . . . , am bi*,62*,b.3*,...,bn*

1 , b2 , b3

built by the algorithm is the global minimum because for any

(

ai,a2,a3,..., am

bi,b2,b3, ...,bn

)

we have the following sequence of statements

( t

Ep

t=1

=E (v(u ) t=1

Vt- E a*jVt-j- E bptj j=1 j=1

)

Vt -E aj *Vt-j -E bj *xtj j=1 j=1

)< g(v - )(

mn

Vt- E aj Vt-j- E bjxtj

j=1 j=1

)

T

Е

t=i

р

yt- Е a*yt-j- Е bj*xtj j=i j=i

yt- E a*yt-j- E bj

j=i

j=i

T

<E t=i

(5 ' (

P

X

<

yt- E a *yt-j- E bj*xtj j=i j=i

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

yt- E ayt-j- E bjxtj j=i j=i

X

yt

yt

m n

j2aj yt-j j=i j=i

*- y, jtj

T

<

t=i

yt

m

aj yt-j

j=i

n

y bj xtj j=i

)

j2aj yt-j j=i

*- jtj

j=i

)< S '(

yt

y aj yt-j j=i

bjxtj

j=i

a

The advantage of the proposed algorithm in front of the bushing is a sufficiently high rate of convergence with efficient use of linear programming methods. Indeed, the linear programming problem in step 2 for iteration k differs from the corresponding problem at step k — 1 only by the coefficients of the objective function which allows us to use the optimal basic solution of the previous iteration as the initial basic solution at the current iteration.

A feature of finding the high-order autoregression equation is the high sensitivity of the algorithm to rounding errors. One may eliminate this problem by using the unerring execution of basic arithmetic operations over the field of rational numbers [7] and the application of parallelization [8].

Conclusion

The relationship between the generalized and weighted method of the smallest modules was established for the problems of autoregressive analysis with exogenous variables. This relationship makes it possible to reduce the problem of determining the GLAD estimates to the iterative procedure with WLAD estimates. The latter is calculated by solving the corresponding linear programming problem.

The sufficient condition imposed on the loss function is found to ensure the stability of the GLAD estimators of the autoregressive models coefficients under emission conditions. It ensures the stability of GLAD estimates of autoregressive models in terms of outliers.

Acknowledgment. The work was carried out with the financial support of the Government of the Russian Federation, contract no. 02.AOS.21.0011

References

1. Mudrov V.I., Kushko V.L. Metody obrabotki izmerenii: Kvazpravdopodobnye otcenki. [Methods for Processing Measurements: Quasi-Like Estimates]. Moscow, LEVAND, 2014. (in Russian)

2. Gurin L.S. [On the Consistency of Estimates of the Method of Least Squares]. Matematicheskoe obespechenie kosmicheskikh eksperimentov [Mathematical Support of Cosmic Experiments], Moscow, Nauka, 1978, pp. 69-81. (in Russian)

3. Tyrsin A.N. The Method of Choosing the Best Distribution Law of a Continuous Random Variable on the Basis of the Inverse Mapping. Bulletin of South Ural State University. Series: Mathematics. Mechanics. Physics, 2017, vol. 9, no. 1, pp. 31-38. (in Russian) DOI: 10.14529/mmphl70104

4. Shestakov A.L., Keller A.V., Sviridyuk G.A. The Theory of Optimal Measurements. Journal of Computational and Engineering Mathematics, 2014, vol. 1, no. 1, pp. 3-16.

5. Huber P., Ronchetti E.M. Robust Statistics. New Jersey, Wiley, 2009. DOI: 10.1002/9780470434697

6. Pan J., Wang H., Qiwei Y. Weighted Least Absolute Deviations Estimation for Arma Models with Infinite Variance. Econometric Theory, 2007, vol. 23, pp. 852-879. DOI: 10.1017/S0266466607070363

7. Panyukov A.V. Scalability of Algorithms for Arithmetic Operations in Radix Notation. Reliable Computing, 2015, vol. 19, no. 4, pp. 417-434. DOI: 10.1134/S0005117912020063

8. Panyukov A.V., Gorbik V.V. Using Massively Parallel Computations for Absolutely Precise Solution of the Linear Programming Problems. Automation and Remote Control, 2012, vol. 73, no. 2, pp. 276-290. DOI: 10.1134/S0005117912020063

9. Panyukov A.V, Tyrsin A.N. Stable Parametric Identification of Vibratory Diagnostics Objects. Journal of Vibroengineering, 2008, vol. 10, no. 2, pp. 142-146.

10. Tyrsin A.N. Robust Construction of Regression Models Based on the Generalized Least Absolute Deviations Method. Journal of Mathematical Sciences, 2006, vol. 139, no. 3, pp. 6634-6642. DOI: 10.1007/sl0958-006-0380-7

Received August 12, 2017

УДК 519.688 DOI: 10.14529/mmp 180104

УСТОЙЧИВАЯ ИДЕНТИФИКАЦИЯ ЛИНЕЙНЫХ АВТОРЕГРЕССИОННЫХ МОДЕЛЕЙ С ЭКЗОГЕННЫМИ ПЕРЕМЕННЫМИ НА ОСНОВЕ ОБОБЩЕННОГО МЕТОДА НАИМЕНЬШИХ МОДУЛЕЙ

A.B. Панюков, Я.А. Мезал

Южно-Уральский государственный университет, г. Челябинск, Российская Федерация

Метод наименьших модулей (МИМ) является альтернативой методу наименьших квадратов (МНК). МИМ позволяет получить надежные оценки при нарушении предположений М11К. В работе рассмотрены два типа МИМ: взвешенный метод (ВМНМ) и обобщенный (ОМНМ). Установленная взаимосвязь методов позволила свести проблему определения ОМНМ-оценок к итерационной процедуре с ВМНМ-оценками, которые вычисляются путем решения соответствующей задачи линейного программирования. Найдено достаточное условие, налагаемое на функцию потерь, обеспечивающее устойчивость ОМНМ-оценок коэффициентов авторегрессионных моделей. Это обеспечивает

стабильность ОМНМ-оценок авторегрессионных моделей при наличии выбросов. Особенности известных способов применения ОМНМ для идентификации уравнения регрессии и уравнения авторегрессии без экзогенных переменных обобщены до способа идентификации моделей авторегрессии с экзогенными переменными.

Ключевые слова: алгоритм; модель авторегрессии; линейное программирование; параметрическая идентификация.

Литература

1. Мудров, В.И. Методы обработки измерений: Квазиравдоиодобные оценки / В.И. Мудрое, В.Л. Кушко. - М.: ЛЕВАНД, 2014.

2. Турин, Л.С. О состоятельности оценок метода наименьших квадратов / Л.С. Турин // Математическое обеспечение космических экспериментов. - М.: Наука. - 1978. -С. 69-81.

3. Тырсин, А.Н. Метод подбора наилучшего закона распределения непрерывной случайной величины на основе обратного отображения // Вестник ЮУрГУ. Серия: Математика, Механика, Физика. - 2017. - Т. 9, № 1. - С. 31-38.

4. Shestakov, A.L. The Theory of Optimal Measurements / A.L. Shestakov, A.V. Keller, G.A. Sviridyuk // Journal of Computational and Engineering Mathematics. - 2014. - V. 1, № 1. - P. 3-16.

5. Huber, P. Robust Statistics / P. Huber, E.M. Ronchetti. - New Jersey: Wiley, 2009.

6. Pan, J. Weighted Least Absolute Deviations Estimation for ARM A Models with Infinite Variance / J. Pan, H. Wang, Y. Qiwei // Econometric Theory. - 2007. - V. 23. - P. 852-879.

7. Panyukov, A.V. Scalability of Algorithms for Arithmetic Operations in Radix Notation / A.V. Panyukov // Reliable Computing. - 2015. - V. 19, № 4. - P. 417-434.

8. Панюков, А.В. Применение массивно-параллельных вычислений для решения задач линейного программирования с абсолютной точностью / А.В. Панюков, В.В. Горбик // Автоматика и телемеханика. - 2012. - № 2. - С. 73-88.

9. Panyukov, A.V. Stable Parametric Identification of Vibratory Diagnostics Objects / A.V. Panyukov, A.N. Tyrsin // Journal of Vibroengineering. - 2008. - V. 10, № 2. -P. 142-146.

10. Tyrsin, A.N. Robust Construction of Regression Models Based on the Generalized Least Absolute Deviations Method / A.N. Tyrsin // Journal of Mathematical Sciences. - 2006. -V. 139, № 3. - P. 6634-6642.

Анатолий Васильевич Панюков, доктор физико-математических наук, профессор, кафедра «Математическое и компьютерное моделирование>, Южно-Уральский государственный университет (г. Челябинск, Российская Федерация), paniukovav@susu.ru.

Ясир Али Мезал, аспирант, кафедра «Математическое и компьютерное моделирование:», Южно-Уральский государственный университет (г. Челябинск, Российская Федерация), yaser_ali_84@yahoo.com.

Поступила в редакцию 12 августа 2017 г.

i Надоели баннеры? Вы всегда можете отключить рекламу.