Научная статья на тему 'Modeling and forecasting Ukraine’s population by time series using the Matlab Econometrics Toolbox'

Modeling and forecasting Ukraine’s population by time series using the Matlab Econometrics Toolbox Текст научной статьи по специальности «Математика»

CC BY
150
34
i Надоели баннеры? Вы всегда можете отключить рекламу.
Журнал
Бизнес Информ
Область наук
Ключевые слова
TIME SERIES / NONSTATIONARITY / ARIMA MODELS / ECONOMETRICS TOOLBOX / MATLAB / ВРЕМЕННЫЕ РЯДЫ / НЕСТАЦИОНАРНОСТЬ / МОДЕЛИ ARIMA / ТИМЧАСОВі РЯДИ / НЕСТАЦіОНАРНіСТЬ / МОДЕЛі ARIMA

Аннотация научной статьи по математике, автор научной работы — Kovalova Kateryna O., Misiura Ievgeniia Yu.

The article deals with modeling and forecasting the population of Ukraine by time series. It is shown that time series analysis is a complex, multicomponent econometric task which does not have a universal approach to its solution. This is due both to the diversity of methods of and approaches to time series analysis which were developed over time and to the specifics of time series data. For example, the authors of the article worked with a univariate nonstationary time series, therefore, the approaches and methods presented in the article are not recommended for time series with different properties. The article has an enormous practical value, since it discusses in detail issues of computer modeling of tasks of the kind. The carried out analysis of the literature has shown the relevance of the problems considered, among which particular attention should be paid to the choice of the ARIMA model, data visualization, and forecast accuracy.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Моделирование и прогнозирование населения Украины с помощью временных рядов с использованием эконометрического инструментального комплекса MATLAB

Статья посвящена моделированию и прогнозированию населения Украины по средствам временных рядов. Показано, что анализ временных рядов является сложной, многокомпонентной эконометрической задачей, не имеющей универсального подхода к ее решению. Это связано как с разнообразием методов и подходов к анализу временных рядов, разработанных за все время, так и со спецификой данных временных рядов. Так, например, авторы статьи работали с одномерным нестационарным временным рядом, поэтому подходы и методы, приведенные в данной статье, не рекомендуется применять к временным рядам, обладающими другими свойствами. Статья обладает колоссальной практической ценностью, так как в ней подробно рассмотрены вопросы компьютерного моделирования такого рода задач. Анализ литературы показал актуальность этого вопроса на сегодняшний день. При этом на первый план выходят такие важные вопросы, как выбор модели ARIMA, визуализация данных и точность прогноза.

Текст научной работы на тему «Modeling and forecasting Ukraine’s population by time series using the Matlab Econometrics Toolbox»

UDC 519.246.85 JEL: С02

MODELING AND FORECAsTING UKRAINE^ POPULATION BY TIME sERIEs UsiNG

the matlab econometrics toolbox

© 2019

KOVALOVA K. 0., MISIURA I. Y.

UDC 519.246.85 JEL: С02

Kovalova K. O., Misiura I. Y. Modeling and Forecasting Ukraine's Population by Time Series Using the Matlab

Econometrics Toolbox

The article deals with modeling and forecasting the population of Ukraine by time series. It is shown that time series analysis is a complex, multicomponent econometric task which does not have a universal approach to its solution. This is due both to the diversity of methods of and approaches to time series analysis which were developed over time and to the specifics of time series data. For example, the authors of the article worked with a univariate nonstationary time series, therefore, the approaches and methods presented in the article are not recommended for time series with different properties. The article has an enormous practical value, since it discusses in detail issues of computer modeling of tasks of the kind. The carried out analysis of the literature has shown the relevance of the problems considered, among which particular attention should be paid to the choice of the ARIMA model, data visualization, and forecast accuracy.

Keywords: time series, nonstationarity, ARIMA models, Econometrics Toolbox, MATLAB. DOI: https://doi.org/10.32983/2222-4459-2019-5-98-105 Fig.: 4. Tabl.: 1. Formulae: 3. Bibl.: 13.

Kovalova Kateryna O. - Candidate of Sciences (Engineering), Associate Professor of the Department of Mathematics and Mathematcal Methods in Economics, Simon Kuznets Kharkiv National University of Economics (9a Nauky Ave., Kharkiv, 61166, Ukraine) E-mail: kateryna.kovalova@m.hneu.edu.ua ORCID: http://orcid.org/0000-0001-6790-6147

Misiura levgeniia Yu. - Candidate of Sciences (Engineering), Associate Professor, Associate Professor of the Department of Mathematics and Mathematcal

Methods in Economics, Simon Kuznets Kharkiv National University of Economics (9a Nauky Ave., Kharkiv, 61166, Ukraine)

E-mail: misuraeu@gmail.com

ORCID: http://orcid.org/0000-0002-5208-0853

УДК 519.246.85 JEL: С02

Ковальова К. О., Мкюра £. Ю. Моделювання i прогнозування населення Укроти за допомогою часових рядiв з використанням економетричного нструментального комплексу MATLAB

Статтю присвячено моделюванню та прогнозуванню населення Укра-ни за допомогою часових рядiв. Показано, що анал'в часових рядiв е складною, багатокомпонентною економетричною задачею, що не мае утверсального тдходу до ii розв'язання. Це пов'язано якз р'вноматтшс-тю метод'в i алгоритмiв до анал'ву часовихряд'ю, розроблених за весь час, так i зi специфкою цих часових ряд'в. Так, наприклад, автори стат-т'> працювали з одновимiрним нестацонарним тимчасовим рядом, тому тдходи i методи, наведенi в цй статт'>, не рекомендуеться за-стосовувати до часових рядах, що володють iншими властивостями. Стаття мае колосальну практичну цншстю, оскльки в юй детально розглянуто питання комп'ютерного моделювання такого роду задач. Анал'в лтератури показав актуальтсть цього питання на сьогодш. При цьому на перший план виходять так важлив'> питання, як виб'р мо-делi ARIMA, вiзуалiзацiя даних i точнсть прогнозу. Ключовi слова: тимчаoei ряди, нестацонарнсть, модел'> ARIMA, Econometrics Toolbox, MATLAB. Рис.: 4. Табл.: 1. Формул: 3. Ббл.: 13.

Ковальова Катерина Олександр'юна - кандидат техтчних наук, доцент кафедри вищоi математики та економко-математичних мето-дiв, Хармвський нацональний економiчний ушверситет iм. С. Кузнеця (просп. Науки, 9а, Хармв, 61166, Украна) E-mail: kateryna.kovalova@m.hneu.edu.ua ORCID: http://orcid.org/0000-0001-6790-6147

Мiсюра Свгешя Юрнвна - кандидат техн'нних наук, доцент, доцент кафедри вищоi математики та економко-математичних мето-дiв, Хармвський нацональний економiчний унверситет iм. С. Кузнеця (просп. Науки, 9а, Хармв, 61166, Украна) E-mail: misuraeu@gmail.com ORCID: http://orcid.org/0000-0002-5208-0853

УДК 519.246.85 JEL: С02

Ковалева Е. А., Митра Е. Ю. Моделирование и прогнозирование населения Украины с помощью временных рядов с использованием эконометрического инструментального комплекса MATLAB

Статья посвящена моделированию и прогнозированию населения Украины по средствам временных рядов. Показано, что анализ временных рядов является сложной, многокомпонентной эконометриче-ской задачей, не имеющей универсального подхода к ее решению. Это связано как с разнообразием методов и подходов к анализу временных рядов, разработанных за все время, так и со спецификой данных временных рядов. Так, например, авторы статьи работали с одномерным нестационарным временным рядом, поэтому подходы и методы, приведенные в данной статье, не рекомендуется применять к временным рядам, обладающими другими свойствами. Статья обладает колоссальной практической ценностью, так как в ней подробно рассмотрены вопросы компьютерного моделирования такого рода задач. Анализ литературы показал актуальность этого вопроса на сегодняшний день. При этом на первый план выходят такие важные вопросы, как выбор модели ARIMA, визуализация данных и точность прогноза. Ключевые слова: временные ряды, нестационарность, модели ARIMA, Econometrics Toolbox, MATLAB. Рис.:4. Табл.: 1. Формул:3. Библ.: 13.

Ковалева Екатерина Александровна - кандидат технических наук, доцент кафедры высшей математики и экономико-математических методов, Харьковский национальный экономический университет им. С. Кузнеца (просп. Науки, 9а, Харьков, 61166, Украина) E-mail: kateryna.kovalova@m.hneu.edu.ua ORCID: http://orcid.org/0000-0001-6790-6147 Мисюра Евгения Юрьевна - кандидат технических наук, доцент, доцент кафедры высшей математики и экономико-математических методов, Харьковский национальный экономический университет им. С. Кузнеца (просп. Науки, 9а, Харьков, 61166, Украина) E-mail: misuraeu@gmail.com ORCID: http://orcid.org/0000-0002-5208-0853

Б1ЗНЕС1НФОРМ № 5 '2019

www.business-inform.net

Time series analysis is a complex, multi-step process. In general it comprises four main stages described below.

1. Choice of time series analysis method (frequency-domain and time-domain methods [1]; parametric and non-parametric methods [2]; linear and nonlinear methods; univariate and multivariate ones).

2. Research analysis of time series data (autocorrelation

analysis to examine serial dependence [3]; spectral analysis to examine cyclic behavior which need not be related to seasonality [4]; separation into components representing trend, seasonality, slow and fast variation, and cyclical irregularity).

3. Curve fitting, which is the process of constructing a curve or mathematical function that has the best fit to a series of data points. At this stage a researcher may choose a model for time series data.

Time series models appear in many forms and represent various random processes. When modeling variations in the level of a process, their three main classes can be singled out: autoregressive (AR) models, integrated (I) models and moving average (MA) models. These three classes are linearly dependent on previous data points. Combinations of these models provide autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models. Autoregressive fractional integrated moving average (ARFIMA) model generalizes the three above mentioned classes. Nonlinear dependence of the level of a time series on previous data points is of interest, to some extent due to the possibility of creating a chaotic time series (ARCH, GARCH, TARCH, EGARCH, FIGARCH, CGARCH, etc.) [5];

4. Forecasting, which involves achieving the ultimate goal of time series analysis - forecasting [6].

Another important question which should be considered concerns choosing a proper computer program for time series analysis and forecasting.

The article deals with some problems of the aforementioned stages of time series analysis. As an example, the authors propose to consider modeling and forecasting the population of Ukraine by time series. The population data are presented as a univariate nonstationary time series. The next section of the article gives a review of the literature on the issue under study.

Currently, Ukraine is experiencing demographic decline. According to the State Statistics Service, during a ten month period of 2018, the population of Ukraine decreased by 166.3 thousand people [7]. This indicator is 18 % higher than that for the same period of 2017. The situation may be related to the migration outflow of the population and its natural decline. In view of the critical demographic situation, the authors propose using time series analysis to describe the population dynamics, for relevant authorities to promptly respond to challenges in this sphere.

Many modern scientists study demographic issues. The problem of the decrease in the working population (mostly through migration) and different ways to solve it are covered in articles [8], [9], and [10]. Statistical approaches to the analysis and forecasting of demographic data are also

considered in the literature. For example, [11] presents new APC (age, period and cohort effects) models, methods, and empirical applications which are based on [1 - 6] fundamental works. Article [12] presents approaches of time series analysis for understanding the demographics of users of online social networks. The article highlights the digital aspect of demographic issues: social networks occupy an important place in the modern world of digital technologies and apply time series as a modeling and forecasting tool. Models and methods of [12] are also based on works [1 - 6].

There are a great number of computer programs for implementing analysis of time series. Their detailed consideration is presented in early fundamental works [1 - 6] and in the modern ones published in the heyday of software [8 -12]. At first glance, the market for time series software does not seem to be underdeveloped, but with a more detailed analysis it becomes clear that there is definitely not enough available and high-quality software for analyzing time series. Professional software requires large financial expenditures. Professional statistical and mathematical software programs (SAS, Statistica, MATLAB) were used in articles [8 - 10], with MATLAB being employed even in earlier works [1 -4]. Among free software packages, GRETL, TISEAN, and R (used in [11], [12]) are worth mentioning. Thus, it can be concluded that the distinguished leaders among the software used in this field are MATLAB, R.

MATLAB Econometrics Toolbox includes univariate Bayesian linear regression, univariate ARIMAX/GARCH composite models with several GARCH variants, multivariate VARX models and cointegration analysis for time series modeling and analysis. Considering the above mentioned, the authors chose MATLAB as the best program to fit the purpose of the study.

The next part of the article presents a step-by-step building of a time series model of Ukraine's population using MATLAB Econometrics Toolbox.

The statistics on the population of Ukraine are available at http://www.ereport.ru. The data are stored in the "population" array. The measurements are made at one year intervals which are stored in the "years" array. The statistics were used by the authors to develop a time series model. The MATLAB code for plotting the initial data is given below and the result of running the program segment is shown in Fig. 1.

clc

clear all

Population = [51.7; 51.9; 51.7; 51.3; 50.9 50.4; 50.0; 49.5; 49.2; 48.8; 48.4; 48.1; 47.7 47.4; 46.7; 46.3; 46.0; 45.7; 45.4; 45.1; 44.9 44.6; 44.3; 44.4; 44.2]; years = [1992:1:2016]';

figure('Units', 'normalized', 'OuterPosition', [0 0 1 1]);

title('Population of Ukraine, million people');

plot(years, Population) xlabel('years') ylabel('the Population') grid on

BI3HECIHQOPM № 5 '2019

www.business-inform.net

The population

Fig. 1. The population of Ukraine, million people

Now, let us create autocorrelation factor (ACF) and partial autocorrelation factor (PACF) plots to identify patterns in the above data. The idea is to identify the presence of AR and MA components in the residuals. The following is the MATLAB code to produce ACF and PACF plots. The result of the running of the program segment is shown in Fig. 2.

parcorr(Population) autocorr(Population)

According to Fig. 2, b, autocorrelation coefficients are more than 1. It seems that in the MATLAB parcorr function gives wrong results. The phenomenon can be explained. This is not a bug but rather a result of different algorithms used to compute the PACF. For reference, the PACF is computed by fitting successive order AR models by OLS, retaining the last coefficient of each regression.

This is an approximation of the Yule-Walker equations' solution, but it should be suitable for reasonable sample sizes. These data are short (only 30 observations) and

1.0

0.5

0.0

Sample Autocorrelation Function

-0.5

ÏÏ

♦ » I ♦ ♦

0 2 4 6 8 10 12 14 16 18 20

Lag

1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 -2.0

Sample Partial Autocorrelation Function

» é ...t........ i <

t ♦ rî

i i * * \ ;

i

< •

0 2 4 6 8 10 12 14 16 18 20

Lag

a) b)

Fig. 2. Visualization of the ACF function (a) and PACF function (b)

they do not satisfy the assumptions that the Yule-Walker equations are based on. The authors recommend to use the following code in the situations:

if (pari == 'Default') {

pari = 10*log10(length(x))

}

else {

}

pari <- as.numeric(parl)

par2 <- as.numeric(par2) par3 <- as.numeric(par3) par4 <- as.numeric(par4) par5 <- as.numeric(par5) if (par6 == 'White Noise') par6 <- 'white' else

par6 <- 'ma'

par7 <- as.numeric(par7) if (par8 != '') par8 <- as.numeric(par8) x <- na.omit(x) ox <- x if (par8 == '') { if (par2 == 0) {

x <- log(x)

}

else {

x <- (x A par2 - 1) / par2

}

else {

x <- log(x,base=par8)

if (par3 > 0) x if (par4 > 0) ; par4)

:- diff(x,lag=1,difference=par3) <- diff(x,lag=par5,difference=

Ah

The result of the running of the program segment is shown in Fig. 3.

ACF 1.00.80.6 0.4 0.2-1 0.0 -0.2 H -0.4

Autocorrelation

PACF

ccording to Fig. 3, a, the autocorrelation plot shows that the time series is not random but rather has a high degree of autocorrelation between adjacent and near-adjacent observations. Also, as seen from Figure 3, b, the partial autocorrelation plot demonstrates a clear

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Partial Autocorrelation

a)

10

—I— 12

0. 0.6-1 0.4 0.2 0.0 -0.2 -0.4

"I—I—I—I—I—I—I—T

—I— 10

time lag

b)

lags

Fig. 3. Visualization of the ACF function (a) and PACF function (b) after using the authors' program

statistical significance for lags 1 and 2. The next few lags are in the middle of statistical significance.

We considered the ACF and PACF values up to lag 12, i.e., one-third of the data and 95 % confidence level. From these plots we can determine the type and order of the adequate model required to fit the series. Moreover, as the ACF values damp out rapidly for increasing lags, we can assume that the data series is stationary. The ACF and PACF plots indicate that an ARIMA model is appropriate.

Table 1

The built-in ARIMA(p, D, q) function creates model objects for a stationary or unit root nonstationary linear time series model [13]. This includes moving average (MA), autoregressive (AR), mixed autoregressive and moving average (ARMA), integrated (ARIMA), multiplicative seasonal, and linear time series models which include a regression component (ARIMAX).

The input arguments of the built-in ARIMA function are shown in Table 1. The mathematical formulation of the ARIMA(p, D, q) model using lag polynomials is given below:

( p \

<(L)(l -L)°yt =e(L)£t,i.e. l-^L (l-L)Dyt =

Name of parameters Description

P Positive integer indicating the degree of the nonseasonal autoregressive polynomial

D Nonnegative integer indicating the degree of nonseasonal integration in the linear time series

q Positive integer indicating the degree of the nonseasonal moving average polynomial

( P

) yt = 0(L )E t, i.e. 1 -^L

V i=i 7

(

yt =

1+X0 L

V j=1 J

(2)

V i=l y

1+E0L

V = J

(1)

Since the series used in this article are already transformed to log-returns, any further differentiation is not needed, and thus we may set d = 0. In this case the ARIMA(p, 0, q) turns into the ARMA(p, q) model. ARIMA has also a more complicated variant which allows to capture seasonality. Though it is available, we will not use it here (demographical data are nonseasonal). Mathematically the ARMA(p, q) model can be represented as

Since the MATLAB built-in function allows to estimate ARMA, from a technical point of view, it will be estimated further as the ARIMA(p, 0, q) model. But since there is no need to differentiate the dependent variable, this will be ARMA (p, q) specifications. The next MATLAB program code identifies the best fit ARIMA(p, 0, q) model. The following is the code for the same:

clc

clear all

r = [51.7; 51.9; 51.7; 51.3; 50.9; 50.4; 50.0; 49.5;

49.2; 48.8; 48.4; 48.1; 47.7; 47.4; 46.7; 46.3;

46.0; 45.7; 45.4; 45.1; 44.9; 44.6; 44.3; 44.4; 44.2]; max_p=5; max_q=1; crit=zeros(max_p,max_q);

c=0;

N=size(r,1); for i=1:max_p for j=1:max_q

Mdl = arima(i,0,j); [EstMdl,EstParamCov,logL,info] estimate(Mdl,r);

[aic,bic]=aicbic(logL,i + j,N);

if ((i==1) && (j==1)

(bic<c)

c=bic; p=i; q=j;

end

crit(i,j)=bic; % <- change this line

for AIC end

end

fprintf('\nEstimates of BIC criterion: \n'); crit

fprintf('\n'); Mdl = arima(p,0,q);

[EstMdl,EstParamCov,logL,info] =

estimate(Mdl,r);

[res,v] = infer(EstMdl,r);

A step-by-step run of the program code is presented below:

ARIMA(1,0,1) Model:

Conditional Probability Distribution: Gaussian

Parameter

Constant AR{1} MA{1} Variance

Value

-0.247391 1

0.466445 0.034112

Standard Error

0.837429 0.0166123 0.268057 0.00917661

t

Statistic

-0.295417 60.1965 1.7401 3.71727

ARIMA(2,0,1) Model:

Conditional Probability Distribution: Gaussian

Parameter

Constant AR{1} AR{2} MA{1} Variance

Value

0.656432 1.77445 -0.789709 -1

0.0130313

Standard Error

0.186673 0.0522129 0.0537842 0.200678 0.00593187

t

Statistic

3.51648 33.9849 -14.6829 -4.98312 2.19682

ARIMA(3,0,1) Model:

Conditional Probability Distribution: Gaussian

Parameter

Value

Standard Error

t

Statistic

Constant 0.531783 0.389661 1 .36473

AR{1} 1.96595 0.29192 6 .73454

AR{2} -1.14885 0.5593 -2 .05409

AR{3} 0.170423 0.276867 0 .61554

MA{1} -0.942271 0.35053 -2 .68813

Variance 0.0158196 0.00859024 1 .84157

ARIMA(4,0,1) Model:

Conditional Probability Distribution: Gaussian

Standard t

Parameter Value Error Statistic

Constant 2.19704 1.31898 1 .66571

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

AR{1} 0.513897 0.233049 2.2051

AR{2} 0.963821 0.31734 3 .03719

AR{3} 0.0220676 0.215029 0. 102626

AR{4} -0.548489 0.20729 -2 .64599

MA{1} 0.69526 0.316827 2 .19445

Variance 0.0162225 0.00491912 3 .29785

ARIMA(5,0,1) Model:

Conditional Probability Distribution: Gaussian

Parameter

Constant AR{1} AR{2} AR{3} AR{4}

Value

2.31268 0.471412 0.970034 0.0838277 -0.52745

Standard Error

1.36453 0.401658 0.322469 0.40442 0.314787

t

Statistic

1.69485 1.17366 3.00815 0.207279 -1.67558

AR{5} MA{1} Variance

-0.0489605 0.711955 0.0161973

0.361783 0.4246 0.004998

-0.135331 1.67677 3.24076

Estimates of BIC criterion: crit = -7.0680 -28.8448 -24.1408 -15.9924 -12.8126

ARIMA(2,0,1) Model:

Conditional Probability Distribution: Gaussian

Parameter

Constant AR{1} AR{2} MA{1} Variance

Value

0.656432 1.77445 -0.789709 -1

0.0130313

Standard Error

0.186673 0.0522129 0.0537842 0.200678 0.00593187

t

Statistic

3.51648 33.9849 -14.6829 -4.98312 2.19682

Choosing the best ARIMA model is based on Akaike or Bayesian information criteria. The best fit model selected is ARIMA(2,0,1). It means that the degree of the nonseasonal autoregressive polynomial is two, the degree of nonseasonal integration in the linear time series is zero, the degree of the nonseasonal moving average polynomial is one. Mathematically, the resulting model can be presented as follows:

<|>(L) yt =6 (L)£t, i.e.(l L1 L2) yt = =(i+e1L1) Et

PopulationModel = arima(2,0,1)

PopulationFit = estimate(PopulationModel,Popu

lation)

[Y,YMSE] = forecast(PopulationFit,4,'Y0',Popu lation)

The result of the program fragment is shown below.

PopulationModel =

ARIMA(2,0,1) Model:

(3)

We are going to apply the ARIMA model with well-chosen parameters for time-series forecasting. This process includes two stages: estimating and forecasting. The following is the MATLAB code to estimate ARIMA model parameters using initial values and to forecast the population of Ukraine for the next 3 years (2017, 2018, and 2019) applying the above mentioned model. Since the forecast includes the year of 2016, it allows us to compare the past data with the reproduced forecast results.

ARIMA(2,0,1) Model:

Distribution P D Q

Constant AR SAR MA SMA Variance

Name = 'Gaussian' 2

0 1

NaN

{NaN NaN} at Lags {}

{NaN} at Lags [1]

{}

NaN

"1 2]

Conditional Probability Distribution: Gaussian

Standard

t

Parameter

Constant AR{1} AR{2} MA{1} Variance

Value

Error

0.656432 1.77445 -0.789709 -1

0.0130313

0.186673 0.0522129 0.0537842 0.200678 0.00593187

Statistic

3.51648 33.9849 -14.6829 -4.98312 2.19682

44.2045 44.2106 44.0516 43.9528

YMSE

0.0403 0.2139 0.8230 2.5207

PopulationFit =

ARIMA(2,0,1) Model:

Distribution P D Q

Constant AR SAR MA SMA Variance

Name = 'Gaussian'

2 0 1

0.656432

{1.77445 -0.789709} at Lags

{}

{-1} at Lags [1]

{}

0.0130313

1 2]

Y

The graph of the generated forecast is presented in

Fig. 4.

The line plot shows the observed values compared to the rolling forecast estimates. In general, our forecasts align with the true value (the year of 2016) very well, showing a downward trend which started in the year of 2016 and finished in 2019.

Particular attention should be paid to quality of the forecast. The automatically calculated mean-square error values (YMSE array) are close to zero, which indicates a good fit of the selected model.

Unfortunately, the forecast trend shows that there may be a shortage of skilled labor in Ukraine within 10 years.

ТЬю population

52 51 50 49 48 47 46 45 44 4З

measured forecasted

1990

CONCLUSIONS

1995 2000 2005 2010 2015

Fig. 4. The forecast of time series data using the ARIMA model

2020 Years

In the course of the research, different methods of and approaches to time series analysis were considered. Unfortunately, a universal approach for addressing problems in this field has not been found yet. The methods developed in the 60s of the last century (and some at the beginning of the 19th century) are still popular along with the MATLAB Econometrics Toolbox which is considered in the article. This is partly due to the fact that the task of forecasting, like any other task arising in the process of working with data, is in many ways a creative and certainly research. Despite a great number of formal quality metrics and methods for estimating parameters, for each time series it is often necessary to select and try something different: e.g., you can get correlation coefficients greater than 1 or thoughtlessly choose the parameters of the ARIMA model.

To find balance between quality and labor costs is also of importance. The MATLAB Econometrics Toolbox demonstrates outstanding results with proper tuning, may require more than one hour of manipulations with the programming and additional settings, while a simple linear regression can be built in 10 minutes with more or less comparable results.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The time series of the population of Ukraine were modeled as one-dimensional time series with nonstation-ary data using the ARIMA model. Thus, the authors recommend to apply the approaches and methods used in the article, as well as the developed algorithms only for time series of the kind. ■

LITERATURE

1. Cohen J., Cohen P., West S. G., Aiken L. S. Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ : Lawrence Erlbaum, 2003.

2. Broughton S. A., Bryan K. Discrete Fourier Analysis and Wavelets: Applications to Signal and Image Processing. New York : Wiley, 2008. 72 р.

3. Murphy K. Machine Learning: A Probabilistic Perspective. MIT, 2012. 16 р.

4. Everitt B. The Cambridge Dictionary of Statistics. Cambridge, UK; New York : Cambridge University Press, 1998.

5. Muhammad Imdad Ullah. Basic Statistics and Data Analysis. Retrieved 2 January 2014. URL: http://itfeature.com/

6. Mormann F., Andrzejak R. G., Elger Ch. E., Lehnertz K. Seizure prediction: the long and winding road. Brain. 2007. No. 130 (2). Р. 314-333.

DOI: 10.1093/brain/awl241.PMID17008335

7. Олексенко Р. И., Афанасьева Л. В. Экономико-правовой механизм реализации новой модели современной культурной политики Украины // Актуальные проблемы экономики, менеджмента и маркетинга в современных условиях : материалы Междунар. науч.-практ. заочной конф. Мелитополь, 2019. С. 421-425.

8. Wessel T., Turner L. M., Nordvik V. Population dynamics and ethnic geographies in Oslo: the impact of migration and natural demographic change on ethnic composition and segregation. Journal of Housing and the Built Environment. 2018. Vol. 33. No. 4. Р. 789-805.

9. Koons D. N. et al. A life history perspective on the demographic drivers of structured population dynamics in changing environments. Ecology letters. 2016. Vol. 19. No. 9. Р. 1023-1031.

Б1ЗНЕС1НФОРМ № 5 '2019

www.business-inform.net

10. Koons D. N., Arnold T. W., Schaub M. Understanding the demographic drivers of realized population growth rates. Ecological Applications. 2017. Vol. 27. No. 7. P. 2102-2115.

11. Yang Y., Land K. C. Age-period-cohort analysis: New models, methods, and empirical applications. Chapman and Hall/ CRC, 2016.

12. Culotta A., Kumar N. R., Cutler J. Predicting the Demographics of Twitter Users from Website Traffic Data // AAAI. 2015. P. 72-78.

13. Shapour Mohammadi, Hossein Abbasi-Nejad. A Matlab Code for Univariate Time Series Forecasting // Computer Programs 0505001, University Library of Munich, Germany, 2005. URL: https:// ideas.repec.org/c/wpa/wuwppr/0505001.html

REFERENCES

Broughton, S. A., and Bryan, K. Discrete Fourier Analysis and Wavelets: Applications to Signal and Image Processing. New York: Wiley, 2008.

Cohen, J. et al. Applied multiple regression/correlation analysis for the behavioral sciences. Mahwah, NJ: Lawrence Erlbaum, 2003.

Culotta, A., Kumar, N. R., and Cutler, J. "Predicting the Demographics of Twitter Users from Website Traffic Data". In AAAI, 72-78, 2015.

Everitt, B. The Cambridge Dictionary of Statistics. Cambridge, UK; New York: Cambridge University Press, 1998.

Koons, D. N. et al. "A life history perspective on the demographic drivers of structured population dynamics in changing environments". Ecology letters, vol. 19, no. 9 (2016): 1023-1031.

Koons, D. N., Arnold, T. W., and Schaub, M. "Understanding the demographic drivers of realized population growth rates". Ecological Applications, vol. 27, no. 7 (2017): 2102-2115.

"Muhammad Imdad Ullah. Basic Statistics and Data Analysis. Retrieved 2 January 2014". http://itfeature.com/

Mormann, F. et al. "Seizure prediction: the long and winding road". Brain, no. 130(2) (2007): 314-333. DOI: 10.1093/brain/awl241. PMID17008335

Murphy, K. Machine Learning: A Probabilistic Perspective. MIT,

2012.

Oleksenko, R. I., and Afanaseva, L. V. "Ekonomiko-pravovoy mekhanizm realizatsii novoy modeli sovremennoy kulturnoy poli-tiki Ukrainy" [Economic and legal mechanism for the implementation of a new model of modern cultural policy of Ukraine]. Aktual-nyye problemy ekonomiki, menedzhmenta i marketinga vsovremen-nykh usloviyakh. Melitopol, 2019. 421-425.

Shapour, Mohammadi, and Hossein, Abbasi-Nejad. "A Matlab Code for Univariate Time Series Forecasting". Computer Programs 0505001, University Library of Munich, Germany, 2005. https://ideas.repec.org/c/wpa/wuwppr/0505001.html

Wessel, T., Turner, L. M., and Nordvik, V. "Population dynamics and ethnic geographies in Oslo: the impact of migration and natural demographic change on ethnic composition and segregation". Journal of Housing and the Built Environment, vol. 33, no. 4 (2018): 789-805.

Yang, Y., and Land, K. C. Age-period-cohort analysis: New models, methods, and empirical applications. Chapman and Hall/ CRC, 2016.

i Надоели баннеры? Вы всегда можете отключить рекламу.