MOVING BLOCK BOOTSTRAP METHOD WITH BETTER ELEMENTS REPRESENTATION FOR UNIVARIATE TIME SERIES DATA

Kayode Ayinde; James Daniel; Akinola Adepetun; Olusegun S. Ewemooje

1Kayüde AYINDE, *2James DANIEL, 1Akinüla ADEPETUN and 1 Olusegun S. EWEMOOJE

Bootstrap method was initially used to determine accuracy measures for sample estimates of independent and identical distributions (i.i.d.). In order to apply bootstrap method to time-dependent data, blocking technique is introduced to preserve serial correlation of the original time series data. In the past, resampling techniques for time-dependent data were implemented using Non-overlapping Block Bootstrap (NBB) method but its dichotomous block arrangement restricts the number of blocks. As a result, improvement becomes necessary. Although the Moving Block Bootstrap (MBB) method improves upon NBB with regard to many more blocks, it introduces an uneven representation of the time series elements which eventually influences its accuracy. In this paper, an innovative method called Moving Block Bootstrap Method with better element Representation (MBBR) is developed to ensure that the time series elements within the block are better represented with minimum number of elements. To compare MBB and MBBR, simulated studies were carried out on some set time series data following each classes of Autoregressive Moving Average (ARMA) model with different parameters, sample sizes and standard deviation using Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Results show that by improving the representation of time series data in the blocking arrangement, the accuracy of the proposed method (MBBR) consistently outperforms the existing one (MBB) and thus, provides more efficient estimates of the dependent variable.

Keywords: Bootstrap method, Autoregressive Moving Average (ARMA), measures of accuracy,

Moving block bootstrap (MBB), Moving Block Bootstrap Method with better element Representation.

Statistics, the bedrock of rational and scientific decision-making, often relies on random samples of observations to draw conclusions. As a result, one can safely say that proper sampling is the backbone of Statistics, and bootstrapping is one of its dynamics. Bootstrap method relies on using original sample or some parts of it as an artificial population from which random samples are selected. [5] introduced the concept of bootstrap which keeps spreading like bushfire in the field of statistical sciences in a couple of decades. The resampling scheme as provided by [5] is discussed as follows. Suppose that Xi = x1, x2, ■ ■ ■ , xn is a random sample of observation from independently identical distribution (i.i.d.). If X* is selected at random from Xi (that is x£, x£, ■ ■ ■ , x£ from x1, x2, ■ ■ ■ , xn) with probability P(Xi) = n, i = 1,2, ■ ■ ■ , n; To generate a bootstrap random sample by resampling Xi, generate n random integers i = 1,2, ■ ■ ■ , n and select the bootstrap sample X* — x^, ■ ■ ■ ,x £ with replacement. Suppose 9 is the parameter of interest (9

1Federal University of Technology, Akure, Nigeria 2National Bureau of Statistics, Abuja, Nigeria * futathesis@gmail.com

*

Abstract

1. Introduction

could be a vector), and Q is an estimator of Q.

The moving block bootstrap (MBB) was developed in [8] as an improvement of the Non-overlapping Block Bootstrap NBB. The MBB was developed to give room for more blocks as against the NBB in [3]. However, while making provision for more blocks, some elements appear less frequently than others, especially the extreme values. Thus, this eventually became a problem as such uneven representation lower its accuracy values. Illustration of MBB method as demonstrated by [8] shows the problem as follows: think of X = xi, X2, ■ ■ ■ , X10 as the original time series data. If a block length of l = 3 is considered such that B1 = x1, x2, X3, B2 = X2, X3, X4, B3 = X3, X4, X5, B4 = X4, X5, X6, B5 = X5, X6, X7, Bg = X6, X7, X8, B7 = X7, X8, X9, B8 = X8, X9, Xio, it is therefore observed that X1, X10 each appears once and X2, X9 each appears just twice while every other observation appears three times each in the MBB blocking scheme. This portrays the problem of uneven representation of the time series elements within the block arrangement at the two edges of the original the time series data. The presence of such uneven representation is capable of influencing the accuracy measure of MBB method. To profile solution to the problem of this uneven representation without creating additional one, effort is made in this paper to design a synthetic blocking scheme within its block pot as an improvement on the MBB

In [3], data-based Markov chain to sample blocks is prioritized in order to increase the possibility that subsequent blocks match at their conclusion. The description of the circumstances in which the bias of a bootstrap estimator of a variance is reduced by this matched-block bootstrap is made. However, the moving block bootstrap only accelerates the pace of bias convergence when a Markov process generates the data. The estimator's variance is not decreased by the moving block bootstrap. Extension of the bootstrap method proposed by [5] through [8]. Series was split into nl + 1 overlapping blocks of length l. Observations 1 to l is considered to be block 1, observations 2 to l + 1 is considered to be block 2, etc until all the elements of the the time series data are exhausted. Then, sfracnl blocks were generated at random with replacement from these nl + 1 blocks. Following that, aligning the sfracnl blocks in the selection order yields the bootstrap observations. Although the bootstrapped observations are no longer steady with this construction, this type of bootstrap approach still works with dependent data. However, [16] stressed that the issue can be solved by arbitrarily changing the block length. The stationary bootstrap is the name of this technique. Under the condition that p-1 is roughly equal to l, where l is the block length and the parameter of the geometric distribution, the stationary bootstrap estimate of variance and the moving block estimate of variance are relatively close ([17]). The Markovian bootstrap and a stationary bootstrap approach that matches succeeding blocks based on standard deviation matching are other related variants of the moving block bootstrap ([6]).

Circular Block Bootstrap (CBB), Moving Block Bootstrap (MBB), and Stationary Block Bootstrap (SBB) are asymptotically comparable in the sense of mean squared error, according to [10], who compared the asymptotic minimal values of the mean square error of each of the four block bootstrap methods (MSE). The study confirmed that, even with moderately sized samples, there are benefits to employing MBB and CBB rather than the stationary block bootstrap approach. Investigated on how the optimal block bootstrap method might be particularly sensitive to the choice of block size was made by [1], While [12] pointed out the stationary difficulty of the resam-pled series by the moving block bootstrapping. Tapered Block Bootstrap was proposed by [13], a new variation of the block bootstrap covering approximately linear statistics, and represented an improvement over the original block bootstrap ([5]). Tapered block bootstraps are shown to have asymptotic validity and favorable bias properties for smooth functions of means and M-estimators. Instead of using the block bootstrap, tapering is typically applied to the random weights in the bootstrapped empirical distribution ([15]; [14]). A detailed discussion of optimally selecting window shapes and block sizes were also presented along with some finite-sample simulations by [14].

A new block bootstrap procedure for time series data called the extended tapered block bootstrap

for estimating the variance and estimating the sampling distribution of a large category of approximately linear statistics was proposed by [20]. The paper established the consistency of the distribution approximation under the smooth function model by obtaining asymptotic bias and variance expansions. The extended tapered block bootstrap has a wider applicability than the tapered block bootstrap, while preserving the favorable bias and mean squared error properties of the tapered block bootstrap. A small simulation study was done to compares the performance of block-based bootstrapping methods on a finite-sample basis. ARIMA methodology was introduced by [2]. Before then, statisticians analyzed time series data without taking into account how non-stationarity might affect their analyses. It was shown that non-stationary data could be made stationary by "differencing" the time series data. In this way, one could pull apart a juicy trend at a specific time period from a growth/decline that would be expected anyway. For an instance, given the non-stationarity of the time series data. They also stressed that the Partial Autocorrelation Function (PAF) fa is non-zero for k less than or equal to p and zero for k greater than p for an autoregressive process of order p. In other words, a cut-off after lag k is present in the partial autocorrelation function (PAF) of the pth order autoregressive process. The autoregressive model is typically expressed as:

The complexity of this model was given a relief by [2]. When an autocorrelation coefficient deviates from their confidence range, they established a cut off. The Box and Jenkins ARIMA approaches are predicated on the notion that a time series with highly correlated successive values can be viewed as having been produced by a string of independent shocks.

Using dependent data, [9] developed a novel fast bootstrap theory. In our scheme, smoothed moment indicators were resampled. Effectiveness of this method is demonstrated for parametric as well as semi-parametric estimation problems. The novel method's asymptotic improvements demonstrate that it is higher-order correct under reasonable time series, estimating function, and smoothing kernel assumptions. The use and benefits of the generalized technique of moments estimation, generalized empirical likelihood estimation, and the M-estimation method were shown by [9] using the method described in this article. The autoregressive conditional duration approach was put up against other current, frequently used first- and higher-order correct methods in a Monte Carlo research. The innovative bootstrap generates higher-order accurate confidence intervals while being computationally lighter than higher-order correct rivals. A real-data example on the dynamics of trading volumes of US stocks serves as an excellent example of the empirical applicability of our methodology. It was pointed by [4] that a brand-new bootstrap method based on generative adversarial networks for time series data (GANs). They show that GANs can understand the dynamics of typical stationary time series processes and that they can be used to produce additional samples from processes using GANs trained on a single sample path. A vector chosen from a normal distribution with a zero mean and an identity variance-covariance matrix can be utilized in the study to create credible samples, and temporal convolutional neural networks have a design that works well as both a generator and a discriminator. The simulations used in the article to evaluate the performance of the recommended bootstrap to circular block bootstrapping when resampling an AR(1) time series process also highlight the finite sample features of GAN sampling. According to the study, resampling with the GAN can provide better empirical coverage than circular block bootstrapping. The Sharpe ratio was given an empirical application at the end.

2.1. The Existing and Proposed Methods

The existing method considered in this research is the Moving Block Bootstrap (MBB) upon which the proposed method called the Moving Block Bootstrap with better element Representation

Xt = (pi Xf_i + <p2 Xt-2 + <3 Xt-3 + ... + (pXt-p + £ t

(1)

2. Methods

(MBBR) is built. Detail of the existing and the proposed methods is discussed below. The Existing Method

The MBB is an extension of Non-overlapping Block Bootstrap (NBB) presented in [3] in which rooms are provided for observations in the original observation(s) that are cut off from the last block because, n which is the sample size may not be divisible by l, the block length. It invariably provides for more number of blocks than the NBB method. Given a time series data, {Xt} = {xi, x2, ■ ■ ■ , xn } which follows an AR, MA or ARMA process presented in Equation 7 and the statistic of interest 0MBB = f (X) = f (x1, x2, ■ ■ ■ , xn), Bi is defined as the block of size l consecutive observations starting from x(, that is:

B, = {X,...,X(+/_1}; i = 1,2,... ,b for

lim n ^ œ b = n -1 + 1

(2)

Let b = n — l + 1 and n* = k x l, while k and n* are positive integers, such that n* is the smallest multiple of l > n. A random sample of k blocks, (n < k < œ), {BJ, B2, ■ ■ ■ , B*} is resampled independently with replacement from {B1, ■ ■ ■ , Bb} ~ Mnz/(1, b) with probability of b-1; where each B„ i = 1,2, ■ ■ ■ , k has a block of size l; (n > l > 1). If l = 1, the block bootstrap returns to independent and identically distributed bootstrap originally proposed by [5]. Figure 1 provides a schematic representation of Equation 2 as a blocking scheme for moving block bootstrap.

X1

x2 xa

X xi+1 x;+2

xn

-Br

-B2-

-Bk-

-Ba-1

Figure 1: Moving Block Bootstrap

(3)

The MBB is then formed by collapsing elements of B*; i = 1,2, ■ ■ ■ , k into a single time series data to form

{Xi* } = {x(i-i)Z+i, ■ ■ ■ , xn* }

= {xi, x2, ■ ■ ■, xn»}

where n( = k x l length of the resampled series. The resampled statistics is then calculated according to QMBB = fn*({x1,x2, ■ ■ ■ ,xn*}). One keep varying the block length to test for the minimum y^MSE) For instance, given a time series data as Xt = x1, x2, ■ ■ ■ , x10, is chunked into equal block length of 3 as can be seen in array of Equation 4.

Xt

/X1 X2 xa \ B1

X2 xa x4 B2

xa x4 x5 B3

x4 x5 x6 B4

x5 x6 x7 B5

x6 x7 x8 B6

x7 x8 x9 B7

V x8 x9 xxQ / B8

(4)

The Proposed Method

Consider modifications to the following already existing block resampling techniques for dependent data: the MBB in [8]. The reason for modification is to introduce equal number of

presence of every element of the parent time series in the blocking procedure. Take for instance,

given a time series data as Xt = xi, x2, ■ ■ ■ , xio, is chunked into blocks as follows according to each method (MBB and MBBR). The Moving Block Bootstrap Method with better element Representation (MBBR) is formed to reduce the less representative presence of extreme member of the time series data from 2l to just 2. as can be seen by comparing Equation 4 and 6. Reduction of less-represented elements of the time series data helps to increase the performance of model evaluation metrics (RMSE and MAE). The MBBR method is an extension of MBB method from [8]. Given a time series data, {Xt} = {x1, x2, ■ ■ ■ ,xn} which follows an AR MA or ARMA process of Equation 7 and the statistic of interest 9MBB = f (Xt) = f (x1, x2, ■ ■ ■ , xn), Bi is defined as the block of size l consecutive observations starting from xi, that is:

Bi = {Xi.....Xl+[l_t)}; i = 1,2,... , b for

lim n ^ ^ b = n - 2(1 - 1)

(5)

Let b = n - 2(l - 1) and n* smallest multiple of l > n.

k x l, while k and n* are positive integers, such that n* is the

A random sample of k blocks, (n < k < ro), {B*,B**, ■ ■ ■ , B**} is resampled independently with replacement from {B1, ■ ■ ■ , Bb} ~ unif (1, b) with probability of b-1; where each BL, i = 1,2, ■ ■ ■ , k has a block of size l; (n > l > 1). If l = 1, the block bootstrap returns to independent and identically distributed bootstrap originally proposed by [5]. Figure 1 provides a schematic representation of Equation 5 as a blocking scheme for moving block bootstrap. The array in Equation 6 demonstrates the churning procedure of a typical time series data of n = 10 sample size with a block length of l = 3. This proposed method can be executed with the help of an R package in [7] using R programming Language in [19] as demonstrated in Listing 1.

Xt =

(%1 X2 X3 \

x2 x3 x4

X4 X5 X6

X5 X6 X7

X7 Xg X9

\ Xg X9 X10/

Bi B2 B3 B4 B5 B6

(6)

Listing 1: OBL Package for Minimum RMSE Values

> install.package s(OBL)

> df <- OBL: :blockboot(ts,

> df $RMSE [2]

# [1] 0.3398036

> df $RMSE [4]

# [1] 0.3303526

123, 4)

2.2. Data Generation for Simulation Study 2.2.1 Time Series Data Using ARIMA Model Methodology

Recall that:

Xt = x\, x2, ■ ■ ■ , xn

(7)

obtained from an AR(p), MA(q) or ARMA(p, q) process is given as follows:

p q

Xt =P + E + E dk£t-k + £t; £t ~ N(0, a£)

t=1 t=1

j = 1,2,... , p; k = 1,2,... ,q; t = 1,2, ■ ■ ■ , n;

0 ^ | ^ 1;

0 ^ |0fc| ^ 1;

ft = dk; n

E £ t = 0;

t=1

£t - N(0, );

(8)

for

of

= o2; > 0.

The use of "R" package demonstrated in R code Listing 2 below shows commands written for "R" to grid-search for a "seed" that produces exactly or approximately an ARIMA(1, 0, 0) with f = 0.8 and sample size (n = 10) starting from seed 280000 to 290000 with an increase of one unit. From the grid-search result printed, seed 289805 is deemed appropriate for the example. Similar effort is put in for every time series data simulated for this study to ensure that the output of each simulated time series data depict its specified parameters.

Listing 2: Seed Searching Using "ARIMASS" Package in R

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

devtools::install_github("sta189332/searchar")

# @example

# searchar: : arsearch(a = 280000, z = 290000, n 10, p 1, d 0, q 0, ar11 0.8, sd 1,

j1 =4, arrl = "0.80")

# ar1 seed

# 7 0.8079816 282327

# 5 0. 8062789 283176

# 6 0. 8074425 284165

# 8 0. 8081475 284461

# 4 0.8026127 287720

# 9 0. 8084755 288160

# 3 0. 8023778 289053

# 1 0. 8000000 289805

# 2 0. 8000368 289989

Haven got the set of program seeds that simulates the specified parameters of the ARIMA time series data that is described in each simulated data, one can then use such program seed to simulate the desired ARIMA time series data. In Listing 3 bellow, "seed(289805)" is used as a demonstration to simulate for the AR1MA(1,0,0) with sample size (n = 10) and f = 0.8 in line 1 while on line 2 one checked for the empirical characteristics of the AR1MA(1,0,0) to make sure that the simulated data does not just come from the population of AR1MA(1,0,0) and f = 0.8 alone but that the simulated data itself has the characteristics one calls for. The below paragraphs show how time series data for AR, MA and ARMA are simulated with there respective parameters.

Listing 3: Illustration of ARIMA Time Series Data Simulation

set.seed(289805)

ar_1 <- stats: arima .sim(n = 10, model = list(ar = c (0 . 8) , order = c(1, 0, 0)), sd = 1)

forecast::auto arima (ar_1, ic = "aicc")

# Series : ar_ 1

# ARIMA (1,0,0) with zero mean

# Coefficients

# ar1

# 0.8000

Data Simulation for Autoregressive (AR) Model

Forty eight (48) time series data were simulated to follow (AR(1) models with different parameters coefficients f = (0.8,0.9,0.95), sample size n = (10,15,20,25) and standard deviation of varying levels values sd = (1,3,5,10). Stationary conditions are confirmed to be true for AR(1) time series data with different levels of f values in Equation 9. AR(1) models of consideration are spelled out in the below Equation 9 with their different levels of autocorrelation (f), standard deviation (a) and sample sizes (n).

lit = fXt-1 + £t; £t ~ N(0, a) ' f = 0.8,0.9,0.95;

for <

a£ = 1,3,5,10; n = 10,15,20,25;

U fl ^ 1.

(9)

Forty eight (48) time series data were simulated to follow AR(2) models with different parameters coefficients f = ((0.4,0.4), (0.45,0.45), (0.35,0.6)), sample size n = (10,15,20,25) and standard deviation of varying levels values sd = (1,3,5,10). Stationary conditions are confirmed to be true for AR(2) time series data at different levels of f values; f = (0.4,0.4), (0.45,0.45), (0.35,0.6) in Equation 10.

Xt = <px xXt-1 + fiXt-2 + £t; £t 'ft = 0.4,0.45,0.35; <p2 = 0.4,0.45,0.6; a£ = 1,3,5,10; | 1 ^ 1; l f21 ^ 1;

n = 10,15,20,25;

l 1 + l <H ^ 1; J 1 - 1 f2| ^ 1.

N(0, o-£ )

for

(10)

Data Simulation for Moving Average (MA) Model

Forty eight (48) time series data were simulated to follow MA(1) models with different parameters coefficients d = (0.8,0.9,0.95), sample size n = (10,15,20,25) and standard deviation of varying levels values sd = (1,3,5,10). Stationary conditions are confirmed to be true for MA(1) time series data with different levels of d values in Equation 11. MA(1) models of consideration are spelled out in the below Equation 11 with their different levels of autocorrelation and standard deviation.

Xt =ûXt-1 + £t; £t - N(0, ae)

'û = 0.8,0.9,0.95; a-£ = 1,3,5,10; n = 10,15,20,25;

U fl ^ 1.

for

(11)

Forty eight (48) time series data were simulated to follow MA(2) models with different parameters coefficients û = ((0.4,0.4), (0.45,0.45), (0.35,0.6)), sample size n = (10,15,20,25) and standard deviation of varying levels values sd = (1,3,5,10). Stationary conditions are confirmed to be true

for AR(2) time series data at different levels of d values; d Equation 12.

(0.4,0.4), (0.45,0.45), (0.35,0.6) in

Xt =Û1 X— + Û2 Xt-2 + £t; £ t

'û1 = 0.4,0.45,0.35; û2 = 0.4,0.45,0.6; a-£ = 1,3,5,10; n = 10,15,20,25; lû11 ^ 1; lû21 ^ 1; lû11 + lû21 ^ 1; lû11 — lû21 ^ 1.

N(0, a£)

for

(12)

Data Simulation for Autoregressive Moving Average (ARMA) Model

Forty eight (48) time series data were simulated to follow ARMA(1, 1)) models with different parameters coefficients $ = ((0.5,0.3), (0.5,0.4), (0.35,0.6)), sample size n = (10,15,20,25) and standard deviation of varying levels values sd = (1,3,5,10). Stationary conditions are confirmed to be true for ARMA(1,1) time series data at different levels of $ values; $ = (0.4,0.4), (0.45,0.45), (0.35,0.6) in Equation 13.

Xt =4>Xt-1 + d£t-1; £t - N(0, a) = 0.5,0.5,0.35;

0 = 0.3,0.4,0.6; a = 1,3,5,10;

for {n = 10,15,20,25;

M < 1;

01 < 1; U = o.

(13)

2.3. Criteria for Model and Method Selection

In order to choose the better-performing method between the two methods (MBB and MBBR) discussed above, Root Mean Squared Error (RMSE) is used to choose the best-performing model that results in the better method. Note that [18] deployed RMSE to choose the best-performing method for forecasting the carbon dioxide (CO2) emission of Bahrain. Mean Absolute Error (MAE) is another metric used in this paper for method evaluation. To evaluate the robustness of data-model comparisons, [11] concluded that RMSE is not enough, rather MAE or other relevant measures are needed. The most common accuracy-fit-performance metric is RMSE. Prediction quality is often evaluated by the root mean square error or root mean square deviation. Based on Euclidean distance, it shows how far predictions differ from measured true values. To calculate RMSE, calculate the residual (difference between prediction and truth) for each data point, compute its norm, calculate its mean, and then take its square root. As RMSE relies on and requires true measurements at every predicted data point.

F\ n

RMSE =J - £(Xi - Xi)2 (14)

V n i=1

1n

MAE = n £ |Xi - X11 (15)

n i=1

From the above Equations 14 and 15, n is the number of data points, X, is the i-th measurement, and X( is its corresponding prediction. MAE also has the same units as RMSE. Usually, MAE is smaller than RMSE, although it can be the opposite if the predicted values are very close to the observed ones.

3. Results Methods Comparison for AR(1) Models

It can be seen in Table 1 and Figure 2 that MBBR method has smaller RMSE and MAE values indicating that the newly proposed method (MBBR) presents a better quality of prediction than the existing method (MBB). It can also be seen in Table 1 and Figure 2 that RMSE and MAE values increase as the value of standard deviation increases. Comparing the values of RMSE in Table 1 with the values of MAE in Figure 2 pairwise, MAE values are smaller than that of RMSE. The proposed method (MBBR) has minimum elements yet has better representation and more efficient than the existing method MBB as seen in 1 and Figure 2.

Methods Comparison for AR(2) Models

It can be seen in Table 2 and Figure 3 that MBBR method has smaller RMSE and MAE values indicating that the newly proposed method (MBBR) presents a better quality of prediction than the existing method (MBB). It can also be seen in Table 2 and Figure 3 that RMSE and MAE values increase as the value of standard deviation increases. Comparing the values of RMSE in Table 2 with the values of MAE in Figure 3 pairwise, MAE values are smaller than that of RMSE. The proposed method (MBBR) has minimum elements yet has better representation and more efficient than the existing method MBB as seen in 2 and Figure 3.

Methods Comparison for MA(1) Models

It can be seen in Table 3 and Figure 4 that MBBR method has smaller RMSE and MAE values indicating that the newly proposed method (MBBR) presents a better quality of prediction than the existing method (MBB). It can also be seen in Table 3 and Figure 4 that RMSE and MAE values increase as the value of standard deviation increases. Comparing the values of RMSE in Table 3 with the values of MAE in Figure 4 pairwise, MAE values are smaller than that of RMSE. The proposed method (MBBR) has minimum elements yet has better representation and more efficient than the existing method MBB as seen in 3 and Figure 4.

Methods Comparison for MA(2) Models

It can be seen in Table 4 and Figure 5 that MBBR method has smaller RMSE and MAE values indicating that the newly proposed method (MBBR) presents a better quality of prediction than the existing method (MBB). It can also be seen in Table 4 and Figure 5 that RMSE and MAE values increase as the value of standard deviation increases. Comparing the values of RMSE in Table 4 with the values of MAE in Figure 5 pairwise, MAE values are smaller than that of RMSE. The proposed method (MBBR) has minimum elements yet has better representation and more efficient than the existing method MBB as seen in 4 and Figure 5.

Methods Comparison for ARMA(1, 1) Models

It can be seen in Table 5 and Figure 6 that MBBR method has smaller RMSE and MAE values indicating that the newly proposed method (MBBR) presents a better quality of prediction than the existing method (MBB). It can also be seen in Table 5 and Figure 6 that RMSE and MAE values increase as the value of standard deviation increases. Comparing the values of RMSE in Table 5 with the values of MAE in Figure 6 pairwise, MAE values are smaller than that of RMSE.

The proposed method (MBBR) has minimum elements yet has better representation and more efficient than the existing method MBB as seen in 5 and Figure 6.

Table 1: Minimum RMSE Criterion of MBB and MBBR Methods for AR(1)

P = 0.8 p = 0.9 P = 0.95

n 10 15 20 25 10 15 20 25 10 15 20 25

MBB 1.58 0.89 1.50 1.85 0.89 1.20 1.45 1.39 1.26 0.88 1.29 1.01 rH II

MBBR 1.58 0.80 1.43 1.85 0.83 1.01 1.40 1.34 1.12 0.17 1.21 0.87 TS w

MBB 4.75 2.67 4.65 5.55 2.67 3.60 4.72 4.18 3.79 3.21 4.01 3.03 ||

MBBR 4.73 2.41 4.29 5.55 2.50 3.04 4.19 4.02 3.37 2.63 3.64 2.60 TS w

MBB 7.92 4.45 7.76 9.26 4.45 6.01 7.87 6.97 6.32 5.34 6.69 5.0 m ||

MBBR 7.89 4.02 7.15 9.26 4.17 5.07 6.98 6.70 5.62 4.39 6.07 4.34 -s

MBB 15.83 8.90 15.52 18.52 8.90 12.01 15.74 13.93 12.65 10.68 13.38 10.11 o rH

MBBR 15.77 8.04 14.30 18.51 8.34 10.13 13.96 13.41 11.23 8.78 12.14 8.68 ii TS w

Table 2: Minimum RMSE Criterion of MBB and MBBR Methods for AR(2)

P1 = 0.4, P2 = 0.4 P1 = 0.45, p2 = 0.45 P1 = 0.35 p2 = 0.6

n 10 15 20 25 10 15 20 25 10 15 20 25

MBB 0.81 1.37 1.11 1.28 1.15 1.08 1.47 1.26 1.22 1.09 1.39 1.29 rH ||

MBBR 2.42 4.1 3.33 3.83 3.45 3.25 4.4 3.79 3.67 3.27 4.18 3.88 TS w

MBB 2.42 4.1 3.33 3.83 3.45 3.25 4.4 3.79 3.67 3.27 4.18 3.88 ||

MBBR 2.28 4.05 3.28 3.32 3.3 3.16 4.3 3.54 3.66 3.15 3.70 3.39 TS w

MBB 4.03 6.83 5.55 6.38 5.74 5.42 7.34 6.32 6.12 5.45 6.97 6.46 m ||

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

MBBR 3.8 6.76 5.46 5.53 5.55 5.27 7.17 5.9 6.11 5.25 6.17 5.66 -s

MBB 12.1 13.66 11.11 12.76 11.48 10.85 14.68 12.65 12.24 10.89 13.94 12.92 o rH

MBBR 8.06 13.51 10.93 11.05 11 10.55 14.34 11.81 12.21 10.5 12.33 11.31 ii TS

1.6 1.2 0.8 0.4

<p = 0.8

LU

<

(p=0.9

,A

i-f-1-r

(p = 0.95

/ * w

. / ' %% Q. II

\J : \ » f i i

t i i i . ...a

A (/)

A / 1 *

X// \ a H

1 1 1 ii 03

I 1 \ /

1 I I—

M fO M fO M fO

OOiOOi OOiOOi OOiOOi

Sample Size

« a

ai

c/) a

Pi =0.4, (P2 = 0.4

1.25

1.00

0.75

0.50H

0.25

0.00

3 2

H

0 6

4

2H

12 9H 6 3

\ *

\ * M

/ V

A

A *

% % \ S \ * A' * *

V

= 0.45, (p2 = 0.4

à'

A-...*

k'

*...../

~~l—T-1-T

>1 = 0.35, (p2 = 0.l

\ v^ ^ t

\ 1 % t ♦ a II

1 » 1 I

T-1-1-r

-\-"f-1-r

N) N) |\j |\j |\j |\j

OOiOOi OOiOOi OOiOOi

Sample Size

\

A a

\ \ ll

» ♦ ♦ t % 03

\ \

a H

ii ai

J

* s

\ \

>

v

_ _

« a

Methods

-•- MBB

MBBR

Table 3: Minimum RMSE Criterion of MBB and MBBR Methods for MA(1)

# = 0.8 # = 0.9 # = 0.95

n 10 15 20 25 10 15 20 25 10 15 20 25

MBB 0.77 0.96 1.29 1.63 1.21 0.98 1.25 1.27 1.46 1.26 1.12 1.31 rH II

MBBR 0.72 0.94 1.29 1.57 1.13 0.92 1.24 1.16 1.42 1.23 1.09 1.31 -s

MBB 2.30 2.88 3.88 4.89 3.63 2.93 3.76 3.8 4.38 3.77 3.37 3.93 m ||

MBBR 2.15 2.81 3.84 4.71 3.39 2.76 3.25 3.49 4.25 3.68 3.17 3.93 -s

MBB 3.83 4.80 6.46 8.15 6.05 4.88 6.27 6.33 7.3 6.29 5.62 6.55 m ||

MBBR 3.59 4.68 6.40 7.85 5.64 4.59 5.42 5.81 7.17 6.13 5.28 6.54 -s

MBB 7.66 9.61 12.92 16.31 12.1 9.76 12.53 12.66 14.6 12.58 11.23 13.1 o rH

MBBR 7.17 9.36 12.81 15.7 11.28 9.19 10.83 11.63 14.17 12.26 10.55 13.08 II -s

Table 4: Minimum RMSE Criterion of MBB and MBBR Methods for MA(2)

#1 = 0.4, #2 = 0.4 #1 = 0.45, #2 = 0.45 #1 = 0.35 #2 = 0.6

n 10 15 20 25 10 15 20 25 10 15 20 25

MBB 0.86 1.35 1.22 1.27 1.74 1.29 1.49 1.3 1.23 1.28 1.17 1.23 rH ||

MBBR 0.90 1.35 1.21 1.27 1.73 1.29 1.49 1.3 1.23 1.23 1.16 1.22 -s

MBB 2.69 4.04 3.67 3.8 5.22 3.87 4.48 3.91 3.70 3.83 3.52 3.69 m ||

MBBR 2.59 4.04 3.62 3.8 5.19 3.87 4.48 3.9 3.70 3.69 3.49 3.65 -s

MBB 4.49 6.74 6.12 6.34 8.69 6.45 7.47 6.52 6.17 6.38 5.87 6.15 m ||

MBBR 4.32 6.74 6.03 6.33 8.66 6.45 7.47 6.5 6.17 6.15 5.82 6.08 -s

MBB 11.77 13.48 12.24 12.67 17.38 12.91 14.94 12.99 12.34 12.75 11.74 12.30 o rH

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

MBBR 8.97 13.48 12.07 12.65 17.31 12.91 14.94 13.01 12.33 12.31 11.65 12.16 II -s

Methods

-•- MBB

MBBR

$ = 0.95

—I-1-1-r^

N) N) O 01 O 01

Sample Size

—I—I—I—T^

N) N) O 01 O 01

Sample Size

Methods

-•- MBB

MBBR

N) N) O 01 O 01

Sample Size

Table 5: Minimum RMSE Criterion of MBB and MBBR Methods for ARMA(1,1)

ф1 = 0.5, #2 = 0.3 ф = 0.5, #2 = 0.4 Ф1 = 0.35, #2 = 0.6

n 10 15 20 25 10 15 20 25 10 15 20 25

MBB 1.23 1.09 1.5 1.35 1.29 1.62 1.95 1.07 1.27 1.19 1.43 1.34 rH

MBBR 1.17 1.01 1.49 1.21 1.2 1.62 1.93 1.04 1.11 1.12 1.41 1.31 = TS СЛ

MBB 3.68 3.27 4.5 4.05 3.87 4.87 5.84 3.2 3.81 3.57 4.29 4.03 (4

MBBR 3.5 3.03 4.48 3.62 3.61 4.85 5.78 3.12 3.34 3.37 4.24 3.93 = TS СЛ

MBB 6.14 5.45 7.5 6.75 6.45 8.11 9.73 5.34 6.34 5.96 7.15 6.72 m

MBBR 5.83 5.06 7.47 6.03 6.01 8.08 9.63 5.2 5.57 5.62 7.07 6.56 = is СЛ

MBB 12.28 10.91 15 13.49 12.9 16.22 19.45 10.67 12.69 11.91 14.3 13.44 О rH

MBBR 12.08 10.11 14.95 12.06 12.02 16.17 19.27 10.4 11.14 11.23 14.15 13.11 = TS СЛ

4. Discussion

It is important to mention interesting points that emerge from this study. The study shows that the newly proposed method (MBBR) represents a better quality of prediction than the existing method (MBB). It is also shown that neither the sample size nor the model parameter(s) has significant impact on the accuracy measures. It is also worthy of note that the varying level of standard deviation has a direct and positive impact the values of accuracy measure. In general, MAE values are smaller than their corresponding values of RMSE. The proposed method has minimum elements yet has better representation than the existing method. Although this research focus mainly on comparison of Moving Block Bootstrap method with the newly proposed (Moving Block Bootstrap with better element representation), Future research will focus on comparing the proposed method with other existing methods of block bootstrap.

References

[1] Berkowitz, J. and Kilian, L. (2000). Recent developments in bootstrapping time series. Econometric Reviews, 19(1):1-48.

[2] Box, G. E., Jenkins, G. M., Reinsel, G. C., and Ljung, G. M. (2015). Time series analysis:forecasting and control. John Wiley & Sons.

[3] Carlstein, E. (1986). The use of subseries values for estimating the variance of a general statistic from a stationary sequence. The annals of statistics, pages 1171-1179.

[4] Dahl, C. M. and Sorensen, E. N. (2022). Time series (re) sampling using generative adversarial networks. Neural Networks.

[5] Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist., 7(1):569-593.

[6] Horowitz, J. L. (2003). Bootstrap methods for markov processes. Econometrica, 71(4):1049-1082.

[7] James, D. and Kayode, A. (2022). OBL: Optimum Block Length. R package version 0.2.1.

[8] Kunsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations. The annals ofStatistics, pages 1217-1241.

[9] La Vecchia, D., Moor, A., and Scaillet, O. (2022). A higher-order correct fast moving-average bootstrap for dependent data. Journal of Econometrics.

[10] Lahiri, S. N. (1999). Theoretical comparisons of block bootstrap methods. Annals ofStatistics, pages 386-404.

[11] Liemohn, M. W., Shane, A. D., Azari, A. R., Petersen, A. K., Swiger, B. M., and Mukhopad-hyay, A. (2021). Rmse is not enough: Guidelines to robust data-model comparisons for magnetospheric physics. Journal of Atmospheric and Solar-Terrestrial Physics, 218:105624.

[12] Liu, R. Y. and Singh, K. (1992). Moving blocks jackknife and bootstrap capture weak dependence. Exploring the limits of bootstrap, 225.

[13] Paparoditis, E. and Politis, D. (2002a). The tapered block bootstrap for general statistics from stationary sequences. The Econometrics Journal, 5(1):131-148.

[14] Paparoditis, E. and Politis, D. (2002b). The tapered block bootstrap for general statistics from stationary sequences. The Econometrics Journal, 5(1):131-148.

[15] Paparoditis, E. and Politis, D. N. (2001). Tapered block bootstrap. Biometrika, 88(4):1105-1119.

[16] Politis, D. and Romano, J. (1992). Circular block'resampling procedure for stationary data, rin lepage, r. and billard, l.(eds) exploring the limits of bootstrap. Wiley: New York, 263:270.

[17] Politis, D. N. and Romano, J. P. (1994). Large sample confidence regions based on subsamples under minimal assumptions. The Annals of Statistics, pages 2031-2050.

[18] Qader, M. R., Khan, S., Kamal, M., Usman, M., and Haseeb, M. (2022). Forecasting carbon emissions due to electricity power generation in bahrain. Environmental Science and Pollution Research, 29(12):17346-17357.

[19] R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

[20] Shao, X. (2010). Extended tapered block bootstrap. Statistica Sinica, pages 807-821.

MOVING BLOCK BOOTSTRAP METHOD WITH BETTER ELEMENTS REPRESENTATION FOR UNIVARIATE TIME SERIES DATA Текст научной статьи по специальности «Науки о Земле и смежные экологические науки»

Аннотация научной статьи по наукам о Земле и смежным экологическим наукам, автор научной работы — Kayode Ayinde, James Daniel, Akinola Adepetun, Olusegun S. Ewemooje

Похожие темы научных работ по наукам о Земле и смежным экологическим наукам , автор научной работы — Kayode Ayinde, James Daniel, Akinola Adepetun, Olusegun S. Ewemooje

Текст научной работы на тему «MOVING BLOCK BOOTSTRAP METHOD WITH BETTER ELEMENTS REPRESENTATION FOR UNIVARIATE TIME SERIES DATA»