GROUP METHOD OF DATA HANDLING IN STATISTICAL FORECASTING
© Skakalina E.V.*
Poltava National Technical Yuri Kondratyuk University, Ukraine, Poltava
This work presents a methodology for short-term forecasting of main economic indicators of industrial activity of a business entity (goods and services cost of sales, gross profit, net profit, selling, General and administrative expenses) using modifications of the classical algorithm of the method of group account of data handling (GMDH), comparing the effectiveness of combinatorial and neural network algorithms of GMDH.
Keywords: group method of data handling , forecasting, neural network algorithm, information technology, economic indicators.
Currently methods have been developed for statistical forecasting, which allow high accuracy to predict almost all possible time series. But they are based on the mathematical apparatus, which can only be used when sufficient statistical data.
A method of group accounting argument is a family of inductive algorithms for mathematical modeling multiparametric data. The method is based on recursive selective model selection on the basis of which are constructed of more complex models [1]. The accuracy of the simulation at each next step of recursion increases due to the complexity of the model. Inductive GMDH algorithms provide a unique opportunity to automatically locate interdependence in the data to select the optimal model structure or network, to improve the accuracy of existing algorithms.
At present this approach of self-organization model is fundamentally different from deductive methods that are commonly used. It is based on inductive principles - solving tasks based on the search for external criterion. By trying different solutions inductive modeling methods try to minimize the role of the assumptions of the author in the simulation results. The computer finds the model structure and the laws that apply to the object. It can be used to create artificial intelligence as an Advisor for dispute resolution and decision making.
GMDH consists of several algorithms for solving various tasks. This includes both parametric and clustering algorithms, integration of the analogues, re-binarization and probability algorithms. This approach of self-organization based on search models, which gradually become more complicated and select the best solution according to the minimum of the external criterion. The underlying models are used not only events, but also non-linear, the probability of clustering.
GMDH performs functions for the implementation of problem-solving tasks [2]:
* Доцент кафедры Компьютерных и информационных технологий и систем, кандидат технических наук.
1. Is the optimal complexity of the model structure, adequate to the level of noise in the sample data. (To solve real-world problems with noisy or short data, simplified prediction models are more accurate.)
2. The number of layers and neurons in hidden layers, structure and other optimal parameters of the neural network are automatically.
3. It is guaranteed to find the most accurate or unbiased model, the method skips the best solutions during the key-space (in a given class of functions).
4. Any nonlinear functions or traits that may have an impact on the output variable are used as input arguments.
5. Automatically finds interpret relationships in data and selects effective input variables.
6. The search algorithms of GMDH simply be programmed.
7. The GMDH network is used to improve the accuracy of other algorithms modeling.
8. The method uses information directly from the sample data and minimizes the influence of the a priori assumptions of the author on the simulation results.
9. Gives the possibility of finding unbiased physical model of the object (of the law or clustering) is the same for all future samples.
Using GMDH in neural network modeling and mechanisms of neural networks method were implemented in software.
GEvom - free program to use for academic purposes (Fig. 1). Compatible only with OS Windows.
Figure 1. Program Screen GEvom
Opportunities GEvom:
- The analysis of all experimental data.
- The creation of models and their application to forecasting.
- The use of 4 different methods of design of neural networks, GMDH-type.
- The receiving polynomial mathematical functions.
- The use of genetic algorithms for designing the optimal network structure.
- The application of the SVD method (singular value decomposition) to eliminate any features of the data.
GMDH Shell is a commercial product. Software tool for data mining and forecasting based on GMDH.
Using GMDH Shell you can explore the data, to build a regression model to apply the previously derived model to predict (Fig. 2).
Figure 2. Program Screen GMDH Shell
The construction of the GMDH model will be implemented in two stages:
- The application of the group Method of accounting arguments obtained using data for prediction of the output parameter.
- The prediction input parameters using the method autoregresive to integrate moving Average (ARIMA - Autoregressive integrated moving average) [3].
All modifications GMDH for autoregressive and distribution lagged models demonstrate the high accuracy of the prediction. The best accuracy are not clear GMDH, and group method of data handling with fuzzy inputs. In addition, their advantage over clear GMDH that they do not use least square method and not sensitive to poor conditioning of the matrix of the input factors and the autocorre-
lation of the random quantity, which is so important for autoregressive models. Can predict selling, general and administrative expenses of the economic entity using the method of group account of arguments in general. The forecasting horizon is 3 years. Algorithm - the GMDH neural network. Have the following results:
Figure 3. The predicted graphics of cost of goods sold
Figure 4. The predicted graphics of gross profit
Figure 5. The predicted graphics of net profit
So, we can conclude that until 31.12.2015:
1. Sales of goods and services will increase by 27, 8 %;
2. The cost of production will increase by 13.9 %;
3. Gross profit will increase by 16.2 %;
4. Net profit will increase by 10.2 %.
But the cost next year will grow by 16.5 %.
Studies have shown that the method of group accounting arguments, which allows to obtain the factor model with insufficient amount of data and has a better accuracy of the forecast is the most suitable for autoregressive models and distribution models.
Also proposed is a modification of fuzzy GMDH algorithms that is to use the dual simplex method instead of the usual simplex method, which allowed to reduce fuzzy GMDH and obtain improved results.
References:
1. Ivakhnenko A.G., Ivakhnenko G.A. The Review of Problems Solvable by Algorithms of the Group Method of Data Handling. International Journal of Pattern Recognition and Image Analysis // Advanced in Mathematical Theory and Applications. - 1995. - Vol. 5, No 4. - P. 527-535.
2. Ivakhnenko A.G, Kovalishyn VV, Tetko I.V, Luik A.I. Self Organization of Neural Networks with Active Neurons for Bioactivity of Chemical Compounds Forecasting by Analogues Complexing GMDH algorithm. Poster for the 9th International Conference on Neural Networks (ICANN 99). - Edinburg, 1999. - 13 p. Available at: http://www.gmdh.net/ articles/ papers/ nn_anal.pdf (accessed 05 February 2015).
3. Box G. and Jenkins G. (1976). Time Series Analysis: Forecasting and Control. - San Rancisco, Holden-Day.