Научная статья на тему 'EVALUATION OF INVESTMENT ACTIVITIES OF VAYODZ DZOR AND SYUNIK MARZES OF THE RA USING MACHINE LEARNING METHODS'

EVALUATION OF INVESTMENT ACTIVITIES OF VAYODZ DZOR AND SYUNIK MARZES OF THE RA USING MACHINE LEARNING METHODS Текст научной статьи по специальности «Экономика и бизнес»

CC BY
92
12
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
INVESTMENT POTENTIAL / INVESTMENT ATTRACTIVENESS / MACHINE LEARNING / CLUSTERING / DECISION TREE

Аннотация научной статьи по экономике и бизнесу, автор научной работы — Khachikyan Sos, Ghazaryan Armen, Sargsyan Andranik

At present, the main driving force for the normal operation and development of different branches and sectors of the economy of the Republic of Armenia and its marzes is the attraction of investments, and consequently the development of the attractiveness and competitiveness of the investment environment. Effective management of the investment process ensures sustainable economic growth and development. The investment potential of any country or region is influenced by a number of favorable or unfavorable factors, from which the most significant indicators of economic development of the observed marzes have been singled out. Based on them, the investment potential of Vayots Dzor and Syunik marzes of the RA was assessed using machine learning methods, comparing it with the same potential of other marzes of the RA. Based on the selected indicators, a clustering of the RA marzes was carried out, and 4 main classes were separated. Based on the clustering results, appropriate labels were given to the RA marzes, after which classification methods were constructed to predict the selected labels. At the end, the ROC (Receiver Operating Characteristic) curves calculated for each class were constructed to assess the quality of the classifier. The quantitative interpretation of ROC shows that the classifier has a very high quality

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «EVALUATION OF INVESTMENT ACTIVITIES OF VAYODZ DZOR AND SYUNIK MARZES OF THE RA USING MACHINE LEARNING METHODS»

REGIONAL

DEVELOPMENT

SOS KHACHIKYAN

PhD in Economics, Associate Professor, Dean of the Department of Informatics and Statistics of Armenian State University of Economics © https://ordd.org/0000-0002-9269-5588

ARMEN GHAZARYAN

PhD in Economics, Associate Professor, Acting Head of the Chair of Economic nformatics and Information Systems of Armenian State University of Economics

© https://orcid. org/0000-0001-6083-5489

ANDRANIK SARGSYAN

PhD Student of the Chair of Economic Informatics and Information Systems

of Armenian State University of Economics © https://orcid.org/0000-0001-8018-7941

EVALUATION OF INVESTMENT ACTIVITIES OF VAYOTS DZOR AND SYUNIK MARZES OF THE RA USING MACHINE LEARNING METHODS

At present, the main driving force for the normal operation and development of different branches and sectors of the economy of the Republic of Armenia and its marzes is the attraction of investments, and consequently the development of the attractiveness and competitiveness of the investment environment. Effective management of the investment process ensures sustainable economic growth and development. The investment potential of any country or region is influenced by a number of favorable or unfavorable factors, from which the most significant indicators of economic development of the observed marzes have been singled out. Based on them, the investment potential of Vayots Dzor and Syunik marzes of the RA was assessed using machine learning methods, comparing it with the same potential of other marzes of the RA.

Based on the selected indicators, a clustering of the RA marzes was carried out, and 4 main classes were separated. Based on the clustering results, appropriate labels were given to the RA marzes, after which classification methods were constructed to predict the

selected labels.

At the end, the ROC (Receiver Operating Characteristic) curves calculated for each class were constructed to assess the quality of the classifier. The quantitative interpretation of ROC shows that the classifier has a very high quality.

Keywords: Investment potential, investment attractiveness, machine learning,

clustering, decision tree JEL: R58, C52, C53 DOI: 10.52174/1829-0280_2021 _6_78

Introduction. For the development and growth of the economy of any state, it is necessary to create favorable conditions for investment programs. The latter contributes to the attraction of additional resources, to the more efficient use of existing opportunities. This is the reason for the creation of a favorable investment climate nowadays, which is one of the main goals of any country's economic policy.

The result of the policy of effective investment programs is the sustainable social-economic development and economic growth of the country, due to which the living standard of the population increases. Therefore, it is very important for each country to increase the attractiveness of the investment environment, to identify issues at different levels of policy, to analyze them. To all of the mentioned factors this study is dedicated.

The purpose of the given research is to evaluate the investment potential of Vayots Dzor and Syunik marzes of the Republic of Armenia based on the economic indicators and using machine learning methods in comparison with other marzes.

Literature review. The investment environment is quite dynamic and is constantly changing. In this regard, the investment environment as an economic category is not clearly interpreted by economists.

Thus, due to S. Tsakunov, the investment environment is a sum of social, economic, legal, political, cultural preconditions, which predetermines the attractiveness and expediency of investments in this or that economic system (country, region, economy, organization)1.

A. Folom and V. Revazov gave almost the same description. They describe the investment environment as a system of social, economic, organizational, legal, political, cultural preconditions that determine the investment attractiveness and expediency of a given region2.

The main difference between these two definitions is that if the first refers to the economic system in general, then in the second case, the authors focus on a definite region. (In this case to Russian Federation, because the subject of their study was the discovery of ways to improve the investment climate in Russia). However, it should be noted that they are the same in terms of content.

1 Tsakunov, S.V. (2009). Theory and practice of investment marketing and management //Electronic resource]. Access mode: http://invm2009 . narod.ru (In Russian).

2 Folom'ev, A., Revazov, V. (2000). The investment climate of russian regions and ways to improve it //Problems of economic transition. V. 43. №. 3, p. 41-55.

According to another view, the investment climate is the economic, political and financial conditions that determine the flow of domestic and foreign investment to a given country. At the same time, the favorable investment environment is characterized by policies, clear legislation, low taxes and prerogatives.

According to M. Melkumyan3, the investment environment is a set of economic, political, legal, social, domestic, psychological and other such kind of factors, which determines the degree of risk of investments, the possibility of their effective use and concentrates the interests of the investor in such business area and in such specific period of time, where the investor is required to strengthen its skills and efforts.

It should be noted that in the above definitions, special emphasis is placed on the environmental factors, which interact with each other to create an investment environment as an independent system.

Thus, for the complete interpretation of the economic content of the investment environment, it is necessary to look at it from the macroeconomic point of view.

A. Mozgev defines the investment climate as the ability of socio-ecological and economic system of the country (region) to accept investments, which include the opportunities of the country issues, and the conditions of the investor's activity4.

In their paper, the Russian authors M. L. Krichevsky and Yu. A. Martinova referred to evaluating the investment activity of the region with the use of machine learning methods, and showed the procedure for determining the classification of research objects and their belonging to this or that class. Specifically, in their paper, the authors present the results of the application of machine learning methods suitable for assessing the investment activity of various regions of Russia. The solution of the problem was brought to the receipt of information about the class to which a particular region belongs5.

Research Methodology. Machine learning has been widely used in solving various economic problems in recent years. Methods of machine learning can be classified into two main groups; controlled and non-controlled. Controlled machine learning methods are used when labeled data are available, and non-controlled learning methods are used to identify and group data connections.

We used K-means and agglomerative clustering algorithms to perform clustering of regions.

K-mean clustering aims to divide data into K clusters so that the data points of the same cluster are similar and the data points of different clusters are

3 Melkumyan, M., (2014). Organization of entrepreneurial activity, textbook, Yerevan, Zangak-97. (In Armenian).

4 Investment climate and its components [Electronic resource].: https://studopedia.ru/7_183452_investitsionniy-klimat-i-ego-sostavlyayushchie.html - Access mode: https://studopedia.ru/

5 Krichevsky, M.L., Martynova, Yu.A. (2019). Using machine learning methods to evaluate investment activity in various regions of Russia //Issues of innovative economy. Vol. 9. No. 4, pp. 1557-1572, (In Russian).

further. The similarity of the two points is determined by the distance between them. There are various methods for measuring the similarity of points. One of them is the Euclidean distance, which we used in the research.

Where a-and b- are arbitrary vectors of n dimensions in Euclidean space. One of the advantages of the K-means algorithm is its simple implementation, fast operation and interpretability. And one of the disadvantages of the method is that we have to select the number of clusters we are looking for in advance. In order to select the optimal number of clusters, we used the elbow method, for which we performed clusters with different quantities of pre-selected K values and in each case, we calculated the sum of the intragroup square deviations, after which we constructed the diagram and determined the optimal number of clusters based on it.

The agglomerative clustering method was used to construct a dendrogram and visualize the resulting clusters. Then, the clusters obtained as a result of clustering were considered as labels for each region and different classification algorithms were used trying to predict the assigned labels as well as possible and having a net investment flow variable, which was not used during clustering.

Suppose we want to create a system that should classify a certain set of points into groups. The problem is that we can not specify the criteria for that classification. As compared with the controlled methods, the difference is that the target variable Y is absent. Most algorithms are aimed at finding a cluster or group structure in the data, which must be interpreted by the researcher. The advantage of this may be that it will lead to a less biased analysis, because non-controlled methods often rely on non-parametric approaches. For example, group labels, can be used as attribute or response variables in a controlled model of machine learning. Therefore, the non-controlled learning algorithm must attempt to construct the data set on its own, in a way that seems best of all. We present the most commonly used non-controlled learning theories for mechanized learning with K-averages and clustering (hierarchical).

Cluster analysis is one of the methods of multidimensional research of socioeconomic phenomena. It most clearly reflects the lines of multidimensionality in the process of classifying objects. The main goal of cluster analysis is to divide a set of objects characterized by a set of marks into objects that are, in a sense, homogeneous groups (clusters). This means that the problem of classifying data and identifying the appropriate structure in them is being solved. In other words, it is assumed the separation of groups of compact objects removed from each other or their division into the set according to the domains of accumulation or density. The need to develop and use cluster analysis methods; first of all, is conditioned by the fact that they give an opportunity to reveal the internal connections of the units of set, to build scientifically grounded classifications. The construction of classifications is especially relevant for poorly researched phenomena, when it is necessary to determine the existence of connections within the set, and to try to structure them. The usual form of representation of baseline data in cluster analysis problems is the "(n) object - (m) attribute" matrix. In this case, by object we mean the research subjects to be classified, and

the attribute is the specific property or characteristic of the object.

In solving the classification problem, the LDA, KNN, LinearSVC and Decision Tree Classifier approaches were considered and the results were compared.

Accuracy metrics are also used to compare constructed classifiers, which is defined as the ratio of the number of correctly classified copies to the number of all copies6.

In order to evaluate the quality of the resulting model we used the ROC curve (Receiver operating characteristic curve)7:

Analysis. Both objective and subjective conditions and preconditions are needed to attract investments. Objective preconditions include economic-geographical and other factors characterizing the state of the region, which may be of interest to investors. These are investment resources, the integrity of which makes up the investment potential of the region. Subjective factors relate to the proper functioning of the authorities in terms of identifying that potential.

The investment potential of the region can be influenced by the favorable geographical location, transport accessibility, availability of sufficient and highly efficient natural resources (raw materials, fuel, energy, water, forest, etc.), the state of the environment, the level of development of productive and social infrastructure, the cost of labor, the availability of qualified personnel, the scientific, technical, design, educational base, the standard of living of the population, the capacity of the consumer market, the business environment, the level of business activity, the level of local taxes, state policy towards business and other factors8. Investors carefully study the investment potential of a given area or region before making an investment, and government agencies should also inform investors about it.

Let us now present to what extent each of these factors is favorable or unfavorable for investing in Syunik and Vayots Dzor marzes.

The marz of Syunik of the Republic of Armenia occupies an important position of strategic-geopolitical significance, has rich natural resources, great production potential, and is one of the largest administrative and economic marzes of the republic9. The average salary in the non-state sector of Syunik is about 2.4 times higher than the salary in the budgetary sector.

The average monthly nominal salary is the highest in the RA Syunik marz due to the mining industry. With these and a number of other indicators, Syunik occupies a leading position in Armenia, but these indicators are absolutely dependent on the mining industry.

Thus, Syunik's economy needs to be diversified, because the dependence on any branch of the economy causes risks and problems. Although there are no official regional analyses in the marz, however, it is obvious that the high

6 Chen, P., Ye, J., Chen, G., Zhao, J., & Heng, P. A. (2020). Robustness of accuracy metric and its inspirations in learning with noisy labels. arXiv preprint arXiv:2012.04193.

7 Bradley, A.P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7), 1145-1159.

8 Hovsepyan, V., Egiazaryan, M. (2014). National economy YSU textbook, 224 p. (In Armenian).

9 http://syunik.mtad.am/files/docs/29670.pdf

indicators of the marz are provided by the areas where the mining industry is developed.

Vayots Dzor marz is located in the south-eastern part of the Republic of Armenia. Among the marzes of the Republic of Armenia, the marz occupies an average position due to the size of the territory, and is the smallest marz in terms of population. There are large reserves of copper, tuff, marble, limestone, clay, sand, basalt, granite, felsite, quartz sand and mineral water in the marz. The economy of Vayots Dzor is one of the weak links of the RA economy. The agriculture is in the leading place, and the industry is mainly represented by the processing of agricultural products. The industrial complex is complemented by cheese-production, mineral water production and winery.

There are also several small hydropower plants. The expansion of the infrastructure of the resort economy plays a big role in the development of the economy of the marz.

In addition to the mentioned factors, the general economic situation also has a significant impact on the investment environment of the marz. From the indicators characterizing the economy of the marz as important factors of investment attractiveness, we have singled out the following indicators.

• X1 - Gross industrial output per capita, AMD. X2 - Gross agricultural output per capita, AMD. X3 - The volume of services per capita, AMD. X4 - The volume of construction works per capita, AMD. X5 - Average monthly nominal salary, AMD. X6 - Unemployment level, percent. Y - Net flows of investment, million AMD.

General indicators characterizing the economic situation of the RA marzes in 201910

Yerevan

Aragatsotn

Ararat

Armavir

Gexarqunik

Lori

Kotayk

Shirak

Syunik

Vayots Dzor

Tavush

Sum

153802.2 285.3 -385.3 -4082 0

159.7 405.2 2.6 7089.5 0

100.7

727207.406 365377.049 1156278.59 410869.812 306618.86 406822.761 875666.866 298828.061 2527970.9 570344.615 257295.277 706000.236

9696.186 638944.4

497954.4 679931.8 491666.7 319496.3

280597

374004.7 429247

432820.5

304722.8 287865

1534585

116902.8

134561.1

121962.9 70176.75

110596.5

482369.8

139780.9

185533.6 225239

167852.2 672890.9

221575.6

106344.7 98777.32 140689.4 99537.72 66555.04 106546.9

121129.8

141770.1 138785.6

139314.2 152171

201527 113538 168027 135081 114479 124293 137388 110876 266832 126740 118446

Table 1

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

cTS S 1S * 18 i? ■SSN

I f 5

3.8CË

22.7 8.8 13.4 11.6 8.6 19.9 20.7 18 15 22.4 26.9

10 https://www.armstat.am/am/?nid=82&id=2324

Based on the mentioned indicators and using the machine learning methods, let us evaluate the investment potential of Vayots Dzor and Syunik marzes of the Republic of Armenia, comparing it with the other regions of the Republic of Armenia.

As was mentioned in the methodology section of the research in order to solve the presented problem the non-controlled model of machine learning was used. The problem was solved using the K-mean algorithm, which aims to classify marzes (marzes) by groups. For data analysis, working files were created in the Jupyter notebook software environment and the Python programming language libraries Pandas11, Scikit-learn12, Scipy13 and Matplotlib14 were imported there. The mentioned libraries are intended for data analysis, machine learning and visualization. The imported "import data csv file" looks like this:

Netinvestment Manufacturing Agriculture Service Construction Salary Unemployment

Marz

Yerevan 153802.2 727207406 9696.186 1534585.00 221575.60 201527 22.7

Aragats otn 285.3 365377.049 638944.400 116902.80 106344.70 113538 8.8

Ararat -385.3 1156278.590 497954.400 134561.10 98777.32 168027 13.4

Armavir -4082.0 410869.812 679931.800 121962.90 140689.40 135081 11.6

Gegharkunik 0.0 306618.860 491666.700 70176.75 99537.72 114479 8.6

Lori 159.7 406822.761 319496.300 110596.50 66555.04 124293 19.9

Kotayk 405.2 875666.866 280597.000 482369.80 106546.90 137388 20.7

Shirak 2.6 298828.061 374004.700 139780.90 121129.80 110876 18.0

Syunik 7089.5 2527970.900 429247.000 185533.60 141770.10 266832 15.0

Vayotz Dzor 0.0 570344.615 432820.500 225239.00 138785.60 126740 22.4

Tavush 100.7 257295.277 304722.800 167852.20 139314.20 118446 26.9

Figure 1. General indicators characterizing the economic situation of the RA marzes (provinces) in the Jupyter notebook environment.

In order to analyze, it is necessary to create appropriate functions for these indicators. Thus, in order to classify the marzes of the Republic of Armenia, we should select the K-mean clustering algorithm from the machine learning algorithms, which is necessary to determine the optimal number of clusters. To determine the optimal number of clusters, we used the elbow method, the results of which are presented in Figure 2. In the figure, the y-axis represents the sum of the intragroup square deviations, and the x-axis the number of clusters. According to the elbow method, the number of clusters is considered optimal, if the Figure becomes almost parallel to the x-axis. In our case, the optimal number of obtained clusters is 4.

11 https://pandas.pydata.org/

12 https://scikit-learn.org/stable/

13 https://scipy.org/

14 https://matplotlib.org/

Number of clusters

Figure 2. Determination of the optimal number of clusters by the elbow method

In order to perform clustering with the K-means method, we used the following software code

clus_model1 = KMeans (n_clusters=4) clus_model1.fit(X)

As a result of the code operation, the regions were clustered on the basis of general indicators characterizing the economic situation of the RA marzes, which are included in the X variable.

Table 2 presents the 4 clusters obtained as a result of the implementation of the K-mean method. As we can see, Lori, Kotayk, Shirak, Vayots Dzor and Tavush marzes are included in one cluster, Aragatsotn, Ararat, Armavir and Gegharkunik are included in other cluster, and Syunik and Yerevan are in their own clusters.

result = defaultdict(lambda: []) for i in range(len(clustens_edit)}: cluster_idx = clustens_edit[i] marz = df.index[i] result[cluster_idx].append(marz)

for cluster_id in range(len(result)):

print(f"cluster {cluster_id + 1}:"j result[cluster_id])

cluster 1: ['Lori 'Kotayk 'Shirak 'Vayotz Dzor', 'Tavush ']

cluster 2: ['Syunik ']

cluster 3: ['Yerevan']

cluster 4: ['Aragatsotn', 'Ararat 'Armavir ', 'Gegharkunik ']

Table 2

Clustering results implemented by K-means method

Cluster 1 Cluster 2 Cluster 3 Cluster 4

Lori Kotayk Shirak Vayots Dzor Tavush Syunik Yerevan Ararat Aragatsortn Armavir Gegharkunik

Figure 3 shows the visualization of the resulting clusters on a two-dimensional plane, the axes of which represent the 2 most important axes obtained as a result of PCA (Principal Components Analysis) analysis. PCA enables to analyze points in a large space by individual components, so that when projecting points on the first axis of these components, the projected points have the greatest possible variation, the second largest possible variation and in case of projecting on the second axis so on. By the way, the calculated axes must be orthogonal to each other15.

Syunik

3

2

(N Ararat Armavir

0- Aragatsotn Gegharkunik JCotayk Vayotz Dzor

-1 # Shirak • • Tavush •

-2-1012345

PCI

Figure 3. Visualization of clusters of the marzes with 2 main axes of PCA transformation

Figure 4 shows the dendrogram obtained as a result of agglomerative clustering, where the most similar marzes within the selected clusters are most visible visually.

15 Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459.

Syunik -

■ferevan -

Gegharkunik -

Aragatsotn ---

Armavir ---

Ararat -

Kotayk ---

Lori

Vayotz Dzor Shirak Tavush

01234567

Figure 4. Dendrogram obtained as a result of agglomerative clustering

Vayots Dzor marz is included in the first cluster, which is relatively close to Lori, Kotayk, Shirak and Tavush marzes with its economic indicators. There are no investment flows in Vayots Dzor marz, the volume of industrial output per capita, the volume of construction, the average monthly nominal wage are quite low, and the unemployment level is higher than the republican average. It has a high position among the marzes of the Republic of Armenia only by the volume of agricultural products per capita.

As a result of multidimensional classification, Syunik was included in a separate cluster, taking into account some peculiarities of the marz.

Thus, the average monthly nominal salary of the population in 2019 was 182.673 AMD, higher than the average nominal salary paid only in Syunik marz 266.832 AMD and in Yerevan - 201.527 AMD. Net investment flows are also high in the marz after Yerevan.

Thus, Syunik region is a leader in terms of industrial output and nominal wage indicators, and by the volume of agricultural output per capita it is almost equal to Vayots Dzor marz. The volume of services per capita is higher in Vayots Dzor, taking into consideration the high level of tourism services in the marz. The city of Yerevan is included in a separate cluster, as it, unlike Syunik region, is in a high position in almost all indicators of economic development, with the exception of agricultural output per capita.

The fourth cluster includes the marzes mostly of agricultural orientation. Based on the clustering results, appropriate labels were given to the RA marzes, after which classification methods were constructed to predict the selected labels. LDA (Linear Discriminant Analysis) 16 , KNN (K Nearest

16 Izenman, A. J. (2013). Linear discriminant analysis. In Modern multivariate statistical techniques (pp. 237-280). Springer, New York, NY.

Neighbors) 17, Linear SVC (Support Vector Classifier) 18 and Decision Tree19 methods were selected for comparative analysis, after which their training was carried out. The variable "net investment flows" was used as an independent variable to teach the classifiers. After model training, the accuracy criterion for all models was calculated and the results obtained were compared (Table 3). As we can see, the best result was obtained by performing the "Decision Tree" method of classification, in case of which the accuracy is 0.91.

Table 3

The accuracy obtained for each classification model

1 Method I Accuracy I

LDA 0.73

KNN 0.63

LinearSVC 0.54

Decision Tree 0.91

Since the best result was obtained with the Decision tree classifier, the analysis was later performed based on the values predicted by that classifier.

Figure 5 shows a figure of the constructed decision tree, which clearly presents how the classifier determines the class for a given marz.

Netlnvestment < = 0239

gini = 0 645

samples = 11

value = [5, 4, 1, 11

Netlnvestment <= -0 328 gin i = 0 494 samples = 9 value = 15, 4, D, 0]

"7

gini = 0 0 samples = 2 value = [0, 2,0, 0]

Netlnvestment < = -0 324 gin I = 0 408 samples = 7 value = [5. 2, 0, 0|

—7-

Netlnvestment <= 1497 gini = 0.5 samples = 2 value = [0, 0,1,1]

"7-V

gini = 0.0 samples = 1 value = [0, 0,1, 0|

gini = 0.0 samples = 1 value = [0.0, 0, 1|

gini = 0 5 samples = 2 value = |1.1. 0. 0]

Netlnvestment <= -0 319 gini = 0.32 samples = 5 value = [4. 1, 0. 0]

~7

gini = 0 0 samples = 3 value = [3. 0, D, 0]

Netlnvestment <= -0 316 gini = 0.5 samples = 2 value = [1, 1, D, 0]

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

~7

gini = 0.0 samples = 1 value = [0,1, 0. 0]

gini = 0.0 samples = 1 value = [1, 0, 0, 0]

Figure 5. Visualization of the constructed decision tree

17 Peterson, L. E. (2009). K-nearest neighbor. Scholarpedia, 4(2), 1883.

18 Suthaharan, S. (2016). Support vector machine. In Machine learning models and algorithms for big data classification (pp. 207-235). Springer, Boston, MA.

19 Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A., & Brown, S. D. (2004). An introduction to decision tree modeling. Journal of Chemometrics: A Journal of the Chemometrics Society, 18(6), 275-285.

Using the predicted and original labels, the error matrix was constructed (Figure 6). Due to the error matrix, as a result of the classification, one of the marzes belonging to the 4th cluster (Gegharkunik marz) was mistakenly labeled as a marz belonging to the first cluster, and all the other marzes were classified correctly.

first -

second

third

fourth ■]

0 0 0

0 1 0 0

0 0 1 0

1 0 0

first

second third

Predicted label

fourth

Figure 6. Matrix of classification errors

After all this, ROC (Receiver Operating Characteristic) curves were constructed, that were calculated separately for each class (Figure 7). ROC is a graph that allows you to evaluate the quality of a classifier between correctly classified and incorrectly classified objects. The quantitative interpretation of ROC allows the AUC (Area under Curve) indicator to be an area bounded by the proportion of false positive classifications with the ROC curve. The higher the AUC index, the better the classifier. In our case, according to ROC curves and calculated AUC values the classifier has a very high quality.

False Positive Rate

Figure 7. ROC curve for each class

Conclusion. Summing up the results of the analysis, we can state that Syunik marz is more attractive in terms of investment potential due to industry, construction volumes, high nominal salary and Vayots Dzor region due to the volume of services and sphere of agriculture. There are no investment flows in Vayots Dzor region, the volume of industrial output per capita, the volume of construction, the average monthly nominal wage are quite low, and the unemployment rate is higher than the republic average. Only in terms of agricultural output per capita it is almost equal to the volume of agricultural products in Vayots Dzor region. The volume of services per capita is higher in Vayots Dzor region, taking into account the high level of tourism services in the region.

The city of Yerevan is included in a separate cluster, as it, unlike the Syunik region, is in a high position in almost all indicators of economic development, except for agricultural products per capita. The fourth cluster includes the regions which are largely engaged in agricultural activities.

This analysis can be a basis for the complex development of these two marzes; they can complement each other to some extent and by cooperating to ensure higher efficiency of economic activity. Due to the current tense situation in the marzes, improving the investment climate is of strategic importance.

References

1. Melkumyan, M., (2014). Organization of entrepreneurial activity, textbook, Yerevan, Zangak-97. (In Armenian).

2. Hovsepyan, V., Egiazaryan, M. (2014). National economy YSU textbook, 224 p. (In Armenian).

3. Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459.

4. Chen, P., Ye, J., Chen, G., Zhao, J., & Heng, P. A. (2020). Robustness of accuracy metric and its inspirations in learning with noisy labels. arXiv preprint arXiv:2012.04193.

5. Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7), 1145-1159.

6. Izenman, A. J. (2013). Linear discriminant analysis. In Modern multivariate statistical techniques (pp. 237-280). Springer, New York, NY.

7. Peterson, L. E. (2009). K-nearest neighbor. Scholarpedia, 4(2), 1883.

8. Suthaharan, S. (2016). Support vector machine. In Machine learning models and algorithms for big data classification (pp. 207235). Springer, Boston, MA.

9. Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A., & Brown, S. D. (2004). An introduction to decision tree modeling. Journal of Chemometrics: A Journal of the Chemometrics Society, 18(6), 275285.

10. Investment climate and its components [Electronic resource]: https://studopedia.ru/7_183452_investitsionniy-klimat-i-ego-sostavlyayushchie. html

11. Folom'ev, A., Revazov, V. (2000). The investment climate of russian regions and ways to improve it //Problems of economic transition. V. 43. № 3. pp. 41-55.

12. Krichevsky, M.L., Martynova, Yu.A. (2019). Using machine learning methods to evaluate investment activity in various regions of Russia //Issues of innovative economy. Vol. 9. No. 4, pp. 1557-1572 (In Russian).

13. Tsakunov, S.V. (2009). Theory and practice of investment marketing and management //Electronic resource]. Access mode: http://invm2009 . narod. ru- 2009 (In Russian).

14. https://www.armstat.am/am/?nid=82&id=2324

15. https://pandas.pydata.org/

16. https://scikit-learn.org/stable/

17. https://scipy.org/

18. https://matplotlib.org/

unu toUQhM3UL

^ajausab/ u/byiwliuih sbsbuaq/salab haiaiuapab/ //b^npiashlaj/ b i*lfi6uiluiqpm.pjuih tyutlmiyihyit1 ¡T-bluib, iphyit'uuiqftyim-pjuih p-blba&nL, r).ngbbs

UPtfbL T.U2U.P3U.L

^ajausab/ u/bs^l^t sbsbuaq/salab haiaiuapab/ sbsbualab hb^npluishlujjh b sb^bl^sl^l^t haialapqbp/ Luip/nb/ ijuipfch u/ui2snhuiluisuip, sbsbuuiq/sni-pjuib p-blbut&m, r).ngbbs

ULT-PULhM UUPqU3UL

^ajausab/ u/bswluib sbsbuLq/s^lLb haiaiuapab/ sbsbualab /b^npluishluijh b sb^bl^sl^lLb haialapqbp/ Lulp/nb/ Luuu/hpuibs

« Uwjng dnpfr U Ujmbfrffr ifwpqbpmd bbpqpmdwjfrb qnp&mbbmpjwb qbwhwsmlp ifbfbbwjwtywb mumgdwb ifbpnqbbpfr Ijfrpwndwdp.- Lbp^mjmJu <mjmummh^ <mh-pm^brnnLpjmh rnhrnbunLpjmh smppbp 6jnL^bp^ L n|npmhbp^ phm^mhnh qnp&rnhbnLpjmh nL qmpqmgJmh h^Jhm^mh 2mp-d^ mdp hbp^.pmJhbp^ hbpqpm4nLJh t, hbrnLmpmp' hbp^.pnL-Jmj^h J^gm^mjp^ qpm4^nLpjmh L Jpgmhm^nLpjmh Jm^mp-H-m^ pmpSpmgmJp: Lbp^.pmJmj^h Spmqpbp^ hbpqpm4Jmh qnp&phpmg^ mp^jnLhm4bm ^mnm4mpmJh m^mhn^nLJ t mhmbum^mh ^mjmh m6 L qmpqmgnLJ:

Smh^mgmS bp^p^ ^mJ Jmpq^ hbp^.pmJmj^h hbpmd^ 4pm mq^.mJ bh J^ 2mpf pmpbh^mum ^mJ mhpmpbh^mum qnp&nhhbp, npnhg^g mnmh5hmg4bL bh ^mmp^n^. Jmpqbp^ mhmbum^mh qmpqmgJmh mnm^bL tm^mh gmgmh^hbpp: H-pmhg h^Jmh 4pm Jbpbhmjm^mh mumgJmh Jbpn^hbp^ ^pmnJmJp qhmhms4bL t ^ 4mjng 5np^ L Ujmh^f^ Jmp-qbp^ hbp^.pmJmj^h hbpnLdp' hmJbJmsb|n4 ££ JjnLu Jmp-qbp^ hbm:

Chmp4m& gmgmh^hbp^ h^Jmh 4pm ^mmmp4bL t ^ Jmpqbp^ ^[murnbpmgnLJ, mnmh5hmg4bL bh 4 h^Jhm^mh ^m-ubp: Cum ^[murnbpmgJmh mp^jnLhphbp^ « Jmpqbp^h sp4b bh hmJm^mmmu^mh ^mm^hbp, np^g hbmn ^m-nnLg4b[ bh ^.mum^mpqJmh Jbpn^hbp' phrnp4m6 ^rnm^hbpp ^mh^mmbubm h^mmm^n4:

4bpgnLJ ^mnnLg4b[ bh jmpmpmhymp ^mu^ hmJmp hm2-4mp^4m6 ROC (Receiver Operating Characteristic) ^npbpp' ^m-um^mpq^ npm^p qhmhmsb|nL hmJmp: ROC-^ pmhm^m^mh Jb^hmpmhnLpjnLhp gnLjg t mm^u, np ^.mum^mpq^h mh^ 2ms pmpSp npm^:

^dbwpwnbp. bbptyprniSutj/ib bbpmd, bbpq.pnLiaj/b qpuiifcnL-pjmb, ifbpbbuijuiluib mumgnLiJ, hutusbputgrniJ, npn2nLibbp/ dan JEL: R58, C52, C53 DOI: 10.52174/1829-0280_2021_6_78

СОС ХАЧИКЯН

Декан факультета информатики и статистики Армянского государственного экономического университета, кандидат экономических наук, доцент

АРМЕН КАЗАРЯН

И/о заведующего кафедрой экономической информатики и информационных систем Армянского государственного экономического университета, кандидат экономических наук, доцент

АНДРАНИК САРГСЯН

Аспирант кафедры экономической информатики и информационных систем Армянского государственного экономического университета

Оценка инвестиционной деятельности в Вайоц-дзорской и Сюникской областях РА с применением методов машинного обучения.- В настоящее время основной движущей силой деятельности и развития различных отраслей экономики РА и ее областей является привлечение инвестиций и, следовательно, повышение уровня привлекательности и конкурентоспособности инвестиционной среды. Эффективное управление процессом привлечения инвестиционных программ обеспечивает устойчивый экономический рост и развитие.

На инвестиционный потенциал любой страны или области влияет ряд благоприятных или неблагоприятных факторов, из которых выделены наиболее существенные показатели экономического развития указанных областей. На основе последних с применением методов машинного обучения была проведена оценка инвестиционного потенциала Вайоцдзорской и Сюникской областей РА, в сравнении с другими областями РА.

На основе выбранных показателей была проведена кластеризация областей РА и выделено 4 основных класса. На основе результатов кластеризации областям РА были присвоены ярлыки, после чего были разработаны методы классификации с целью прогнозирования выбранных ярлыков.

Также были построены вычисленные для каждого класса кривые ROC (Receiver Operating Characteristic) для оценки качества классификатора. Количественная интерпретация ROC показывает, что классификатор отличается очень высоким качеством.

Ключевые слова: инвестиционный потенциал, инвестиционная привлекательность, машинное обучение, кластеризация, дерево решений JEL: R58, C52, C53 DOI: 10.52174/1829-0280_2021_6_78

i Надоели баннеры? Вы всегда можете отключить рекламу.