Научная статья на тему 'DEVELOPMENT OF A MODULE FOR EVALUATING THE ACTIVITY OF THE MAKHALLA CHAIRPERSONS BASED ON THE EXPERTS' ASSESSMENT WITH THE ALGORITHMS'

DEVELOPMENT OF A MODULE FOR EVALUATING THE ACTIVITY OF THE MAKHALLA CHAIRPERSONS BASED ON THE EXPERTS' ASSESSMENT WITH THE ALGORITHMS Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
32
14
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
expert / model / evaluation / linear regression learner / polynomial regression learner / RProp MLP Learner / X-Partitioner / X-Aggregator / error / object

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — M. Ibragimov, I. Matyakubov, A. Raximov, M. Musayev

This article deals with the application of machine learning algorithms in the development of a module for evaluating the activity of mahallas based on the employment data of the mahalla. For the issue of classification of machine learning, the construction of a model for the support of Linear Regression, Polynomial Regression and Neural Network algorithms of Machine Learning is mentioned. By using these algorithms, the problem of automatic assessment of mahalla assessment has been solved. Experimental work was carried out on the KNIME Analytics platform. The obtained results are compared for these algorithms and conclusions are presented.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «DEVELOPMENT OF A MODULE FOR EVALUATING THE ACTIVITY OF THE MAKHALLA CHAIRPERSONS BASED ON THE EXPERTS' ASSESSMENT WITH THE ALGORITHMS»

DEVELOPMENT OF A MODULE FOR EVALUATING THE ACTIVITY OF THE MAKHALLA CHAIRPERSONS BASED ON THE EXPERTS' ASSESSMENT WITH THE ALGORITHMS

1Ibragimov Muhiddin Fakhraddin ugli, 2Matyakubov Islombek Ikrom ugli, 3Raximov Asadbek Dilshod ugli, 4Musayev Muhammadjon Xursandbek ugli

1Assistant of the department of "Software engineering", Urgench branch of Tashkent University

of Information Technologies named after Muhammad al-Khwarizmi 2,3,4 Students of the direction of "Software engineering", Urgench branch of Tashkent University of Information Technologies named after Muhammad al-Khwarizmi https://doi.org/10.5281/zenodo.7915928

Abstract. This article deals with the application of machine learning algorithms in the development of a module for evaluating the activity of mahallas based on the employment data of the mahalla. For the issue of classification of machine learning, the construction of a model for the support of Linear Regression, Polynomial Regression and Neural Network algorithms of Machine Learning is mentioned. By using these algorithms, the problem of automatic assessment of mahalla assessment has been solved. Experimental work was carried out on the KNIME Analytics platform. The obtained results are compared for these algorithms and conclusions are presented.

Keywords: expert, model, evaluation, linear regression learner, polynomial regression learner, RProp MLP Learner, X-Partitioner, X-Aggregator, error, object.

Introduction

When analyzing the mahalla data, there is a lot of information that mainly determines the living conditions, material and spiritual condition of the population. The standard of living of the population means the level of providing people with the necessary material and spiritual benefits, the level of satisfying their consumption and needs.[1]

It is possible to consider the standard of living of the population as the level of meeting their material, spiritual and social needs. One of the important indicators in this regard is the employment of the population of the mahalla. Incomplete data on employment is common when data is collected in a cross-sectional area, but if it is discarded during pre-processing of the data, it results in a loss of informative data.[2,3] In the process of building the model, it leads to an increase in errors in the verification of model adequacy.

Experts face a number of difficulties in assessing the activity of the mahalla chairman, especially based on the existing indicators of employment, due to the large number of mahallas. To upload the evaluation process to the system, building a model for evaluating the mahallas according to the employment indicator will allow mahalla chairmen to compare their performance with other mahallas. Classification methods and algorithms of machine learning based on available data help to solve the problem.[4,5]

Setting the question. When solving the problem of classification into classes, the use of methods such as properly distributed neural network, Logistic regression, Naive Bayes, Base vectors, Random forest, and nearest neighbor gives good results. Many classification algorithms have been developed based on these methods. Each of them works well on different types of

datasets. Therefore, it is important to determine which classification method is effective in solving the classification problem for the data set we have chosen. As an efficiency indicator, it is necessary to consider not only the classification error rate, but also the time taken to execute the algorithm as an efficiency indicator. Therefore, in solving the problem of classification in this article, the main issue is to choose a teaching method that minimizes both the error in classification and the time spent on the teaching process for the given set of data.

Therefore, it is advisable to use a more complicated, but more reliable testing method. One such method is the K-Fold Cross-Validation testing method for assessing model reliability. In this way, we can describe the testing process as follows. The main aspect of this method is that all the subjects participate in both the teaching process and the testing process. (See Figure 1).

Figure 1. Splitting the running sample into subsets using the K-Fold Cross-Validation testing

method.

In the K-Fold Cross-Validation testing method, the set of running samples is divided into k subsets. The model is then run and tested k times. Each time it is run, the i-th set is used only for testing, and the rest are used for running. Accordingly, the error is calculated for each test, and the average error is determined by the following formula.

K i=l

Using data on employment of residents of mahallas in Khorezm region (Table 1), a data set is developed, looking at each mahalla as an object.[10,11]

Table 1.

Demographic indicators and employment indicators in the areas of the regions and expert

assessment

The Num

num ber

Makh Popul ber of

allas ation of pens

fami ione

lies rs

The

numb Numbe

er of r of

peopl people

e engage

under d in

the child

age of care

18

Number

of permane

ntly employe d

People engage

d in busines

Number

of unemplo yed

Expert assessmen t

s

Uslar 2371 649 194 726 182 486 149 189 86

Yuksa lish 3314 667 314 981 248 617 164 208 71

Namu na 2857 797 356 1208 245 64 129 138 75

Gulzo r 4012 915 283 1374 150 423 917 270 72

Bogzo r 3723 745 389 1344 205 0 144 243 76

Uzbek istan 4861 1203 124 1754 386 811 202 215 68

Navba khor 6347 1945 389 2443 432 281 328 234 65

'he regression problem, which is a part o ? the classi: Ication problem, is solved for the

generated running sample.

Regression is one of the methods of intellectual analysis of data and is a set of statistical processes for evaluating the relationship between variables related to an object or process. Linear regression analysis. Regression analysis is widely used mainly for prediction and forecasting, and now the use of this method is compatible with the field of machine learning [8]. In linear regression, the relationship between the independent variables and the dependent variable is usually done through a line that represents the relationship between the two variables. The corresponding line is called the regression line and y = a * x + b is represented by a linear equation. if we have more than one independent variable, then we consider a multiple linear regression model if we take the following model:

y=bo + bixi + b2X2 + ... + bnXn

• y- is the response to the values, that is, it means the result predicted by the model;

• bo the intercept, which is the value of x for which y is all 0;

• the first sign bl xl coefficienti;

• another characteristic is the bn x coefficient;

• x1,x2,., xn are independent variables of the model.

Basically, an equation explains the relationship between a constant dependent variable (y) and two or more independent variables (xl, x2, x3...). In polynomial regression, the relationship between the independent variable and the dependent variable is usually represented by a Polynomial, which represents the relationship between the two variables.

File Reader: Data is downloaded through this component. The downloaded data will look like this.

Row ID [J] Colo [0 Coll []]Col2 UfC0l3 [¡]Col4 [J] Col 5 [J] ColS Q]CoI7 [J] ColS Q]CoI9

RowO 2371 649 194 726 182 486 149 312 189 86

Rowl 3314 667 314 981 248 617 164 421 208 71

Row2 2857 797 356 1208 245 64 129 83 138 75

Row3 4012 915 283 1374 150 423 917 489 270 72

Row4 3723 745 389 1344 205 0 144 147 243 76

Row5 4661 1203 124 1754 386 811 202 316 215 68

Row6 6347 1945 389 2443 432 281 328 285 234 65

Row 7 4114 1039 397 1402 29 786 365 342 136 73

Row8 3386 935 283 1010 115 481 414 280 217 86

Row9 3305 925 334 926 23 781 298 345 145 86

Row 1)0 3468 874 523 1285 29 102 124 275 255 87

Figure 3. Downloading the data formed on the basis of the data of the mahallas.

Normalizer: Through this component, the data we have is normalized (minmax is transferred to the [0..1] range by normalization. This process is necessary to give the same weight to all symbols)

Figure 4. View of the data set after normalization. X-Partitioner and X-Aggregator: K-Fold Cross-Validation running and testing samples forming components, divides the running samples into k parts and organizes the running process k times

.Y1 = -0,5823 * x1 - 0,347 * x2 - 0.2316 * x3 - 0,4682 * x4 - 0,4759 * x5

0,3101 * x6 + 0,00048 * x7 - 0,0909 * x8 + 0,1683 * xf - 0,3748 * xf + 0,198 * xf + 0,2268 * x| + 0,287 * xf + 0,2266 * xf - 0,161 * xf + 0,059 * xf

R-Squared: 0,7163 Adjusted R-Squared: 0,6964

Linear Regression Learner: Through this component, we check the adequacy of the model built using Regression learner. This is done by giving the tested sample to the regression predictor. After testing, the Line Chart component is used to visualize how different the results are from the real situation, and the error is as follows.[12]

= -0,2315 * x1 - 0,0889 * x2 - 0.2799 * x3 - 0,3407 * x4 - 0,2359 * x5 - 0,0879 * x6 - 0,173 *x7 - 0,0731 *x8;

R-Squared: 0,6799 Adjusted R-Squared: 0,669

Row ID S Variable D Coeff, D Std, Err, D t-value D a R>|t|

RowlJO Intercept 1,047 0,027 38,233 0

Row3 Col 2 -0,28 0,06 -4,645 0

Row 5 Col4 -0,236 0,053 -4,464 0

Row4 Col3 -0,341 0,095 -3,579 0

Row 7 Colo -0,173 0,067 -2,573 0,011

Rowl ColO -0,232 0,141 -1,636 0,103

Row6 Col 5 -0,038 0,056 -1,58 0.L15

RowS Col 7 -0,073 0,083 -0,879 0,38

Row2 Coll -0,089 0,112 -0,792 0,429

Row9 ColS -0,058 0,077 -0,754 0,452

Figure 5. The error was detected using the Linear Regression Learner method. Polynomial Regression Learner: we can build a regression database and get the following nonlinear information.

R-Squared: 0,7163 Adjusted R-Squared: 0,6964

Figure 6. Using the Polynomial Regression Learner method, the database and the error were

identified.

Through RProp MLP Learner: we build the regression database and get the following

graph.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Figure 7. Initial and final values are shown in the RProp MLP Learner method.

Table 2.

Experts' assessment

№ Linear Regression Polynomial RProp MLP Learner

Learner Regression Learner

1 0.024 0.021 0.009

2 0.019 0.014 0.008

3 0.023 0.016 0.024

Conclusion. In conclusion, it was found that the model of machine learning using linear, polynomial regression and multi-layer neural network methods for evaluating the activity of the chairman of the neighborhood based on neighborhood data is more effective than the model built using Neyorn's network method than linear and polynomial regression methods.

REFERENCES

1. Development of algorithms and software products for personality recognition based on speech signal processing S Ismoilov, O Masharipov, M Ibragimov, AIP Conference Proceedings, 2022

2. Ibragimov Mukhiddin Fakhraddin ugli EURASIAN JOURNAL OF MATHEMATICAL THEORY AND COMPUTER SCIENCES Volume 2 Issue 14, December 2022 https://doi.org/10.5281/zenodo.7485196

3. Ijro Hokimiyati Organlarda Qaror Qabul Qilishning Intelektual Algaritmlarini Ishlab Chiqish Va Uni Tadbiq Qilish I.M. Fahraddin o'g'li - Komputer texnologiyalari, 2022

4. Madeling using polynamial regression algorithims of machine learning on mahalla data O.K.Xujaev M.F.Ibragimov

5. THE IMPORTANCE OF MONITORING IN THE MANAGENT OF SOCIO-ECONOMIO PROCESSES IN SELF- GOVERNMENT BODIES M.F.Ibragimov, O.K.Xujaev

6. К Вопросу Оценки Компетентности Подготовки Будущих Бакалавров «Программный Инжиниринг» В Слабо Формализованных Условиях Ф Юсупов, О Казаков, М Ибрагимов.

7. Tomas Loster. KLASTER TAHLILIDA KLASTERLARNING OPTAL SONINI ANIQLASH. 10-Xalqaro statistika va iqtisodiyot kunlari. Konferensiya materiallari. 2016 yil 8-10 sentyabr; Praga, Chexiya. pp. 1078-1090.

8. X. Rahimboev., M. Ismoilov. "Boshqaruv ob'ekti va uning tarkibiy qismlarining holatini parametrik baholash uchun modelni yaratish". Acta Turin Polytech. Univ. Toshkent, jild. 10, yo'q. 2, bet. 19-33, 2020. https://uzjournals.edu.uz/actattpu/vol10/iss2/11.

9. Raximboyev XJ "Mashinada o'qitishdan foydalangan holda o'zini o'zi boshqarish organlarida qarorlar qabul qilishni qo'llab-quvvatlash algoritmini ishlab chiqish" Ilmiy- texnika jurnali, FerPI, 2020 yil, V.24, №6. 23-30-betlar. https://uzjournals.edu.uz/ferpi/vol24/iss6/4

10. Dubina, I. N. Ijtimoiy-iqtisodiy jarayonlarni matematik modellashtirish asoslari: bakalavriat va magistratura talabalari uchun darslik va amaliy mashg'ulot / I. N. Dubina. - Moskva: Yurayt nashriyoti, 2019. - 349 b.

11. A. A. Barseghyan, M. S. Kupriyanov, I. I. Xolod, M. D. Tess va S. I. Elizarov. Ma'lumotlar va jarayonlarni tahlil qilish: darslik. nafaqa. - 3-nashr, qayta ko'rib chiqilgan. va qo'shimcha -Sankt-Peterburg: BHV-Peterburg, 2009. - 512 p.: kasal. + CD-ROM.

12. Prokopenko, N. Yu. Qarorlarni qo'llab-quvvatlash tizimlari: darslik / N. Yu. Prokopenko; Nijniy Novgorod davlat arxitektura va qurilish universiteti. - Nijniy Novgorod: Nijniy Novgorod davlat arxitektura va qurilish universiteti, 2017 yil.

13. Kornikov V.V., Seregin I.A., Xovanov N.V. Og'irlik koeffitsientlari haqida raqamli bo'lmagan, noto'g'ri va to'liq bo'lmagan ma'lumotlarni qayta ishlash uchun Bayes modeli // http://inftech.webservis.ru/it/conference/scm/2000/session3/kornikov.htm

14. https://www.knime.com/knime-analytics-platform

15. Sahami M. Learning limited dependence Bayesian classifiers // Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. - Portland, Ore, USA, 1996. - P. 335-338

16. Friedman N. Learning belief networks in the presence of missing values and hidden variables // Proceedings of the 14th International Conference on Machine Learning. - 1997. - P. 125133.

i Надоели баннеры? Вы всегда можете отключить рекламу.