Научная статья на тему 'Estimation of the inductive model of objects clustering stability based on the k-means algorithm for different levels of data noise'

Estimation of the inductive model of objects clustering stability based on the k-means algorithm for different levels of data noise Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
138
35
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
inductive modeling / clustering / k-means algorithm / external balance criterion / индуктивное моделирование / кластеризация / алгоритм k-средних / внешний критерий баланса / індуктивне моделювання / кластеризація / алгоритм k-середніх / зовнішній критерій балансу

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Babichev S., Lytvynenko V., Taif M. A.

The inductive model of the objective clustering of objects based on the k-means algorithm clustering is presented in the paper. The algorithm for division of initial data into two equal power subsets is proposed and practically implemented. The difference between the mass centres of the appropriate clusters in different clustering is proposed to use as an external balance criterion. Approbation of the proposed model operation was carried out using the data “Compound” and “Aggregation” of the database of the Computing School in the Eastern Finland University. The researches on the estimation of the model stability to a noise component using the data “Seeds” are presented in the paper. The algorithms k-means, c-means, inductive k-means and agglomerative hierarchical algorithm were used to compare the results of the experiment. The ways of further improvement of the proposed model in order to increase the objectivity of investigated data clustering were defined by the results of the simulation.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

ОЦЕНКА УСТОЙЧИВОСТИ ИНДУКТИВНОЙ МОДЕЛИ КЛАСТЕРИЗАЦИИ ОБЪЕКТОВ НА ОСНОВЕ АЛГОРИТМА К-СРЕДНИХ ПРИ РАЗЛИЧНЫХ УРОВНЯХ ШУМА

В статье представлена индуктивная модель объективной кластеризации объектов на основе алгоритма кластеризации k-средних. Предложен и практически реализован алгоритм деления множества исходных данных на два равномощных подмножества. В качестве внешнего критерия баланса предложено использовать разницу между центрами масс соответствующих кластеров в различных кластеризациях. Аппробация работы предложенной модели производилась с использованием данных «Compound» and «Aggregation» базы данных вычислительной школы Восточно-Финского университета. Представлены исследования по оценке устойчивости модели к шумовой компоненте с использованием данных «Seeds». Для сравнения результатов эксперимента были использованы алгоритмы kmeans, C-means, индуктивный алгоритм k-means, а также алгоритм агломеративной иерархической кластеризации. По результатам моделирования определены пути дальнейшего усовершенствования предложенной модели с целью повышения объективности кластеризации исследуемых данных.

Текст научной работы на тему «Estimation of the inductive model of objects clustering stability based on the k-means algorithm for different levels of data noise»

НЕЙРО1НФОРМАТИКА ТА ШТЕЛЕКТУАЛЬШ СИСТЕМИ

НЕЙРО1НФОРМАТИКА ТА ШТЕЛЕКТУАЛЬШ СИСТЕМИ

НЕЙРОИНФОРМАТИКА И ИНТЕЛЛЕКТУАЛЬНЫЕ СИСТЕМЫ

NEUROINFORMATICS AND INTELLIGENT SYSTEMS

UDC 004.048

Babichev S.1, Lytvynenko V.2, Taif M. A.3

1Ph.D, Associate Professor, Associate Professor of the Department of Informatics, Jan Evangelista Purkyne University in Usti nad

Labem, Czech Republic

2DrSc., professor, head of the Department of Informatics and Computer Sciences, Kherson National Technical University, Kherson,

Ukraine

3Postgraduate Student of the Department of Informatics and Computer Sciences, Kherson National Technical University, Kherson,

Ukraine

ESTIMATION OF THE INDUCTIVE MODEL OF OBJECTS CLUSTERING STABILITY BASED ON THE K-MEANS ALGORITHM FOR DIFFERENT

LEVELS OF DATA NOISE

The inductive model of the objective clustering of objects based on the ¿-means algorithm clustering is presented in the paper. The algorithm for division of initial data into two equal power subsets is proposed and practically implemented. The difference between the mass centres of the appropriate clusters in different clustering is proposed to use as an external balance criterion. Approbation of the proposed model operation was carried out using the data "Compound" and "Aggregation" of the database of the Computing School in the Eastern Finland University. The researches on the estimation of the model stability to a noise component using the data "Seeds" are presented in the paper. The algorithms ¿-means, c-means, inductive ¿-means and agglomerative hierarchical algorithm were used to compare the results of the experiment. The ways of further improvement of the proposed model in order to increase the objectivity of investigated data clustering were defined by the results of the simulation.

Keywords: inductive modeling, clustering, ¿-means algorithm, external balance criterion.

NOMENCLATURE

GMDH is the Group Method of Data Handling;

n is the number of observed objects;

m is the number of attributes that characterize the objects;

k is the number of clusters;

xij is the value of feature in column j of row i;

x'ij is the normalized value of feature in column j of row i;

medj is the median of j column;

q is the number of clusters in clustering Q and R respectively.

INTRODUCTION

Nowadays, great attention is devoted to the issues of the complex objects clustering at the conditions of various data noise levels. First of all it is connected with the increase of the requirements for accuracy of detection and identification systems operation under various conditions of information obtaining. A lot of clustering algorithms exist nowadays. Each of them has its advantages and

© Babichev S., Lytvynenko V., Taif M. A., 2016 DOI 10.15588/1607-3274-2016-4-7

disadvantages and is focused on a specific type of data. A high percent of subjectivism is one of the key disadvantages of existing algorithms, i.e. high quality of clustering on a one dataset does not guarantee the same results on another similar dataset. Clustering objectivity improvement is possible by using inductive methods of complex systems modelling based on the Group Method of Data Handling [1-3], where the data processing is carried out by two equal power subsets and a final decision concerning of the nature of the objects partition into the clusters is done based on the complex use of external criteria of relevance and internal criteria of clustering quality estimation. Thereby, the development of hybrid models and methods of objects clustering based on the complex systems inductive modeling methods is an actual problem both fundamentally and practically.

1 PROBLEM STATEMENT

The initial dataset of objects is a matrix: A = \xjj } i = 1...n, j = 1...m. The aim of the clustering is a

partition of objects into non-empty subsets of pairwise nonintersecting clusters in accordance with the criteria of remoteness of the object and cluster, taking into account the properties of the objects:

K = {KS},* = 1,...,k; K uK2 u...uKk = A;

KinKj =0,i * j; i, j = 1,...,k.

Three fundamental principles, which are taken from different scientific fields, are the basis of methodology of complex systems inductive modeling [1-6]:

- the principle of heuristic self-organizing, i.e., enumeration of models set and the selection of the best model on the basis of the external balance criterion;

- the principle of external addition, i.e. the necessity of additional information using with purpose of objective verification of models;

- the principle of inconclusive solution, i.e. generation of a certain set of intermediate results in order to select the best variant one.

The implementation of these principles within the objective clustering inductive model assumes the following steps:

- normalization of the investigated objects features, i.e. their reduction to identical range with the same median of the objects attributes;

- division of initial data set into two equal power subsets;

- definition of external criterion or group of relevance criteria to choose the optimal clustering for two equal power subsets;

- choice or development of the basic clustering algorithm used as a component in the inductive model of objective clustering of objects.

2 REVIEW OF THE LITERATURE

The basic conceptions of creation of inductive method of objects clustering on the basis of Group Method of Data Handling are described in the papers [2-4]. Further development of this theory is reflected in [5, 6]. The conception of the objective cluster analysis is presented in [4] and has been further developed in [7-9]. The authors define the basic principles of objective clustering inductive model creation, show the ways and perspectives of its implementation, define the advantages of clustering inductive model by comparison with traditional data clustering methods. Theoretical developments for implementation of biclustering methods for systems of complex processes inductive modeling are presented in [10]. However, it should be noted that, in spite of the successful results achieved in this area, an objective clustering model based on the analysis of clustering systems has no practical realization at the present time.

The unsolved parts of the general problem are the absence of the effective algorithms for division of initial data set into two equal power subsets and integrated criterion approach for evaluation of the clustering efficiency during their enumeration one by one.

The aim of the paper is the development of inductive model of objective clustering of objects based on the k-means clustering algorithm and evaluation of the stability of operation algorithm quality using of the noise data with different noise level.

3 MATERIALS AND METHODS

According to hereinbefore concept of complex systems inductive modeling the first step of data processing is the data normalization process. Data normalization was carried out for all columns by the formula (1):

X ij

Xj - med j

max IIx j - med j

(1)

The choice of this normalization method is determined by the fact that as a result the set of data features in all columns had the same median with maximum of features variation range from -1 to 1, herewith the amount of data for each column which falls into the interquartile distance (50%), differs insignificantly.

Algorithm of the original set of objects Q division into 2 equal

power non-intersecting subsets Q A and QB consists the following steps [4, 9]:

1. Calculation of n • (n - !)/ pairwise distances between the /2

objects in the original sample of data;

2. Allocation of pairs of objects X *, X p, the distance between which is minimal:

d (x*, Xp) = min d X, Xj) i, j

3. Distribution of the object X* to subset QA, and the object X p to subset Q ;

4. Repetition of the steps 2 and 3 for the remaining objects. If the number of objects is odd, the last object is distributed into the two subsets.

The approach, outlined in [7], was taken as the basis to calculate the external balance criterion. Optimality criterion of regulated clustering was determined as minimum value of squared deviations sum between mass centers of appropriate clusters for different clustering (2):

CQ(Q, R)= (Q)- ck (Rf ^ min.

k=1

(2)

The mass center of k cluster in Q clustering was determined as the average of vectors attributes in this cluster (3):

1 n

Ck (Q) = ~X Xj , j =

m.

(3)

•i=1

The absolute value of this criterion can be calculated for m-dimension feature space as follows (4):

CQ (Q, R ) =

i \2 m q

X Z(ck(Q)-Ck(Rf

j =A k=1

■ min. (4)

In the case of criterion normalization the formula (4) takes the form (5):

CQN (Q, R ) =

^ min. (5)

X(ck (Q )+ Ck (R ))2 \ k=1

To create the equal conditions for subsets QA and QB using ¿-means clustering algorithm, the same values are assigned to the centers of corresponding clusters for the different clustering at the initialization phase. The initial value of (4) criterion is zero in this case. The experiment has shown that on the subsequent iterations the criterion value increases at the first step, and then it varies monotonically to reach the saturation which corresponds to a sustainable clustering for the two equal power subsets. The relative change of the (4) criterion on two successive iterations vanishes in this case. Thereby, the external balance criterion can be represented as follows (6, 7):

CQN (Q, R ) =

I

j =1

( \2 (Q)-ct (R ))2 k=1_

(Q) + ck (R ))2

¿=1

^ opt. (6)

CQNi+i(Q, R)-CQNi (Q, R)

CQN, (Q, R)

>0.

(7)

The scheme of the inductive cluster analysis model based on the k-means algorithm is shown in Fig. 1. The implementation of this algorithm guesses the next steps:

Step 1. Formation of the initial set q of the objects. Data preprocessing (filtration and normalization). Presentation of data as a matrix n x m;

Step 2. Division of set Q into two equal power subsets in accordance with hereinbefore algorithm. These subsets

A R

QA and QD can be formally represented as follows:

QA = i = 1,..

j Q B =k/}j =1,... ., nA = nB, nA + nB = n.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Figure 1 - Scheme of inductive cluster analysis model based on the k-means algorithm clustering

Step 3. Setup of clustering procedure using ¿-means algorithm. Choose the number of clusters and setting the initial centers of clusters;

Step 4. Sequential calculation of Euclidean distances from the objects to the cluster centers for two clustering. Distribution of objects into clusters in accordance with the condition:

d (X, ,Ck min.

Step 5. Calculation of the new clusters centers by the formula (3);

Step 6. Calculation of the external balance criteria by the formulas (6) and (7);

Step 7. Fixation of obtained clustering when the conditions (6) and (7) are true. At against, if the current number of iterations less than maximum, go to step 4.

4 EXPERIMENTS

Approbation of the proposed model working was carried out using the data "Aggregation" and "Compound" of the database of the Computing School in the Eastern Finland University [11]. Estimation of the algorithm stability to the noise component was carried out using the "Seeds" data [12], representing the researches of kernels of three kinds of wheat. Each kernel was characterized by seven attributes, herewith each group included 70 observations. Thus, the

initial data matrix had the size: A = {210x7} . Data normalization was carried out by the formula (1). Then the "white noise", the amplitude of which varied from 2,5% to 50% of maximum of data scattering, was added to the data. Evaluation of the quality of algorithm operation was performed by counting the number of the correctly grouped objects. In order to compare the results, this problem was solved using the classical k-means algorithm, the fuzzy c-means algorithm and using the agglomerative hierarchical algorithm yet. The simulation was performed in the R software environment.

5 RESULTS

The results of the operation algorithm for division of initial data set into two equal power subsets are shown in Fig. 2.

Fig. 3 shows the charts of external balance criteria calculated by formulas (7) and (8) dependence on the iterations number of the investigated data, herewith, 4 clusters are assigned for the "Compound" data and 7 - for "Aggregation" data - 7. The results of studied objects division into clusters are presented in Fig. 4. Fig. 5 shows the boxplots of unnormalized and normalized data. The charts of incorrectly distributed objects depending on level of noise component using the different clustering algorithms are shown in Fig. 6.

m

Compound data

CompouncM data

Com pound_2 data

■ 0 "

'îï*'« * - . 1 *

25 V1

Aggregation data

v : 1.1 : ■■_«.?: v..

m

• ^ : :V'J:.J'iV\> 1

m

IT, •» *;.•*

Aggregation 1 data

G IP 15 20 25 30 35

Aggregation 2 data

5 10 15 20 25 35 35

Figure 2 - Results of the operation algorithm for division of the initial data set into two equal power subsets A) Plot of CO criterion B) Plat of dCQiCQ criterion

A 6 3 10

number of iteration

4 6 8

number of iteration

Figure 3 - Charts of external balance criteria for different iterations number Compound data Aggregation data

Figure 4 - Results of objective clustering inductive model operation

Un normal! zed data

Normalized data

V1 V2 V3 V4 V5 V6 V7 V1 V2 V3 V4 V5 V6 V7

Figure 5 - Boxplots of unnormalized and normalized "Seeds" data

Seeds data clustering

agn

Kmsans c means ind kmeans

s

ZZ.

30

Level of noise,%

Figure 6 - Charts of incorrectly distributed objects depending on the level of noise component using different clustering algorithms

6 DISCUSSION

The analysis of Fig. 2 allows to conclude about the high efficiency of the algorithm operation. The obtained subsets have the similar structure with lower density of objects distribution in the feature space. Fig. 3 shows that the relative change of external balance criterion achieves zero at 4 and 11 iterations for the data "Aggregation" and "Compound" respectively.

Therefore, the relevant clustering are optimal at these levels in terms of criteria applied. As it can be seen from Fig. 4, objects of the "Compound" data were divided adequately into clusters. Low percentage of incorrect data can be

explained by the nature of the distribution. However, the surface, separating the clusters is rather distinct. The same conclusion can be done on the basis of the "Aggregation" data analysis. In this case some clusters intersection can be observed, however, the algorithm has divided the objects into clusters by surface fitting at feature space. Fig. 6 analysis allows to conclude that the inductive clustering model based on the k-means algorithm gives better results of objects division into clusters as compared to the classical k-means algorithm and the fuzzy c-means algorithm. Inductive clustering algorithm is more stable as compared to the k-means and fuzzy c-means algorithms for increase of

noise level to 10%. With a further increase of the noise level up to 40% the number of incorrectly distributed objects varies on either side. A further increase of the noise level contributes to a monotonic increase of the number of falsely identified objects. However, it should be noted that in this case the high efficiency of operation and the best stability to the noise of the agglomerative hierarchical clustering algorithm. It can be explained by the nature of this algorithm. The profiles of objects or the centers of clusters are compared using the Euclid distance during the clustering. Herewith, the noise presence has no significant effect to the result of profiles comparison in case of significant differences of the objects profiles in different clusters. Moreover, the advantage of this algorithm is the independence of the initial choice of cluster centers because the number of clusters in the initial state equals the number of objects studied. In this case the choice of optimal clustering is the main problem, because the analysis of dendrogram doesn't allow to draw the conclusion about the clustering quality at the chosen level. Therefore creation of a hybrid inductive model of objective clustering based on the agglomerative hierarchical clustering algorithm is reasonable.

CONCLUSION

The hybrid model of objects clustering based on the methods of complex systems inductive modeling and k-means clustering algorithm is presented in the article. The methodology of inductive modeling to choice the optimal clustering during model operation through the implementation of objective criterial approach has been further developed. The "Compound" and "Aggregation" data of the database of the Computing School in the Eastern Finland University and the "Seeds" data, representing the researches of kernels of three kinds of wheat, were used as experimental ones. The algorithm for division of initial data set into two equal power subsets, which are then used in data clustering inductive model, has been further developed and practically implemented. The implementation of the proposed model was carried out using the R software environment. The results of simulation showed the high efficiency of proposed model operation. The algorithm has distributed the objects into the corresponding clusters adequately. The clusters intersection was not observed in the case of optimal distribution. The simulation of model operation using the "Seeds" data with different noise level was performed to estimate the model stability to the different data noise level. Level of noise was changed from 2,5% to 50% of maximum data variation. Clustering using the proposed inductive clustering model, the classic k-means algorithm, the fuzzy c-means algorithm and the agglomerative hierarchical clustering algorithm was carried out to compare

the results of the experiment. The results of the simulation have shown better quality and stability of inductive k-means algorithm as compared to the classic k-means and the fuzzy c-means algorithms. However, the agglomerative hierarchical clustering algorithm has shown the best results in terms of clustering quality and stability to the noise. Therefore, the creation of a hybrid inductive model of objective clustering based on the agglomerative hierarchical clustering algorithm is the perspectives of further authors' researches. ACKNOWLEDGEMENTS

The work is carried out within the framework of the state budget scientific researches theme of Kherson National Technical University "Synthesis of Hybrid Evolutionary Algorithms and Methods for Modeling of Gene Regulatory Networks" (State Registration Number: 0116U002840).

REFERENCES

1. 1вахненко О. Г. Метод групового урахування аргумента -конкурент методу стохастично! апроксимацп / О. Г. 1вахнен-ко // Автоматика. - 1968. - № 3. - С. 58-72.

2. Ивахненко А. Г. Индуктивный метод самоорганизации моделей сложных систем / А. Г. Ивахненко. - К. : Наук. думка, 1982. - 296 с.

3. Ивахненко А. Г. Объективная кластеризация на основе теории самоорганизации моделей / А. Г. Ивахненко // Автоматика. -1987. - №5. - С. 6-15.

4. Madala H. R. Inductive Learning Algorithms for Complex Systems Modeling / H. R. Madala, A. G. Ivakhnenko. - CRC Press, 1994. -365 p.

5. Степашко В. С. Теоретические аспекты МГУА как метода индуктивного моделирования / В. С. Степашко // Управляющие системы и машины (УСиМ). - 2003. - № 2. - С. 31-38.

6. Степашко В. С. Елементи теорп шдуктивного моделювання -Стан та перспективи розвитку шформатики в Укра1ш : монография / Колектив автс^в / В. С. Степашко. - К. : Наукова думка, 2010. - 1008 с. - С. 471-486.

7. Осипенко В. В. Два пщходи до розв'язання задачi кластерiзацii у широкому сена з позицш iндуктивного моделювання / В. В. Осипенко // Енергетика i Автоматика. - 2014. - № 1. - С. 83-97.

8. Osypenko V. V. The Methodology of Inductive System Analysis as a Tool of Engineering Researches Analytical Planning / V. V. Osypenko, V. M. Reshetjuk // Ann. Warsaw Univ. Life Sci. -SGGW. - 2011. - № 58. - P. 67-71.

9. Сарычева Л. В. Объективный кластерный анализ данных на основе метода группового учета аргументов / Л. В. Сарычева // Проблемы управления и автоматики. - 2008. - № 2. - С. 86104.

10. The Using of Biclustering Techniques in Inductive Modeling Systems of Biological Processes / [S. Babichev, V. Osypenko, M. A. Taif, V. Lytvynenko] // Inductive modeling of complex systems. - 2015. - № 7. - P. 5-14.

11. https://cs.joensuu.fi/sipu/datasets/

12. http ://archive. ics. uci. edu/ml/datasets/seeds

Article was submitted 15.08.2016.

After revision 09.06.2016.

Бабiчев С. А.1, Лггвшенко В. I.2, Таiф М. А.3

'Канд. техн. наук, доцент, доцент кафедри шформатики, Ушверситет Яна Свангелюта Пуркине в Усп на Лаб^ Чехiя 2Д-р техн. наук, професор, завщувач кафедри шформатики i комп'ютерних наук, Херсонський нацюнальний техшчний ушверситет, Херсон, Украша

3Асшрант кафедри шформатики i комп'ютерних наук, Херсонський нацюнальний техшчний ушверситет, Херсон, Украша ОЦ1НКА СТ1ЙКОСТ1 ШДУКТИВНО1 МОДЕЛ1 КЛАСТЕРИЗАЦП ОБ'СКТШ НА ОСНОВ1 АЛГОРИТМУ А"-СЕРЕДШХ ПРИ Р1ЗНИХ Р1ВНЯХ ШУМУ

У статт представлено шдуктивну модель об'ективно! кластеризацп об'екпв на основi алгоритму кластеризацп ¿-середшх. Запро-поновано i практично реалiзовано алгоритм розпод^ множини вихщних даних на двi рiвнопотужних шдмножини. У якосп зовнш-

НЕЙРО1НФОРМАТИКА ТА ШТЕЛЕКТУАЛЬШ СИСТЕМИ

нього критерто балансу запропоновано використовувати рiзницю мiж центрами мас вiдповiдних кластерiв у рiзних кластеризацiях. Апробацiя роботи запропоновано! моделi проводилася з використанням даних «Compound» та «Aggregation» бази даних обчислюваль-но! школи Схiдно-Фiнського ушверситету. Представленi дослiдження з оцiнки стшкосп моделi до шумово!' компонентi з використанням даних «Seeds». Для порiвняння результата експерименту були використанi алгоритми k-середшх, с-середшх, iндуктивний алгоритм k-середшх, а також алгоритм агломеративно! ieрархiчноi кластеризацii. За результатами моделювання визначено шляхи подальшого вдосконалення запропоновано! моделi з метою пiдвищення об'ективносп кластеризацii дослiджуваних даних.

Ключовi слова: шдуктивне моделювання, кластеризацiя, алгоритм k-середшх, зовнiшнiй критерiй балансу.

Бабичев С. А.1, Литвиненко В. И.2, Таиф М. А.3

'Канд. техн. наук, доцент, доцент кафедры информатики, Университет Яна Евангелиста Пуркине в Усти на Лабе, Чехия

2Д-р техн. наук, професор, заведующий кафедры информатики и компьютерных наук, Херсонский национальный технический университет, Херсон, Украина

3Аспирант кафедры информатики и компьютерных наук, Херсонский национальный технический университет, Херсон, Украина

ОЦЕНКА УСТОЙЧИВОСТИ ИНДУКТИВНОЙ МОДЕЛИ КЛАСТЕРИЗАЦИИ ОБЪЕКТОВ НА ОСНОВЕ АЛГОРИТМА К-СРЕДНИХ ПРИ РАЗЛИЧНЫХ УРОВНЯХ ШУМА

В статье представлена индуктивная модель объективной кластеризации объектов на основе алгоритма кластеризации k-средних. Предложен и практически реализован алгоритм деления множества исходных данных на два равномощных подмножества. В качестве внешнего критерия баланса предложено использовать разницу между центрами масс соответствующих кластеров в различных класте-ризациях. Аппробация работы предложенной модели производилась с использованием данных «Compound» and «Aggregation» базы данных вычислительной школы Восточно-Финского университета. Представлены исследования по оценке устойчивости модели к шумовой компоненте с использованием данных «Seeds». Для сравнения результатов эксперимента были использованы алгоритмы k-means, C-means, индуктивный алгоритм k-means, а также алгоритм агломеративной иерархической кластеризации. По результатам моделирования определены пути дальнейшего усовершенствования предложенной модели с целью повышения объективности кластеризации исследуемых данных.

Ключевые слова: индуктивное моделирование, кластеризация, алгоритм k-средних, внешний критерий баланса.

REFERENCES

1. Ivahnenko O.G. Metod grupovogo urahuvannya argumentiv -konkurent metodu stohastichnoyi aproksimatsiyi, Avtomatika, 1968, No. 3, pp. 58-72.

2. Ivahnenko A. G. Induktivnyj metod samoorganizacii modelej slozhnyh sistem. Kiev, Nauk. dumka, 1982, 296 p.

3. Ivahnenko A. G. Objektivnaja klasterizacija na osnove teorii samoorganizacii modelej, Avtomatika, 1987, No. 5, pp. 6-15.

4. Madala H. R., Ivakhnenko A. G. Inductive Learning Algorithms for Complex Systems Modeling. CRC Press, 1994, 365 p.

5. Stepashko V. S. Teoreticheskie aspekty MGUA kak metoda induktivnogo modelirovanija, Upravljajushhie sistemy i mashiny (USiM), 2003, No. 2, pp. 31-38.

6. Stepashko V. S. Elementi teoriyi induktivnogo modelyuvannya -Stan ta perspektivi rozvitku informatiki v Ukrayini : monografiya. Kolektiv avtoriv, 2010, 1008 p., pp. 471-486.

7. Osipenko V V. Dva pidhodi do rozv'yazannya zadachi klasterizatsiyi u shirokomu sensi z pozitsiy induktivnogo modelyuvannya, Energetika i Avtomatika, 2014, No. 1, pp. 83-97.

8. Osypenko V. V., Reshetjuk V. M. The Methodology of Inductive System Analysis as a Tool of Engineering Researches Analytical Planning, Ann. Warsaw Univ. Life Sci, SGGW, 2011, No. 58, pp. 67-71.

9. Sarycheva L. V. Objektivnyj klasternyj analiz dannyh na osnove metoda gruppovogo ucheta argumentov, Problemy upravlenija i avtomatiki, 2008, No. 2, pp. 86-104.

10. Babichev S., Osypenko V., Taif M. A., Lytvynenko V. The Using of Biclustering Techniques in Inductive Modeling Systems of Biological Processes, Inductive modeling of complex systems, 2015, No. 7, pp. 5-14.

11. https://cs.joensuu.fi/sipu/datasets/

12. http ://archive. ics. uci. edu/ml/datasets/seeds

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.