6. Gupta, L. Fault and Performance Management in Multi-Cloud Based NFV using Shallow and Deep Predictive Structures [Text] / L. Gupta, M. Samaka, R. Jain, A. Erbad, D. Bhamare, H. A. Chan // 7th Workshop on Industrial Internet of Things Communication Networks at The 26th International Conference on Computer Communications and Networks (ICCCN 2017). -Vancluver, 2017.
7. Tarsa, S. J. Workload prediction for adaptive power scaling using deep learning [Text] / S. J. Tarsa, A. P. Kumar, H. T. Kung // 2014 IEEE International Conference on IC Design & Technology. - 2014. doi: 10.1109/icicdt.2014.6838580
8. Flenner, J. A Deep Non-Negative Matrix Factorization Neural Network [Electronic resource] / J. Flenner, B. Hunter // Available at: http://www1.cmc.edu/pages/faculty/BHunter/papers/deepNMF.pdf
9. Li, Y. Learning-based power prediction for data centre operations via deep neural networks [Text] / Y. Li, H. Hu, Y. Wen, J. Zhang // Proceedings of the 5th International Workshop on Energy Efficient Data Centres - E2DC '16. - 2016. doi: 10.1145/2940679.2940685
10. Zhao, Z. Stacked Multilayer Self-Organizing Map for Background Modeling [Text] / Z. Zhao, X. Zhang, Y. Fang // IEEE Transactions on Image Processing. - 2015. - Vol. 24, Issue 9. - P. 2841-2850. doi: 10.1109/tip.2015.2427519
11. Chan, T.-H. PCANet: A Simple Deep Learning Baseline for Image Classification [Electronic resource] / T.-H. Chan, K. Jia, S. Gao, J. Lu et. al. // arXiv. - 2014. - Available at: https://arxiv.org/pdf/1404.3606.pdf
12. Labusch, K. Learning Data Representations with Sparse Coding Neural Gas [Text] / K. Labusch, E. Barth, T. Martinetz // Proceedings of the European Symposium on Artificial Neural Networks. - Bruges, 2008. - P. 233-238.
13. Labusch, K. Sparse Coding Neural Gas: Learning of overcomplete data representations [Text] / K. Labusch, E. Barth, T. Martinetz // Neurocomputing. - 2009. - Vol. 72, Issue 7-9. - P. 1547-1555. doi: 10.1016/j.neucom.2008.11.027
14. Moskalenko, V. Optimizing the parameters of functioning of the system of management of data center it infrastructure [Text] / V. Moskalenko, S. Pimonenko // Eastern-European Journal of Enterprise Technologies. - 2016. - Vol. 5, Issue 2 (83). - P. 21-29. doi: 10.15587/1729-4061.2016.79231
15. Dovbysh, A. S. Information-Extreme Method for Classification of Observations with Categorical Attributes [Text] / A. S. Dovbysh, V. V. Moskalenko, A. S. Rizhova // Cybernetics and Systems Analysis. - 2016. - Vol. 52, Issue 2. - P. 224-231. doi: 10.1007/s10559-016-9818-1
16. Mosa, A. Optimizing virtual machine placement for energy and SLA in clouds using utility functions [Text] / A. Mosa,
N. W. Paton // Journal of Cloud Computing. - 2016. - Vol. 5, Issue 1. doi: 10.1186/s13677-016-0067-7 -□ □-
Запропоновано метод onmuMi3au,ii класифжацшних нечтких баз знань з використанням полтшувальних подстановок у виглядi розв'язтв нечтких реляцшних рiвнянь. Полтшувальш подстановки дозволяють фор-малiзувати процес генерування та вйдбюру варiантiв нечiткоi бази знань за критерiями «точтсть - склад-тсть», що спрощуе процес налаштування
Ключовi слова: оптимiзацiя нечтких баз знань, minmax кластеризащя, розв'язання нечтких реляцшних рiвнянь
□-□
Предложен метод оптимизации классификационных нечетких баз знаний с использованием улучшающих подстановок в виде решений нечетких реляционных уравнений. Улучшающие подстановки позволяют формализовать процесс генерирования и отбора вариантов нечеткой базы знаний по критериям «точность - сложность», что упрощает процесс настройки
Ключевые слова: оптимизация нечетких баз знаний, min-max кластеризация, решение нечетких реляционных уравнений -□ □-
UDC 681.5.015:007
|DOI: 10.15587/1729-4061.2017.1102611
OPTIMIZATION OF FUZZY CLASSIFICATION KNOWLEDGE BASES USING IMPROVING TRANSFORMATIONS
H. Rakytyanska
PhD, Associate Professor Department of software design Vinnytsia National Technical University Khmelnytske shose str., 95, Vinnytsia, Ukraine, 21021 E-mail: h [email protected]
1. Introduction
Fuzzy classification knowledge base design is carried out according to the criteria of accuracy, complexity, and interpretability. The design criteria are provided by gradual transformations of the initial model. In the theory of
defect-free design of human-machine systems [1, 2], formalization of such transformations is achieved by the use of improving transformations.
Then improving transformations correspond to the addition (removal) of output classes, input terms, and rules. Improving transformations allow formalization of the process
иТ
©
of generating the fuzzy knowledge base variants with the simultaneous establishment of control variables that affect the accuracy and complexity of the model. At the same time, the system of fuzzy logic equations follows from the description of the tuning process in the language of V. M. Glush-kov algorithmic algebras [3].
In practice, fuzzy classification knowledge bases are tuned by generating the candidate rules with a further selection of rules. Improving transformations for a given granularity [4], when at the stage of generating the candidate rules it is possible to obtain the required number of input terms are called formalized. In this case, only the weights of rules are sufficient for selection. However, such an approach provides lower inference accuracy. In general, the knowledge base optimization problem is a problem of granular min-max clustering [5], where the relation between design criteria is achieved by the level of granularity of interval rules (hyperboxes).
2. Literature review and problem statement
terval solutions of the trend system of equations [19-21]. The number of rules in a class is determined by the number of solutions, and the granularity is determined by intervals of values of input variables in rules. The set of minimum solutions provides the minimum rule length [17, 18]. Simplification of the candidate rule generation process is achieved through the sequential generation of solutions of the trend equation system.
Relational transformation replaces the selection fragment. Since the number of rules is already known, selection of terms lies in the maximum approximation to the partition by interval solutions of the trend equation system. Linguistic interpretation of the resulting solutions is performed by the relational partition of the space of input variables [22, 23]. The level of detail and the density of coverage are determined by the "input terms - output classes" relational matrix, and the dimensions of hyperboxes are adjusted using triangular membership functions. Simplification of the selection process is achieved through a concise presentation of rules in the form of a relational matrix.
Candidate rules for the min-max clustering problem [5] are generated by means of multi-objective genetic algorithms [6, 7] or min-max neural networks [8, 9]. Dimensions of hyperboxes are selected until the required inference accuracy is achieved [6-9]. The preliminary method of partition is determined by the boundaries of triangular membership functions of candidate terms. However, the initial model is redundant. The methods of candidate rule generation do not guarantee the optimum number of rules and optimum granularity of input variables.
The number of terms and rules is determined during the multi-objective evolutionary selection [10-12]. The purpose of selection is to obtain a simplified and interpretable knowledge base. Selection is implemented by choosing the best configuration of terms and rules of zero option. The fitness function is constructed on the basis of granularity measures, which allow estimating the redundancy among fuzzy rules [13]. Linguistic interpretation of the solutions obtained requires the adjustment of preference relations between the candidate terms [14, 15]. Criteria for the selection of terms are the level of detail and the maximum size of the hyperbox [16]. The selection process becomes more complicated with increasing number of criteria, in particular, when taking into account the rule length [12]. In addition to weights of rules, such selection requires signs of the terms or their absence in each rule.
The considered approach differs in computational complexity due to the increasing number of rules. Both at the stage of candidate rule generation and at the selection stage, the control variables are the membership function parameters in each rule. The proposed method consists in replacing the tuning algorithm fragments with other fragments that provide accuracy at lower cost. Improving transformations of fuzzy classification knowledge bases are based on solving fuzzy relational equations [17, 18]. As improving transformations, the following is proposed: transition to a composite model for selecting the number of output classes and rules [19-21]; transition to a relational model for selecting the number of input terms [22, 23].
Composite transformation replaces the fragment of candidate rule generation. The problem of min-max clustering is solved by generating composite rules in the form of in-
3. The aim and objectives of the study
The aim of the work is to develop a method for optimization of fuzzy classification knowledge bases using improving transformations. The method should provide the construction of accurate, compact and interpretable knowledge bases. At the same time, consistent use of composite and relational models shall provide design process simplification.
To achieve this aim, the following objectives were accomplished:
- to develop logic-algorithmic models of improving transformations;
- to develop a genetic algorithm for the fuzzy knowledge base optimization.
4. Models and method of fuzzy classification knowledge base optimization
4. 1. Logic-algorithmic models of improving transformations
Algorithms for tuning the fuzzy classification knowledge base for the object y=/(xR,...,xn) were described in the language of V. M. Glushkov algorithmic algebras [3].
The linear operator structure corresponds to the algorithm of tuning by means of an expert (without tuning to the experimental data):
- for the zero option
A = LA, (1)
- for the composite model
Ar = Lr LR LRLR, (2)
- for the relational model
AR = LRLR. (3)
Here L1 and L2 are the operators of tuning the struct ure and parameters of expert fuzzy rules: A and L° - for the zero option, AR, AR and LR, A2 for the trend and composite
models, LR and LR for the relational model; A0, AR, A2 are equivalent operators of the linear structure for the zero option, composite and relational models, respectively.
To increase the inference accuracy, the operator of improving by tuning to experimental data is used.
An alternative operator structure corresponds to the partial case of the "accuracy control - tuning" algorithm for the specified output classes:
- for the zero option
B0 = A„( YvD0),
a0
- for the composite model
B = A ( Y v DRDC ),
(wR)
- for the relational model
B2 = A2 ( Y v D ).
(4)
(5)
(6)
Here D1 is the tuning operator in case of the specified output classes: D" - for the zero option, DR and D[ - for the trend and composite models, D[ - for the relational model; a and œ are the conditions that are checked during the control, a, œ=1(0) - if the rules or relations satisfy (do not satisfy) the inference accuracy requirements: a0 - for the zero option, œR and ac - for the trend and composite models, œr - for the relational model; Y is the identity operator, corresponding to the fixation of the tuning results; B0, B1, B2 are equivalent operators of the alternative structure for the zero option, composite and relational models, respectively.
The iterative operator structure corresponds to the general case of the "control-tuning" algorithm for unknown output classes:
for the zero option
C0 = A0{ Dj},
- for the composite model
C = A { DrD;},
ac (wR )
- for the relational model
C2 = A2 { D2}.
(7)
(8)
(9)
for the zero option [18]
ao = V_ ^af(X = af ) ^ y = dj, j = 1,^
p=i,?0 L'=1,n J
- for the composite model [19, 20]
wr = .^\,v-wJ( x = c,i y = ej , J =1M,
1-1 «17-1 b. I J
ac(WR) = V
P=1,Zj
{ V a f ( X, = af )
■y=d,
j = 1, m,
- for the relational model [23]
(11)
w r = A_
i=1,n
v_
I=1k
^ra :
s( X = a,u )
>y = dj, j = 1,m. (12)
Here D2 is the tuning operator in case of unknown output classes: Dj - for the zero option, DR and D2c - for the trend and composite models, D2, - for the relational model; Co, C1, C2 are equivalent operators of the iterative structure for the zero option, composite and relational models, respectively.
It is assumed that the tuning operator D2 is executed until the conditions a and œ become true. The truth of the conditions a and œ is determined by the correctness of knowledge base construction. The condition - indicator of correctness of knowledge base construction is described by a series-parallel logical structure:
Here dj, Ej and dj are output classes of the zero option, trend and composite models, respectively; m0, M and m are the number of output classes; ai and roi are the micro conditions of correctness of terms of the variable xi in rules and relations: af - for the term aj in the rule jp of the class dj of the zero option; wJ - for the initial term ci; in the relation ci;xEj of the trend model; aj - for the linguistic modification a'j of the term ca in the rule jp of the class dj of the composite model; a'ik - for the term ans in the relation a^xdj of the relational model; zj and Zj are the number of rules in the class of the zero option and composite model; ki and gn are the number of primary terms and linguistic modifiers to evaluate the variable xi.
The parameters of improving operators are the number of input terms, output classes, rules, and parameters of membership functions in logical conditions (10)-(12).
The stages of generation and selection of the zero option rules are described by the same operator and logical structures. Then, as a result of improving transformations, we obtain variants of the fuzzy knowledge base:
- without tuning to experimental data
Ao=4IA2,
- with tuning to data for the specified output classes
BO=BIB2,
- with tuning to data for unknown output classes
CO=CIC2.
In this case, the condition of correctness of knowledge base construction takes the form:
a0=raRacrar.
The rules of improving transformations (2)-(12) allow representing the generation of the knowledge base variants as inference in a formal grammar [1]:
G={Vt, V„, So, P),
where Vt is the set of operator and logical functional units (terminals), [L\,L°2,Lf,Lf,FvL\,L\,Lr2| and [af};Vn is the
set of operator and logical functional structures (nonterminals), {¿0, Bo, Co} and {ao}; S0 is the zero option of a fuzzy knowledge base; P is the set of improving transformations: D02, DR2, D1c2, D[2} and {wJ, a 1, w1} for operators and logical conditions; {Ai, A2, B1, B2, C1, C2} and {wR, ac, wr} for operator and logical structures.
Fuzzy logic equations follow directly from the algorithmic descriptions (10)-(12).
4. 2. Problems of fuzzy classification knowledge base optimization
We denote the set of input terms of the trend, composite and relational models as:
{c11,..., civ...,c„lv..,c,4} = {T1,...,TN }
{ö1,...,öq}
al1,...,aigl,...,aN 1,...,aNgN } = {t1,...,tK},
and
where N, q and Kare the number of input terms, N=ki+.+kn K=gi+..+gN. Let:
Y£ = (yE1,...,yE" ), Ye = (/,...,y" )
and
Yd = ^^ I"), Yd = (y\..., y")
are the vectors of the lower (upper) boundaries of the output classes Ej and j
Bt = (PTi,...,PTN), Bt = (pTl,...,pTN),
Ba = (Pa1,...,p°' ), Ba = (PV.,P°' )
and
Bt = (PV.,^ ), Bî = (p\...,piK )
are the vectors of the lower (upper) boundaries of triangular membership functions of fuzzy input terms of the trend, composite and relational models;
R ç ca x Ej = [rt{ ] and P ç as x = [p4 ] - "input terms -output classes" fuzzy relational matrices before and after linguistic modification.
Then the solution vectors are:
Y1=(M, N, m, q, Y E, Ye , Yd, Yd ,R, BT, Bt , Ba, Ba )
and
Direct statement. To find the vector Y, for which Z( Y)^ ^min and e(Y)< e, where e is the maximum permissible inference error.
Dual statement/To find the vector Y, for which e(Y)^ ^min and Z( Y)< Z, where Z is the maximum permissible number of rules.
4. 3. Genetic algorithm
For the composite transformation, the chromosome contains elements of the vector Yi, and for the relational - elements of the vector Y2. The number of classes, rules, and terms is determined using arrangement vectors [23]. The elements of arrangement vectors are i(0), which means the addition (removal) of classes, rules, and terms.
For the composite transformation, the initial population is generated for all output classes, and for the relational - for all input terms. This ensures the maximum approximation to the partition by interval solutions of the trend equation system. Then the elements of the partition matrix determine the degree of coverage of projections of hyperboxes [P,v,P; ] by the interval [ptk ,pk ].
Let us define the fitness function for the solution Y. For the direct problem, we choose the function i/Z(Y), inverse to the target one. For the dual problem, we choose the function i-S(Y), where S(Y) = e(Y)/(y - y) mean square error.
The accounting of restrictions of constrained optimization problems is carried out by fining the chromosomes, and the fitness function combines the target and penalty functions. According to [2], when leaving the admissible area, the chromosome is subject to fining, reducing the value of the target function. The amount of a penalty depends on the distance to a zone of permissible solutions and reflects the quality of the current population.
Thus, the fitness function for the solution Y has the form [2]:
- for the direct problem
F '(Y) =
1
Z (Y) 1 -Q'(Y)
Z (Y)
,if Y - permissible solution, , otherwise;
- for the dual problem
F "(Y) =
(1 -S(Y)),if Y - permissible solution, (1 -S(Y))(1 -Q"( Y)), otherwise,
where Q'(Y) e[0,1] and Q''(Y) e[0,1] are penalty functions that take into account the violation of constraints of direct and dual optimization problems by the chromosome Y.
The penalty function is defined as the relation [2]:
Y2=(K, m, Y, Yd, P, Bt, Bt )
Q'(Y) = or Q''(Y) = -AZ(Y)
Ae„
AZ„
(13)
for the composite and relational transformations, respectively.
We will estimate the fuzzy model quality based on the mean square error e(Y), and complexity - based on the number of rules Z(Y) [20, 23].
For each improving transformation, the knowledge base optimization problem is formulated in a direct and dual statement.
where Ae(Y ) = e(Y) - e or AZ (Y) = Z (Y) - Z is the deviation from the permissible value for the solution Y; Aemax or AZmax is the maximum deviation in the current population.
Penalties (13) vary for the same violations of constraints in different populations of chromosomes. An adaptation of penalties (13) to the population quality allows distancing
high-quality solutions from low-quality ones during the selection [2]. Following [2], it is expedient to use tournament selection, which does not require tuning the penalty coefficients to take into account constraints.
5. Example: time series forecasting
The problem of forecasting the number of comments after the publication of a post in a social network is considered [24]. Experimental data were obtained from [25].
Input parameters are: x1-x3 - the number of comments for the last, the penultimate and the first day of observation, respectively, x1-3e[0, 250]; x4 - observation window, x4e [i, 72] hours; X5 - forecasting time-frame (H), xse [i, 24] hours. The output parameter is: y - the number of comments in the next H hours, ye. [0, 250].
The indicator of the model accuracy is: Ptop10 - the probability of a correct forecast of top 10 posts, which will receive the largest number of comments. The task was to find an option of the knowledge base, which provides: Z^min and Ptopi0 > 0.7 in a direct statement; Ptop10^max and Z < 30 in a dual statement.
As a result of the relational transformation, the number of primary input and output terms for the direct and dual problems was M=3, k1-5=3.
The system of fuzzy relational equations for rule generation has the form:
iE1 = [(|ic" a0.90) v(il"« A0.41) v(|i,% A0.22)] A
a[(|C21 A0.99) v(|c22 A0.89) v(i"23 A0.62)] A
a[(|C31 A0.87)v(|c32 A0.64)v(|c33 A0.26)] A
A[(ic41 A 0.45) v (ic42 A 0.59) v (ic43 A 0.93)] a
A[(ic51 A0.95)v(ic52 A0.46)v(ic53 A0.30)];
iE2 = [(|c11 a 0.35) v (|"" a 0.83) v (|c13 a 0.56)] a A[(ic21 A0.17) v(ic22 A0.61) v(ic23 A0.84)] A A[(ic31 A0.23) v(ic32 A0.89) v(ic33 A0.47)] a a[(|C41 A0.68) v(ic42 A0.82) v(ic43 A0.32)] A A[(ic51 A0.62)v(ic52 A0.90)v(ic53 A0.68)];
iE3 = [(i"" A0.11) v(i"" A0.50) v(ic13 A0.74)] A
A[(ic21 A0.08) v(ic22 A0.75) v(ic23 A0.38)] A
A[(ic31 A 0.12) v (iA0.50) v (iC33 A 0.75)] A
A[(ic41 A 0.92) v (iA0.46) v (iA 0.12)] A
A[(|C51 A0.20)v(iC52 A0.77)v(iC53 A0.89)]. (14)
Using the genetic algorithm, a set of solutions for P-pa-rameters of candidate rules was obtained (Table 1).
Table 1
A set of values of p-parameters in solutions of the trend system of equations
No. IF THEN
X1 X2 X3 x4 X5 y
11 [0, 25] [30, 40] [0, 4] d1, [0, 50]
12 [0, 25] [25, 112] [36, 50] [0, 4]
13 [0, 25] [0, 125] [0, 25] [55, 72] [0, 4]
21 52 [30, 45] [0, 5] d2, [35, 100]
22 52 [52, 105] [36, 55] [0, 5]
23 [0, 52] [79, 105] [100, 140] [61, 72] [0, 8]
24 [0, 52] 52 [86, 100] [61, 72] [0, 8]
31 [105, 140] [20, 29] [0, 16] [75, 110]
32 [73, 105] [89, 175] [29, 43] [6, 24]
33 [105, 163] 200 [29, 43] [6, 24]
34 [73, 105] 89 [105, 163] [46, 72] [6, 24]
35 [105, 140] [55, 89] [160, 197] [46, 72] [6, 24]
41 [125, 180] [10, 30] [4, 18] d4, [90, 150]
42 [87, 125] [168, 250] [39, 50] [7, 24]
43 [125, 180] [120, 140] [39, 50] [7, 24]
44 [125, 180] [168, 210] [90, 125] [61, 72] [7, 24]
45 [87, 125] [120, 168] [160, 197] [61, 72] [7, 24]
51 [160, 197] [12, 30] [7, 18] d5, [130, 190]
52 [112, 160] [174, 210] [32, 50] [12, 24]
53 [160, 197] 210 [112, 160] [46, 72] [12, 24]
54 [160, 185] [140, 174] [160, 197] [46, 72] [12, 24]
61 203 [0, 29] [11, 16] [175, 220]
62 203 165 [29, 43] [20, 24]
63 [184, 203] [165, 203] [202, 250] [40, 60] [20, 24]
64 [184, 203] 203 [171, 202] [40, 60] [20, 24]
71 [217, 250] [0, 27] [12, 18] dy, [190, 250]
72 [217, 250] [118, 153] [20, 42] [22, 24]
73 [217, 250] [160, 207] [31, 50] [22, 24]
74 [168, 250] [118, 153] [200, 250] [40, 46] [22, 24]
75 [168, 250] [160, 207] [167, 196] [40, 46] [22, 24]
Note: * - output classes (rules) for the dual problem
The set of interval rules (Table 1) corresponds to the set of solutions of the system of equations (14), where the minimum solutions determine the rule length:
5 (dR) = {0.86 U [0.86,1] -|ac", |ae3>, i043, i051; [0.86,1] U [0,0.86] -i021, i022; 0.23 -i"23; [0,0.23] -i"32, i"33, i"52, i"53; [0,0.11] -i^2, ^c,3;[0,1] -i"41, };
5 (d2) ={0.61U [0.61,1]-i", |c43, |cs'; [0.61,1] U [0,0.61] -|c21, [0.61,1] U [0.35,0.61] U [0,0.61] -i2, i"32; [0.35,1] -|c42; [0.35,0.50] U [0,0.50] ; 0.35 U [0,0.35] [0,0.20] -i52, i"53; [0,0.11] , |C13; [0,1] -|e"};
5 (d3) = {0.57 U [0.57,1] -i"23, |°32; [0,0.57] U [0.57,1] -10'3, | e4.,|°42, |"52, |"53; [0,0.57] U [0.41,0.57] U [0.57,1] -|0'2; [0,0.50] -|°33;[0,0.46] -|°5';0.41 U [0,0.41] [0,0.38] -|°22; [0,1] -|°2', |°3', |0'3};
5(d4) = {0.83 U [0.83, 1] - |°'2, |"23 , , , |"5r ; [0,0.64] -|°31;[0,0.59] -|0'3; 0.50 -|°22, ; [0,0.50] -iM"33; [0,0.46] ; [0,0.41] -i0"; [0,1] -i2, |!3};
5 (d5) = {0.68 U [0.68,1] -i2, i23, i32; [0.68,1] U [0,0.68] -i", i42, i52, i53; [0,0.50] -i3 ,i33; 0.46 U [0,0.46] -i22; [0,0.41]-i", i"; [0,1] -i021, i31, i43};
5 (d6) = {0.54 U [0.54,1] -i13, i22,i33, i41; [0,0.54] U [0.54,1] -i52, i53; [0,0.54] -i23, i42; [0,0.47] -i 032,i051; [0,0.22] -i11, i12; [0,1] -i021, i31, i43};
5 (d7) = {0.74 U [0.74,1] -i13,1022,i033, i41; [0.74,1] U [0,0.74] -i52, i53; [0,0.68] -i042; [0,0.61] -i023; [0,0.47] -i32, i051; [0,0.22] -i11, i12; [0,1] -i21, i31, i043}.
As a result of the composite transformation, the number of output classes and rules was m=5, Z=21 for the direct problem; m=7, Z=30 for the dual problem.
As a result of the relational transformation, the number of output classes and input terms was m=5, gR-3=5, q4=6, q5=5 for the direct problem; m=7, 91^=7, q4=6, qs=5 for the dual problem.
The fuzzy partition matrix for the direct and dual problems is presented in Table 2. Composite terms were obtained through linguistic modification: little (L) - very little (vL);
average (A) - lower (higher) than average (lA, hA); big (B) -very big (vB).
The relation matrix (Table 2) is a concise form of presentation of the rules, the length of which can be variable. This means that for the incomplete composite "inputs - output" rules (Table 1), the relational "input - output" rules are absent in Table 2. Therefore, for each element of the partition matrix, the number of the composite rule in the output class is specified. This allows reproducing the measurability of hyperboxes over the partition matrix.
Table 2
Fuzzy "input terms — output classes" partition matrix
IF THEN y
vL L lA* A hA B* vB
vL 0.961-3 0.921-4 0 0 0 0 0
L 0 0.981,2 0.712,4 0.342,5 0 0 0
lA* 0 0 0.621-5 0.912,5 0 0 0
xi A 0 0 0.761,3,5 0.641-5 0.832 0 0
hA 0 0 0.603 0.851,3,4 0.791-4 0.511,2 0.624,5
B* 0 0 0 0 0.681,3 0.841-4 0.914,5
vB 0 0 0 0 0 0 0.951-5
vL 0.682,3 0 0 0 0 0 0
L 0.942,3 0.832-4 0.855 0 0 0 0
X2 lA* 0.852,3 0.862,3 0.692,4,5 0 0 0 0
A 0.543 0.382,3 0.942 0.873,5 0.644 0 0.562,4
hA 0 0 0.672 0.692,4,5 0.612,4 0.732,3 0.642-5
B* 0 0 0.893 0.902,4 0.762,3 0.793,4 0.873,5
vB 0 0 0 0.812,4 0.502,3 0 0.393,5
vL 0.573 0 0 0 0 0 0
L 0 0.504 0 0 0 0 0
lA* 0 0.623,4 0.654 0.824 0 0 0
X3 A 0 0.913 0.864 0.914 0.753 0 0
hA 0 0 0.6I4,5 0.865 0.723,4 0.694 0.545
B* 0 0 0.835 0.885 0.804 0.843,4 0.904,5
vB 0 0 0 0 0.434 0.753 0.934
L 0 0 0 0.501 0.591 0.901 0.971
lA 0 0 0.841-3 0.811 0.751,2 0.911,2 0.961,2
x4 A 0.681 0.741 0.892,3 0.632,3 0.902 0.822 0.782,3
hA 0.652 0.862 0.802-5 0.722,3 0.742-4 0.682-4 0.703-5
B 0.983 0.803,4 0.974,5 0.804,5 0.933,4 0.743,4 0
vB 1.003 0.953,4 0.954,5 0.944,5 0.983,4 0 0
L 0.901-3 0.921-4 0.461-5 0.481 0 0 0
lA 0 0.843,4 0.831-5 0.691-5 0.751 0 0
X5 A 0 0 0.871-5 0.721-5 0.771-4 0.801 0.601
hA 0 0 0.601-5 0.651-5 0.631-4 0.591-4 0.651
B 0 0 0.922-5 0.942-5 0.882-4 0.922-4 0.912-4
Note: * - output Glasses and input terms /or the dual problem
Parameters of membership functions of the trend and composite models are shown in Table. 3. The membership functions of fuzzy terms are presented in Fig. 1.
Linguistic interpretation of b-parameters for the direct and dual problems is presented in Table 4.
Table 3
Parameters of membership functions of the trend and composite models
Input Parameter Trend model Composite model
L A B vL L lA A hA B vB
X1-3 P 0 55 160 0 40 80 92 137 177 186
P 114 197 250 49 98 115 152 195 206 250
X4 P 1 20 46 - 1 16 28 37 52 59
P 30 50 72 - 22 34 44 56 65 72
X5 P 1 7 16 - 1 4 9 14 20 -
P 10 18 23 - 6 12 17 22 24 -
0 25 50 75 100 125 150 175 200 225 250
0 10 20 30 40 50 60
70
x5
0 3 6 9 12 15 18 21 24
Fig. 1. Membership functions of fuzzy terms of variables: a X1.3; b X4; c X5
Linguistic interpretation of solutions of the direct/dual problems
Table 4
No. IF THEN
X1 X2 X3 X4 X5 y
11 vL A L vL
12 vL vL-L/vL-lA hA L
13 vL vL-A vL B-vB L
21 L A L L
22 L L-A/L-lA hA L
23 vL-L L-A/L-lA A/lA-A B-vB L-lA
24 vL-L L L/L-lA B-vB L-lA
31 lA-A lA L-hA lA*
32 L-lA lA-hA lA-hA L-B
33 lA-hA B lA-hA L-B
34 L-lA lA lA-hA hA-vB L-B
35 lA-A L-lA hA-B hA-vB L-B
41 A-hA L-lA L-hA A
42 L-A/lA-A hA-vB A-hA lA-B
43 A-hA A A-hA lA-B
44 A-hA hA-vB/hA-B A/lA-A B-vB lA-B
45 L-A/lA-A A-hA hA/hA-B B-vB lA-B
51 hA/hA-B L-lA lA-hA hA
52 A-hA hA-vB/hA-B lA-hA A-B
53 hA/hA-B vB/B A-hA hA-vB A-B
54 hA A-hA hA-vB/hA-B hA-vB A-B
61 B L-lA A-hA B*
62 B hA lA-hA hA-B
63 hA-B hA-B B-vB hA-B hA-B
64 hA-B B hA-B hA-B hA-B
71 vB L-lA A-hA vB
72 vB A-hA lA-A B
73 vB hA-vB/hA-B A-hA B
74 hA-vB/B-vB A-hA vB/B-vB hA B
75 hA-vB/B-vB hA-vB/hA-B hA/hA-B hA B
Notes: * - output classes (rules) for the dual problem
x4
x1-x3
b
a
c
The linguistic model (Table 4) ensures the forecasting correctness at the level of Ptop10=0.71 for the direct problem; Ptop10=0.83 for the dual problem.
6. Discussion of the results of effectiveness estimation of improving transformations
The papers [19-23] proposed methods for simplifying the process of tuning a fuzzy knowledge base for the specified (unknown) output classes and input terms.
The given method uses these results as improving transformations that replace fragments of the granular "in-"ax clustering algorithm. The principal difference of this method is the establishment of control variables in improving transformations that allow formalizing the fuzzy knowledge base generation. The effectiveness estimates of improving transformations are given below.
Candidate rule generation requires solving the optimization problem with 2nZo+2"o variables for boundaries of interval rules and boundaries of output classes [4, 5, 7-11]. Application of the composite transformation allows reducing the number of tuning parameters by solving Z optimization problems for 2" boundaries of output classes [19-21]. 2N variables for boundaries of -parameters of the rules are subject to tuning. At the same time, the rule length is determined by minimum solutions. Rule generator tuning is the optimization problem with 2N+2M+NM variables for the relational matrix and boundaries of triangular membership functions.
Selection, that is, finding the best configuration of terms and rules of zero option, requires solving the optimization problem with 2K0+2"0+nZ0 variables. The boundaries of triangular membership functions, a term sign with the possibility of tuning the rule length, as well as the degree of relevance of rules are subject to tuning [4, 5, 7-11, 15, 16]. In the case of Z solutions of the trend system of equations, selection is reduced to maintaining the level of detail and density of coverage. Application of the relational transformation reduces the number of tuning parameters to 2K+2"+K" for the partition matrices and boundaries of triangular membership functions [23].
For time series forecasting problems, the model tuning time from the moment the new experimental data appear shall not exceed the forecasting time-frame. The minimum value of this parameter for the problem of forecasting the number of comments in a social network is 1 hour. The time of tuning according to the method [4, 5, 7-11, 15, 16] is
72 min, which exceeds the forecasting time-frame. The tuning time for this method is 21 min (Intel Core 2 Duo P7350 2.0 GHz processor).
A restriction of the proposed method is the use of improving transformations for fuzzy knowledge bases of classification type.
7. Conclusions
1. Logic-algorithmic models of improving transformations for the fuzzy classification knowledge base are proposed. Such transformations are transition to a composite or relational fuzzy model. The composite model represents interval solutions of the trend system of fuzzy logic equations. The number of rules in a class is determined by the number of solutions, and the granularity is determined by intervals of values of input variables in rules. The set of minimum solutions provides the minimum rule length. The relational model represents a linguistic interpretation of the resulting solutions. The level of detail and the density of coverage are determined by the "input terms - output classes" relational matrix, and the dimensions of hyperboxes are tuned using triangular membership functions. Composite transformation provides the choice of the number of output classes and rules, the relational - the choice of the number of input terms.
2. The method of fuzzy classification knowledge base optimization using improving transformations is proposed. Improving transformations allow formalizing the generation of fuzzy knowledge base variants with the simultaneous establishment of control variables that affect the accuracy and complexity of the model. This solves the problem of redundancy of terms and rules in min-max clustering problems. The choice of control variables (the number of classes, terms, and rules) is carried out using the genetic algorithm. At the same time, consistent use of composite and relational improving transformations provides tuning process simplification.
Acknowledgments
The paper is prepared in the framework of the research work 46-D-388 "Identification of hidden dependencies in online social networks based on methods of fuzzy logic and computer linguistics".
References
1. Rotshtein, A. Design of faultless man-machine technologies [Text] / A. Rotshtein, P. Kuznetsov. - Kyiv: Tehnika, 1992. - 180 p.
2. Rotshtein, A. Modeling and optimization of multivariable algorithmic processes reliability [Text] / A. Rotshtein, S. Shtovba, A. Kozachko. - Vinnytsia: UNIVERSUM, 2007. - 215 p.
3. Glushkov, V. Algebra. Languages. Programming [Text] / V. Glushkov, G. Tceitlin, E. Ucshenko. - Kyiv: Naukova Dumka, 1989. - 376 p.
4. Ishibuchi, H. Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning [Text] / H. Ishibuchi, Y. Nojima // International Journal of Approximate Reasoning. - 2007. - Vol. 44, Issue 1. - P. 4-31. doi: 10.1016/j.ijar.2006.01.004
5. Bargiela, A. Optimised Information Abstraction in Granular Min/Max Clustering. Vol. 13 [Text] / A. Bargiela, W. Pedrycz // Smart Innovation, Systems and Technologies. - 2013. - P. 31-48. doi: 10.1007/978-3-642-28699-5_3
6. Fazzolari, M. A Review of the Application of Multiobjective Evolutionary Fuzzy Systems: Current Status and Further Directions [Text] / M. Fazzolari, R. Alcala, Y. Nojima, H. Ishibuchi, F. Herrera // IEEE Transactions on Fuzzy Systems. - 2013. - Vol. 21, Issue 1. - P. 45-65. doi: 10.1109/tfuzz.2012.2201338
7. Fazzolari, M. A multi-objective evolutionary method for learning granularities based on fuzzy discretization to improve the accuracy-complexity trade-off of fuzzy rule-based classification systems: D-MOFARC algorithm [Text] / M. Fazzolari, R. Alcalá, F. Herrera // Applied Soft Computing. - 2014. - Vol. 24. - P. 470-481. doi: 10.1016/j.asoc.2014.07.019
8. Seera, M. A modified fuzzy min-max neural network for data clustering and its application to power quality monitoring [Text] / M. Seera, C. P. Lim, C. K. Loo, H. Singh // Applied Soft Computing. - 2015. - Vol. 28. - P. 19-29. doi: /10.1016/j.asoc.2014.09.050
9. Reyes-Galaviz, O. F. Granular fuzzy modeling with evolving hyperboxes in multi-dimensional space of numerical data [Text] / O. F. Reyes-Galaviz, W. Pedrycz // Neurocomputing. - 2015. - Vol. 168. - P. 240-253. doi: 10.1016/j.neucom.2015.05.102
10. Rudziñski, F. A multi-objective genetic optimization of interpretability-oriented fuzzy rule-based classifiers [Text] / F. Rudziñski // Applied Soft Computing. - 2016. - Vol. 38. - P. 118-133. doi: 10.1016/j.asoc.2015.09.038
11. Rey, M. I. Multi-objective based Fuzzy Rule Based Systems (FRBSs) for trade-off improvement in accuracy and interpretability: A rule relevance point of view [Text] / M. I. Rey, M. Galende, M. J. Fuente, G.I. Sainz-Palmero // Knowledge-Based Systems. -2017. - Vol. 127. - P. 67-84. doi: 10.1016/j.knosys.2016.12.028
12. Cruz-Reyes, L. Incorporation of implicit decision-maker preferences in multi-objective evolutionary optimization using a multi-criteria classification method [Text] / L. Cruz-Reyes, E. Fernandez, P. Sanchez, C. A. C. Coello, C. Gomez // Applied Soft Computing. - 2017. - Vol. 50. - P. 48-57. doi: 10.1016/j.asoc.2016.10.037
13. Ahmed, M. M. Knowledge base to fuzzy information granule: A review from the interpretability-accuracy perspective [Text] / M. M. Ahmed, N. A. M. Isa // Applied Soft Computing. - 2017. - Vol. 54. - P. 121-140. doi: 10.1016/j.asoc.2016.12.055
14. Bandaru, S. Data mining methods for knowledge discovery in multi-objective optimization: Part A - Survey [Text] / S. Bandaru, A. H. C. Ng, K. Deb // Expert Systems with Applications. - 2017. - Vol. 70. - P. 139-159. doi: 10.1016/j.eswa.2016.10.015
15. Zhang, S. Adaptive consensus model with multiplicative linguistic preferences based on fuzzy information granulation [Text] / S. Zhang, J. Zhu, X. Liu, Y. Chen, Zhenzhen Ma // Applied Soft Computing. - 2017. - Vol. 60. - P. 30-47. doi: 10.1016/j.asoc. 2017.06.028
16. Hu, X. From fuzzy rule-based models to their granular generalizations [Text] / X. Hu, W. Pedrycz, X. Wang // Knowledge-Based Systems. - 2017. - Vol. 124. - P. 133-143. doi: 10.1016/j.knosys.2017.03.007
17. Rotshtein, A. Fuzzy logic and the least squares method in diagnosis problem solving [Text] / A. Rotshtein, H. Rakytyanska; R. D. Sarma (Ed.). - Genetic diagnoses. - New York: Nova Science Publishers, 2011. - P. 53-97.
18. Rotshtein, A. P. Fuzzy evidence in identification, forecasting and diagnosis [Text] / A. P. Rotshtein, H. B. Rakytyanska. - Heidelberg: Springer, 2012. - 314 p. doi: 10.1007/978-3-642-25786-5
19. Rotshtein, A. Expert rules refinement by solving fuzzy relational equations [Text] / A. Rotshtein, H. Rakytyanska // 2013 6th International Conference on Human System Interactions (HSI). - 2013. doi: 10.1109/hsi.2013.6577833
20. Rotshtein, A. P. Optimal Design of Rule-Based Systems by Solving Fuzzy Relational Equations. Vol. 559 [Text] / A. P. Rotshtein, H. B. Rakytyanska // Studies in Computational Intelligence. - 2014. - P. 167-178. doi: 10.1007/978-3-319-06883-1_14
21. Rakytyanska, H. B. Fuzzy classification knowledge base construction based on trend rules and inverse inference [Text] / H. B. Rakytyanska // Eastern-European Journal of Enterprise Technologies. - 2015. - Vol. 1, Issue 3 (73). - P. 25-32. doi: 10.15587/17294061.2015.36934
22. Rotshtein, A. P. Fuzzy Genetic Object Identification: Multiple Inputs/Multiple Outputs Case. Vol. 99 [Text] / A. P. Rotshtein, H. B. Rakytyanska // Advances in Intelligent and Soft Computing. - 2012. - P. 375-394. doi: 10.1007/978-3-642-23172-8_25
23. Rakytyanska, H. Optimization of knowledge bases on the basis of fuzzy relations by the criteria "accuracy - complexity" [Text] / H. Rakytyanska // Eastern-European Journal of Enterprise Technologies. - 2017. - Vol. 2, Issue 4 (86). - P. 24-31. doi: 10.15587/ 1729-4061.2017.95870
24. Singh, K. Facebook comment volume prediction [Text] / K. Singh // International Journal of Simulation: Systems, Science and Technologies. - 2015. - Vol. 16, Issue 5. - P. 16.1-16.9.
25. Lichman, M. UCI Machine Learning Repository [Electronic resource] / M. Lichman. - Irvine, CA: University of California, School of Information and Computer Science, 2013. - Available at: https://archive.ics.uci.edu/ml/citation_policy.html