Оптимизация баз знаний на основе нечетких отношений по критериям "точность - сложность"

Rakytyanska H.

-□ □-

Запропоновано метод оптимiзацii класифтацш-них нечтких баз знань за критерiями «точтсть -складтсть», який дозволяв спростити процес нала-штування шляхом переходу до реляцiйноi моделi. Задачу оптимiзацii бази знань зведено до задачi тт-тах кластеризаци. Суть методу у виборi таких матриць розбиття «входи - вихид», як забезпечу-ють необхдш або екстремальш рiвнi точностi виве-дення та кiлькостi правил

Ключовi слова: оптимiзацiя нечтких баз знань,

тт-тах кластеризация, нечтк реляцшш моделi □-□

Предложен метод оптимизации классификационных нечетких баз знаний по критериям «точность - сложность», который позволяет упростить процесс настройки путем перехода к реляционной модели. Задача оптимизации базы знаний сведена к задаче тт-тах кластеризации. Суть метода в выборе таких матриц разбиения «входы - выход», которые обеспечивают необходимые или экстремальные уровни точности вывода и количества правил

Ключевые слова: оптимизация нечетких баз знаний,

тт-тах кластеризация, нечеткие реляционные модели -□ □-

UDC 681.5.015:007053.81+004.91

|dOI: 10.15587/1729-4061.2017.95870|

OPTIMIZATION OF KNOWLEDGE BASES ON THE BASIS OF FUZZY RELATIONS BY THE CRITERIA "ACCURACY -COMPLEXITY"

H. Rakytyanska

PhD, Associate Professor Department of software design Vinnytsia National Technical University Khmelnytske shose str., 95, Vinnytsia, Ukraine, 21021 E-mail: h [email protected]

1. Introduction

The tuning of expert fuzzy knowledge bases involves maximum approximation to experimental data for a given level of complexity or maximum simplification without losing accuracy of inference [1]. The number of output terms or classes of output [2] determines the quality of a fuzzy classification knowledge base. The optimization of such knowledge base implies: a search for the minimum inference error with the limitation to the complexity of a model (the number of input terms, output classes, and rules); search for the minimum of rules (classes) at the assigned level of accuracy. A transition to the relational model makes it possible to simplify the design process by presenting the rules in the form of a matrix of fuzzy relations "input terms - output classes" [1]. In this case, a multi-dimensional matrix of relations R(X) is presented in the form of projections R1(x1),_, Rn(xn) [3]. The number of input and output terms is set in advance, and the tuning of the model implies selection of the elements of a matrix of relations [4, 5]. However, relational models leave open the problem on the optimal choice of the number of output classes. At the same time, the problem on the optimization of a fuzzy knowledge base is the task of fuzzy clustering [6]. In addition, it requires a partition of the space of input variables into such number of classes that provides the required or extreme levels of inference accuracy and the number of rules.

2. Literature review and problem statement

Methods of relational clustering, which conduct the partition of objects by similarity measures, are limited by the assigned number of classes [6, 7]. If the number of classes is unknown, the methods of min-max clustering are

used, which imply the generation of easily understandable rules-hyperboxes [8]. Hyperboxes learn using supporting vector machines (SVM) [9, 10] through extension/compression. Balancing between the inference accuracy and the number of rules (classes) is achieved by combining/partition of hyperboxes. To restore nonlinear boundaries between classes and avoid excessive coverage density, the mode of learning in the min-max neural networks must reduce the number of hyperboxes without compromising the recognizing capacity [11, 12]. There remains a problem in the adaptation of maximum size of the hyperbox, which determines how many rules can be generated. Classes overlapping and classification errors render this parameter very important. If the value of this parameter is small, unnecessary hyperboxes (classes) are formed [13].

A general problem of the min-max clustering methods is the selection of the number of output classes and the minimization of the number of input terms without compromising the inference accuracy. The method for the optimization of output classes of fuzzy knowledge base was proposed in papers [14, 15]. In contrast to the heuristic procedures of rules (classes) selection [8-13], the generation of fuzzy knowledge bases is reduced to the problem on discrete optimization of indicators of algorithm reliability [14, 15]. For the selection of output classes, the gradient method was used. The number of classes is defined under the offline mode [14]. Clarification of class boundaries is carried out by adaptive adding/ removing classes in arrangement vectors [15]. For the current output classes, interval rules are generated by solving the problem on inverse logical inference [2]. This solves the problem of control and adaptation of the hyperbox size [16]. The structure of the model is determined by parameters of interval rules that are connected to the coordinates of the maximum of a membership function.

©

This paper proposes a method for the optimization of output classes and input terms of a fuzzy knowledge base. If the number of terms is set in advance, the problem of min-max clustering may be solved by relational partition of the space of input variables [1]. The number and location of hyperboxes is determined by the matrix of relations [17] and the sizes of hyperboxes are determined as a result of adjusting the triangular membership functions [1]. Then the optimization of a relational fuzzy knowledge base lies in the selection of such partition matrices "inputs - output", which provide the required or extreme levels of inference accuracy and the number of rules. Following [14, 15], the selection of number of input and output terms in the partition matrices may be performed both under the offline mode and by adaptive adding/removing of terms.

3. The aim and tasks of the study

The aim of present work is to develop an approach to the optimization designing of relational fuzzy knowledge bases by the criteria "inference accuracy - complexity". This approach should simplify the process of the knowledge bases tuning based on fuzzy relations for both the assigned and the unknown output classes.

To achieve the set goal, the following tasks were to be solved:

- development of a relational fuzzy model that matches a fuzzy classification knowledge base;

- development of a method for the optimization of knowledge base on the basis of fuzzy relations under offline and online modes.

4. Models and methods for the optimization of knowledge bases on fuzzy relations

4. 1. Fuzzy relational model

Consider an object of the form y=f(x1,...,xn) with n inputs X=(xi,...,xn) and output y, for which the relation "inputs -output" may be represented in the form of a system of fuzzy classification IF-THEN rules [2]:

U_[ nixj = ajp}] ^ y = dj, j = 1, m,

p=1,Zj i=1,n

(1)

where aip is the fuzzy term for the evaluation of variable xi in line jp, j = 1,m, p = 1,z^ dj is the fuzzy term for the evaluation of variable y; zj is the number of rules in class dj; m is the number of terms of the output variable.

Let {ci1,...,cik} be a set of input terms for the evaluation of variable xi, i = 1,n.

We designate

{C1,...,CN} = {cn,...,c1kj,...,cn1,...,cnki},

where N=kj+..+kn.

Then the system of one-dimensional matrices of fuzzy relations corresponds to a fuzzy knowledge base (1):

Ri ccil xdj = [rilj,i = 1,n,l = 1,ki,j = 1,m], that is equivalent to a multi-dimensional matrix:

R c Cj x dj = [J = 1N,J = 1m].

Given matrices Ri, i = 1,n, dependence "inputs - output" is described using the extended compositional rule of inference [1]:

m d(y)=m A*(x1) o r 1 n... nm A-(xn) ° R n,

(2)

where mAi(xi) = (|mci1,...,|mciki ) and |d(y) = (|A...,|mdm)are the vectors of membership degrees of variables xi and y to terms cil, i = 1,n, and dj, j = 1,m, respectively.

From ratio (2), hence follows the system of fuzzy logical equations, which connects membership functions of fuzzy input and output terms:

mdj(y) = min{max[min(mcil(xi),rilJ)]}, j = 1,m. (3)

i=1,n l=1,ki J

Ratio (3) defines a fuzzy model of an object as follows:

y = f(X,N,m, ¥r),

(4)

where = (R,Be,Be,HC,Bd,Bd,Hd) is the vector of parameters of fuzzy relations, which includes:

Bc = (Pc1,...,Pcn), Bc = (Pc1,...,Pcn), Hc = (hC',...,hCN),

Bd = (Pd1,...,Pd"), Bd = (pd1,...,pdm), Hd = (hd1,...,hdm).

- vectors of lower and upper bounds, as well as vectors of coordinates of the maximum of triangular membership functions of fuzzy terms CI and dj; f is the operator of connection "inputs - output", which corresponds to formula (3).

4. 2. Problems on the optimization of knowledge base based on fuzzy relations

For a fuzzy knowledge base (1), the interrelation between the mean root square error and the number of rules depends on the number and bounds of output classes. Then the problem on the optimization of a fuzzy knowledge base (1) is reduced to the problem on the min-max clustering and lies in selecting such a partition matrix R that provides the required or extreme levels of inference accuracy and the number of rules.

Let the training sample be assigned as P pairs of experimental data:

(Xs,ys), s = 1,P,

where Xs = (x^,...^); ys are the vectors of values of input and output variables in the experiment number s.

Optimization of the number of input terms and output classes is carried out under the offline mode. In this case, the preliminary boundaries of dj classes are assigned by an expert.

We shall evaluate the complexity of a fuzzy model (4) based on the number of rules Z(N, m, R), which are associated with relation matrix R. We shall assess the quality of a fuzzy model (4) based on the root mean square error:

E =

^|[f(Xs,N,m,R)- ys]2.

Then the problem of selecting the optimal number of input terms and output classes may be formulated in the direct and dual statement.

Direct statement. Find such a number of input terms N, output classes m and fuzzy partition matrix R that provide

the minimum number of rules for a permissible inference error: Z(N,m,R) ^ min and E(N,m, R) < E, where E is the maximum permissible root mean square error.

Dual statement. Find such a number of input terms N, output classes m and fuzzy partition matrix R, which provide minimum inference error for the assigned number of rules: E(N,m,R) ^ min and Z(N,m,R) < Z, where Z is the maximum permissible number of rules.

Optimization of boundaries of output classes is performed under the online mode. In this case, clarification of the partition method is made by adaptive adding/removing of terms.

We shall introduce a limitation on_the volume of rela-tions_matrix in the following way: k < kp m < m, where k and m are the maximum number of input terms and output classes.

Assume:

U = (u1,...,us), V = (v1,...,v=),

are the vectors of arrangement of input terms and output classes, where uI=1(0) or vj=1(0) correspond to the addition (removal) of term CI or dj, respectively.

We shall evaluate a complexity of fuzzy model (4) based on the number of rules Z (U, V, R), which are associated with relations matrix R. We will assess the quality of fuzzy model (4) based on root mean square error

E = J 1É [f(X.,U,V,R) - ys ]2.

Then the problem on the selection of optimum boundaries of output classes may be formulated in direct and dual statement.

Direct statement. Find vectors of arrangement of input terms U, output classes V and fuzzy partition matrix R, for which under condition of limitation on the_knowledge base volume Z(U, V, R) ^ min and E(U, V, R) < E.

Dual statement. Find vectors of arrangement of input terms U, output classes V and fuzzy partition matrix R, for which under condition of limitation on the volume of knowledge base E(U, V,R) ^ min and Z(U,V,R) < Z.

4. 3. Method for the optimization of relational fuzzy knowledge base

To select the values of controlling variables, the gradient method is used, which was proposed in [14] for the solution of problems on discrete optimization of fuzzy knowledge base. This method implies a coordinate-wise rise along the surface of objective function in the direction of gradient. Algorithms for solving the optimization problems have a unified structure, consisting of two iteration sections [14]. In the first of them, the first permissible solution by successive adding of terms with the highest gradients is determined; in the second, an improvement of the found solution by decreasing the complexity of the model is accomplished. For the current output classes, fuzzy relations are tuned by the methods proposed in [2].

4. 3. 1. Algorithms of the optimization under offline mode

Gradients:

YL(ki), i = 1,n and yy(m),

will be defined as the ratio of infallibility increment AE(ki+1, ¥r) or AE(m+1, ¥r) to the increment in the number of rules AZ(ki+1, Yr) or AZ(m+1, Yr) at increasing the number of input or output terms in partition matrices:

i(k ) = AE(ki,¥r) = E(ki,yr)-E(ki +1,¥r)

Yx( j) = AZ(ki,Yr) = Z(ki +1,¥r)-Zi(ki,

„ , AE(m, V) E(m, V) - E(m +1, ¥r)

Y y(m) =---— = —-----—.

y AZ(m, ¥r) Z(m +1, Yr) - Z(m, ¥r)

We designate the solution vector, obtained at the tth step of the optimization algorithm as:

^(t) = (k(') m(t) ^('))

The algorithm for solving the problem in direct statement is performed in the following sequence:

1. Set the zero-option of a fuzzy model:

t=0 ¥(0) = (k(0),m(0), ¥r(0)).

If E(¥(0))<E,_proceed to step 4.

2. If E(Y(t))>E, proceed to step 3, otherwise - to step 4.

3. For models

T'= (k(l) + 1,m(t),Yr') and = (k(l),m(t) +1,Yr")

identify gradients y^ and yy relative to solution ¥(t). Find the coordinate, for which y = max{Yx,Yy}, t:=t+1. For vector ¥(t), assign:

k(t): = k(t-1) +1, Y(t):=^i', if y = y!;

m(t):= m(t-1) +1, ¥(t):=¥", if Y = Yy.

Proceed to step 2.

4. Decrease the complexity of model ¥(t) by decreasing the number of input or output terms at maintaining permissible inference accuracy. Check the conditions for models T'= (k(l) - 1,m(t),Tr') and ¥" = (k(l),m(t) -1,Tr"):

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

E(Vi') < E;

E(Y" ) < E.

(5)

(6)

If conditions (5) and (6) are not fulfilled for any coordinate, consider vector ¥(t) as the result of solving the problem, otherwise proceed to step 5.

5. For the coordinates that satisfy conditions (5) and (6), find the magnitude, by which the number of rules AZ will decrease. Find the coordinate for which:

A = max{AZ(kjl)-l,m(t)), AZ(k('\m(t) -1)}.

t:=t+E For vector Y(t), assign:

k(t) : = k(t-1) -1, Y(t) := Y, if A = AZ(ki );

m(t): = m(t-1) -1, Y(t):=Y", if A = AZ(m).

Proceed to step 4.

The algorithm of solving the problem in the dual statement is performed in the following sequence.

1. Set the zero-option of a fuzzy model: t:=0, T(0) = (k|0),m(0),Tr(0)).

If Z(T(0) ) > Z, proceed to step 4.

2. If Z(T(t))<Z, proceed to step 3, otherwise - to step 4.

3. The essence of this step coincides with step 3 of the algorithm for solving the problem in direct statement. Proceed to step 2.

4. Decrease the complexity of model T(t) for the inclusion in the area of permissible solutions by reducing the number of input or output terms. Check the conditions for models

T'= (k(l) - 1,m(t),Tr') and T'' = (k|t),m(t) -1, T")

Z(Y/) < Z; (7)

Z(T'' ) < Z. (8)

If at least one of the conditions (7) or (8) is fulfilled, then, among permissible solutions, select a model that provides a lower inference error, otherwise proceed to step 5.

5. For the coordinates that do not satisfy limitations (7) and (8), find the increment in deriving error AE. Find the coordinate, for which

A = min{AE(k|l) -1,m(t)), AE(k|t),m(t) -1)}.

t:=t+1. For vector Y(t), assign:

k(t) : = k|t-1) -1, T(t) := T', if A = AE(k );

m(t): = m(t-1) -1, T(t):=T'', if A = AE(m).

Proceed to step 4.

4. 3. 2. Algorithms of optimization under the online mode

Gradients

3. For the models where u(t) = 0 and j = 0, add an input or output term as follows:

v;=(u((t)+1,v(t),X) or Tj'= (u(t),v(t)+1,vrJ).

Determine gradients yX^) and yJ(vj) relative to solution Y(t). Find the term, for which Y = max^L,Y^}, where:

Y IL(u(Lt)) = max{Y i},

Y M(vMt)) = max{Y J}.

t:=t+1. For vector Y(t), assign:

uLl) = 1, v (t):=^L, if?=Y x; vMt) = 1, V(t):=VM, if Y = yM.

Proceed to step 2.

4. Improve model ¥(t) by attaining the required level of inference accuracy with fewer terms. For the models for which u(t) = 1 and v^ = 1, decrease the complexity by reducing the number of terms in the following way:

v;= (u((t) -1, V(t),V); V= (U(t), v(t) -1, VrJ ).

(t)

For the inputs and outputs, find such sets of terms Q' and Q(t), for which the conditions are fulfilled:

E(T') < E;(9) E(TJ") < E. (10)

If QXl) and Qyi) are empty sets, consider vector ¥(t) as the result of solving the problem, otherwise proceed to step 5.

5. For terms Ct eQXl) and dj eQy\ which satisfy conditions (9) and (10), find the magnitude, by which the number of rules AZ decreased. Find the term, for which

Y>i), I = 1,N and Yj(vj), J = 1,m,

will be defined as the ratio of infallibility increment AE(uI=1, ¥r) or AE(vj=1, ¥r) to the increment in the number of rules AZ(uI=1, Yr) or AZ(vj=1, Yr) as a result of adding the input or output term CI or dj:

, ) =AE(UI, Vr) = E(u = 0, Vr) - E(u = 1, Vr) Yx( i) AZ(uI,Vr) Z(uI = 1,Vr)-Z(uI = 0,V,./

Y y(v,) =

AE(vJ, E(vJ = 0, Tr) - E(vJ = 1, Tr) AZ(vJ, Tr) = Z(vJ = 1, Tr) - Z(vJ = 0, Tr).

Designate the solution vector, obtained at the t-th step of the optimization algorithm as V(t) = (U(t),V(t),Vr(t)). The algorithm of solving the problem in direct statement is performed in the following sequence.

1. Assign the zero-option of a fuzzy model:

t:=0 v(0) = (U(0) v(0) v(0))

If E(V(0)) < E,_proceed to step 4.

2. If E(V(t))>E, proceed to step 3, otherwise - to step 4.

A Z = max{AZL, AZ^}, where

AZL(u(Lt)) = max{AZ(u((t) -1)}; AZM(v<t)) = max{AZ(v(t) -1)}. t:=t+1. For vector Y(t), assign: uLt) = 0, V(t):=VL, if AZ = AZL; v^ = 0, V(t):=VM, if AaZ = AZm Proceed to step 4.

The algorithm of solving the problem in the dual statement is performed in the following sequence.

1. Set the zero-option of a fuzzy model:

t:=0 V(°)=(U(°) v(°) vm)

If Z(V(0)) > Z,_proceed to step 4.

2. If Z(V(t))<Z, proceed to step 3, otherwise - to step 4.

3. The essence of this step coincides with step 3 of the algorithm for solving the problem in direct statement. Proceed to step 2.

4. Decrease the complexity of model ¥(t) for the inclusion in the area of permissible solutions. For models, in which u(t) = 1 and vJl) = 1, decrease the number of terms in the following way:

(u((t) -1, V(t),J (U(t),v(t) -1,J

For the inputs and outputs, find such sets of terms Q' and Qlt), for which the conditions are satisfied:

Z(¥/) > Z; Z(¥j') > Z.

(11) (12)

If at least one of conditions (11) or (12) is not met, then choose among permissible solutions a model that provides a lower inference error, otherwise proceed to step 5.

5. For terms Ct e Q^ and dj e Q^, which satisfy conditions (11) and (12), find the magnitude, by which the inference error AE increases. Find the term, for which

À E = min{AEL, AE"},

Yx(ui2 ' U18 ) =

0.6712 - 0.6380

Y 2(U24) =

14 -12 0.6712 - 0.5968

= 0.0166,

= 0.0248;

15 -12

- terms c12 and c18 at step 3 since:

. 0.5968 - 0.5575 AAAOQ Y y(v3 ) =--= 0.0098,

19 -15

Y x(U12, U18 ) =

0.5968 - 0.5310 17 -15

= 0.0329,

2, , 0.5968 - 0.5632 nn„n

Y 2(u,3) =-= 0.0112;

xV 237 18-15

- class d3 at step 4 since:

. . 0.5310 - 0.4873 .........

Y y(v3 ) =-= 0.0109,

,yV 37 21 -17

1, . 0.5310 - 0.5250 _

Y i(u13,u17) =-2TTIT-= 0.0015,

2/ , 0.5310 - 0.5101 „„„„„

Y X(u23 ) =-20^17-= 0.0070.

where

Table 1

AEL(u(Lt)) = min{AE(u((t) -1)}; AEM(v<S)) = min{AE(v(t) -1)}. t:=t+1. For vector Y(t), assign:

u(Lt) = 0, ¥(t) := if AE = AEL;

v(t) = 0, if AE = AE".

Proceed to step 4.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

5. Results of computer experiment

For the model-standard [15, 16], the number of terms is limited as follows:

k1 = 9, k2 = 7, m = 7.

The task implied the transformation of the expert zero-option of a knowledge base to the variant, which provides: Z^min and E < 0.5 in the direct statement; E^min and Z < 30 in the dual statement.

Results of the calculation of optimization problems are listed in Table 1, where each iteration represents the results of designing model ¥(t) for the current number of terms k;(t) and m(t) with further arrangement of terms in vectors U(t) and V(t).

The first acceptable solution of the direct problem is obtained at step 4 by successive adding of terms with the highest gradients:

- term c24 at step 2 since:

. . 0.6712 - 0.6104 AA„_

Y y(v3) =-= 0.0152,

y 3 16-12

Calculation of optimization problems

t k1 k2 m U11.....U19 U21.....U27 v1.....vy Z E

1 5 4 5 100111001 1100110 1101011 12 0.6712

2 5 5 5 100111001 1101110 1101011 15 0.5968

3 7 5 5 110111011 1101110 1101011 17 0.5310

4 7 5 6 110111011 1101110 1111011 21 0.4873

5 5 5 6 100111001 1101110 1111011 19 0.5575

6 7 6 6 110111011 1111110 1111011 22 0.4625

7 7 6 7 110111011 1111110 1111111 24 0.4318

8 9 6 7 111111111 1111110 1111111 28 0.3514

9 9 7 7 111111111 1111111 1111111 31 0.3007

10 7 7 7 110111011 1111111 1111111 27 0.3819

Model ¥(4) remains the solution of the direct problem. Decreasing the complexity leads to model ¥(S) leaving the region of permissible solutions. Further increase in the number of terms in model ¥(6) provides decreasing the inference error by AE=0.0248 with increasing the number of rules by AZ=1.

Solution of the dual problem was continued by adding terms with the highest gradients: - term c23 at step 6 since:

0.4873 - 0.4509

Y y (v5 ) = -

23 - 21

-= 0.0182,

Y X(U13,U17) =

0.4873 - 0.4716 23 - 21

= 0.0079,

= 0.0248;

2( ) = 0.4873 - 0.4625 Yx(u23) = 22 - 21 - class ds at step 7 since:

. . 0.4625 - 0.4318

Y (v5) =-= 0.0153,

,yV 5J 24-22

Y X(U13,U17> =

0.4625 - 0.4437

26-22

= 0.0047,

Table 4

2/ , 0.4625 - 0.4560

Y2 (u27 > =

= 0.0022;

25 - 22

- terms c13 and c17 at step 8 since:

, 0.4318 - 0.3514 AAOA„

Y X(ui3-ui7) =-^^-= a0201,

2, . 0.4318-0.3819 AA.CC

Y X(u27) =-= 0.0166.

'xV 21' 27-24

Model ¥(8) remains the solution of the dual problem. Further increase in the number of terms leads to model ¥(9) leaving the region of permissible solutions, and decreasing the complexity of model ¥(10) - to increasing the inference error by AE=0.0305 at decreasing the number of rules by AZ=1.

Matrices of fuzzy relations in the solutions of direct and dual tasks (Tables 2, 3) are associated with fuzzy rule bases, presented in Tables 4, 5. Results of structural and parametric tuning of models ¥(4) and ¥(8) are shown in Fig. 1, 2.

Table 2

Matrix of fuzzy relations for a direct problem

IF THEN y

di d2 d3 d4 d6 dy

cii 0 0.65 0.72 0.92 0 0

C12 0.74 0 0.65 0 0 0

C14 0.50 0.84 0 0.69 0.57 0

xi C15 0.72 0.96 0 0 0.92 1.00

C16 0.53 0.82 0 0.70 0.57 0

C18 0.77 0 0.63 0 0 0

C19 0 0.67 0.71 0.95 0 0

C21 0 0.82 0.71 0.91 0 0

C22 0.74 0 0.73 0 0 0

X2 C24 0.75 0.63 0 0.63 0.94 0

C25 0.64 0.52 0 0 0.87 1.00

C26 0.50 0.61 0 0.61 0.92 0

Table 3

Matrix of fuzzy relations for a dual problem

IF THEN y

d1 d2 d3 d4 d5 d6 d7

C11 0 0.76 0.82 0.95 0 0 0

C12 0.82 0 0.65 0 0 0 0

C13 0.51 0.70 0 0.70 0.70 0 0

C14 0.52 0.64 0 0 0.83 0.92 0

X1 C15 0.64 0.64 0 0 0.71 0.86 1.00

C16 0.50 0.63 0 0 0.85 0.93 0

C17 0.52 0.70 0 0.70 0.70 0 0

C18 0.81 0 0.64 0 0 0 0

C19 0 0.75 0.83 0.94 0 0 0

C21 0 0.82 0.71 0.93 0 0 0

C22 0.78 0 0.72 0 0 0 0

X2 C23 0.77 0.80 0 0.70 0.86 0 0

C24 0.73 0.64 0 0 0.83 0.76 0

C25 0.61 0.52 0 0 0.77 0.89 1.00

C26 0.50 0.80 0 0.70 0.70 0.70 0

IF-THEN rules for a direct problem

Rule IF THEN

x1 X2 y

1, 2 3 [0.61, 2.03] or [3.90, 5.45] [1.75, 4.12] [0.60, 4.00] [0.60, 1.97] d1, [-0.25, 0.39]

4, 5 6 [0, 0.86] or [5.17, 6.00] [1.75, 4.12] [1.68, 4.00] [0, 0.87] d2, [0.32, 0.80]

7, 8 9, 10 [0, 0.86] or [5.17, 6.00] [0.61, 2.03] or [3.90, 5.45] [0.60, 1.97] [0, 0.87] d3, [0.73, 1.25]

11, 12 13, 14 15, 16 [0, 0.86] or [5.17, 6.00] [1.75, 2.72] or [3.30, 4.12] [1.75, 2.72] or [3.30, 4.12] [0, 0.87] [1.68, 2.70] [3.32, 4.00] d4, [1.10, 1.97]

17, 18 19 20 [1.75, 2.72] or [3.30, 4.12] [2.54, 3.45] [2.54, 3.45] [2.56, 3.50] [1.68, 2.70] [3.32, 4.00] d6, [1.85, 2.59]

21 [2.54, 3.45] [2.56, 3.50] d7, [2.41, 3.20]

Table 5 IF-fflEN rules for a dual problem

Rule IF THEN

X1 X2 y

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

1, 2 3 [0.52, 1.46] or [4.68, 5.54] [1.32, 4.82] [0.56, 4.00] [0.56, 1.42] d1, [-0.18, 0.34]

4, 5 6 [0, 0.64] or [5.42, 6.00] [1.32, 4.82] [1.30, 4.00] [0, 0.68] d2, [0.26, 0.71]

7, 8 9, 10 [0, 0.64] or [5.42, 6.00] [0.52, 1.46] or [4.68, 5.54] [0.56, 1.42] [0, 0.68] d3, [0.64, 1.00]

11, 12 13, 14 15, 16 [0, 0.64] or [5.42, 6.00] [1.32, 2.24] or [3.85, 4.82] [1.32, 2.24] or [3.85, 4.82] [0, 0.68] [1.30, 2.18] [3.37, 4.00] d4, [0.92, 1.63]

17, 18 19, 20 21, 22 23 [1.32, 2.24] or [3.85, 4.82] [2.12, 2.76] or [3.34, 3.97] [2.12, 2.76] or [3.34, 3.97] [2.12, 3.97] [2.06, 3.45] [2.06, 2.72] [3.37, 4.00] [1.30, 2.18] d5, [1.50, 2.25]

24, 25 26 27 [2.12, 2.76] or [3.34, 3.97] [2.64, 3.48] [2.64, 3.48] [2.61, 3.45] [2.06, 2.72] [3.37, 4.00] d6, [2.11, 2.78]

28 [2.64, 3.48] [2.61, 3.45] d7, [2.65, 3.37]

For the solutions of a direct and a dual problem, the compromise "inference accuracy - complexity" is achieved by adding/removing output class ds and input terms c13, c17 and c23.

b

Fig. 1. Results of the structural tuning for solving: a — direct

problem; b — dual problem;

d4;_- d5;

- di;_- d2;

- ds;

■ de; .

d7

a

5*o~ 1 b

Fig. 2. Results of parametric tuning for solving: a — direct problem; b — dual problem

6. Discussion of results of assessing the complexity of tuning algorithms for a fuzzy classification knowledge base

The proposed method, as well as methods [14, 15], represents the formalization of improving transformations for an expert fuzzy knowledge base. At the same time, controlling variables are set, which are the number of input terms, output classes and rules. Improving transformations make it possible

to formalize the process of generation of fuzzy knowledge base variants with a subsequent selection by the criteria of accuracy and costs or by the complexity of the tuning process.

Assume the number of rules (classes) is limited, and the number of input terms is unknown. Then the number of tuning parameters for the fuzzy classification knowledge base is 2nZ+2m for two-parameter membership functions [2] or upper and lower boundaries of interval rules [8-13]. Assume that in addition to the number of output classes and rules, the number of input terms is also limited. Then relations matrices are implanted into the antecedents of fuzzy rules, and the number of tuning parameters of a relational fuzzy knowledge base is ZNm + 2N + 2m [4, 5].

If the number of rules (classes) is subjected to minimization, we limit the number of terms of input NT and output MT whose linguistic modification provides the required inference accuracy [14-16]. The number of tuning parameters of the rules generator based on the matrix of fuzzy relations is NTMT + 2NT + 2MT. An inverse inference for m output terms requires the solution of Z optimization problems with 2n variables for the upper and lower boundaries of the intervals [16].

Compared with [2, 4, 5, 8-13, 14-16], the proposed method allows us to decrease the number of tuning parameters to Nm + 2N + 2m for partition matrices and the upper and lower boundaries of triangular membership functions. The shortcoming of the method is the necessity of obtaining linguistic IF-THEN rules, which are associated with a fuzzy partition matrix.

7. Conclusions

1. The models and methods were developed for the optimization design of fuzzy classification knowledge bases by the criteria "inference accuracy - complexity". A fuzzy relational model, which corresponds to a fuzzy classification knowledge base, was proposed. The prob-■■:■-.. lem on the optimization of a fuzzy knowl-

edge base is reduced to the problem on the min-max clustering and comes down to selecting such partition matrices "inputs -output" that provide the required or extreme levels of accuracy and the number of rules.

2. The selection of output classes and input terms is reduced to the problem on discrete optimization of the algorithm reliability indicators, for the solution of which we employed the gradient method. This resolves a general problem in the methods of min-max clustering related to the selection of the number of output classes and minimization of the number of input terms without losing inference accuracy. The number and location of hyperboxes are determined by the relation matrix "input terms - output classes", and the sizes of hyperboxes are defined as a result of tuning of the triangular membership functions. Selection of the number of input and output terms in partition matrices may be performed both under the offline mode and by adaptive adding/ removing of terms. A transition to the relational fuzzy model provides the simplification of the process of knowledge bases tuning both for the assigned and unknown output classes.

References

1. Yager, R. Essentials of fuzzy modeling and control [Text] / R. Yager, D. Filev. - New York: John Willey & Sons, 1994. - 408 p.

2. Rotshtein, A. P. Fuzzy Evidence in Identification, Forecasting and Diagnosis [Text] / A. P. Rotshtein, H. B. Rakytyanska. - Heidelberg: Springer, 2012. - 314 p. doi: 10.1007/978-3-642-25786-5

3. Mandal, S. SISO fuzzy relational inference systems based on fuzzy implications are universal approximators [Text] / S. Mandal, B. Jayaram // Fuzzy Sets and Systems. - 2015. - Vol. 277. - P. 1-21. doi: 10.1016/j.fss.2014.10.003

4. Scherer, R. Relational Modular Fuzzy Systems [Text] / R. Scherer // Studies in Fuzziness and Soft Computing. - Springer Berlin Heidelberg, 2012. - P. 39-50. doi: 10.1007/978-3-642-30604-4_4

5. Gonzalez, A. An Efficient Inductive Genetic Learning Algorithm for Fuzzy Relational Rules [Text] / A. Gonzalez, R. Perez, Y. Caises, E. Leyva // International Journal of Computational Intelligence Systems. - 2012. - Vol. 5, Issue 2. - P. 212-230. doi: 10.1080/18756891.2012.685265

6. Graves, D. Clustering with proximity knowledge and relational knowledge [Text] / D. Graves, J. Noppen, W. Pedrycz // Pattern Recognition. - 2012. - Vol. 45, Issue 7. - P. 2633-2644. doi: 10.1016/j.patcog.2011.12.019

7. De Carvalho, F. de A. T. Relational partitioning fuzzy clustering algorithms based on multiple dissimilarity matrices [Text] / F. de A. T. de Carvalho, Y. Lechevallier, F. M. de Melo // Fuzzy Sets and Systems. - 2013. - Vol. 215. - P. 1-28. doi: 10.1016/ j.fss.2012.09.011

8. Bargiela, A. Optimised Information Abstraction in Granular Min/Max Clustering [Text] / A. Bargiela, W. Pedrycz // Smart Innovation, Systems and Technologies. - 2013. - P. 31-48. doi: 10.1007/978-3-642-28699-5_3

9. Gaspar, P. Parameter Influence in Genetic Algorithm Optimization of Support Vector Machines [Text] / P. Gaspar, J. Carbonell, J. L. Oliveira // Advances in Intelligent and Soft Computing. - 2012. - P. 43-51. doi: 10.1007/978-3-642-28839-5_5

10. Wu, Z. A fuzzy support vector machine algorithm for classification based on a novel PIM fuzzy clustering method [Text] / Z. Wu, H. Zhang, J. Liu // Neurocomputing. - 2014. - Vol. 125. - P. 119-124. doi: 10.1016/j.neucom.2012.07.049

11. Mohammed, M. F. Improving the Fuzzy Min-Max neural network with a K-nearest hyperbox expansion rule for pattern classification [Text] / M. F. Mohammed, C. P. Lim // Applied Soft Computing. - 2017. - Vol. 52. - P. 135-145. doi: 10.1016/ j.asoc.2016.12.001

12. Seera, M. A modified fuzzy min-max neural network for data clustering and its application to power quality monitoring [Text] / M. Seera, C. P. Lim, C. K. Loo, H. Singh // Applied Soft Computing. - 2015. - Vol. 28. - P. 19-29. doi: 10.1016/j.asoc.2014.09.050

13. Reyes-Galaviz, O. F. Granular fuzzy modeling with evolving hyperboxes in multi-dimensional space of numerical data [Text] / O. F. Reyes-Galaviz, W. Pedrycz // Neurocomputing. - 2015. - Vol. 168. - P. 240-253. doi: 10.1016/j.neucom.2015.05.102

14. Rotshtein, A. P. Optimal design of rule-based systems by solving fuzzy relational equations [Text] / A. P. Rotshtein, H. B. Rakytyanska // Studies in Computational Intelligence. - 2014. - P. 167-178. doi: 10.1007/978-3-319-06883-1_14

15. Rakytyanska, H. Optimization of composite fuzzy knowledge bases on rules and relations [Text] / H. Rakytyanska // Inf. Tech-nol. Comput. Eng. - 2015. - Issue 1. - P. 17-26.

16. Rakytyanska, H. Fuzzy classification knowledge base construction based on trend rules and inverse inference [Text] / H. Rakytyanska // Eastern-European Journal of Enterprise Technologies. - 2015. - Vol. 1, Issue 3 (73). - P. 25-32. doi: 10.15587/1729-4061.2015.36934

17. Rotshtein, A. P. Fuzzy Genetic Object Identification: Multiple Inputs/Multiple Outputs Case [Text] / A. P. Rotshtein, H. B. Rakytyanska // Advances in Intelligent and Soft Computing. - 2012. - P. 375-394. doi: 10.1007/978-3-642-23172-8_25

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Rakytyanska H.

Похожие темы научных работ по компьютерным и информационным наукам , автор научной работы — Rakytyanska H.

Optimization of knowledge bases on the basis of fuzzy relations by the criteria "accuracy - complexity"

Текст научной работы на тему «Оптимизация баз знаний на основе нечетких отношений по критериям "точность - сложность"»