Научная статья на тему 'The asymptotic probabilistic genetic algorithm'

The asymptotic probabilistic genetic algorithm Текст научной статьи по специальности «Математика»

CC BY
110
32
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
PROBABILISTIC GENETIC ALGORITHM / MUTATION / SELECTION

Аннотация научной статьи по математике, автор научной работы — Galushin P. V., Semenkin E. S.

This paper proposes the modification of probabilistic genetic algorithm, which uses genetic operators, not affecting the particular solutions, but the probabilities distribution of solution vector's components. This paper also compares the reliability and efficiency of the base algorithm and proposed modification using the set of test optimization problems and bank loan portfolio problem.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «The asymptotic probabilistic genetic algorithm»

#exp| 2f-p-x-sin29

x[A exp (--9) + B exp (--9)] +

+— exp " - p - x - sin 29^j*| [-A exp ( --9) + B exp ( -—9)) ) cos 9.

5) u = exp | #

2+3

Ai-3+b

Here A, B, # arearbitraryconstants; #2 = 1. In this case

v = # exp| #

2+3

- A exp | #

2+3

AÎ-3+b

= exp " - p - x - sin 29 11[-A9 + B]cos 9-

#exp| ~i -p-x-sin29 | |[-A9+ B]-

-A exp " 21+ - p - x - sin29

sin 9,

= exp| #"-p-x-sin29 | |[-A9 + B]sin9 + +| #exp| ~f-p-x-sin29 | |[-A9 + B]-

-Aexp| #|-p-x-sin29

Bibliography

cos 9.

1. Sockolovsky, V V Plasticity Theory/W Sockolovsky. M.: HigherSchool, 1969. (inRussian)

2. Limiting State of Strained Media and Rocks / D. D. Ivlev, L. A. Maximova, R. I. Nepershin et al. M. : Physmathlit, 2008. (inRussian)

3. Polyanin, A. D. Reference Bookon Nonlinear Equations ofMathematicalPhysics /A.D. Polyanin. M.: Physmathlit, 2001. (inRussian)

© Gomonova O. V., Senashov S. I., 2009

P. V Galushin, E. S. Semenkin Siberian State Aerospace University named after academician M. F. Reshetnev, Russia, Krasnoyarsk

THE ASYMPTOTIC PROBABILISTIC GENETIC ALGORITHM*

This paper proposes the modification of probabilistic genetic algorithm, which uses genetic operators, not affecting the particular solutions, but the probabilities distribution of solution vector’s components. This paper also compares the reliability and efficiency of the base algorithm and proposed modification using the set of test optimization problems and bank loan portfolio problem.

Keywords: probabilistic genetic algorithm, mutation, selection.

The probabilistic genetic algorithm (PGA) is an attempt to create an algorithm with a scheme similar to that of the traditional genetic algorithm (GA), preserving the basic properties of the genetic operators, but defined in terms of the pseudo-Booleanoptimizationtheory [1].

The probabilistic genetic algorithm explicitly (as opposed to the traditional GA) computes the components of the probability vector and has no crossover operator (it is replaced a by random solution generation operator) but retains the genetic operators of mutation and selection.

The purpose of this study is to develop a probabilistic genetic algorithm modificationwith mutation and selection operators, effecting not particular individuals, but genes’ values distribution as a whole; and to compare efficiency and reliability ofbasic algorithm and modification.

Asymptotic mutation. PGA uses a standard GA mutation operator, which inverts genes with a given probability (as a rule, this probability is very low). Since genes mutate

independently, we can study one particular gene. All following formulas will stand for every gene in the chromosome. Let us suppose that p - denoting the probability of that fact was equal to 1 before mutation. We will determine the probability as equal to 1 for same gene after mutation (p' denotes this probability). The mutation probability is pm.

The gene canbe equalto 1 aftermutationintwo cases: it was equal to 1 before mutation and has not mutated or it was equal to 0 before mutation and has mutated. If x denotes the gene value before mutation and y - after mutation - the following equality is:

P {y= 1}=P {x = 1}(1 - pm )+P : = 0; =

= p(1 & pm )+(1 - p)pm=pm +p(1 & ^m X Using the aforementioned designations for genes probabilities before and after mutation we can write down:

p' = pm +p(1 & 2pm ).

* This work was financially supported by the State programs “The Development of Scientific Potential of Higher Education Institutions” (Project2.1.1/2710) and “The Scientific andEducational Staffoflnnovative Russia” (ProjectNK-136P/3).

This equation can be used to implement the mutation operator, not affecting the solution genes, but the distribution of genes in whole.

The difference of this scheme from the classical one is that mutation (in the traditional GA sense) is absent, but after estimating intermediate population genes distribution each component of the probabilities vector must be transformed by the formula aforementioned. Such a transformation canbe called “the mutation operator effecting the distribution” or “asymptotic mutation operator”.

The traditional implement of mutation inthe PGA canbe seen as an estimation of gene distribution using the Monte-Carlo technique. In the end we have a stochastic approximation of computation formula results. The term asymptotic mutation states that this procedure is a limited case of traditional mutation operators if the population size tends to increase.

Let us examine some properties of the proposed procedure. The transformationdefinitionis linear, mapping intervals from [0; 1]to interval [p ; 1 - pm]. The linearity is obvious and the boundaries can be calculated if we replace p by0 and 1. Since the linear function is also monotonous, the values from interval [0; 1] will be mapped into interval [pm; 1 - pm]. Thus asymptotic mutation doesn’t let probability p reachvalues 0 and 1 exceedpm; this excludes premature convergence.

It is easy to see thatvalue p = 0.5isthe fixed point of this mapping, meaning that if the gene values have had equal probabilities before mutation, this property will remain after mutation. Thus, the proposed distributing transformation procedure makes the distribution formal.

Another feature of the proposed procedure is that its time consumption does not depend on population size because it performs only simple linear transformations of gene probability vector components. In the traditional PGA, random real numbers from intervals [0;1] are generated for each gene iteration for every solution, if this number is in the interval [0; pm] (where pm is the mutation probability), then the gene will be inverted (flipped). The complexity of the traditional mutationprocedure (forone iteration) is O(N; M), where N is the population size, and M is the genes’ number. Most of the genes will not be flipped since mutation probability is usually small.

Letus compute the probability of a situationwhere some solution will stay unchanged after mutation. Let us denote this probability Q. The solution will not change if all of the genes remainunchanged. Genes mutate independently, they will stay unchanged within the probability 1-pm = qm. Using this rule for calculating thejoint probability of independent events, we get the needed probability, which is equal to Q = (1 -pm)M. Mutationprobability is often setto pm =1 /M; in this case Q « e-1. This approximate equality is precise enough even when M is equal to10. This means that more than one third of all solutions will not change during the mutation procedure; in other words all computations connected with the implementation of the mutation operator, affecting the solutions, in more than one third of cases were performed only to find out that no action must be performed.

The proposed mutation implementation approach has no such drawbacks. Its algorithmic complexity is O(M) meaning

that the time for this procedure does not depend on the population size, in practice meaning that the time consumption of the mutation procedure is relatively small compared to other procedures, the complexity of which depends on the population size (in the proposed algorithm such a procedure is selectional).

The proposed mutation implementation does not require a generation of random numbers (which can be expensive operations). The probabilities of genes values are computed independently; therefore the proposed procedures may be implemented onparallel orvector hardware.

This approach does not contain conditional logic (branching statements), and therefore is more suitable for modern processors with instruction pipeline [2].

It is necessary to notice that in spite of the fact that the proposed PGA modification has no traditional mutation procedure; the parameter of this procedure - mutation probability - retains, it means that user has to specify the parameter. This can be seen as both an advantage and a disadvantage. On the one hand, it is handily for user to not specify and tune parameters. On the other, the proposed modification makes no assumption about the mutation probability, and therefore it can be used with any mutation probability setup method, including self-adjusting (the tuning of mutation probability during an optimization process).

The proposed distribution transformation procedure can be seen not only as a mutation implementation procedure, but as an additional step of the estimation of distribution, the purpose of which is the avoiding of premature convergence. The connection between the distributions before and after mutation is analogous to the connection between the classical and Bayesian statistical estimations of probability based on sample rates. This connection is expressedby the following formula:

np+C p B = n+2C'

where pB is the Bayesian estimation of probability, p is the classical estimation of probability, n is the total experiment number (sample size), C is the parameter, (usually equal to 1). After simple transformations we canfind out, thatthe Bayesian estimation of probability is equivalent to the “mutating” classical estimation, if the mutationprobability is equal to C/(n + 2C).

Asymptotic selection. Letus now considerthe selection procedure. During this the PGA (and traditional GA) intermediate population is generated - the probability is to be selected in the intermediate population is higher for individuals with better health. After the intermediate population estimation has been completed, the mutation and estimation of the genes’ values probabilities are performed. This procedure is known as the Monte-Carlo estimation (as it was done for the mutation procedures in the previous section. Since the distribution is known exactly, we can simply compute the distribution.

Let the population contain individuals x1, ..., xn, the probabilities tobe selected (inone experiment) areg1,..., gn. The expected value of the probability of this case is that the i-th gene will be equal to 1.

p) = ¿xf gk. k=1

It is possible to calculate the distribution of genes in the intermediate population without explicit selection procedure using the given formula. This approach can be called “asymptotic”, since it distributes the genes in the limit for distributions, generated by the traditional approach in situations when the population size tends to infinity.

Proportional and ranking selection methods calculate selection probabilities explicitly; therefore the asymptotic approach can be applied directly to these methods. During the tournament selection the explicit selection probabilities are not used and the asymptotic approach can not be applied without modifications. However, tournament selection is often more efficient and reliable than other selection methods, and therefore distributing the asymptotic approach on tournament selection is important problem.

It can be shown, that tournament selection is a kind of ranking with implicit selection probabilities. Let us consider selection procedure of this method: tournament groups are generated randomly (not considering an individuals’ health), and the winner of the tournament is an individual with the best health (in tournament groups). The issue is to find a solution with a maximal fitness value; values themselves have no importance - we can consider only the ranks of these values.

To build an asymptotic selection method, equivalent to tournament selection it is necessary to find outthe dependence between selection probabilities and fitness ranks.

Let the tournament group size be denoted as S. Let us assume (for simplicity), that the population does not contain individuals with equal fitness values. Our objective is to find the probability of selecting an individual with a k-th fitness value. Tournament groups are random - distribution is uniform. A tournament winner is an individual with highest rank, meaning that the winner’s rank has same distribution as maximal S uniformly distributed randomvalues [3]:

kS - (k - 1)S & n .

Let us now assume that the population consists of individuals with equal physical values. Let the population consist of K different fitness values; k-th value appears in the population nk times. It is clear that following equality is:

K

(n = n .

k=i

In this case it is simple to find out the expression for cumulative probabilities Gk, defined by the following formulas:

G1 =gu Gk=Gk-i+gk, k = ^.^ K.

Since solutions are selected for tournament groups without considering their physical abilities, then all possible tournament groups have the same probability, and the k-th cumulative probability is equal to the number of tournament groups containing solutions where the fitness is less or equal to k-th fitness value, divided by the total number of tournament groups:

Asymptotic selection does not generate intermediate population; therefore this approach consumes less memory (if as usual the genes’ number is high consuming half of the memory necessary fortraditional PGA).

Using proposed selection and mutation techniques we get the following modification of the PGA procedure:

- create and estimate initial population within uniform genes distribution;

- if the termination condition is met - stop;

- compute genes distribution using asymptotic selection;

- transform genes distribution using asymptotic mutation;

- generate new populationusing computed distribution, estimate it;

- return to step 2.

Algorithm comparison using test problems. Quality characteristics of stochastic optimization methods are reliability (number of experiments in which the algorithm found global optimum divided by total number of experiments) and the average number of objective value computations required for reaching the optimum (average over successful experiments). The primary characteristic is reliability: if two algorithms have equal reliability, the algorithmwhichperforms a lower number of objective computations is better.

The number of objective computations is a secondary characteristic, since this criterion can be inadequate if the reliability of the optimization algorithm is low; in such case the algorithm canbe used to find global optimum only if the initial population is extremely promising (many solutions belong to the attraction region of the global optimum), therefore the algorithm will converge very quickly. Furthermore, low reliability means that the average is calculated over a sample small in size, therefore the variation is high.

PGA is a stochastic optimization algorithm; its quality cannot be determined by one experiment, it’s necessary to perform many experiments and average the results.

We have used the following settings: the population size is equal to 100, the maximum iterations number is 50, the number of experiments for qualified estimation of the characteristics is 1,000, the mutation is weak, and the coding method is Code Grey. We have used same test problems as setinthepaper[1].

To define if the difference between the two methods is statistically significant - the Wilcoxon-Mann-Whitney nonparametric test [4] (with samples sizes 5) was used. The results of experiments are summed up in table 1.

The result column contains a number of testing problems where the differences between algorithms are statistically significant. Values for base algorithm and modification are divided by a slash (the total problems number is 16).

The results show that in all cases the difference between algorithms is not statistically significant and that the modification’s performance is better than the one of base algorithm. The proposed modification’s reliability is not worse thanthat ofbase algorithm (and in some cases surpasses it), but in most of experiments the modification performs more calculations of the objective value. Since PGA is a global optimization algorithm, such a trade-off canbe acceptable.

The observed increase of reliability and computational cost canbe explained by the fact that selection probabilities

for solutions with small fitness values are also small. Therefore such solutions, selected in the intermediate population affect genes distribution rarely in traditional selection. In the case for asymptotic selection all solution can contribute into distribution (the contribution of “bad” solution is low). It is clear that accounting of “bad” solutions decreases the speed of local convergence and the probability of finding a local minimum; in turn this increases the probability of global convergence.

Algorithm comparison within the problem of the bank loan portfolio. Let us compare standard PGA and the proposed modifications within the bank loan portfolio problem [5]. This problem is a constrained pseudo-Boolean optimization problem (constrained optimization of the functionwithBooleandomainandrealvalues). The dimension of the search space is equal to 50. The constraints dynamic penalty method was used [6]. The population size was equal to 1,000, the number of iterations was 100, and the number of averagingexperimentswas 100.

For each selection method we test the equality of the expected values of profitability best loan portfolios, defined by the base algorithm and modifications. To test the statistical significance of the difference we’ve used the Student two sample test [4]. Since the sizes of the samples are quite high we can use the asymptotic value. For the significance level

0,95 it is approximately equal to 1.97.

For the bank loan portfolio problem we’ve performed a full comparison of all the four possible variants of PGA: base probabilistic genetic algorithm (PGA), PGA-M - probabilistic genetic algorithm with asymptotic mutation (and traditional selection), PGA-S - probabilistic genetic algorithmwithasymptotic selection (and traditional mutation), PGA-MS - probabilistic genetic algorithm with both asymptotic mutation and selection.

The results of the experiments were placed in the following table. Table 2 contains the averages of the best objective values found by optimization algorithms, the standard deviation of these quantities, and time consumption (in seconds).

The Students’ test of equality means shows that the efficiency of algorithms with asymptotic mutation or asymptotic selection (PGA-M and PGA-S) doesn’t significantly differ in statistics from the base algorithm (with the significance level 0.95). Furthermore, the difference between these two algorithms is also not statistically significant. The difference significance between PGA with asymptotic mutation, selection (PGA-MS), and other algorithms depends on the selection methods: it is significantly higher when using the proportional and ranking selection efficiency ofPGA-MS, than the efficiency of the other three algorithms. However in tournament selection there are no statistically significant differences.

Let us now consider the time consumption of the PGA variants. It can be seen that in all cases the base algorithm consumes more time than its asymptotic modifications. Furthermore, PGA-MS surpasses both PGA-M and PGA-S, if we use proportional or ranking selection. Only in the case of tournament selection, the algorithm PGA-M is the most rapid. This canbe explainedby the fact, thatthe asymptotic variant of the tournament selection performs a relatively expensive operation: the sorting of population, which traditional tournament selection does not (ranking without sorting - is one of the most important features of tournament selection). However the usage of asymptotic mutation and/or selection does not slow down the algorithm when comparing to the PGA.

We can conclude that asymptotic variants of probabilistic genetic algorithm perform according to their designed goals, i.e. to give statistically equivalents of probabilistic genetic algorithm consuming fewer amounts of computational resources.

Bibliography

1. Semenkin, E. S. Probabilistic evolutionary algorithms of complex systems optimization/E. S. Semenkin, E. A. Sopov // Proc. of Int. Conf. “Intelligent systems” (AIS’05) and “Intelligent CAD” (CAD-2005): in 3 vol. Vol. 1. M.: Fizmatlit, 2005. (inRussian)

Table 1

Test results

Selection method Base algorithm Modification Result

Reliability Costs Reliability Costs

Proportional 0.44 2,100 0.52 2,100 0/7

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Ranking 0.46 2,200 0.62 2,300 0/7

Tournament 0.48 2,100 0.65 2,300 0/8

Table 2

Bank portfolio problem

Selection method Algorithm Average Standard deviation Time

Proportional PGA 199,581 62.74 1.32

PGA-M 199,579 57.41 1.28

PGA-S 199,585 75.34 1.09

PGA-MS 199,605 59.98 0.94

Ranking PGA 199,631 38.35 1.71

PGA-M 199,631 37.84 1.59

PGA-S 199,633 40.45 1.48

PGA-MS 199,646 38.39 1.27

Tournament PGA 199,634 48.91 1.37

PGA-M 199,634 48.79 1.11

PGA-S 199,635 38.43 1.34

PGA-MS 199,639 46.09 1.24

2. Compilers: Principles, techniques, and Tools / A. V Aho [et. al.]. 2nd ed. N. Y.: Addison-Wesley, 2007.

3. Knuth, D. TheArtof ComputerProgramming: in2vol. Vol. 2. Seminumerical algorithms / D. Knuth. 3rd Edition. Reading,Mass. :Addison-Wesley, 1997.

4. Applied statistics: Backgrounds of modeling and initial data processing / S. A. Aivazian [et. al]. M. : Finansy i Statistika, 1983.471 p. (inRussian)

5. Purticov, V. A. Optimization of bank credit portfolio management: PhD thesis /VA. Purticov. Krasnoyarsk, 2001. 148p. (inRussian)

6. Michalewicz, Z. Evolutionary Algorithms for Constrained Parameter Optimization Problems / Z. Michalewicz, M. Schoenauer//Evolutionary Computation. 1996.№4(1).P. 1-32.

© GalushinP. V., SemenkinE. S., 2009

M. A. Gorbunov, A. V Medvedev, P. N. Pobedash, E. S. Semenkin Siberian State Aerospace University named after academician M. F. Reshetnev, Russia, Krasnoyarsk

THE MODELING OF THE WORLD SOCIO-ECONOMIC STRATEGY AS AN OPTIMAL CONTROL PROBLEM*

An approach to the modeling strategy of global social-economical development on the basis of the economic-mathematical optimum control model, considering interaction of the basic economic agents of the world social-economic system (WSES) - industrial, consumer, financial sectors, as well as the operating center (the world government) is described in this article. The task ofoptimizing the global social-economic development is formulated; the main principles of the analysis, restrictions and target criteria are analyzed.

Keywords: global economical crisis, sustainable development, mathematical models of optimal control.

Interest to the problems of human survival and the balanced development of the world socio-economic system is aroused under the conditions of the world socio-economic crisis. It is clear that such a kind of development requires the coordination of interests between business, consumer, and financial sectors. It also requires participation of a united control center (the world government). In this context the elaboration of the mathematical model of the global economy that will consider the balance of interests of required sectors is still relevant. Some mathematical models which describe the global development had been elaborated inthe 1950sby scientists from The Club of Rome ([1] etc.). At the heart of these models is the system of usual first order differential equations. The analysis of such models showed the reality of crisis occurrences in world development. To these occurrences belong the greenhouse effect, over-population, the depletion of natural resources, etc. A necessity to fight them is confirmed by ratifying the Kyoto Protocol, which reduces emissions of greenhouse gases. It is important to note that the specified models don’t solve the problem of optimal process control in global development and need a large amountof numeral experiments. These experiments do not always lead to optimal or quasi optimal development scenarios. Currently, the interest in investigating global development problems is aroused. This is connected with the series of world financial crises, which happened during the last years; which were caused by the imperfection of the world financial system, oriented on the dollar as the only world currency; and the domination through this, the geopolicy of one country. Let’s mark works [2-4] as

representative modern publications on this issue. An approach to solving the problem of global social-economic development management is based on solving the multicriterial, multistage linear optimal control problem.

It is necessary to note that for the management of global social-economic development, the operating agency of the WSES needs to accomplish several complicated and interconnected tasks: 1) socially-industrial (the maintenance of high production volumes with a solvent demand, employment, and high standards of living); 2) financial-industrial (first of all, the elimination of the financial system imbalance and production sector); 3) ecological (preserving a suitable living environment).

Let’s consider the main elements of the prospective approach below. Let’s formulate the following task, which will be called the main task of global social-economic development. We shall consider the available number of branches in the world’s production sectors: food, clothes, housing, the articles of prime necessity etc. It is required to determine the amount of main production funds and the production volumes of the mentioned branches in set time moments, during which the total net present value of cash flows for industrial, social and financial sectors of world economy will be the greatest at a set planning horizon. The formulated task in our opinion can be considered as a global investment project (IP) of optimum WSES development management in a view of statutes to be mentioned. Let’s suppose that in the global development model (GDM), the simultaneous economic agents aforementioned are the decision makers (DM), interested in abalanced development

* This work is supported by State Programs “Development of the Scientific Potential of Highest Education Institutions” (project 2.1.1/2710) and “Scientific andTeaching StaffoflnnovativeRussia” (projectNK-136P/3).

i Надоели баннеры? Вы всегда можете отключить рекламу.