Научная статья на тему 'Hybrid learning algorithm for neural networks'

Hybrid learning algorithm for neural networks Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
185
43
i Надоели баннеры? Вы всегда можете отключить рекламу.

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Jamil Ahmad

This paper presents a hybrid algorithm for the learning mechanism of the neural network model which adjusts weights twice in any single iteration. This new algorithm is a combination of a machine learning algorithm proposed by Gabor in 1960s and LMS algorithm. This hybrid algorithm uses two different equations based on mean square errors to optimize weights. The algorithm showed better performance when compared with Least Mean Square (LMS) and Back Propagation (BP) learning algorithms using pattern recognition problem.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Рассмотрен гибридный алгоритм для обучения нейросети, который адаптирует веса дважды на одной итерации. Этот новый алгоритм представляет собой комбинацию алгоритма обучения машины, предложенный Габором в 1960-х годах и метода наименьших квадратов. Этот гибридный алгоритм использует два разных уравнения, основанных на минимизации среднеквадратической ошибки для оптимизации весов. Алгоритм показал лучшие способности, чем метод наименьших квадратов и метод обратного распространения ошибки относительно задачи распространения образов.

Текст научной работы на тему «Hybrid learning algorithm for neural networks»

ica 27 (1991) 247-255.

4. I. Kanellakopoulos, P.V. Kokotovic, A.S. Morse. Systematic design of adaptive controllers for feedback linearization systems, IEEE Trans. Automat. Control 36 (1991) 1241-1253.

5. R. Marino, P. Tomci. Global adaptive output feedback control of nonlinera systems, part 1: nonlinear parameterization, IEEE Trans. Automat. Control 38 (1993) 17-32.

6. M.M. Polycarpou, P.A. Loannou. A robust adaptive nonlinear control design, Automatica 32 (1996) 423-427.

7. L.X. Wang. Adaptive fuzzy systems and control, Prentice-Hall, Englewood Cliffs, NJ, 1994.

8. L. X. Wang. Fuzzy systems are universal approximators, Proceedings of IEEE Conference on Fuzzy systems, San Diego, 1992, pp. 1163-1170.

9. R. Palm. Sliding mode fuzzy control, Proceedings of IEEE Conference on fuzzy systems, San Diego, 1982, pp. 519-526.

10. D. Driancov, R. Palm, (Eds.). Advances in fuzzy control, Heidelberg, 1998.

11. G.C. Hwang, S.L. Lin. A stability approach to fuzzy control

design for nonlinear systems, Fuzzy Sets and Systems 48

(1992) 279-287.

12. T.A. Johansen. Fuzzy model based control: stability, robustness, and performance issues, IEEE Trans. Fuzzy Systems 2 (3) (1994) 221-234.

13. W.A. Kwong, K.M. Passino. Dynamically focused fuzzy learning control, IEEE Trans. SMS 26 (1996) 53-74.

14. J.R. Layne, K.M. Passino. Fuzzy model reference learning control for cargo ship steering, IEEE Control Systems Mag. 13 (6)

(1993) 23-34.

15. C.Y. Sue, Y. Stepanenko. Adaptive control for a class of nonlinear systems with fuzzy logic, Trans. Fuzzy Systems 29

(1994) 285-294.

16. Wang Li-Xin. Stable adaptive fuzzy control of nonlinear systems, IEEE Trans. Fuzzy Systems 1 (1993) 146-155.

17. C.C. Lee. Fuzzy logic in control systems: Fuzzy logic controllers, parts I and II, IEEE Trans. Syst., Man, Cybern., 20 (2) (1990) 404-435.

YAK 004.93

HYBRID LEARNING ALGORITHM FOR NEURAL NETWORKS

Jamil Ahmad

Рассмотрен гибридный алгоритм для обучения нейросети, который адаптирует веса дважды на одной итерации. Этот новый алгоритм представляет собой комбинацию алгоритма обучения машины, предложенный Габором в 1960-х годах и метода наименьших квадратов. Этот гибридный алгоритм использует два разных уравнения, основанных на минимизации среднеквадратической ошибки для оптимизации весов. Алгоритм показал лучшие способности, чем метод наименьших квадратов и метод обратного распространения ошибки относительно задачи распространения образов.

This paper presents a hybrid algorithm for the learning mechanism of the neural network model which adjusts weights twice in any single iteration. This new algorithm is a combination of a machine learning algorithm proposed by Gabor in 1960s and LMS algorithm. This hybrid algorithm uses two different equations based on mean square errors to optimize weights. The algorithm showed better performance when compared with Least Mean Square (LMS) and Back Propagation (BP) learning algorithms using pattern recognition problem.

1 INTRODUCTION

In recent years, ANN models have made great leaps in solving complex problems such as prediction, classification, speech analysis, image analysis, and pattern recognition [15]. A number of sophisticated learning models have been developed to solve variety of problems. In spite the remarkable achievement by ANN model in some application areas, there is still space for improvement. Mostly, these models are suffered from problems of slow convergence and its structure definition. This paper presents a hybrid approach by combining two algorithms into a single learning model. This hybrid algorithm is mainly derived from the Gabor theory of Communication and Machine Learning, [6-7] which is modified by merging it with LMS based learning algorithm. Further information about the algorithm can be found in [8]. The algorithm is compared with standard BP [9] and LMS [10], with the pattern recognition problem. Various parameters such as initialization of weights, learning rate,

and learning curve are also investigated with the help of experimental study.

2 THE LEARNING PARADIGM

The general structure of the proposed hybrid learning algorithm is shown in Figure no. 1, which shows working mechanism of the algorithm. It can be noted that the algorithm adjusts weight in two stages. Both stages are carried out in each training run simultaneously, i.e., the adjustment of the weights takes place twice in single iteration. In the first stage, the algorithm calculates three errors associated with each weight and uses them to modify the associated weight (only one weight). Subsequently, the algorithm uses mean square errors to adjust all the weights which is considered as a second stage of the model. The flowchart for the proposed system is also shown in Figure no. 2.

Input

Figure 1 - General structure of the proposed Hybrid learning algorithm

Start

JL

Initialise variables

Yes

Start training runs (increment training counter) Increment weight counter i=i+1 Take input data set (see Table 3.1)

Calculate computed output and errors

1st stage: Adjust one weight (if i > no. of total weight set i=

2nd stage: Adjust all weights using MSE

^ ^Conditio^ Check the training runs?

No

i

Stop

Stored results (optimised weights + computed output)

Figure 2 - Working flow of the proposed Hybrid learning algorithm

3 EXPERIMENTAL STUDY

The above two-stages model is analyzed through experimental study using a computer simulation package developed for this purpose. During the experiments a number of parameters such as, initial weights setting, learning constant, etc. are investigated thoroughly. The following paragraphs explain these parameters with more details and simulation results.

3.1 The Initial Weights values of the Algorithm

Most of the network models use random initial values for the weights to calculate their first approximated computed output. It is clear from the current literature that no specific rule exists to calculate the initial weight values; so in most cases these values are fixed after a number of experiments which are carried out with different random initial weight values. A number of experiments are carried out to find suitable initial weight boundaries for the proposed hybrid learning algorithm. Figures no. 3(a) and 3(b) show the learning curves for initial weights values, (< 1.0) and (>1.0), respectively. These graphs indicate that the initial weights' values less than 1.0 shows better performance.

a)

b)

Figures 3 - Initial weight values for the proposed Hybrid learning Algorithm. The weight values are randomly selected less than 1.0 and greater than 1.0.

3.2 Learning rate of the Algorithm

The concept of learning rate is discussed intensively in the current literature, which controls stability and speed of

convergence. Generally, a small value is preferable, i.e., between 0.1 and 1.0 (in some cases even 2.0 is acceptable) [11]. High learning rates cause divergence from the required target output, as in the case of LMS learning algorithm, especially when the learning rate is greater than 2.

The learning rate of the hybrid learning algorithm should be very small, i.e. between 0.00005 - 0.0001. Several experi-

ments are carried out to set the learning rate for this algorithm and the results are graphically shown in Figures 4(a) to 4(f) for 0.0001, 0.00009, 0.00005, 0.00001, 0.000009, and 0.02, respectively. It is found that the proposed hybrid learning algorithm has a precise range of values for the learning rate. Further, learning rates of values 0.02 or more create problems during the convergence.

Tr aining run s Tr aining run s

a)

b)

0.8

Learning curve: lr = 0.00005

0 50 100 150 200

Training runs

0.8

Learning curve: lr = 0.00001

0 50 100 150 200

Training runs

c)

d)

o.s

Learning curve: lr = 0.000009

Learning curve: lr = 0.02

50 100 150 200 Training runs

e)

50 100 150 200 Training runs

f)

Figures 4 - Effect of learning constant with different values on the training performance of the proposed Hybrid

learning algorithm

4 AN EXAMPLE OF PATTERN RECOGNITION

The proposed model is analyzed with the help of an example of pattern recognition. The input and target output are provided, as shown below in Figure no. 5. The learning curves for the proposed hybrid, LMS and BP learning algorithms are discussed in the following section.

Figure 5 - Input and target patterns

0.25

0.15

0.05 -

4.1 The Learning Curves

The criteria for the adjustment of the weight have been an object of the neural network literature for some time, but practically there is only one choice, the criteria of mean squares, as used by Widrow-Hoff, Gabor, Kolmogorov and Wiener for their models. The proposed model also uses MSE for its weights optimization. One can observe the speed, performance and adaptation of the neural network learning algorithms from the reduction of associated MSEs. The MSEs for the example pattern as shown above are presented for the hybrid, LMS and BP learning algorithms in Figure nos. 6(a), 6(b) and 6(c), respectively. These graphs indicate better performance for the hybrid learning algorithm than the other two algorithms. The comparison with the LMS and BP learning algorithms shows that the hybrid learning algorithm is more powerful and its optimization techniques are faster than the LMS and BP learning algorithm because it reaches quickly to the minimum level of the error, i.e., shown in Figure no. 6(a).

a) The proposed Hybrid Learning Algorithm

b) LMS Learning Algorithm

5 CONCLUSION

This paper presented a novel hybrid learning algorithm for neural networks. All important and necessary parameters of the algorithm are tested for the proposed model to show its capability for the implementation of complex problem. The proposed model is also implemented for the pattern recognition problem and compared with standard BP and LMS learning algorithms. The new proposed hybrid learning algorithm has shown better training performance, minimum error and better recognition rates. The overall minimum training time and smooth error reduction have found to be prominent features of the newly developed learning algorithm. Implementation of the proposed model for complex pattern recognition such as hand written digits and speech are amongst the major tasks for future study.

c) BP Learning Algorithm

Figure 6 - Learning Curves of Hybrid, LMS and BP Learning Algorithms

Karasawa S.: THE INTELLIGENT CIRCUIT THAT IS OPERATED BY TRANSMISSION OF IMPULSE AS SYMBOL OF ACTIVITY

REFERENCES

1. S. Knerr, L. Personnaz, and G. Dreyfus Handwritten Digit Recognition by Neural Network with Single-Layer Training, IEEE Trans. On Neural Networks, 3, pp. 962-968, Nov. 1992.

2. A. K. Jain, J. Mao, J. and K. M. Mohiddin, Artificial Neural Network: A tutorial, IEEE Computers, pp. 31-44, March 1996.

3. W. A. Schmidt, and J. Davis, Pattern Recognition Properties of Various Feature Spaces for higher order Neural Networks, IEEE Trans. On PAMI, vol. 15, no. 8, pp. 795-801, Aug. 1993.

4. R. Lippman, An Introduction to Computing with Neural Network, IEEE ASSP Mag. Pp. 4-22, Apr. 1987.

5. A. Rajavlu, M. T. Musavi, and M. V. Shirvaikar, A Neural Network Approach to Character Recognition, Neural Networks, vol. 2, pp. 387-393, 1989.

6. D. Gabor, Communication Theory and Cybernetics, IRE Trans., CTI-4, pp. 19-31, 1954.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

7. D. Gabor, W. P. Wilby, and R. Woodcock, Universal non-linear Filter Predictor Simulator which optimizes itself by a learning processes, Proc. IEE, vol. 108, Part B. no. 40, pp. 422-433, Jul. 1961.

8. J. Ahmad, "Novel Neural Network strategies based on the Gabor-Kolmogrov learning Algorithm for Pattern recognition and prediction", University of London, 1995.

9. D. E. Rumelhart, G. E. Hinton, and R. Williams, Learning internal representations by error propagation, in Parallel Distributed Processing (D. E. Rumelhart and J. L. McCelland, Eds.), Cambridge, MA:MIT Press, 1986, vol. 1, chapter 8.

10. B. Widrow and M. E. Hoff, "Adaptive Switching Circuits," IRE WESTCON CONV. Record, Part 4, pp. 96-104, 1960.

11. B. Widrow B. and M. A. Lehr, "30 Years of Adaptive Neural Networks Perceptron, MADALINE, and Backpropagation," IEEE, vol. 78, no. 9, pp. 1415-1442, Sep. 1990.

УДК 004.93

THE INTELLIGENT CIRCUIT THAT IS OPERATED BY TRANSMISSION OF IMPULSE AS SYMBOL OF ACTIVITY

Karasawa S.

Якщо вплив на датчиках представлений тдмножиною iмпульсiв, а дИ виконуючих пристрою представлен тшою тдмножиною iмпульсiв, тодi рабоча пам'ять формуеться через дiю одночасно шляхом поеднання точок, де iснують iмпульси. 1нтелектуальна поведiнка, що описанна у логiчнiй програмi, може бути достигнута за допомогою ланцюга, що передав iмпульс як дiю.

Если воздействия на датчиках представлены подмножеством импульсов, а действия исполнительных устройств представлены другим подмножеством импульсов, тогда рабочая память формируется через действие одновременно путем соединения точек, где существуют импульсы. Интеллектуальное поведение, описанное в логическей программе, может быть достигнуто посредством цепи, которая передает импульс как действие.

If activities on sensors are represented by a subset of impulses and actions of actuators are represented by another subset of impulses, the working memory is formed through the activity concurrently by connecting the points where impulses exist. The intelligent behavior described in a flow chart can be achieved by means of the circuit that transfers an impulse as an activity.

1 INTRODUCTION

Neural engineering is an emerging discipline [1]. The greater part of artificial intelligence (AI) is researched by means of computational models. Those algorithms of the models are adapted to the digital computer. We appreciate the fruits of the software. The computer is omnipotent if the algorithm is perfect. Although the computing paradigms such as artificial neural networks [2] and connectionist models [3] are taking nature of a brain into account, the programs are difficult to form a working memory through an experience automatically.

The software met difficulties in the field of recognition if the object is things and affairs in the real world. An activity

is nonlinear and an activity is not continuous. The impulsive activity in the brain is not computational. The author as a researcher in the field of semiconductor electronics proposes the circuit where an impulse is transferred as an activity is transferred. The artificial intelligence in which the existence of impulse means occurrence of an action can be termed as hardware artificial intelligence.

Since a digital data is consist of signal on motionless state, the change of a state is represented by means of two states, i.e. one is obtained before the change, and the other is obtained after the change. There are two kinds of changes on a digital state, one is the positive impulse that causes an excitatory state, and the other is the negative impulse that inhibits an active state. A positive impulse that is the minimum unit of excitatory action is able to connect the lines at the transference, and the transmission of the positive impulse creates the working memory that makes possible to replay the same activity [4].

The action of connections makes possible to materialize the working memory. Although the definition of the working memory is difficult within the traditional concept of software, we can define the working memory as the function of a neuron, because the connections of a neuron are formed by means of activities, and the neuron decodes impulsive activities and it outputs an activity. The intelligence is a kind of activity and it is not restricted to the processing of information.

If we employ the segmentation of working memory in order to recognize things and affairs in the real world, the processing of recognition will become easy. If we divide a pattern into many pieces in order to compute, the processing of recognition becomes a jigsaw puzzle.

On the other hand, the intermittent operations of impulses make possible to carry out time-sharing of the operations. A circulating impulse in a loop is able to keep the activity, and the paralleling activities by means of transmit-

i Надоели баннеры? Вы всегда можете отключить рекламу.