Научная статья на тему 'Neural nets'

Neural nets Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
514
79
i Надоели баннеры? Вы всегда можете отключить рекламу.

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Волковский Сергей Александрович

Научный руководитель С.А. Ермолаева. В статье рассматриваются некоторые наиболее интересные свойства современных нейронных сетей, их устройство и характеристики. Также приведена краткая история появления современных методов искусственного интеллекта, моделирующих деятельность человеческого мозга.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Neural nets»

NEURAL NETS

С.А. Волковский Научный руководитель - С.А. Ермолаева

В статье рассматриваются некоторые наиболее интересные свойства современных нейронных сетей, их устройство и характеристики. Также приведена краткая история появления современных методов искусственного интеллекта, моделирующих деятельность человеческого мозга.

Introduction

In the beginning of computer era (middle of XX century) several different types of calculating machines were offered. Most of them did not find the sphere of application and became history, but another ones are still used. The most popular and simle computer structure is so-called fon Neiman's architecture, which is used today in almost every PC in the world. However, there is more one structure, besides the fon Neiman machine, which got a shot in the arm especially in last years. It is a matter of neural nets technology, that simulates the human brain activity. In the table there are some facts for comparison [1].

Characteristic Fon Neiman machine Human brain

Procession unit complex simple

high-speed low-speed

one or several ones big quantity

Memory separated from PU integrated in PU

localizated distributed

non-content addresing content addresing

Calculations сentralized distributed

successive parallel

using saved programs using self-training

Reliability high vulnerability high vitality

Sphere of functioning definited not definited

limited without limits

A neural network is, in essence, an attempt to simulate the brain. Neural network theory revolves around the idea that certain key properties of biological neurons can be extracted and applied to simulations, thus creating a simulated (and very much simplified) brain. The first important thing to understand then, is that the components of an artificial neural network are an attempt to recreate the computing potential of the brain. The second important thing to understand, however, is that no one has ever claimed to simulate anything as complex as an actual brain. Whereas the human brain is estimated to have something on the order of ten to a hundred billion neurons, a typical artificial neural network (ANN) is not likely to have more than 1,000 artificial neurons [2].

Now it is obviously, that the only way to make machines thinking is to use control systems with neural mechanisms of functioning - neural nets. That's why we have to try to understand the principle of their working already now.

The Brief History of Invention

The study of the human brain dates back thousands of years. But it has only been with the dawn of modern day electronics that man has begun to try and emulate the human brain and its thinking processes. The modern era of neural network research is credited with the work done by neuro-physiologist, Warren McCulloch and young mathematical prodigy Walter Pitts in 1943. McCulloch had spent 20 years of life thinking about the "event" in the nervous system that allowed to us to think, feel, etc. It was only until the two joined forces that they wrote a paper on how neurons might work, and they designed and built a primitive artificial neural network using simple electric circuits. They are credited with the McCulloch-Pitts Theory of Formal Neural Networks.

The next major development in neural network technology arrived in 1949 with a book, "The Organization of Behavior" written by Donald Hebb. The book supported and further reinforced McCulloch-Pitts's theory about neurons and how they work. A major point bought forward in the book described how neural pathways are strengthened each time they were used. As we shall see, this is true of neural networks, specifically in training a network.

During the 1950's traditional computing began, and as it did, it left research into neural networks in the dark. However certain individuals continued research into neural networks. In 1954 Marvin Minsky wrote a doctorate thesis, "Theory of Neural-Analog Reinforcement Systems and its Application to the Brain-Model Problem", which was concerned with the research into neural networks. He also published a scientific paper entitled, "Steps Towards Artificial Intelligence" which was one of the first papers to discuss AI in detail. The paper also contained a large section on what nowadays is known as neural networks. In 1956 the Dartmouth Summer Research Project on Artificial Intelligence began researching AI, what was to be the primitive beginnings of neural network research.

Years later, John von Neumann thought of imitating simplistic neuron functions by using telegraph relays or vacuum tubes. This led to the invention of the von Neumann machine. About 15 years after the publication of McCulloch and Pitt's pioneer paper, a new approach to the area of neural network research was introduced. In 1958 Frank Rosenblatt, a neurobiologist at Cornell University began working on the Perceptron. The perceptron was the first "practical" artificial neural network. It was built using the somewhat primitive and "ancient" hardware of that time. The perceptron is based on research done on a fly's eye. The processing which tells a fly to flee when danger is near is done in the eye. One major downfall of the perceptron was that it had limited capabilities and this was proven by Marvin Minsky and Seymour Papert's book of 1969 entitled, "Perceptrons".

Between 1959 and 1960, Bernard Wildrow and Marcian Hoff of Stanford University, in the USA developed the ADALINE (ADAptive LINear Elements) and MADELINE (Multiple ADAptive LINear Elements) models. These were the first neural networks that could be applied to real problems. The ADALAINE model is used as a filter to remove echoes from telephone lines. The capabilities of these models were again proven limited by Minsky and Papert (1969).

The period between 1969 and 1981 resulted in much attention towards neural networks. The capabilities of artificial neural networks were completely blown out of proportion by writers and producers of books and movies. People believed that such neural networks could do anything, resulting in disappointment when people realized that this was not so. Asimov's television series on robots highlighted humanity's fears of robot domination as well as the moral and social implications if machines could do mankind's work.Writers of best-selling novels like "Space Oddesy 2001" created fictional sinister computers. These factors contributed to large-scale critique of AI and neural networks, and thus funding for research projects came to a near halt [3].

An important aspect that did come forward in the 1970's was that of self-organizing maps (SOM's). Self-organizing maps will be discussed later in this project. In 1982 John Hopfield of Caltech presented a paper to the scientific community in which he stated that the approach to AI should not be to purely imitate the human brain but instead to use its concepts to build machines that could solve dynamic problems. He showed what such networks were capable of and how they would work. It was his articulate, likeable character and his vast knowledge of mathematical analysis that convinced scientists and researchers at the National Academy of Sciences to renew interest into the research of AI and neural networks. His ideas gave birth to a new class of neural networks that over time became known as the Hopfield Model.

At about the same time at a conference in Japan about neural networks, Japan announced that they had again begun exploring the possibilities of neural networks. The United States feared that they would be left behind in terms of research and technology and almost immediately began funding for AI and neural network projects.

1986 saw the first annual Neural Networks for Computing conference that drew more than 1800 delegates. In 1986 Rumelhart, Hinton and Williams reported back on the developments of the back-propagation algorithm. The paper discussed how back-propagation learning had emerged as the most popular learning set for the training of multi-layer perceptrons. With the dawn of the 1990's and the technological era, many advances into the research and development of artificial neural networks are occurring all over the world. Nature itself is living proof that neural networks do in actual fact work. The challenge today lies in finding ways to electronically implement the principals of neural network technology. Electronics companies are working on three types of neuro-chips namely, digital, analog, and optical. With the prospect that these chips may be implemented in neural network design, the future of neural network technology looks very promising [3].

The Structure of Neural Nets

Biologically speaking, neural networks are constructed in a three dimensional way from minute components, namely neurons (that are practically capable of unlimited interconnections). Artificial neural networks are the combination of artificial neurons which results in the formation of so called "layers". These layers are also interconnected. It can therefore be concluded that all neural networks have a similar topology (structure). Neurons are usually connected in 3 layers.

Layer 1: The first layer is the input layer. This layer consists of neurons that receive information (inputs) from the external environment.

Layer 2: The second layer is hidden from view (not directly visible from the external world) and is referred to as the hidden layer. There is no limit to the number of hidden layers that a network can have, but the trial and error method still remains one of the best ways to find out. However, through experimentation it has been discovered that one layer is usually sufficient.

Layer 3: The third layer is the output layer that communicates the result of the weighted, summed output to the external environment or to the user.

Neurons in a network communicate with one another. They do this between themselves (intra-connections) or with neurons on different layers (inter-layer). Some artificial neurons have few connections, but it is important to note that neurons of one layer are always connected to neurons on another layer.

A neural network has can either be seen as a hierarchal system or resonance system. In a hierarchal system neurons of a "lower" level can only communicate with neurons on a "higher" level. The neurons on the higher level are not allowed to communicate their outputs with lower level neurons. In a resonance structure the neurons are allowed to communicate both to a higher or lower levels of neurons [2].

Feed Forward Networks

Basic Architecture

Feed-forward networks usually consist of three to four layers in which the neurons are logically arranged. The first and last layers are the input and output layers respectively, and there are usually one or more hidden layers in between the other layers. Research indicates that a minimum of three layers (one hidden layers) is required to solve complex problems. As we have already seen, the term feed-forward means that the information is only allowed to "travel" in one direction. This means that the output of one layer becomes the input of the next layer, and so forward. In order for this to occur, each layer is fully connected to next layer (each neuron is connected by a weight to a neuron in the next layer). A multilayer feed-forward network is also often called a multi-layer perceptron.

It should be understood that the input layer does have a part in calculation, rather than just receiving the inputs. The raw data is computed, and activation functions are applied in all layers. This process occurs until the data reaches the output neurons, where it is classified into a certain category.

A three-layer feed-forward network Training Feed Forward Networks

Once the user or programmer has determined the number of neurons in each layer and the number of layers have been decided on, the network's weights must be adjusted to minimize the delta error. A training algorithm is used for this purpose.

In order to train the neural network, sets of known input-output data points must be assembled. In other words, the neural network is trained by example, much like a small child learns to speak.

The most common and widely used algorithm for multi-layer feed-forward neural networks is the back-propagation algorithm. It is based on the Delta Rule that basically states that if the difference (delta) between the user's desired output and the network's actual output is to be minimized, the weights must be continually modified. The result of the transfer function changes the delta error in the output layer. The error in the output layer has been adjusted or

fixed, and therefore it can be used to change the input connection weights so that the desired output may be achieved. This is why feed-forward networks are also often called back-propagation, feed-forward networks.

The training process starts by converting all input weights to small non-zero values. A subset of training samples is presented to the network. Each exemplar is fed into the network separately and the obtained outputs are compared to the desired outputs and the size of the error is measured. The input connection weights are adjusted in such a way that the error will be minimized. This process is repeated (many epochs) several times until satisfactory results are obtained. Training can stop when a the error obtained is less a certain threshold or limit. A threshold of .001 mean squared error is good. The mean squared error is computed by "summing the squared differences between what was a predicted variable should be versus what it actually is, then dividing by the number of components that went into the sum." The above definition suffices and shall not be discussed in further detail.

It is important to note that "over-training" a network can be detrimental to the network's results. If a network is over-trained, it may result in problems when it has to generalize examples, which it is unfamiliar with (outside the training set). For example, if a network is over-trained with a training set consisting of sound devices or samples like "would," "should" and "could", the network may be unable to recognize these words when they are presented to the network by a person with a different tone of voice or accent. If this problem does occur, the programmer can simply include these "unknowns" in the training set or set a lower mean square error threshold.

Examples that the network is unfamiliar with form what is known as the validation set, which tests the network's capabilities before it is implemented for use.

An interesting application of the back-propagation algorithm is its use by astronomers to predict the number of annual sunspots. It learns to predict the future sunspots from historical data collected over the past three centuries. This data is presented to the network from which it can make accurate predictions.

The multilayer feed-forward, back-propagation network is known throughout industry, academia, and all fields of research as the "universal function approximator" and therefore anything learnable can be taught to this network.

Back-propagation was the first practical and still remains the primary method for training multiple-layer feed-forward networks. Its presentation in Rumelhart, Hinton and Williams' paper of 1986 was chiefly responsible for the renewed interest in artificial neural networks. (for more information refer to the "History" section of the project). The original back-propagation algorithm has been modified considerably from the original algorithm, but they are all still based on the same basic principles.

Feed-forward networks trained using the back-propagation algorithms have many useful and diverse areas which include the following:

• Neural networks that learn to pronounce English text.

• Neural networks that are used in speech recognition.

• Neural networks which are used in optical character recognition.

• Neural networks that are capable of steering an autonomous vehicle.

• Neural networks used for diagnosis of cardiovascular illness and heart

• Neural networks used in radar detection and classification • Neural Networks used for modelling of the control of eye movements

History

The feed-forward, back-propagation network architecture was developed in the early 1970's. The initial idea is credited to various individuals namely, D.B Parker, P.J Werbos, G.E Hinton, R.J Williams, and D.E Rumelhart. Their ideas were brought forward and combined at

many international conferences and seminars. Their ideas and designs enthused the neural network industry, and soon the back-propagation, feed- forward network was introduced. Today, it is the most common and widely used network and is has many useful and interesting applications. Many new kinds of networks have been developed using the principles of back-propagation, feed-forward networks.

The Spheres of Application

Networks have many real world applications in the fields of medicine, commerce and military.

Artificial neural networks in medicine are used chiefly to identify diseases like cardiovascular illness. This is done through the use of predictive networks and auto-associative memory networks, which are being used alongside conventional medical procedures. This type of network is particularly beneficial when doctors are unable to identify diseases or for patients making use of so-called "online doctors."

Artificial neural networks in finance and commerce are primarily used for prediction of "booms" or "crashes" of the stock market. Because financial institutions deal with data that involves statistics, neural networks can easily be adapted to fit into such an environment. Other uses of artificial neural networks in finance include credit evaluation, application form verification, airline seat allocation, credit card signature and character evaluation and the list continues.

Artificial neural networks in military are used in missile target evaluation, interpretation of radar and sonar signals, and other such tasks.

Conclusion

Artificial neural networks have advanced in leaps and bounds since their discovery in 1943 and their first implementation to tackle real world problems in 1958. The latest developments in the research of neural networks, are providing society with new and improved methods of tackling complex problems and tedious tasks. It can be concluded that artificial neural networks and artificial intelligence are becoming increasingly popular fields of research and development and is a relatively new and truly fascinating paradigm of computing. The future of artificial neural networks and artificial intelligence looks very promising.

Literature

1. L.N. Yasnitskiy. Introduction to AI. Academa, 2005. 26 p.

2. http://ei.cs.vt.edu/~history/Perceptrons.Estebon.html

3. http://library.thinkquest.org/C007395/tqweb

i Надоели баннеры? Вы всегда можете отключить рекламу.