PRACTICAL USAGE OF ARTIFICIAL NEURAL NETWORKS FOR THE TASKS OF CLASSIFICATION AND IMAGE COMPRESSION
В.Д. Стремоухов Научный руководитель - Т.В. Примакина
Тема статьи - практическое применение нейронных сетей для решения задач классификации и программного сжатия изображений. В статье рассмотрены базовые аспекты функционирования нейросетей, различные виды сетевых топологий и методов обучения нейросетей, особенности построения нейронных сетей для рассмотренных практических задач.
1. Introduction
In this work I'm going to introduce you some examples of practical usage of neural networks. The term "Neural Network", which can nowadays be heard just everywhere, however, can not be distinctly explained by ordinary man. M. Negnevitsky in his book "Artificial Intelligence. A guide to Intelligent Systems" defines the neural network as "a model of reasoning based on the human's brain". However, this work doesn't take as a goal to define the basic principals of our neural system working, so I'll only try to explain in a few sentences.
1.1. How does neural network model the brain
The neural network consists of a number of nodes, which are called neurons. Neurons are connected with weighted links, corresponding to biological dendrites and axons.
Biological neural network
input layer Middle [ayer Output layer
Architecture Of 3 lyptcat artificial neural networti
Fig. 1. Biological & artificial NNs
Each link is characterized by its weight, weights of all neurons form the memory of neural network. Each neuron receives a number of input signals through input connections (b. dendrites) and the output signal is transmitted through the outgoing connection (b. axon).
But before our neural network starts working, we should learn it. The goal of the learning process is choosing the right connection's weights.
1.2. Goals of this work
In this work I'll explain the methods of application of neural networks for solving problems of economical analysis (task of clustering) and image compression. Problems of choosing the right topology of neural network, learning algorithm and data coding method in each case will be discussed.
2. The NN building 2.1. Practical tasks
As it was told earlier, I'm going to get you known with methods of neural networks usage according to 2 tasks - economical analysis (task of clustering) and image compression. Solving problems of classification requires ranging the available static patterns (parameters of the market situation, medical examination data or information about a client) in certain classes, while the task of image compression, I think, needn't be explained.
Why have I chosen them? Because before the ANN appearance the task of classification could not be solved by a machine, so in this case the ANN just "filled the cavity", replacing the real human brain.
The situation with image compression is another. This task was, of course, always solved by computers, but the image files' sizes continue to grow up, traditional algorithms everything they could (nowadays image file formats even take account of how our eye "decode" the image, being observed, to know, where the decline of the quality will not be noticeable), so we need brand new algorithms. And one of possible trends here is the usage of ANNs for image coding.
2.2. Representation and coding of the input data
Solving a problem of data space representation corresponding to the image compression we have rather small amount of problems. The input image is split up into blocks or vectors of 8x8, 4x4 or 16^16 pixels Sometimes, these blocks are gathered in the groups of blocks, usually it is tightly connected with the chosen topology. Each pixel's value is being normalized inside the segment [0, 1]. The reason of using normalised pixel values is due to the fact that neural networks can operate more efficiently when both their inputs and outputs are limited to a range of [0, 1]. The right choice of the normalization function is a long conversation, connected with biological optics (the attention is usually paid to the characteristics of a human's eye).
When constructing the ANN-based classifier (for using in economics or medicine), the task of input data representation is, maybe, one of the most complicated. First of all it is necessary to determine the complexity level of the system. In real tasks the situation is very common when the number of patterns is limited and this complicates evaluation of the task complexity. It is possible to specificate three main levels of complexity. The first (the simplest one) is when classes can be separated by straight lines (or hyperplanes if the input space has more than two dimensions). This is so called linear separability. In the second case classes cannot be separated by lines (planes), but they can be separated with more complex divisions - nonlinear separability. In the third case classes intersect and we can only speak of probabilistic separability.
In ideal model after preliminary processing we must obtain a linearly separable problem, because after that construction of the classifier is significantly simpler. Unfortunately with real problems we have a limited number of patterns to construct the classifier with. We cannot perform such a preliminary data processing that would result in the linear separability of the patterns.
Using neural networks as a classifier Feed-forward networks are universal tool of functions approximation, therefore they can be used for solving classification tasks.
For the purpose of constructing a classifier it is necessary to determine what parameters influence a decision of ranging a pattern to this or that class. Two problems can arise. First, if the number of parameters is small the situation can develop when the same set of initial data corresponds to examples in different classes. It will be impossible to train the neural network then and the system will not work correctly (it is impossible to find the minimum corresponding to such an initial data set). Initial data must not be contradictive. To solve this problem it is necessary to increase dimensionality of the attributes space (number of components of the input vector corresponding to the pattern). But after increasing the dimensionality of the attributes space we can face a situation when a number of attributes would not be enough for training the system and instead of generalization it will simply remember the training samples and will not function correctly. Thus when determining the attributes we have to find a compromise with their number.
We've already touched the problem of input values normalization. Various functions can be used, from simple linear transformation to the required range to multivariate analysis of parameters and non-linear normalization, depending on cross-impact of the parameters.
2.3. Choosing the size and topology
Correct selection of the network size is very important. It is often impossible to construct a small and fine-working model. The main problem when making a classifier is that our training set in usually very limited, so sometimes a larger model will simply remember patterns from the training samples and will not perform the approximation. It will certainly result in incorrect functioning of the classifier. There are two main approaches to the network construction: constructive and destructive. With the first approach a network of minimal size is created first, then it is gradually increased to achieve the required accuracy. On every step it is trained again. There is also a so-called method of cascade correlation in which after the end of epoch the network architecture is corrected to minimize the error.
With the destructive approach, first an oversized network is taken, then nodes and connections that have little influence on the decision are removed. It is useful to remember the following rule - the number of patterns in the set used for training must be higher than the number of weight being adjusted. Otherwise instead of generalization the network will simply remember the data and lose its ability to classify. The result will be indefinite for the patterns that were not included in the training set.
In the case of image compression, the size and topology is being chosen according to the size of initial image and its input blocks, and to the level of compression we want to achieve.
2.3.1. Back-Propagation Neural Network
Three-layer Uack-propagaiicm neural ne'worV
Fig. 2. Back-Propagation NN
It's the simplest type of neural network given a name because of a learning algorithm, usually used in it. Typically, it is a simple three-layer network, but sometimes more compli-
cated architecture is used. For example, in image compression the Hierarchical Back-Propagation Neural Network can be used, which is like the network of sub-networks.
2.3.2. Hebbian-Based Neural Network
Hebbian-based neural network is a network, in which the learning in being processed according to Hebb's Law. Actually, Hebbian learning can be applied to different types of network topologies, but most often it is used in rather big networks with more then one hidden layer.
HfrbLian learning <n a neural network
Fig. 3. Hebbian learning
There are other types of the ANN topology, but the main idea you should learn is that usually the network topology is created according to the chosen learning algorithm, so let's talk about
2.4. Methods of the ANN training 2.4.1. Back-Propagation
To derive the back-propagation learning law, let us consider the three-layer network. The indices i,j,k refer to neurons in the input, hidden, output layers respectively.
Input signals xi,x2,.. .,xn are propagated through the network from left to right, and error signals, e1,e2,.el from right to left. The symbol wij denotes the weight for the connection between neuron i in the input layer and neuron j in the hidden layer, and symbol wjk the weight between neuron j in the hidden layer and neuron k in the output layer.
To propagate error signals, we start at the output layer and work backward to the hidden layer. The error signal at the output of neuron k at iteration p is defined by
^(phyd^p^ykipX where yd,k(p) is the desired output of neuron k at iteration p.
Neuron k, which is located in the output layer, is supplied with a desired output of its own. Hence, we may use a straightforward procedure to update weight wjk. The rule for updating weights at the output layer is
wjk(p+1)=wjk(p)+Awjk(p), where Awjk(p) is the weight correction.
This method is, maybe, the simplest in the realization, but for complicated tasks is not the best choice often. But the improvement of this method, called adaptive back-propagation is used in the image compression.
2.4.2. Adaptive back-propagation
Adaptive back-propagation neural network is designed to make the neural network compression adaptive to the content of input image. The basic idea is to classify the input image blocks into a few sub-sets with different features according to their complexity measurement. A fine tuned neural network then compresses each sub-set.
Training of such a neural network can be designed as: (a) parallel training; (b) serial training; and (c) activity based training;
The parallel training scheme applies the complete training set simultaneously to all neural networks and use S/N (signal-to-noise) ratio to roughly classify the image blocks into the same number of sub-sets as that of neural networks. After this initial coarse classification is completed, each neural network is then further trained by its corresponding refined sub-set of training blocks.
Serial training involves an adaptive searching process to build up the necessary number of neural networks to accommodate the different patterns embedded inside the training images. Starting with a neural network with pre-defined minimum number of hidden neurones, hmin, the neural network is roughly trained by all the image blocks. The S/N ratio is used again to classify all the blocks into two classes depending on whether their S/N is greater than a preset threshold or not. For those blocks with higher S/N ratios, further training is started to the next neural network with the number of hidden neurones increased and the corresponding threshold readjusted for further classification. This process is repeated until the whole training set is classified into a maximum number of sub-sets corresponding to the same number of neural networks established.
In the next two training schemes, extra two parameters, activity A(Pi) and four directions, are defined to classify the training set rather than using the neural networks. Hence the back propagation training of each neural network can be completed in one phase by its appropriate sub-set.
The so called activity of the ith block is defined as:
1 1
A(Pi) = Z Ap (Pi (i, j)) and Ap (Pi (i, j)) = Z Z (Pi (i, j) - Pi (i + r, J + *))2,
i,j r=-1 s=-1
where Ap(Pi(i,j)) is the activity of each pixel which concerns its neighbouring 8 pixels as r and ^ vary from -1 to +1 in equation (11).
Prior to training, all image blocks are classified into four classes according to their activity values which are identified as very low, low, high and very high activities. Hence four neural networks are designed with increasing number of hidden neurones to compress the four different sub-sets of input images after the training phase is completed.
On top of the high activity parameter, further feature extraction technique is applied by considering four main directions presented in image details, i.e., horizontal, vertical and the two diagonal directions. These preferential direction features can be evaluated by calculating the values of mean squared differences among neighbouring pixels along the four directions.
For those image patterns classified as high activity, further four neural networks corresponding to the above directions are added to refine their structures and tune their learning processes to the preferential orientations of the input. Hence, the overall neural network system is designed to have six neural networks among which two correspond to low activity and medium activity sub-sets and other four networks correspond to the high activity and four direction classifications.
2.4.3. Hebbian learning
While the back-propagation based narrow-channel neural network aim at achieving compression upper bounded by K-L transform, a number of Hebbian learning rules have been developed to address the issue how the principal components can be directly extracted from input image blocks to achieve image data compression. The general neural network structure consists of one input layer and one output layer. Hebbian learning rule comes from Hebb's postulations:
1. If two neurons of either side of connection are activated synchronously, then the weight of the connection is increased
2. If two neurons of either side of connection are activated asynchronously, then the weight of the connection is decreased
Hence, for the output values expressed as [h] = [w]T[x], the learning rule can be described as:
+1) = ^o+OMitL,
' W (t)+ah< (t) X (t) II
where, Wi(t+1) = (wi1, wi2, ... wiN} - the ith new coupling weight vector in the next cycle (t+1); 1<=i<=M and M is the number of output neurons; a - learning rate; hi(t) - ith output value; X(t) - input vector, corresponding to each individual image block. ||. | - Euclidean norm used to normalise the updated weights and make the learning stable.
From the basic learning rule, a number of variations have been developed in the existing research.
2.4.4. Genetic algorithms
Fig. 4. The GA
This type of learning algorithms is based on the by C. Darwin's laws. When the topology of the network is established, each link weight is coded into bitstring with fixed length, called gene. The set of genes in the fixed order is called chromosome. When the size of the chromosome is determined, the initial population of chromosomes is created randomly. And then, these chromosomes take part in the GA cycle. Here is the principal scheme of the GA cycle, offered by M. Negnevitsky:
The GA are often used in cases, when the initial data set is very limited (typical situation in the economical analysis).
3. Conclusion
With the appearance of neural networks, new types of tasks can be solved by computers. But to make the ANN work properly, we should use a proper structure and learning algorithm.
First, we should work with data. In the task of image compression, it means we should choose the block size and in the case of making classifier it means, that we should decompose the whole set of data into two sets: training and testing sets (decomposition in three sets is also possible: training, testing and confirmation sets). The complexity level should be established.
Then, according to the initial data analysis, we should choose the right topology and training method. There are a lot of training methods, but each typical task has a number of methods, most widely used. For example, only 2 methods, in fact, are used when solving the task of classification: the back-propagation learning and genetic algorithms. The activation function choice also has to be paid attention to.
Literature
1. Rutkovskaya D. Neural networks, genetic algorithms and fuzzy systems. Warshawa: LODZ, 1999. p. 382.
2. Negnevitsky M. Artificial intelligence. A guide to intelligent systems. // Harlow:PEL, 2002, p. 395.
3. Image Compression with Neural Networks./ Digital Imaging & Data Compression / http://www.comp.glam.ac.uk/digimaging/research-activities.htm
4. Practical usage of neural networks for the tasks of classification (clustering). / http://www.basegroup.ru/neural/practice.en.htm