EFFECTIVE LEARNING OF DEEP NEURAL NETWORKS FOR PATTERN RECOGNITION TASKS

N.V. Shaposhnikova; Y.S. Ganzha; M.V. Gordienko; L.V. Lipinskiy

Секция «Математические методы моделирования, управления и анализа данных»

UDC 519.6

EFFECTIVE LEARNING OF DEEP NEURAL NETWORKS FOR PATTERN RECOGNITION TASKS

N. V. Shaposhnikova, Y. S. Ganzha, M. V. Gordienko Scientific supervisor - L. V. Lipinskiy

Reshetnev Siberian State University of Science and Technology 31, Krasnoyarskii rabochii prospekt, Krasnoyarsk, 660037, Russian Federation E-mail: shapninel@yandex.ru, yanavaio@yandex.ru

This article discusses modern approaches to building deep neural networks based on the idea of tensorization, which allows to speed up the training and operation of networks by orders of magnitude, which makes it possible to use them on mobile devices in offline mode.

Keywords: artificial intelligence, deep learning, tensor network, artificial neural network, low rank approximation.

ЭФФЕКТИВНОЕ ОБУЧЕНИЕ ГЛУБОКИХ НЕЙРОННЫХ СЕТЕЙ ДЛЯ ЗАДАЧ РАСПОЗНАВАНИЯ ОБРАЗОВ

Н. В. Шапошникова, Ю. С. Ганжа, М. В. Гордиенко Научный руководитель - Л. В. Липинский

Сибирский государственный университет науки и технологий имени академика М. Ф. Решетнева Российская Федерация, 660037, г. Красноярск, просп. им. газеты «Красноярский рабочий», 31 E-mail: shapninel@yandex.ru, yanavaio@yandex.ru

Ррассматриваются современные подходы к построению глубоких нейронных сетей, основанные на идее тензоризации, позволяющей на порядки ускорить обучение и эксплуатацию сетей, что дает возможность использовать их на мобильных устройствах в автономном режиме.

Ключевые слова: искусственный интеллект, глубокое обучение, тензорная сеть, искусственная нейронная сеть, низкоранговая аппроксимация.

Introduction. Today, artificial neural networks and deep learning [1] have become practically indispensable in applications related to the tasks of machine vision, machine translation, speech to text conversion, text rubrication, video processing, etc. In the coming years, ANN methods will begin to be actively applied in autonomous driving systems for cars and aircraft, autonomous robotic systems in production, in automated biomedical systems and other robotic applications.

However, for the practical use of neural networks methods in order to accelerate the development of the country's scientific, technical and technological complex, it is necessary to bring new algorithmic ideas and developments to the stage of practical application, i.e. to develop technological models for the application of new methods.

Tensor networks. N In spite of the presence of a number of classical theorems proving the approximate capacity of neural network structures, current advances in the field of ANN in most cases associated with a heuristic construction of a network architecture that is applicable only to a particular problem under consideration. However, there is no complete understanding of the internal laws of the functioning of the network, the necessity or redundancy of certain layers of the network, methods for the optimal choice of hyper parameters, etc.

The lack of comprehensive scientific answers to the above questions significantly limits the qualitative development of the artificial neural networks method, and necessitates the modification of existing algorithms and the development of new approaches. Significant progress in solving this

Актуальные проблемы авиации и космонавтики - 2020. Том 2

problem has been obtained in recent years in establishing and connecting between deep artificial neural networks and tensor networks [2], which made it possible to use the methods of low -rank tensor approximation, in particular, TT decomposition [3], for the fundamental compression and acceleration of learning deep artificial neural networks [4, 5].

These methods are based on the idea of a low-rank tensor approximation and the construction of the corresponding compact (low-ranking) representation (decomposition) of a multidimensional data array of the form:

X eRNlXN2X-xNd (1)

where d (d = 1,2,3,... ) is the dimension.

Within the decomposition of the tensor train [Oseledets2011TensorTrain], the multidimensional array X by repeatedly calculating the singular decomposition of the scan matrices is represented as:

X = la^ila^i-'-ll^idi (l,i1,a1)g2(a1,i2,a2)^gd(ad_1,id,l) (2)

where r1,r2,^,rd_1 are the ranks of decomposition, and three-dimensional arrays Q1,Q2'■■■>Gd are the so-called decomposition kernels.

As can be seen from the above formula, the total number of parameters in such a decomposition does not exceed the quantity dxnx max(rfc)2, i.e., it linearly depends on the dimension and the number of elements along each dimension with limited ranks of the decomposition.

And tax decomposition can also be applied to matrices and vectors if they are formally represented as multidimensional arrays by the tensorization procedure (artificially increasing the dimension of a small-dimensional array).

For the resulting objects, the consumed memory and computational complexity becomes logarithmic in terms of the number of elements, which allows in some cases to reduce the consumption of memory and computing resources by a thousand times.

The decomposition of the tensor train makes it possible to compactly represent the fully connected or convolutional layer artificial neural networks, which in turn allows an order of magnitude faster network operation.

Conclusion. The idea of a low-rank approximation of multidimensional arrays described in this paper is a multidimensional analogue of classical matrix decompositions (singular value decomposition, skeletal decomposition, etc.) and allows you to remove the so-called dimensional curse.

The decomposition of a tensor train based on this idea is actively used in a wide range of practical applications, and the possibility of its use shown in our work to accelerate the work of deep neural network structures will significantly reduce the requirements for computing resources, which will make it possible to use them in autonomous unmanned devices, including aircraft.

References

1. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.

2. Cichocki, Andrzej, et al. Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions. Foundations and Trends in Machine Learning 9.4-5 (2016): 249-429.

3. Oseledets, Ivan V. Tensor-train decomposition. SIAM Journal on Scientific Computing 33.5 (2011): 2295-2317

4. Lebedev, Vadim, et al. Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553 (2014).

5. Novikov, A., Podoprikhin, D., Osokin, A., & Vetrov, D. P. (2015). Tensorizing neural networks. In Advances in neural information processing systems (pp. 442-450).

EFFECTIVE LEARNING OF DEEP NEURAL NETWORKS FOR PATTERN RECOGNITION TASKS Текст научной статьи по специальности «Компьютерные и информационные науки»

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — N.V. Shaposhnikova, Y.S. Ganzha, M.V. Gordienko, L.V. Lipinskiy

Похожие темы научных работ по компьютерным и информационным наукам , автор научной работы — N.V. Shaposhnikova, Y.S. Ganzha, M.V. Gordienko, L.V. Lipinskiy

ЭФФЕКТИВНОЕ ОБУЧЕНИЕ ГЛУБОКИХ НЕЙРОННЫХ СЕТЕЙ ДЛЯ ЗАДАЧ РАСПОЗНАВАНИЯ ОБРАЗОВ

Текст научной работы на тему «EFFECTIVE LEARNING OF DEEP NEURAL NETWORKS FOR PATTERN RECOGNITION TASKS»