Научная статья на тему 'CANONICAL DECOMPOSITION FOR TENSORIZATION OF NEURAL NETWORKS'

CANONICAL DECOMPOSITION FOR TENSORIZATION OF NEURAL NETWORKS Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
2
2
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
neural networks / canonical decomposition / tensorization / tensor / нейронные сети / каноническое разложение / тензоризация / тензор

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Y.S. Ganzha, N.V. Shaposhnikova, A.M. Popov, V.A. Okhorzin

The article describes a practical implementation of constructing a low-rank approximation for the kernel of a convolutional ANN with an efficient implementation of its convolution with a vector and differentiation by parameters.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

КАНОНИЧЕСКОЕ РАЗЛОЖЕНИЕ ДЛЯ ТЕНЗОРИЗАЦИИ НЕЙРОННЫХ СЕТЕЙ

В статье описана практическая реализация построения низкорангового приближения для ядра сверхточного кольца с эффективной реализацией его свертки вектором и дифференцирования по параметрам.

Текст научной работы на тему «CANONICAL DECOMPOSITION FOR TENSORIZATION OF NEURAL NETWORKS»

Актуальные проблемы авиации и космонавтики - 2021. Том 2

УДК 519.711.3

CANONICAL DECOMPOSITION FOR TENSORIZATION OF NEURAL NETWORKS

Y. S. Ganzha*, N. V. Shaposhnikova Scientific advisers - A.M. Popov, V. A. Okhorzin

Reshetnev Siberian State University of Science and Technology 31, Krasnoyarskii rabochii prospekt, Krasnoyarsk, 660037, Russian Federation *Е-mail: yanavaio@yandex.ru, shapninel@yandex.ru

The article describes a practical implementation of constructing a low-rank approximation for the kernel of a convolutional ANN with an efficient implementation of its convolution with a vector and differentiation by parameters.

Keywords: neural networks, canonical decomposition, tensorization, tensor.

КАНОНИЧЕСКОЕ РАЗЛОЖЕНИЕ ДЛЯ ТЕНЗОРИЗАЦИИ НЕЙРОННЫХ СЕТЕЙ

*

Я. С. Ганжа , Н. В. Шапошникова Научные руководители - A.M. Попов, В. A. Охорзин

Сибирский государственный университет науки и технологий имени академика М. Ф. Решетнева Российская Федерация, 660037, г. Красноярск, просп. им. газ. «Красноярский рабочий», 31

*Е-mail: yanavaio@yandex.ru

В статье описана практическая реализация построения низкорангового приближения для ядра сверхточного кольца с эффективной реализацией его свертки вектором и дифференцирования по параметрам.

Ключевые слова: нейронные сети, каноническое разложение, тензоризация, тензор

Introduction. Considering the kernel of a convolutional ANN as a four-dimensional array (tensor), we can construct a low-rank approximation for it with an efficient implementation of its convolution with a vector (forward signal propagation in the network when forming a prediction) and differentiation by parameters (backward signal propagation in the network during training). We will consider the practical feasibility of this ANN tensorization approach on a specific model numerical example corresponding to the problem of automatic recognition of handwritten digits using the classical MNIST dataset [1], representing a labeled set of handwritten digit images (60,000 training images and 10,000 verification images).

The optimal ANN architecture for working with graphical images is a set of sequential convolutional layers. Since the canonical decomposition provides the highest compression ratio, we will first use this low-rank tensor format to compactly represent the kernels of ANN convolutional layers. For software implementation, we will use the popular PyTorch machine learning framework in the python programming language and the colab cloud service, which allows training powerful deep neural network architectures on modern computing equipment. We will use a ready-made implementation of the classical canonical decomposition within the tensorly python library [2].

As a basic model for the problem of recognizing handwritten digits, we consider a standard architecture consisting of two sequential convolutional layers, one internal fully connected layer and an output layer with 10 neurons, each of which corresponds to the probability of the presence of the corresponding digit in the image.

Секция «Математические методы моделирования, управления и анализа данных»

The tensor layer is implemented by us in the form of an independent class "LayTens", which can replace the corresponding library class of the standard convolutional layer "Conv2d" in the PyTorch framework.

In the framework of the described approach, we consider the kernel of the convolutional layer as a tensor represented in a low-rank canonical format, while the expansion rank (which actually determines the compression ratio) is specified by the user as a layer parameter. When the signal passes through this layer ("forward" method), we transform the input vector into a 4-dimensional tensor, and then we perform the corresponding convolutions with four factorial matrices of the canonical decomposition.

We create a "NetTens" class corresponding to the tensorized convolutional network. The architecture of this network is the same as the basic "NetBase", with the exception of replacing conventional convolutional layers with their tensorized version.

In Fig. 1 and Fig. 2, we present the result of training the basic and tensorized networks (with canonical tensor rank equal to 3), respectively, for five training epochs.

loss function

Accuracy of predictions

TrainE^ Testing

- Training

Testing

1.0 1.5 2.0 2.5 3.0 15 4.0 4.5 The agy of learning

L0 L5 ¿0 2.5 3.0 15 40 4S 5.0

The agy of learning

Fig. 1 - Calculation result for the base CNN

Loss function

Accuracy of predictions

L5 2.0 2.5 30 15 4.0 4.5

The age of learning

097 I Tiiimry,

- Testing

095 094 093 092

L 0 l 5 2 0 2.5 10 15 40 4 5 5

The age of learning

Fig. 2 - The result of the calculation for the tensorized CNN using canonical decomposition (the

tensor rank is equal to 3)

For the convolutional layers, we use 10 filters with a window width of 3x3, the learning rate parameter is set to 0.0001. In Table 1, we present the dependence of the result on the selected rank of the canonical decomposition. The accuracy presented in the table corresponds to the percentage of correctly recognized images in the test dataset, and the compression ratio is defined as the ratio of the number of parameters of the base convolutional layer to the corresponding number of parameters of a similar tensorized layer.

Актуальные проблемы авиации и космонавтики - 2021. Тома 2

Table 1

Results of calculations using the canonica expansion depending on the canonical tensor rank

Architecture Tensor rank Accuracy,% Compression ratio

Convolutional basic - 98,34 i

Convolutional tensor 3 9l,i2 l,l

Convolutional tensor 6 9l,95 3,8

Convolutional tensor 9 9l,6l 2,6

As you can see, as the rank of the canonical decomposition decreases, the compression ratio increases significantly, while the accuracy of network predictions decreases insignificantly. So with a rank of 3, we have a compression of more than 7 times and a corresponding decrease in accuracy by about 1 percent.

References

1. LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): pp. 2278-2324.

2. Jean Kossaifi, Yannis Panagakis, Anima Anandkumar and Maja Pantic, TensorLy: Tensor Learning in Python, Journal of Machine Learning Research (JMLR), 2019, рр. 576-591.

© Ganzha Y. S., Shaposhnikova N. V., 2021

i Надоели баннеры? Вы всегда можете отключить рекламу.