Noise Impact on a Recurrent Neural Network with a Linear Activation Function

V. M. Moskvitin; N. I. Semenova

Russian Journal of Nonlinear Dynamics, 2023, vol. 19, no. 2, pp. 281-293. Full-texts are available at http://nd.ics.org.ru DOI: 10.20537/nd230502

NONLINEAR ENGINEERING AND ROBOTICS

MSC 2010: 60H40, 62M45

Noise Impact on a Recurrent Neural Network with a Linear Activation Function

V. M. Moskvitin, N. I. Semenova

In recent years, more and more researchers in the field of artificial neural networks have been interested in creating hardware implementations where neurons and the connection between them are realized physically. Such networks solve the problem of scaling and increase the speed of obtaining and processing information, but they can be affected by internal noise.

In this paper we analyze an echo state neural network (ESN) in the presence of uncorrelated additive and multiplicative white Gaussian noise. Here we consider the case where artificial neurons have a linear activation function with different slope coefficients. We consider the influence of the input signal, memory and connection matrices on the accumulation of noise. We have found that the general view of variance and the signal-to-noise ratio of the ESN output signal is similar to only one neuron. The noise is less accumulated in ESN with a diagonal reservoir connection matrix with a large "blurring" coefficient. This is especially true of uncorrelated multiplicative noise.

Keywords: artificial neural networks, recurrent neural network, echo state network, noise, dispersion, statistic, white gaussian noise

1. Introduction

Over the past few years, artificial neural networks (ANNs) have been applied in solving many problems [1]. Such tasks include image recognition [2, 3], image classification, improvement of sound recordings, speech recognition [4], prediction of climatic phenomena [5] and many others.

The basic principle of ANN construction is signal propagation between neurons using connections with some coefficients. In this case, the greatest efficiency and speed can be achieved

Received February 28, 2023 Accepted April 26, 2023

This work was supported by the Russian Science Foundation (Project no. 21-72-00002).

Viktor M. Moskvitin vmmoskvitin@gmail.com

Nadezhda I. Semenova

semenovani@sgu.ru

Saratov State University

ul. Astrakhanskaya 1, Saratov, 410012 Russia

by paralleling calculations on high-performance computing clusters. However, in this case the bottleneck is the speed of memory access and data processing. The maximum performance of calculations can be achieved only if ANN is completely hardware-implemented. In this case, the problem of memory access and mathematical operations over a large amount of data disappears, since each neuron corresponds to an analog nonlinear component, and each connection to a physical connection channel.

In recent years, there has been an exponential increase in work with hardware implementations of ANNs. Currently, the most effective ANNs are based on lasers [6], memristors [7], and spin-torque oscillators [8]. Connection between neurons in optical ANN implementations is based on the principles of holography [9], diffraction [10, 11], integrated networks of Mach-Zender modulators [12], wavelength division multiplexing [13], and 3D printed optical interconnects [14-16]. Recently, the so-called photonic neural networks have gained popularity [17, 18].

The physical implementation of ANN fundamentally changes the features of noise influence. In the case of digital computer implementation of ANN, noise can enter the system exclusively with the input signal, whereas in analog ANN there are many internal sources of noise with different properties. The purpose of this paper is to study the peculiarities of internal noise propagation in recurrent ANN, to reveal ways to suppress such noises and to analyze the stability of networks to some types of noises.

In our previous studies we were focused on the effects of additive and multiplicative, correlated and uncorrelated noise on deep neural networks [19, 20]. Several models of varying complexity were considered. General features depending on the nonlinear activation function and the depth of the ANN were shown for simplified symmetric ANNs with global uniform connectivity between layers. All the findings and results were then validated for three trained deep ANNs used for number recognition, classification of clothing images, and chaotic realization predictions. Using the analytical methods described in Ref. [20], several noise reduction strategies were proposed in our next study [21].

In this work, we make the problem of studying noise more complicated by considering time dependence. In contrast to previous studies in which deep neural networks were considered, here we are focused on the recurrent neural network, specifically the echo state neural network (ESN). This network consists of three main parts: 1 — the input layer receiving the input signal and transmitting it to the next layer; 2 — one layer called reservoir whose state depends on both input signal at the current moment and previous states of reservoir at previous times; 3 — the output layer making the final output signal. Such networks are often used to work with signals that are highly dependent on time. For example, prediction of chaotic temporal realizations, speech recognition, etc.

2. The system under study

2.1. Types of noise

In this paper we consider only white Gaussian noise with zero mean and some constant dispersion D. The discreteness property of the processes under consideration allows us to speak about the finiteness of the dispersion for white noise sources. The noise values will be different for each neuron each time, so it is uncorrelated in time and in network. Mathematically speaking, it is introduced into each artificial neuron according to the noise operator IV as

yt(t) = NIxi(t) = x(t) • (1 + Cm(t, i)) + Ut i), (2.1) _RUSSIAN JOURNAL OF NONLINEAR DYNAMICS, 2023, 19(2), 281 293_

where xi and yi are noise-free and noisy outputs of the fth artificial neuron, respectively. £ are the sources of white Gaussian noise with zero mean. The indices "A" and "M" point out the types of noise, namely, additive (A) and multiplicative (M) noise with noise dispersions DA and DM. As can be seen from (2.1), the additive noise is added to the noise-free output, while the multiplicative noise is multiplied on it. The part (1 +...) is needed to keep the useful signal. The notation of noise operator IV will be used further to indicate which outputs of neurons become noisy.

The noise dispersions will be fixed throughout the paper as Da = DM = 10_2. This order of values corresponds to what we have previously obtained in an RNN realized in an optical experiment [6, 19].

2.2. Recurrent neural network

There are many different types of neural networks. Their topology and type of neurons strongly depend on a signal type and features of the problems to be solved. If the network is trained to work with signals changing in time, such as speech recognition, prediction of chaotic phenomena etc., then the neural network must have the property of memory. Here recurrent neural networks (RNNs) come to aid. In RNNs, part of neurons have memory of their previous states. In this paper we consider an echo state network (ESN) schematically shown in Fig. 1 as an example of RNN. This network contains input and output neurons (orange) and a hidden layer with multiple neurons called reservoir (gray). The connectivity and weights of neurons inside the reservoir Wres are usually fixed and randomly assigned. The output connection matrix Wout is varied during the training process to make the network produce correct responses to certain input signals.

Fig. 1. Schematic representation of the recurrent neural network

In this paper we are mainly interested in the impact of noise on the reservoir part of the network, and therefore, only gray neurons have a noise influence.

In accordance with the notation in Fig. 1, the input signal xin passes through the input neuron and comes to reservoir neurons coupled via the input matrix Win of size (1 x N), where N is the number of neurons inside the reservoir. Throughout this paper it is fixed as N = 100. The reservoir neurons have the connection to their previous states via a connection matrix Wres

of size (N x N). Thus, the equation of reservoir neurons is

xtres = f ■ Win + 7yire_s1 ■ Wres); ytres = IVxtres, (2.2)

where f (■) is the activation function of reservoir neurons. The type of activation function often depends on the current task. In this paper we are mainly focused on the linear activation function f (x) = ax, since linear and partly linear functions are often used in RNNs. The nonlinear activation function can lead to completely different dynamics and noise accumulation, and it will therefore be the subject of another study.

The index t corresponds to the current time instant, while (t — 1) in the term yres indicates that the outputs of reservoir neurons are taken from the previous time instant. The bold font used for y£es and y[es indicates that they are row vectors (1 x N).

Parameters / and 7 control the impact of input signal (/) and memory (7). In order to keep the same range of the output signals, the condition / + 7 = 1 is imposed on them.

The output of the network comes from the output neuron connected with the reservoir via a connection matrix Wout of size (N x 1):

xout = ytres ■ Wout. (2.3)

In order to see the pure impact and statistics of noise, the output connection matrix Wout is fixed and uniform with elements jj. The input connection matrix Win is responsible for sending an input signal to the reservoir. That is why its values are set to be unity.

2.3. Noise level assessment

In this paper we consider two statistical characteristics to estimate the noise level.

• Dispersion or variance showing the scale of distribution of data around a central point or value. If there is some output signal Y containing K random values with some mean

K

value n[Y] = Yk, then its dispersion can be calculated as follows: k=1

1K

^[Y^FTlE^-MY]). (2.4)

k=1

Therefore, in order to get the statistical characteristics of ESN's output signal, one needs to repeat each input signal x\n K times to get the corresponding statistics, namely, the corresponding mean value and dispersion. Then the same can be repeated for another input signal xt+1. Further we will plot this as a graph of dispersion depending on the mean output value (Fig. 2a), where each colored dependence consists of T = 200 points calculated for each x\n, where t = 1, 2, ..., T.

• Signal-to-noise ratio, SNR which is the measure comparing the level of a desired signal to the level of background noise or variance. In our previous papers [19, 20] we calculated the characteristics similar to SNR, as working for only positive input and output values. Now we will consider both positive and negative values and use therefore a more common form of SNR calculated as

MY] °2 [Y]

(see Ref. [22]). Examples of SNR calculation depending on the mean output value are given in Fig. 2b.

SNR[Y] = (2.5)

3. One noisy neuron

As a first step, let us consider the impact noise has on one isolated neuron with a linear activation function. The neuron receives the input signal x\n and produces the noise-free signal xOut = f {x?). Then, due to internal noise, the output signal becomes noisy according to yout = Nx°ut. In this section we consider only one neuron without time feedback. This neuron has no memory and, therefore, it is not important at this stage how the input signal changes in time. For this reason, we will use a random input signal, and in order to match future tasks for the network, this input signal will consist of T = 200 points with a value range from —1 to 1.

0.02

(b)

0.015 ^ . •

g 0.01

>

0.005

■ additive noise

multiplicative noise • mixed noise

mean output

Fig. 2. Dispersion a2 (panel a) and SNR (panel b) calculated for the output of one noisy neuron with only additive noise (green points), only multiplicative noise (orange) and mixed noise (blue) depending on the corresponding mean output value. The dispersion of noise sources is DA = DM = 10~2. The neuron has a linear activation function with a =1

In order to see the noise impact, we will use the following two characteristics. The first one is a dispersion showing the variance of the output signal from its mean. It is calculated as follows. Each input signal x0n is repeated K = 1000 times to get the statistics of K output values. Then the resulting sequence of noisy output values is averaged to get the corresponding mean value i [yOut] and dispersion a2 [yOut]. Figure 2a shows the dispersion of the output signal of one neuron with additive (green), multiplicative (orange) and mixed (blue) noise depending on the mean output signal.

The second characteristic is a signal-to-noise ratio (SNR) showing the relation between the

„ T^out ]

mean output value and its variance or dispersion as SNR [y°ut] = ^Jout] (see [22])- The

SNR for different noise types is given in Fig. 2b.

As can be seen from Fig. 2a, the additive noise (green points) leads to the constant level of dispersion which does not depend on the input and output signal. The corresponding dispersion can be found as the variance of a random signal:

a2 [yr] = Var [yOut] = Var f {x?) + ^^ = Var[^] = DA.

(3-1)

In the case of multiplicative noise, the variance becomes completely different (orange). There is a quadratic relationship between the mean output value and its variance. In terms of the expectation value and variance, the variance of the output signal of one neuron with multiplicative noise can be found as follows:

Var [y°out] = Var [f (x?) • (1 + M)] = (E [f (x?)]f • Var[^] = DA ■ (E [y°out])2 , (3.2)

where E[] is the expectation value. The expectation value of the output signal is E [yOut] = = E [xOut] = E [f (x?)].

In the case of a linear activation function the dispersion strongly depends on the parameter a:

a2 [y°out] = Var [y°out] = Var [f (x?) • (1 + Cm)] = Da (ax?)2 . (3.3)

Mixed noise (blue points in Fig. 2) combines features of both additive and multiplicative noise. Thus, the dispersion (panel a) is the sum of additive and multiplicative variances. For this reason, the SNR in the case of mixed noise with Da = DM is reduced twice (panel b).

The impact of the a-parameter on SNR and dispersion is shown in Fig. 3. It confirms Eqs. (3.1) and (3.3). The input signal and a do not change the dispersion in the case of additive noise (Fig. 3a) and SNR in the case of multiplicative noise (Fig. 3e). The parameter a has no impact on SNR with additive noise (Fig. 3d), while both a and the input signal change the dispersion in the case of multiplicative noise (Fig. 3b).

4. ESN with a uniform connection matrix

In this section, we will focus on the interplay of input signal and memory. In order to see pure noise accumulation without the impact of connectivity matrices, we consider the uniform connection matrix in the reservoir: W-js = jj.

As a first step, we set 7 = 0, when the reservoir has no memory, and the state of the reservoir depends only on the input signal. According to Sect. 2.2, if 7 = 0, then / = 1.

If the property of memory is turned off, then how the input signal depends on time is irrelevant. Therefore, we use the same random input signal from —1 to 1 as in the previous section.

Figure 4a shows the dispersions calculated by the output signal of ESN with 7 = 0 for additive (green) and multiplicative (orange) noise. The general view of these dependences is the same as that for one noisy neuron. The difference is the range of these values. Comparing Figs. 2a and 4a, the dispersion of ESN's output is reduced by a factor of 100. Thus, the final output signal becomes less noisy.

This can be explained as follows. We introduce the noise only to neurons of the reservoir. In the case of only additive noise their dispersion can be calculated according to Eq. (3.1). If 7 = 0,

N

then the output signal can be calculated as x°ut = jj xtej- Then the output dispersion and

j=i

variance for additive noise is

/1\2 N /1\ 2 N 1 D

Vax ] = M E Var W + ZaU, *)] = M E t)] = —Var[^] =

\ J = \ J =

Comparing this equation with (3.1), the variance is reduced by a factor of N = 100. Therefore, the dispersion level is reduced from 10_2 in one neuron to 10_4 in ESN.

additive noise

multiplicative noise

mixed noise

cö

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

ä a

CÖ

ä a

Fig. 3. Dispersion (panels a-c) and SNR (panels d-f) calculated for the output of one noisy neuron with only additive noise (a, d), only multiplicative noise (b, e) and mixed noise (c, f) depending on the input value xin and the parameter a. The dispersion of noise sources is DA = DM = 10_2

Now we turn on the property of memory by making 7 = f3 = 0.5. Then the sequence of the input signal becomes important. To keep the same range of signal and include the property of growing and decreasing input signal, we will use the sine function as an input (Fig. 4e). Figure 4b shows the dispersions for this case. From that moment on we will plot these characteristics depending on time t to emphasize the peculiarities of the input signal. Comparing the scales of dispersion in panels a and b, one can see that dispersion grows with increasing 7, as now the noise is accumulated in the reservoir.

Another thing that should be pointed out is the view of these dependences. A comparison of panels a and c of Fig. 4 may give a wrong impression that now the general form of dependences has changed a lot. However, it is not so. If we change the dependence on time in panels c and d to the dependence on the mean output value as before, then the general view will be exactly the same as that in panels a and b.

If 7 grows further up to 0.9, the maximal dispersion level will increase to ~ 5 • 10-4. The general view of dispersion dependences remains the same as that shown in Fig. 4b. In the case of additive noise the dispersion dependence is almost constant, while for multiplicative noise this dependence looks like a doubled sine function and covers a range from zero to some maximal level depending on 7. In order to reveal the impact of the parameter 7, Fig. 5 shows how the

a o ■i—i

m

!h <D ft

-1 -0.5 0 0.5 mean output

1.2

0.9

xlO

-4

(b)

^ ; 1

pi

CO

0.3

H

t i

-0.5 0 0.5 mean output

additive noise multiplicative noise

ö o •i—i

GO S-H

0J ft

200

time

200

time

0 50 100 150 200 time

Fig. 4. Dispersion (a, b) calculated for the output of ESN with 7 = 0 (left panels) and 7 = 0.5 (right panels). The input signal (c) was used when 7 = 0. Parameters: a = 1, / =1 — 7, DA = DM = 10~2

mean dispersion level for additive noise and the dispersion range for multiplicative noise changes depending on the parameter 7.

A uniform reservoir connection matrix Wres may look like a rather degenerate case. However, this matrix is sometimes set randomly and does not change during the training process.

1

0.8

§ 0.6 -i-H

CO in

<D Ph

CO

^ 0.4 0.2 0

0 0.2 0.4 0.6 0.8 1

7

Fig. 5. Mean dispersion level (dashed line) for additive noise and dispersion range for multiplicative noise depending on parameter 7. Other parameters: a = 1, 3 = 1 — 7, DA = DM = 10_2

According to our previous studies [20], in terms of network noise accumulation, a similar uniform connectivity can be interpreted as a matrix with random values and a mean value of Thus, the conclusion "the smallest level of dispersion in this case can be obtained with weak memory of the reservoir and then grows exponentially with parameter 7 responsible for memory property" also holds for uniform random Wres.

5. ESN with a diagonal-like connection matrix

As mentioned in the previous section, the reservoir connection matrix Wres can be set without change during the training process. Usually, it is set to be uniform or diagonal-like [23]. In this section we consider the last type of the network. Figure 6a,d shows the diagonal matrices which we will use in the reservoir. Both networks are set with some blurring coefficient Z. We set this "blurring" effect using the Gaussian function. For example, in Fig. 6a this coefficient is set Z = 2, meaning that the main diagonal and two terms from the left and right sides of the main diagonal are set according to the Gaussian function, while the rest are set to be zero. In order to keep the same range of values, we need to make the sum of the elements in each row and column of the matrix Wres equal to one, as before. Therefore, the nonzero elements of this matrix are set as follows:

e-(k/e)

= —-, keii-(-i + (\. (5.1)

^ e-(j/Z2)

j=-z

In this section we consider two diagonal-like matrices Wres with two "blurring" coefficients Z = 2 (Fig. 6a) and Z = 20 (Fig. 6d).

Figure 6b,e shows the dispersion of the output signal for reservoir connection matrices with Z = 2 and Z = 20 given in corresponding left panels and parameter 7 = 0.8. There is

(a)

Wres with C = 2

xlO

-4

(b)

100

(d)

Wres with C = 20

100

ö o

O) Oh

f A / \

7 W '

50

xlO

-4

100 time

(e)

150

• additive noise

• multiplicative noise

f/\/\

50

100 150 time

0.2 0.4 0.6 0.S

7

Fig. 6. Dispersion of the output signal of ESN with diagonal matrices Wres. Panels a and d show the connection matrices with Z = 2 and Z = 20, respectively. Panel b shows the dispersion for additive (green) and multiplicative (orange) noise in the case of Z = 2. Panel c shows how this dispersion changes depending on y. Panels d and e show the same, but for Z = 20. Other fixed parameters: a = 1, 7 = 0.8, 3 = 1 - Y, da = DM = 10-2

a clear difference between dispersion in the case of diagonal matrices and dispersion obtained for uniform matrices (Fig. 4b). The form of a2-dependences for additive and multiplicative noise remains almost the same, but there is a clear quantitative difference. In the case of a uniform connection matrix and a diagonal matrix with small Z (Fig. 6b) the maximal value of dispersion for multiplicative noise coincided with its mean value for additive noise. These dependences are moving away from each other faster for the diagonal matrix (Fig. 6c) than for the uniform matrix (Fig. 5).

These dependences drift apart faster with a larger "blurring" coefficient, Z = 20 (see Fig. 6e,f).

Comparing the impact of additive noise in Figs. 5 and 6c, one can see that dispersions for additive and multiplicative noise depending on y are very similar for the diagonal matrix with a small "blurring" coefficient (Fig. 6c) and a uniform connection matrix (Fig. 5). Moreover, the noise is accumulated less for the large "blurring" coefficient Z (Fig. 6f). In the case of large y the final dispersion level becomes lower with growing Z. This difference is clearer for multiplicative

noise. Comparing the dispersion ranges in panels c and f of Fig. 6, one can see that multiplicative dispersion level for large Z = 20 is much lower than for small Z = 2.

Conclusion and discussion

In this paper we have studied the impact of uncorrelated additive and multiplicative noise on an echo state neural network. The noise was added only to neurons inside the reservoir. These neurons had a linear activation function. To analyze the output level of noise, we mainly used a dispersion and a signal-to-noise ratio derived from it.

Here we consider the case where artificial neurons have a linear activation function with different slope coefficients. We have started by studying only one isolated artificial neuron. It is shown that the parameter a controlling the slope of the activation function has no impact on the accumulation of additive noise. At the same time, the dispersion for multiplicative noise has a quadratic dependence on a and the input signal. The level of dispersion can be predicted analytically by Eqs. (3.1)-(3.3). In our next studies we plan to consider other types of activation functions such as sigmoid functions and piecewise linear activation functions.

The input signal has no impact on the statistics of additive noise either. At the same time, the dispersion level of multiplicative noise has a quadratic dependence on the input signal. Thus, if the hardware network has a multiplicative internal noise, the noise level can be decreased by working with values less than unity. Otherwise the noise level will significantly increase.

At the same time, it is not always possible to control the signal amplitude in a trained network. For this reason, the influence of connection matrices on noise accumulation has also been considered here. We have considered the influence of the main types of coupling matrices on the accumulation of noise. ESNs are usually set with a random uniform reservoir connection matrix or a diagonal-like matrix, which are not changed during the training process. Therefore, in this paper we have considered both types of connection matrices and studied the impact of memory on the accumulation of noise. We have found some interesting results for these matrices. The noise is less accumulated in ESN with the diagonal reservoir connection matrix Wres with a large "blurring" coefficient. This is especially true of uncorrelated multiplicative noise. The accumulation of noise in uniform Wres is almost the same as in diagonal Wres with a small "blurring" coefficient.

Another interesting result is that the shape of dispersion and SNR for only one neuron and for ESN with weak memory are similar. The shape of dispersion and SNR dependences and their qualitative level can be predicted analytically by Eqs. (3.1)-(3.3).

Conflict of interest

The authors declare that they have no conflicts of interest.

References

[1] LeCun, Y., Bengio, Y., and Hinton, G., Deep Learning, Nature, 2015, vol. 521, no. 7553, pp. 436-444.

[2] Maturana, D. and Scherer, S., VoxNet: A 3D Convolutional Neural Network for Real-Time Object

Recognition, in IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (IROS, Hamburg,

Germany, 2015), pp. 922-928.

[3] Krizhevsky, A., Sutskever, I., and Hinton, G.E., ImageNet Classification with Deep Convolutional Neural Networks, Commun. ACM, 2017, vol. 60, no. 6, pp. 84-90.

[4] Graves, A., Mohamed, A., and Hinton, G., Speech Recognition with Deep Recurrent Neural Networks, in IEEE Internat. Conf. on Acoustics, Speech and Signal Processing (Vancouver, British Columbia, Canada, May 2013), pp. 6645-6649.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

[5] Kar, S. and Moura, J.M.F., Distributed Consensus Algorithms in Sensor Networks with Imperfect Communication: Link Failures and Channel Noise, IEEE Trans. Signal Process., 2009, vol. 57, no. 1, pp. 355-369.

[6] Brunner, D., Soriano, M. C., Mirasso, C. R., and Fischer, I., Parallel Photonic Information Processing at Gigabyte per Second Data Rates Using Transient States, Nat. Commun., 2013, vol. 4, no. 1, Art. 1364, 6 pp.

[7] Tuma, T., Pantazi, A., Le Gallo, M., Sebastian, A., and Eleftheriou, E., Stochastic Phase-Change Neurons, Nat. Nanotechnol, 2016, vol. 11, no. 8, pp. 693-699.

[8] Torrejon, J., Riou, M., Araujo, F. A., Tsunegi, S., Khalsa, G., Querlioz, D., Bortolotti, P., Cros, V., Yakushiji, K., Fukushima, A., Kubota, H., Yuasa, S., Stiles, M.D., and Grollier, J., Neuromorphic Computing with Nanoscale Spintronic Oscillators, Nature, 2017, vol. 547, no. 7664, pp. 428-431.

[9] Psaltis, D., Brady, D., Gu, X.-G., and Lin, S., Holography in Artificial Neural Networks, Nature, 1990, vol. 343, no. 6256, pp. 325-330.

[10] Bueno, J., Maktoobi, S., Froehly, L., Fischer, I., Jacquot, M., Larger, L., and Brunner, D., Reinforcement Learning in a Large Scale Photonic Recurrent Neural Network, Optica, 2018, vol. 5, no. 6, pp. 756-760.

[11] Lin, X., Rivenson, Y., Yardimci, N.T., Veli, M., Luo, Y., Jarrahi, M., and Ozcan, A., All-Optical Machine Learning Using Diffractive Deep Neural Networks, Science, 2018, vol. 361, no. 6406, pp. 1004-1008.

[12] Shen, Y., Harris, N.C., Skirlo, S., Prabhu, M., Baehr-Jones, T., Hochberg, M., Sun, X., Zhao, S., Larochelle, H., Englund, D., and Soljacic, M., Deep Learning with Coherent Nanophotonic Circuits, Nature Photon., 2017, vol. 11, no. 7, pp. 441-446.

[13] Tait, A.N., De Lima, T.F., Zhou, E., Wu, A.X., Nahmias, M. A., Shastri, B. J., and Prucnal, P.R., Neuromorphic Photonic Networks Using Silicon Photonic Weight Banks, Sci. Rep., 2017, vol. 7, no. 1, Art. 7430, 10 pp.

[14] Moughames, J., Porte, X., Thiel, M., Ulliac, G., Larger, L., Jacquot, M., Kadic, M., and Brunner, D., Three-Dimensional Waveguide Interconnects for Scalable Integration of Photonic Neural Networks, Optica, 2020, vol. 7, no. 6, pp. 640-646.

[15] Dinc, N.U., Psaltis, D., and Brunner, D., Optical Neural Networks: The 3D Connection, Photoniques, 2020, no. 104, pp. 34-38.

[16] Moughames, J., Porte, X., Larger, L., Jacquot, M., Kadic, M., and Brunner, D., 3D Printed Multimode-Splitters for Photonic Interconnects, Opt. Mater. Express, 2020, vol. 10, no. 11, pp. 2952-2961.

[17] Mourgias-Alexandris, G., Moralis-Pegios, M., Tsakyridis, A., Simos, S., Dabos, G., Totovic, A., Passalis, N., Kirtas, M., Rutirawut, T., Gardes, F. Y., Tefas, A., and Pleros, N., Noise-Resilient and High-Speed Deep Learning with Coherent Silicon Photonics, Nat. Commun., 2022, vol. 13, no. 1, Art. 5572, 7 pp.

[18] Wang, T., Ma, S.-Y., Wright, L.G., Onodera, T., Richard, B.C., and McMahon, P.L., An Optical Neural Network Using Less Than 1 Photon per Multiplication, Nat. Commun., 2022, vol. 13, no. 1, Art. 123, 7 pp.

[19] Semenova, N., Porte, X., Andreoli, L., Jacquot, M., Larger, L., and Brunner, D., Fundamental Aspects of Noise in Analog-Hardware Neural Networks, Chaos, 2019, vol. 29, no. 10, 103128, 11 pp.

[20] Semenova, N., Larger, L., and Brunner, D., Understanding and Mitigating Noise in Trained Deep Neural Networks, Neural Netw., 2022, vol. 146, pp. 151-160.

[21] Semenova, N. and Brunner, D., Noise-Mitigation Strategies in Physical Feedforward Neural Networks, Chaos, 2022, vol. 32, no. 6, 061106, 10 pp.

[22] Johnson, D.H., Signal-to-Noise Ratio, Scholarpedia, 2006, vol. 1, no. 12, Art. 2088.

[23] Lukosevicius, M. and Jaeger, H., Reservoir Computing Approaches to Recurrent Neural Network Training, Comput. Sci. Rev., 2009, vol. 3, no. 3, pp. 127-149.

Noise Impact on a Recurrent Neural Network with a Linear Activation Function Текст научной статьи по специальности «Физика»

Аннотация научной статьи по физике, автор научной работы — V. M. Moskvitin, N. I. Semenova

Похожие темы научных работ по физике , автор научной работы — V. M. Moskvitin, N. I. Semenova

Текст научной работы на тему «Noise Impact on a Recurrent Neural Network with a Linear Activation Function»