NUMERICAL METHODS FOR IDENTIFYING THE DIFFUSION COEFFICIENT IN A NONLINEAR ELLIPTIC EQUATION

Huang Jian; Grigorev Aleksandr V.; Ivanov Dulus Kh.

Математические заметки СВФУ Январь—март, 2021. Том 28, № 1

UDC 004.85+519.63

NUMERICAL METHODS FOR IDENTIFYING THE DIFFUSION COEFFICIENT IN A NONLINEAR ELLIPTIC EQUATION J. Huang, A. V. Grigorev, and D. Kh. Ivanov

Abstract. Two different approaches for solving a nonlinear coefficient inverse problem are investigated in this paper. As a classical approach, we use the finite element method to discretize the direct and inverse problems and solve the inverse problem by the conjugate gradient method. Meanwhile, we also apply the neural network approach to recover the coefficient of the inverse problem, which is to map measurements at some fixed points and the unknown coefficient. According to the results of applying the two approaches, our methods are shown to solve the nonlinear coefficient inverse problem efficiently, even with perturbed data.

DOI: 10.25587/SVFU.2021.81.41.007 Keywords: inverse problem, neural network, nonlinear elliptic equation, optimization, finite element method.

1. Introduction

Inverse problems are widely encountered in many important science and engineering fields, such as signal processing, medical imaging, computer vision, electromagnetic, fluid flow, etc. [1,2], whereas parameter identification may be viewed as the largest class of inverse problems from real applications; see [3].

A parameter identification problem usually denotes the problem to recover the unknown coefficients in a partial differential equation from some measurements of the solution or a perturbed solution. A simple example of such problem is the following Poisson-type equation

-V- (D(x)Vu(x)) = f, x e tt, (1.1)

where tt is an open and bounded convex polyhedral domain in Rd (d = 2,3). The system (1.1) should be complemented by some boundary conditions. Here we shall

The work of J. Huang was supported by the National Natural Science Foundation of China (Grant No. 11901497), in part by the Natural Science Foundation of Hunan Province (Grant No. 2019JJ50607) and in part by the China Postdoctoral Science Foundation Funded Project (Grant No. BX20180266);The work of A. Grigorev was supported by RFBR (Grant 21-51-54001). The work of D. Ivanov was supported by the Mega-Grant of the Russian Federation Government 14.Y26.31.0013. The work of Y. Huang was supported by the National Natural Science Foundation of China (Grant No. 11971410) and in part by the Project of Scientific Research Fund of Hunan Provincial Science and Technology Department (2018WK4006).

consider the following homogeneous Dirichlet boundary conditions

u = 0, x e on. (1.2)

These kinds of elliptic problems have been studied in great detail by many researchers. Depending on the concrete applications, the solution u(x) may represent different physical quantities such as the temperature, the hydraulic head or the electric potential at some certain point x, then D(x) stand for the thermal conductivity, the aquifer transmissivity, the electric conductivity, respectively, and f is the corresponding source.

The direct problem is usually well-posed, and used to find u by solving the partial differential equation given the coefficient D and the source term f. By contrast, practically, the boundary data, the diffusion coefficient, or the source term usually may not be given. And usually it is easier to measure the solution u at some points than to measure the coefficient D(x) directly. These lead to the inverse problems, which are utilized to estimate the unknown coefficient D on n given a measured data ug(x) of the solution.

In many applications, inverse problems can be nonlinear and ill-posed which makes them difficult to solve numerically. There are many existing methods to deal with this inverse problem. The most difficult issue in analyzing and solving the above inverse problem of recovering the parameter D(x) lies in their strong instability with respect to the errors in the measured data, i.e., small errors in the data may cause tremendous changes of the unknown parameter D(x).

On the one hand, in order to achieve a reasonable and practically acceptable numerical reconstruction of the parameter D, one may have to resort to some reg-ularization techniques to transform the unstable ill-posed parameter identification process into a stable mathematical process. Tikhonov regularization is one of the most frequently used and robust regularization techniques [4,5], which often leads to a stabilized nonlinear minimization problem, e.g. least-squares solutions, instead of solving the ill-posed problems directly, and then employs some suitable method for its solution. But the optimal solutions of the inverse problems are still hard to obtain because it depends on the regularization conditions. To overcome the drawbacks of the regularization methods, the conjugate gradient method is chosen as a practical and powerful iterative technique for solving ill-posed problems. Moreover, the conjugate gradient method is also known as an iterative regularization method, where the number of the iteration is taken as the regularization parameter. And it has been successfully applied to various inverse problems [6-8].

On the other hand, in recent few years, deep learning has obtained very greatly progress in many respects, such as imaging restoration, computed tomography and language processing. Deep neural networks contain multiple hidden layers, which can learn very complicated relationships between the input data and output data. Nowadays, neural network approaches are used as alternative and successful algorithms for solving inverse problems (see, for example [9-12]).

Due to difficulty of the nonlinearity and the ill-posedness, further research is needed to provide more effective and feasible numerical methods of the coefficient

inverse problems. The authors in [13] gave an iterative deep neural network for solving ill-posed inverse problems, and compared with classical regularization methods on tomographic examples. Beilina et al. in [14] presented a globally convergent numerical scheme for a multidimensional coefficient inverse problem of a hyperbolic partial differential equation. In [15], they presented error estimates for identifying the scalar diffusion coefficient of an elliptic problem with homogeneous Dirichlet boundary conditions. The authors in [16] developed the iterative neural network approaches for the coefficient and the evolutionary inverse problems, respectively. In [17-21], they showed the convergence of the finite element methods with Tikhonov regularization for identifying the diffusion coefficients in elliptic problems. A combined deep neural networks with Tikhonov regularization method for solving the inverse problem was introduced in [22].

In this paper, we investigate the inverse problem for identifying the coefficient of a nonlinear diffusion equation. We mainly introduce two numerical approaches for solving the coefficient inverse problem. In a classical approach, we use the finite element method to discretize the nonlinear diffusion equation and then deal with the resulting discrete nonlinear system by Newton's method. We apply the conjugate gradient method to construct an efficient solver of the discrete ill-posed linear system. Meanwhile, we solve the coefficient inverse problem by neural network approach for training the correlation between the observed data with the unknown coefficient and demonstrate the efficiency of our methods by the numerical examples.

The article is organized as follows: The coefficient inverse problem of the nonlinear diffusion model is stated in Section 2. We reformulate the coefficient inverse problem into a minimization problem and then solve it by an iterative regularization method in Section 3. The neural network method for solving the inverse problem is constructed in Section 4. Some numerical experiments using our methods are carried out in Section 5 to verify that the efficiency of our methods. Finally, the conclusions and further ideas are presented in Section 6.

We end this section with an introduction of some notation which will be used in the subsequent sections. For any m > 0 and p > 0, Wm'p(n) stands for the standard Sobolev spaces of mth order, and we write it as Hm(n) for p = 2, L2(n) for m = 0 and p = 2. The inner product in L2(n) will be expressed by (■, -)L2(o) or simply (■, ■) if no confusion is caused.

2. Problem Statement

Consider the following nonlinear elliptic equation in a bounded domain n c

Rd(d = 2, 3):

-V- (s(u)Vu) = f, in n, (2.1) with homogeneous Dirichlet boundary condition

u = 0 on dn, (2.2) k(u) is a nonlinear diffusion coefficient,

k(u) = a(x) + u2, (2.3)

where a(x) e K is a spatial parameter and

K = {a e Hx(tt) : 0 < a1 < a(x) < a2 < ro a.e. in tt},

a1 and a2 are two a priori bounds of the physical parameter a.

The direct problem is stated as follows: Given the source term f and parameter a, find a function u(x) such that it is the solution to (2.1)-(2.3).

Suppose that the parameter a, which is involved in the nonlinear diffusion coefficient (2.3), is unknown and needs to be determined by the measurements of the solution of the direct problem only at some interior points of tt. Here, the inverse problem that we are concerned with in this paper is: Recover the unknown coefficient parameter a with the observed data

u(xj) = zi, xl e tt, i = 1, 2,... , M. (2.4)

We introduce a parameter-to-solution mapping

a M u(a),

in which u(a) is the solution to the direct problem (2.1)-(2.3) with a given specific a. It is worth pointing that even for a linear direct problem, the coefficient inverse problem and the parameter-to-solution mapping are usually nonlinear and not invert-ible.

We take problem (2.1)-(2.3) as a direct problem if a(x) is fixed by the measured data (2.4). It is obvious that when a(x) is given, the system (2.1)-(2.4) is overdeter-mined, i.e. there may not be any function u(x) satisfied all the equations for arbitrary given f and zi. Whereas in fact the Lax-Milgram lemma ensures the existence of a unique solution u(x) of the direct problem. Therefore, for an unknown positive parameter a, we have to provide some additional information (2.4) to guarantee a unique solution u(a, x) of the inverse problem, which goes to a coefficient inverse problem for a second order nonlinear elliptic equation and belongs to an ill-posed problem.

In this paper, we shall investigate two approaches for dealing with the correspon ding inverse problem. One is the classical method, i.e. the optimization approach, which is used to obtain the approximation by solving a minimization problem using gradient method. The other is the neural network approach, which is designed to obtain a good prediction of a based on the constructed mapping between the training data (observations) z and the parameter. Generally, the data z are derived from the solution of the direct problem for some given parameter a or obtained by means of a finite set of measurements, and then interpolated in some suitable manner.

Here we are only concerned with the two-dimensional case. Some more general results are trivial extensions to the three-dimensional model.

3. The Optimization Approach

In this section, we use the standard notation of the Sobolev spaces and the associated norms.

Define the function space

V = Hq1^) = {u G H| u = 0 on dQ}. (3.1)

We multiply both sides of equation (2.1) by a test function v G V and integrate over the whole domain Q. The variational formulation of (2.1)-(2.4) is as follows: find u G V such that

J (a + u2)Vu -Vvdx = J fvdx Vv G V. (3.2)

n n

For the sake of brevity and readability, we rewrite the weak formula into the following general form:

F(u, a) := ((a + u2)Vu, Vv) - (f,v) = 0, Vv G V. (3.3)

Given a problem as (3.3), the direct problem consists of finding the solution u with a given parameter a and a known f. The inverse problem of parameter identification, by contrast, focuses on recovering the parameter a with a measurement of u. More descriptively, the problem to be investigated is: suppose the observed data is given, the goal is to find out the optimal value of a such that the corresponding solution u(a) to the problem (3.3) matches the observed data as closely as possible.

The identifying process is carried out in a way that the solution u matches its measured data z optimally in the energy norm.

We employ the minimization problem to quantify how well the solution u(a, x) matches the requirement. Thus, the inverse problem can be reformed into the following constrained minimization problem: find a* G K such that

a* = argmin J(u, a) (3.4)

aEK

where the objective function is

J(u, a) = —(u(a, x) — z, u(a, x) — z) = — ||it(a, x) — z\\y, (3-5)

which quantifies the error between the solution u(a,x) and the measured data z.

The first order condition for the problem (3.4) is

(J'(u, a*), n) = 0, for all n G K, (3.6)

where J'(u, a*) is usually called the gradient of J(u, a), which is defined through the Frechet derivative of J(u, a) at a*.

It is well known that the key work is to find the derivatives of the objective functional J(u, a). Suppose the total derivative of the functional, which can be obtained by the adjoint method, is known, any gradient-based optimization algorithm can be used to solve the minimization problem.

We compute the total derivative of J(u, a) by

= + (3.7)

dq dq dq

where A is the solution to the following adjoint equation (3.8) of the direct problem (3.3) and the functional J(u, a),

= (3.8,

du du

Here F* denotes the Hermitian transpose of F.

Let O be a polyhedral domain which can be completely triangulated by triangles. Let {Th}h denote a family of shape regular and quasi-uniform mesh of O with

h := max hT such that

t eth

O = y T for h = 1, 2, 3,... .

T ETh

We discretize the variable u by the following defined continuous piecewise linear finite element space:

Vh = {q g HKO); q|T g Pi(T), vt g Th}, (3.9)

where P1(T) is the space of polynomials of degree < 1 on an element T G Th. We approximate the constraint set K by the following discrete set

Kh = {ph G Vh; ai < ph(x%) < a2 for all x% e Nh}, (3.10)

where Nh = {xi}N=h1 is the set of nodal points of the triangulation Th.

We first discretize the minimization system (3.4) by the piecewise linear finite elements:

l( M

min Jh{uh,cth) =-\'S^{uh(ah,xl) - Zi)2 \ , (3-11)

aheKh 2\ J

where uh(ah, xj) G Vh is obtained by solving the following nonlinear discrete system with Newton's method:

J (ah + uh) Vuh ■ Vvh dx = J fvh dx Vvh G Vh. (3.12)

n n

In the implementaion of the finite element methods, we use the FEniCS computing platform [23], version (2018.1), to solve the discrete system and combine with the dolfin-adjoint package, written as the Python interface of FEniCS, to automatically derive and solve the discrete adjoint and tangent linear models according to the direct problem. See Section 5 for details.

4. Neural Network Approach

In this section, we shall investigate the neural network approach to solve the coefficient inverse problem. We shall construct a convolutional neural network (CNN) based on Keras library.

CNN is a well-known class of artificial neural networks, which has most commonly led to very good performance on a variety of problems, such as visual classification

and recognition, medical image analysis, natural language processing, etc. [2427]. And Keras is a high-level API, written in Python and capable of running on top of TensorFlow, for developing and training deep learning models. Moreover, TensorFlow supports effectively running computations on CPU and GPU.

In this approach, we implement the deep learning neural network model in Python using Keras with the TensorFlow backend. They were run on a laptop with Intel Core i7-8750H (6 cores 2.20 GHz) and Nvidia GTX 1050Ti with 8 Gb RAM.

A simple CNN is a sequence of layers, and every layer transforms one volume of activations to another through a differentiable function. Neurons are a variant of nonlinear function. The output function result is passed to the next layer. Neurons from the previous layer are connected to neurons from the current layer. The link data is a weighting parameter. Therefore, the task of neural network training is to properly adjust these weights. Convolutional neural networks reduce the size of the input data size to simplify the task through a convolution operation. The convolution operation is multiplication by a matrix (the convolution kernel) and direct summation of the results.

In this case, we apply the simplest variant of CNN in TensorFlow in the sequential model. This model is a particular case of convolutional neural networks in which multiplication by 1 occurs. Such a neural network with more than one layer of neurons is called the deep neural network. In real-life problems we often encounter such a situation that measurements are carried out in a small number of points evenly distributed throughout the region. Such measurements are carried out in time at various intervals or randomly in order to obtain a samples sufficient to construct a mathematical model. In order to process such data it is necessary to build a mathematical model that coincides with the measured data with acceptable accuracy on the large size of the input data array. In this case we generate input data based on the solution of a large number of direct problems and fix values at selected points. These technologies let us building the dependence of values at these points with the value of the coefficient a.

For the direct problem (2.1)-(2.3), in the beginning, we need to generate a randomly in the admissible set K, where the value of a1 was chosen for separating k(u) from 0. After solving the direct problem, we can obtain the values of the solution at some specific points xj G Q, i = 1, 2,... , M. Thus, for each a, we can find and assign the values of the solution at these points. Our deep learning process is developed and completed with these patterns.

And now we use CNN to achieve our goal [28]. We need to collect the layers to construct the sequential model. Most simplest sequential model is a stack of layers, which can be described as tf.Keras.Sequential.

The machine learning process is used to initialize the weights which helps to efficiently de-correlate layer activities in this layers of perceptrons. For our model we choose sequential model only with 1 layer and 9 neurons in the layer, and let batch size be 50. The batch size defines the number of samples that will be propagated through the network. In other words, it includes the number of training examples

used in one iteration. It's especially important if you are not able to fit the whole

dataset in your machine's memory. Our results show that the work on small mini-

batch sizes yields better and faster neural network models than the others; that is

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

because we update the weights after each propagation.

5. Numerical Experiments

In this section, some numerical results are presented to illustrate the efficiency of the two approaches for the coefficient inverse problem (2.1)-(2.3). We choose O c R2 as the unit square (0,1)2 and the admissible set K with a1 = 0.001 and a2 = 1. We use the uniform triangle mesh (32 x 32) over O.

In order to show the merits of the two approaches, we begin with the constant parameters test example, and then consider the variable parameters case.

Example 1: Constant Parameters. We firstly consider the problem

(5.1)

—V((a + u2)Vu) = e-\/(-0-5)2+(v-0-5)2, in ii, u = 0, on dO, with the measured data taken at the following points:

x1 = (0.2, 0.2), x2 = (0.5, 0.2), x3 = (0.8, 0.2), x4 = (0.2, 0.5), xs = (0.5, 0.5), xe = (0.8, 0.5), xr =(0.2, 0.8), xg = (0.5, 0.8), xg = (0.8, 0.8),

and here the measured data are selected without any artificial noise.

5.1.1. The Optimization Approach. In the implementation, the conjugate gradient method is used to solving the minimization problem for recovering the unknown coefficient a, and Newton's method for the nonlinear discrete system. The iteration starts with an initial guess a0 = 0.5, and the chosen termination criterion is the relative error between two iteration solutions drops below 10-3.

As Table 1 and Fig. 1 show, the unknown coefficient is recovered very well. All cases can reach precisely good accuracy with few iterations and low computational cost. However, in the 5th case, the computational time is significantly larger than the others, that is because that more iterations are needed in the Newton's method to solve the nonlinear problem.

5.1.2. Neural Network Approach. The key step of the CNN is how to create training data. The simplest way is using random processes to produce data. We randomly choose the coefficient a and then solve the direct problem. Then we evaluate the values in the selected 9 points and construct a mapping between these values and the chosen coefficient a. All the data are collected in form of strings in CSV format file. We rebuild four new data sets for training data, and the data sizes are 1000 strings, 2000 strings, 4000 strings, 8000 strings, respectively. And we start from 1000 data set that actually can fit real data of observations. Notice that machine learning process requires big dataset. And in order to get good approximations, we have to take into account the influence by some special situations,

Table 1. Results for Example 1. #iter is the number of iteration and time (sec) is the CPU time of the computation

NO. (X-exact C^approx absolute error Jh #iter time (sec)

1 0.306529 0.306528 1.132e-07 3.714e-15 3 0.715

2 0.745282 0.745281 1.109e-07 1.322e-14 5 0.869

3 0.457313 0.457313 3.150e-10 7.008e-21 4 0.722

4 0.182086 0.182085 7.998e-13 1.539e-23 4 0.885

5 0.260822 0.260822 8.176e-10 3.194e-19 5 2.423

6 0.681761 0.681761 9.825e-09 1.469e-18 5 0.881

7 0.924753 0.924752 1.381e-06 8.727e-15 6 1.031

8 0.381546 0.381546 4.133e-08 2.341e-16 4 0.786

9 0.056102 0.056102 4.770e-ll 6.734e-21 3 1.379

10 0.855829 0.855829 4.546e-08 1.284e-17 6 1.027

Fig.1. Approximation (blue lines) at each iteration, red dashed lines represent the

exact value of the coefficient for each case.

which will actually lead to large errors and hard to deal with using machine learning method. Therefore, we build the test data sets with only 10 strings for the extreme values, shown in Table 2. These values we concerned were chosen around 0.001 and 1.

It is demonstrated in Tables 3 and 4 that our neural network is convergent according to training data size and epoch parameter. As Fig. 2 and 3 show, the errors are updated and the curve goes from underfitting to optimal and then to overfitting through the entire process of neural network. We have to be careful with the epoch parameter, there is no magic rule of thumb for choosing the number of epochs. Too high values of the epoch parameter usually lead to overtraining. In machine learning, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship, namely, the model does well with training data and fails to perform well on test data. It is worth noting that the neural network try to learn some complicated non-existing relationships between the inputs and outputs and accumulate errors of the computational model (i.e. in our case errors from solving the direct problem).

Table 2. Test data set

Xl x2 x3 Xi x5 Xß XT Xg Xg a

0.035 0.052 0.035 0.052 0.081 0.052 0.035 0.052 0.035 0.717

0.037 0.055 0.037 0.055 0.086 0.055 0.037 0.055 0.037 0.676

0.126 0.184 0.126 0.184 0.269 0.184 0.126 0.184 0.126 0.192

0.032 0.048 0.032 0.048 0.075 0.048 0.032 0.048 0.032 0.778

0.379 0.445 0.379 0.445 0.527 0.445 0.379 0.445 0.379 0.018

0.174 0.245 0.174 0.245 0.340 0.245 0.174 0.245 0.174 0.133

0.047 0.071 0.047 0.071 0.110 0.071 0.047 0.071 0.047 0.527

0.025 0.038 0.025 0.038 0.059 0.038 0.025 0.038 0.025 0.982

0.354 0.424 0.355 0.424 0.509 0.424 0.355 0.424 0.355 0.028

0.025 0.037 0.025 0.037 0.058 0.037 0.025 0.037 0.025 0.999

Table 3. Convergence in Lmax (mean value)

N\ epoch 10 20 50 100 200 400

1000 0.6988 0.4735 0.1867 0.0962 0.0723 0.0188

2000 0.4040 0.2746 0.2869 0.1348 0.1282 0.0128

4000 0.2740 0.1969 0.0429 0.0245 0.1345 0.0253

8000 0.2025 0.0571 0.0267 0.0648 0.0226 0.1295

Table 4. Convergence in L2 (mean value)

N \ epoch 10 20 50 100 200 400

1000 0.4216 0.2424 0.0875 0.0428 0.0446 0.0106

2000 0.1966 0.1285 0.1348 0.0708 0.0703 0.0059

4000 0.1288 0.0907 0.0182 0.0138 0.0707 0.0099

8000 0.0909 0.0276 0.0132 0.0372 0.0095 0.0683

The major advantage of the neural network approach is the prediction time compared with the classical approach. This approach provides an extremely fast method that produces about 10 predictions in 0.02 sec. However, we have to build an offline training stage for the neural network, which takes hours according to the input data size. Meanwhile, it is still worth to develop proper and effective training algorithms to accelerate the training parts of CNNs. If there are no strong restrictions on the accuracy, the NN approach can be applied for obtaining the predictions quickly.

Example 2: Variable Parameters. In this experiment, a is chosen as a variable parameter depending on spatial variables, i.e. a = a(x). Assume for

Fig. 2. Errors depending on epoch number

Fig. 3. Errors depending on training data size.

the sake of the argument that we take the parameter in a linear dependency form, namely, let а(ж) = a1 + a2x + a3y, simplified by a = (a^ a2, a3), and we still bound each parameter as before, ai G K, i =1,2,3. In the implementation of Example 2, the noisy data z are generated by adding a random perturbation

zi = u(a, xi)(1 + Sai), i = 1, 2,... M,

where ai are uniformly distributed random numbers in [— 1,1] and S is the corresponding noise level.

5.2.1. The Optimization Approach. We choose different initial values for the parameters a and then make a comparison of the computational costs with different initial guesses, which are chosen as follows: (1) (0.1,0.9,0.9); (2) (0.9,0.1,0.9) and (3) (0.5,0.5,0.5). We refine the mesh up to 100 x 100 grid.

We present the results for various parameters with different initial guesses in Table 5. In this case, we employ the L-BFGS-B algorithm to solve the optimization problem with many variables, which is more suitable to handle simple bounds on control variables. It can be seen from Table 5 that more iterations and evaluation time are needed for the multi-parameter case than in the constant parameter case: it takes approximately 0.45 seconds per iteration. The coefficient a is well approximated even with noisy input data. The largest relative error is 1.9% for noise level 1% and 7.7% for 5%.

5.2.2. Neural Network Approach. The neural network approach can be easily extended to the multiple parameters case. In the previous construction, we

Table 5. Results for multi-parameter a

initial guest ai a2 a3 #iter time (sec)

a = (0.5, 0.5,0.5), S = 1%

1 0.493320 0.504706 0.509233 13 5.132

2 0.504059 0.498588 0.494881 15 6.245

a = (0.5,0.5,0.5), <5 = 5%

1 0.470931 0.501862 0.538382 13 5.219

2 0.484641 0.534363 0.496536 15 6.351

a = (0.3,0.8,0.6), <5 = 1%

1 0.300964 0.792119 0.602069 15 6.686

2 0.299458 0.807414 0.592130 13 8.502

3 0.298877 0.805383 0.600089 16 6.762

used the simplest case with 1 layer and 9 neurons inside the layer. Now we consider the neural network for the 3-parameter case; therefore, we need to modify the neural network with 3 layers and a (9,12,12)-neurons scheme.

This scheme can be explain as follows: when we fix some of two parameters ai; for example a2 and a3, the problem becomes computationally close to the case of recovering one parameter; and when we unfix some parameter ai; we need to construct an extra layer. The first layer has 9 neurons for 9 selected inner points. In this sense, the first layer coincides with the input data.

We should increase the number of neurons in each layer because that we must take into account the correlation of the parameters between each other. Not only the number of parameters but also their superposition effect increase the computational complexity. Optimizing neural network architecture is a separate and complex issue to study. At the moment we focus on performance criteria and meeting the requirements of the permissible errors. We realize this neural network with training data set size 8000, epoch size 800, batch size 50.

As is seen from Tables 6 and 7 (for 10 predictions), the results are well matched with the requirement of no perturbed data. Moreover, Tables 8 and 9 show the results with the noised data. The coefficient a is still well reconstructed although noisy data causes the the loss of accuracy of predictions in a range. And the numbers of layers and neurons in the neural network grow as the computational complexity increases.

9. Conclusion

In this paper, we constructed two different approaches for solving the nonlinear coefficient inverse problem. In the optimization approach, we reformed the inverse problem into a minimization problem. We presented the finite element method for the direct problem and used the conjugate gradient method for the inverse problem. In the neural network approach, we recovered the coefficient by constructed CNNs, which depend on the offline training stage of calculations. The first approach

Table 6. Comparison »i, »2, »3 with predictions ai, »2, a3 without noise

ai/äi «2/02 a2/â3

Test case 1 Test case 2 Test case 3 0.8711 / 0.8695 0.3725 / 0.3679 0.2530 / 0.2500 0.3539 / 0.3283 0.0619 / 0.0688 0.7667 / 0.7332 0.6962 / 0.7161 0.2692 / 0.2720 0.1873 / 0.1983

Table 7. Errors for parameters »i, »2, »3 without noise

£1 £2 £3

maximum error Lmax mean error L2 0.0345 0.0097 0.0335 0.0136 0.0362 0.0187

Table 8. Comparison »1, »2, »3 with predictions »i, ¿»2, »3 with 8 = 5%

ai/äi «2/02 0:2/0:3

Test case 1 Test case 2 Test case 3 0.7283 / 0.7339 0.3387 / 0.3280 0.1549 / 0.1547 0.6073 / 0.5749 0.7458 / 0.7560 0.4205 / 0.3940 0.3079 / 0.3264 0.2699 / 0.2724 0.5660 / 0.5568

Table 9. Errors for parameters »1, »2, »3 with 8 = 5%

£1 £2 £3

maximum error Lmax mean error L2 0.0474 0.0207 0.0957 0.0396 0.0736 0.0287

provides a more accurate solution but takes more time than the NN approach. By contrast, the neural network provides faster predictions but with a loss in accuracy. We tested both the constant parameters case and the variable parameters case. The results obtained from our tests indicate that two approaches are very efficient for numerical solving of this nonlinear coefficient inverse problem, even with some noisy data.

In the future, we shall extend our results in two directions. One is that we would like to extend our research on the N-parameter case (where N > 3); another, we intend to apply our methods to the discontinuous coefficients case.

REFERENCES

1. V. Isakov, Inverse Problems for Partial Differential Equations, Springer, New York (2006).

2. A. A. Samarskii and P. N. Vabishchevich, Numerical Methods for Solving Inverse Problems

of Mathematical Physics, de Gruyter (2008) (Inverse Ill-Posed Probl. Ser.; vol. 52).

3. L. Beilina and M. V. Klibanov, Approximate Global Convergence and Adaptivity for Coefficient Inverse Problems, Springer, New York (2012).

4. M. Benning and M. Burger, "Modern regularization methods for inverse problems," Acta Numerica, 27, 1-111 (2018).

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

5. H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems, Kluwer Acad. Publ., Dordrecht; Boston; London (1996).

6. M. Hanke, Conjugate Gradient Type Methods for Ill-posed Problems, CRC Press (2019).

7. C. H. Huang and C. W. Chen, "A boundary element-based inverse problem in estimating transient boundary conditions with conjugate gradient method," Int. J. Numer. Methods Eng., 42, No. 5, 943-965 (1998).

8. B. Jin, "Conjugate gradient method for the Robin inverse problem associated with the Laplace equation," Int. J. Numer. Methods Eng., 71, No. 4, 433-453 (2007).

9. I. Elshafiey, L. Udpa, and S. Udpa, "Solution of inverse problems in electromagnetics using Hopfield neural networks," IEEE Trans. Magnetics, 31, No. 1, 852-861 (1995).

10. Y. Fan and L. Ying, "Solving inverse wave scattering with deep learning," arXiv:1911.13202 (2019).

11. S. Hoole, "Artificial neural networks in the solution of inverse electromagnetic field problems," IEEE Trans. Magnetics, 29, No. 2, 1931-1934 (1993).

12. J. Schwab, S. Antholzer, and M. Haltmeier, "Deep null space learning for inverse problems: convergence analysis and rates," Inverse Probl., 35, No. 2, 025008 (2019).

13. J. Adler and O. Oktem, "Solving ill-posed inverse problems using iterative deep neural networks," Inverse Probl., 33, No. 12, 124007 (2017).

14. L. Beilina and M. V. Klibanov, "A globally convergent numerical method for a coefficient inverse problem," SIAM J. Sci. Comput., 31, No. 1, 478-509 (2008).

15. A. Bonito, A. Cohen, R. DeVore, G. Petrova, and G. Welper, "Diffusion coefficients estimation for elliptic partial differential equations," SIAM J. Math. Anal., 49, No. 2, 1570-1592 (2017).

16. V. I. Gorbachenko, T. V. Lazovskaya, D. A. Tarkhov, A. N. Vasilyev, and M. V. Zhukov, "Neural network technique in some inverse problems of mathematical physics," in: Int. Symp. Neural Networks, pp. 310-316, Springer, Cham (2016).

17. D. N. Hao and T. N. T. Quyen, "Finite element methods for coefficient identification in an elliptic equation," Appl. Anal., 93, No. 7, 1533-1566 (2014).

18. I. Knowles, "Parameter identification for elliptic problems," J. Comput. Appl. Math., 131, No. 1-2, 175-194 (2001).

19. T. N. T. Quyen, "Finite element analysis for identifying the reaction coefficient in PDE from boundary observations," Appl. Numer. Math., 145, 297-314 (2019).

20. L. Wang and J. Zou, "Error estimates of finite element methods for parameter identifications in elliptic and parabolic systems," Discrete Contin. Dyn. Syst., B, 14, No. 4, 1641 (2010).

21. J. Zou, "Numerical methods for elliptic inverse problems," Int. J. Computer Math., 70, No. 2, 211-232 (1998).

22. H. Li, J. Schwab, S. Antholzer, and M. Haltmeier, "NETT: Solving inverse problems with deep neural networks," Inverse Probl., (2020).

23. A. Logg, K. A. Mardal, and G. Wells, Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book, Springer-Verl., Berlin; Heidelberg (2012) (Lect. Notes Comput. Sci. Eng.; vol. 84).

24. S. S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall PTR, New York (1994).

25. S. S. Haykin, Neural Networks and Learning Machines, Prentice Hall, New York (2009).

26. K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, "Deep convolutional neural network for inverse problems in imaging," IEEE Trans. Image Process., 26, No. 9, 4509-4522 (2017).

27. A. Lucas, M. Iliadis, R. Molina, and A. K. Katsaggelos, "Using deep neural networks for inverse problems in imaging: beyond analytical methods," IEEE Signal Process. Mag., 35, No. 1, 20-36 (2018).

28. V. M. Krasnopolsky and H. Schiller, "Some neural network applications in environmental sciences. Part I: forward and inverse problems in geophysical remote measurements," Neural

Networks, 16, No. 3-4, 321-334 (2003).

Received April 4, 2019 Revised December 11, 2020 Accepted February 26, 2021

Jian Huang

School of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, China; Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan, 411105, China;

Key Laboratory of Intelligent Computing Information Processing of Ministry of Education, Xiangtan, 411105, China huangjian213@xtu.edu.cn, huangyq@xtu.edu.cn Aleksandr V. Grigorev

M. K. Ammosov North-Eastern Federal University, Institute of Mathematics and Informatics, 48 Kulakovsky Street, Yakutsk 677000, Russia re5itsme@gmail.com Dulus Kh. Ivanov

M. K. Ammosov North-Eastern Federal University, Institute of Mathematics and Informatics, 48 Kulakovsky Street, Yakutsk 677000, Russia; Yakutsk Branch of the Regional Scientific and Educational Mathematical Center "Far Eastern Center of Mathematical Research", 48 Kulakovsky Street, Yakutsk 677000, Russia i,am.djoos@gmail.com

NUMERICAL METHODS FOR IDENTIFYING THE DIFFUSION COEFFICIENT IN A NONLINEAR ELLIPTIC EQUATION Текст научной статьи по специальности «Медицинские технологии»

Аннотация научной статьи по медицинским технологиям, автор научной работы — Huang Jian, Grigorev Aleksandr V., Ivanov Dulus Kh.

Похожие темы научных работ по медицинским технологиям , автор научной работы — Huang Jian, Grigorev Aleksandr V., Ivanov Dulus Kh.

Текст научной работы на тему «NUMERICAL METHODS FOR IDENTIFYING THE DIFFUSION COEFFICIENT IN A NONLINEAR ELLIPTIC EQUATION»