Научная статья на тему 'Applied aspects of modern non-blind image deconvolution methods'

Applied aspects of modern non-blind image deconvolution methods Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
27
6
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
non-blind image deconvolution / image deblurring / state-of-the-art methods / method robustness / non-blind deconvolution benchmarking

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Olga Borisovna Chaganova, Anton Sergeevich Grigoryev, Dmitry Petrovich Nikolaev, Ilya Petrovich Nikolaev

The focus of this paper is the study of modern non-blind image deconvolution methods and their application to practical tasks. The aim of the study is to determine the current state-of-the-art in non-blind image deconvolution and to identify the limitations of current approaches, with a focus on practical application details. The paper proposes approaches to examine the influence of various effects on the quality of restoration, the robustness of models to errors in blur kernel estimation, and the violation of the commonly assumed uniform blur model. We developed a benchmark for validating non-blind deconvolution methods, which includes datasets of ground truth images and blur kernels, as well as a test scheme for assessing restoration quality and error robustness. Our experimental results show that those neural network models lacking any preoptimization, such as quantization or knowledge distillation, fall short of classical methods in several key properties, such as inference speed or the ability to handle different types of blur. Nevertheless, neural network models have made notable progress in their robustness to noise and distortions. Based on the results of the study, we provided recommendations for more effective use of modern non-blind image deconvolution methods. We also developed suggestions for improving the robustness, versatility and performance quality of the models by incorporating additional practices into the training pipeline.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Applied aspects of modern non-blind image deconvolution methods»

Applied aspects of modern non-blind image deconvolution methods

O.B. Chaganova13, A.S. Grigoryev12, D.P. Nikolaev 14,I.P. Nikolaev1 1 Institute for Information Transmission Problems, RAS, 127051, Russia, Moscow, Bolshoy Karetny per. 19;

2Evocargo LLC, 129085, Moscow, Russia,Godovikova st., 9, b. 4;

3 Moscow Institute of Physics and Technology (National Research University), 141701, Russia, Dolgoprudny, Institutskiy per. 9;

4LLC "Smart Engines Service", 117312, Russia, Moscow, prospect 60-letiya Oktyabrya 9

Abstract

The focus of this paper is the study of modern non-blind image deconvolution methods and their application to practical tasks. The aim of the study is to determine the current state-of-the-art in non-blind image deconvolution and to identify the limitations of current approaches, with a focus on practical application details. The paper proposes approaches to examine the influence of various effects on the quality of restoration, the robustness of models to errors in blur kernel estimation, and the violation of the commonly assumed uniform blur model. We developed a benchmark for validating non-blind deconvolution methods, which includes datasets of ground truth images and blur kernels, as well as a test scheme for assessing restoration quality and error robustness. Our experimental results show that those neural network models lacking any pre-optimization, such as quantization or knowledge distillation, fall short of classical methods in several key properties, such as inference speed or the ability to handle different types of blur. Nevertheless, neural network models have made notable progress in their robustness to noise and distortions. Based on the results of the study, we provided recommendations for more effective use of modern non-blind image deconvolution methods. We also developed suggestions for improving the robustness, versatility and performance quality of the models by incorporating additional practices into the training pipeline.

Keywords: non-blind image deconvolution, image deblurring, state-of-the-art methods, method robustness, non-blind deconvolution benchmarking.

Citation: Chaganova OB, Grigoryev AS, Nikolaev DP, Nikolaev IP. Applied aspects of modern non-blind image deconvolution methods. Computer Optics 2024; 48(4): 562-572. DOI: 10.18287/2412-6179-CO-1409.

Introduction

The paper is dedicated to the investigation of modern non-blind image deconvolution (NBID) methods. The task of NBID is to estimate the unknown sharp image from its blurred representation and a known point spread function (PSF). NBID methods are used in various fields such as astronomy [1], microscopy [2] and medical diagnostics [3]. In addition, non-blind deconvolution methods can be a part of a two-stage blind deconvolution scheme that follows the estimation of the blur kernel [4].

Academic research tends to focus on the development of new methods that demonstrate superior restoration quality. The majority of articles prioritize quality metrics [5, 6, 7] as their primary evaluation criteria, which also determine the state-of-the-art. However, little attention is paid to the implementation of such methods in real-world computer vision systems and their proper use considering practical limitations. Furthermore, there is currently no consistent approach to model validation. Many studies often evaluate their methods on only a small set of blur kernels and original images [5]. However, such an approach falls short of providing a comprehensive assessment of the method's performance. Other papers do not contain a detailed

description of the validation process [6], making it difficult or impossible to compare results between papers.

Our work aims to assess the progress made in the field of NBID and to identify the strengths and weaknesses of modern methods. To achieve this, we have developed a validation scheme for NBID methods that evaluates both the restoration quality and the robustness of the methods. In addition, we have constructed a more comprehensive and representative benchmark consisting of datasets of original images and blur kernels. The developed benchmark complements commonly used datasets for restoration quality evaluation (e.g. [8, 9]) and allows testing under various conditions, including the presence of noise, different levels of image quantization and types of blur kernels. The source code is available at https://github.com/OlgaChaganova/non-blind-deconvolution-benchmark.

The main contributions of the work are as follows:

1. A new benchmark for validating NBID models is proposed, which includes a methodology for testing methods that assesses restoration quality and error robustness;

2. Additional practices to improve NBID methods are suggested. Such practices do not require changes to

the model architecture and can be incorporated into the generation of training datasets. By adopting them, it is possible to improve the robustness, versatility and accuracy of the models.

Related work

In this section we will provide an overview of current state-of-the-art non-blind image deconvolution methods. Current method quality assessment approaches and existing benchmarks will also be considered. Our criteria for current state-of-the-art models are as follows:

1. the article is no more than 2 or 3 years old;

2. the authors claim metrics growth and visual improvement over current state-of-the-art methods;

3. the weights of the trained models are publicly available so that we can use them for testing.

These requirements are met by the models described in the next section. A detailed description of the models' architecture and training pipeline can be found in the original articles.

Overview of the state-of-the-art methods

The NBID methods has evolved from analytical methods [16 - 22] to deep learning models. The simplest approach has been to use neural network models that map the blurred image and the blur kernel directly onto the ground truth image [23, 24]. However, the combination of analytical methods and machine learning has significantly improved the restoration quality. Many methods approximate classical methods [25, 26]. The use of the deep unfolding paradigm is also common [5, 27]. Among all the modern methods, we have selected three models that can be considered as state-of-the-art:

1. USRNet (CVPR 2020) [5]. The model was originally developed to solve the single image superresolution problem, but it can also be used to solve the NBID problem. The model architecture includes three modules. The recovery module contains no trainable parameters as it has an analytical form using the Fourier transform. The noise reduction module is a ResUNet network that takes as input an estimate of the reconstructed image and a numerical value of the noise level. The hyperparameter estimation module is a three-layer fully connected network that takes as input a noise level value and a dimensionality reduction factor. USRNet is trained in an end-to-end paradigm using L1 loss as the loss function. The model is trained to recover both motion and gaussian blurred images.

2. DWDN (NeurlPS 2020) [6]. The main idea is to combine a classical Wiener filter with a trainable neural network model. The DWDN model consists of three parts. A convolutional network is used as a feature extractor. A Wiener filter is then applied to the extracted features. Due to the trainability and non-linearity of the neural network, these features contain more useful information than the pixel intensity of the original image.

Finally, an autoencoder reconstructs the image at different scales using the image pyramid. The loss function is computed as a weighted sum of L1 losses for images reconstructed at different scales. The model has been trained to recover motion blurred images. 3. KerUnc (CVPR 2020) [7]. The model architecture is designed to increase the robustness of the model to noise and errors. The authors have modified the optimization problem of image reconstruction by including an error component and regularization operators. There are three main modules in the network. The recovery module has an analytical form based on the Fourier transform. The error term evaluation module is a dual-path UNet model. The noise component estimation module is a combination of a convolutional network and a set of high-pass wavelet transform filters. The loss function is a weighted sum of the MSEs between the reconstructed image and the ground truth image over all the iterations. The robustness of the model to errors and noise is also ensured by the construction of the dataset. Both the correct kernels and their distorted versions were used during training. The model was trained to recover images blurred with motion blur kernels.

Overview of benchmarks

There are three main benchmarks that are used to test the quality of NBID models:

1) Levin et al [8]. This is one of the most commonly used datasets for benchmarking NBID methods. It contains 8 motion blur kernels and 4 original images, resulting in 32 test pairs. One of the advantages of this dataset is that it was assembled by the authors using their own equipment. The blurred images were taken directly from a camera instead of being modelled. The blur kernels were estimated with a high degree of accuracy.

2) Sun et al [9]. The dataset contains 8 motion blur kernels and 80 ground truth images, from which the blurred images are then generated. The kernels in this dataset are estimates of the kernels from the Levin dataset, so they are similar to those kernels but not exactly the same. The original images contain fine and subtle details that clearly show the restoration quality of the deblurring algorithm.

3) Lai et al [28]. The dataset contains 25 ground truth images, including images from 4 categories and 4 generated blur kernels.

All three benchmarks have one drawback in common: they have a small volume and contain only one type of blur kernel, which is the motion blur. We could not find any benchmarks containing Gaussian blur kernels, and we could only find one dataset with eye blur kernels, called SCA-2023, which was recently presented in [29]. SCA-2023 is a dataset for benchmarking image precompensation methods, but it can also be used for testing non-blind image deconvolution methods. It contains three subsets of PSFs (each consisting of 256

KoMmrorepHaa omma, 2024, tom 48, №4 DOI: I0.18287/2412-6179-C0-I409

563

PSFs) and 735 ground truth images divided into 6 categories: texts, icons, animals, faces, natural and urban landscapes. This dataset is much larger and more diverse than the NBID benchmarks discussed earlier.

Methodology

In this section, we describe the basics concerning blurred image modelling, the design of experiments, the structure of the developed benchmark for testing NBID methods, and the test scheme.

Blur image formation model

The uniform blur model is given by the following equation

g (x, y) = k (x, y) * f (x, y) + n( x, y),

(1)

where g (x,y) is the known blurred image, f (x,y) is the latent (ground truth) image, k (x,y) is the blur kernel, n (x,y) is additive white Gaussian noise (AWGN), the operator * denotes the spatial convolution.

It is assumed that f(x, y) and g (x, y) are linRGB images, since the convolution takes place in the linear space. In practice, however, we are more likely to work with processed images than with RAW images. It is therefore necessary to consider the effect of the camera response function (CRF), which involves several stages of image processing such as white balancing, demosaicing, sRGB conversion, and other [10]. With CRF and quantization of pixel values, the blur model is as follows:

g (x, y) = Qq (<D(k (x, y) * f (x, y) + n( x, y))),

(2)

where ®(x) is the camera response function, Qq(x) is the quantization function of the form Qq (x) = q-round(x-(1 / q)), q = 1 / (2k - 1) for k-bit image. Since ® is a non-linear function, the blurred image g (x, y) must be linearized using the inverse transform ® -1 before applying the NBID method. The exact form of the camera response function is not standardized. In practice, it is either determined by special methods [11] or approximated by some model. For example, the gamma curve [12, 13] is common.

Blur types

We consider three types of blur: motion blur, gaussian blur, and eye blur. Typical examples of these types of kernels are shown in Fig. 1.

motion blur gaussian blur small eye blur medium eye blur large eye blur

Fig. 1. Examples of motion blur, gaussian blur, and eye blur kernels

1) Motion blur

The motion blur can be caused by the movement of objects in the scene or the scene itself; an unstable

camera position during recording; a long exposure time when taking a picture. The generation algorithm proposed in [4] can be used to simulate motion blur kernels. The blur kernel contour is a spline of several randomly selected points, whose pixel values are then sampled from a normal distribution and normalized.

2) Gaussian blur

The Gaussian blur is a common form of blur observed in astronomical imaging, underwater photography, and fluorescence microscopy. It can also be used as an approximation for more complex blurs. In general terms, the kernel of Gaussian blur can be described by the following formula:

k ( x, y) =

1

- x2 - V 2

exp(-+ ),

2nax ctv 2ay 2a2y

(3)

where CTx and CTy are the standard deviations in the x and y axes, respectively.

3) Eye blur

This type of blur is caused by abnormalities in the human visual system that result in the eye's inability to focus the light beam correctly on the retina. A brief description of eye blur modeling can be found in [29].

Benchmark description

In order to create a representative benchmark, it is necessary to provide a variety of images and blur kernels. The following datasets are used in our proposed benchmark:

1. Blur kernel datasets:

1.1. Motion blur (46 kernels): Levin (8 kernels), Sun (8 kernels), synthetic kernels (30 kernels). The synthetic kernels were generated using the algorithm from [4] with the following parameters: the kernel size is 41*41, the number of spline points is uniformly sampled from a set {3, 4, 5, 6}, and the spline size is uniformly sampled from a set {11, 16, 21, 26, 31}.

1.2. Gaussian blur: synthetic kernels (30 kernels). The synthetic kernels were generated according to expression (3) with an additional rotation of the kernel by an angle A. The generation parameters are as follows: CTx and CTy are chosen from the range [2, 10], and A is selected from the range [180, 180] with a step of 10 degrees.

1.3. Eye blur: SCA-2023 (30 small blur kernels, 30 medium blur kernels, 30 large blur kernels).

2. Ground truth image datasets:

2.1 Sun: 80 images;

2.2 SCA-2023: 539 images in 6 categories.

The pairs (ground truth image, blur kernel) were created so that each kernel had 1 image from each dataset. For 7 ground truth image categories and 166 kernels, a total of 1162 pairs were created.

The entire image pre-processing procedure is shown in Fig. 2. Image cropping is necessary to unify the

estimation process for all models, and grayscale conversion is used to eliminate the influence of color in the calculation of metrics. For those models that take a three-channel image as input, a grayscale image is copied to the remaining two channels. To ensure the correctness of the blur simulation, the original images were converted from sRGB to linRGB before the convolution. The images in the datasets are in two formats: JPEG (uint 8 bit pixel values) and PNG (float 32 bit pixel values). We normalize JPEG images by dividing their pixel values by the maximum pixel value (255) to be able to use them in

the tests. We simulated noisy images by adding white Gaussian noise with a sigma equal to 1 % of the maximum pixel brightness on the convolved images.

Test scheme

The testing process is shown schematically in Fig. 3. The blurring takes place in the linear space, resulting in linRGB blurred and ground truth images. The testing process is further divided into two branches: NBID algorithms are applied to linRGB images and to sRGB images at different quantization levels (float 32 bit, uint 16 bit, uint 8 bit).

p5f \ ['' gjlis: гэ 'ifj'j

Blurred image

Crop to 2561256

Convert to gray

SRGE1 —linRGB

linRGB

image

Blurred Image linRGB

Ground truth image linRGB

Fig. 2. Image pre-processing and blur modelling procedure

SRGB color \ mage /

blurred and ground truth Images

sRGB

I Pre-processing /

the convolution Is modelled In linear space

I

linRGB

Restoration qualitj assessment

у/ г н^гПи" - ~ion

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

float32 uint16

/ Restoration quality ^ assessment

lity II Deconvolution ^ f sRGB 16 bit |

/ 7 method j Images J

uint16-» uintS

/ Restoration quality assessment

lity / / Deconvolution / f 5rgb 8 bit i

f / method j images J

I Restoration quality / assessment /

float32 — uint16

linRGB 16 bit

Images

iit\ / Deconvolution / F 1 method / *

Restoration quality / assessment /

uint16-» uintS

f linRGB S bit

V images

t^_ / Deconvolution ! ,'I

J ' / method / I

Restoration quality j assessment

Models and metrics

We chose the USRNet [5], DWDN [6], and KerUnc [7] neural networks for testing. A better understanding of the advantages and limitations of modern methods can be gained by comparing them with a well-known classical method, which in our case is a Wiener filter with regularisation [15].

The trained model weights are taken from official sources. The Wiener filter implementation is taken from the skimage library. Since all the selected neural network models can take into account the presence and the intensity of noise, the Wiener filter hyperparameter balanse was pre-optimized on a part of the test dataset. The optimal values of the hyperparameter are 1e 8 in the absence of noise and 5e-3 in the presence of noise.

The standard metrics PSNR [30] and SSIM [31] are used as quality metrics. A higher metric indicates better restoration. However, these metrics have their own drawbacks [32, 33]. For this reason, it is proposed to use

Fig. 3. Schematic diagram of the testing process for NBID methods

a combination of measuring the metrics on a large number of images to obtain the average restoration quality of a method, and a visual analysis of the recovered images to evaluate this quality from a human point of view.

Restoration quality evaluation

This section describes how different parameters affect the quality of the restoration. In most experiments, the methods were applied to linRGB images. However, we will also consider whether the neural network models should be applied to linRGB or sRGB images, and for these experiments both metrics for linRGB and sRGB images are given.

The influence of the blur type and noise on the restoration quality

According to experimental results (Fig. 4), the main advantage of modern neural network models is their improved robustness to noise. In the absence of noise,

Компьютерная оптика, 2024, том 48, №4 DOI: I0.18287/2412-6179-C0-I409

565

the Wiener filter is the best method, but in a more realistic scenario, its restoration quality drops dramatically. Its average PSNR drops by a factor of 2-4,

while for neural networks this metric drops by less than 25 %. A visualization of the model performance is shown in Fig. 5.

Without noise

dwdn

kerunc

•o^ „^ ^

Op ^

With Gaussian noise (a is 1% of the maximum pixel brightness)

usrnet

dwdn

kerunc

A»'1 ^ oei ^^^

^ ^

v\<* ^ ^

& ^ „eS

Fig. 4. Boxplots of the restoration quality metrics versus the blur types. The red lines represent the median values of the metrics

However, the neural network models do have one important feature. All three neural network models perform well on the motion blur, but DWDN and KerUnc perform significantly worse on the Gaussian blur and the eye blur. The reason for this is that the training datasets of the models only consisted of motion blur kernels. Therefore, the generalizability of these models only extends to unfamiliar kernels within their known blur type, but not to other blur types. If there is a need to handle different types of blur without a loss of quality, the USRNet is the most suitable model.

The influence of pixel quantization on the restoration quality

Image quantization is reducing the number of bits used to represent the colors of each pixel in an image. The dependence of the average restoration quality metrics on the quantization level (float 32 bit, uint 16 bit, uint 8 bit) are shown in Fig. 6. The given metrics values are average values for all kinds of blurring.

For the neural network models, the quantization did not result in a noticeable drop in the metric values. However, for the Wiener filter, the conversion to 8-bit

pixel values results in a significant loss of quality, mainly due to quantization noise. This effect can be partially mitigated by increasing the weight of the regularization term that suppresses noise, but the reconstructed image still contains visible artifacts.

The influence of the image category on the restoration quality

The dependency of the average restoration quality metrics on the image category is shown in Fig. 7. The metric values given are the average values for all the blur types. The "Sun" category denotes the Sun dataset [9] which contains mainly landscape images. The horizontal line represents the average metric value across all the categories.

The most challenging images for the neural network models were those with text and icons. A possible explanation is that there were few or no images in these categories in the models' training dataset. The Sun dataset, which contains images with fine details and a complex structure, is also quite challenging. The methods perform better on real-world images, i.e. images of nature, faces, cities, and animals.

Blurred Noised blurred Original image PSF

Blurred Noised blurred Original image

wiener (1.00/82.69) kerunc (0.96/33.03) usrret (1.00/47.35) dwdn <0.96/31.04) wiener (0.99/43.88) kerunc (0.86/28.29) usrnet (0.93/31.77) dwdn (0.84/27.18)

wiener (0.88/28.79) kerunc (0.89/29.73) usrnet (0.96/34.30) dwdn <0.92/29.78) wiener (0.72/17.75) kerunc (0.85/27.90) usrnet (0.88/28.85) dwdn (0.84/27.07)

wiener (1.00/51.87) kerunc (0.68/23 47) usrnet (0.93/30.34) dwdn <0 68/22.76} wiener (0.99/38.51) kerunc (0.49/20.70) usrnet (0.81/25.86) dwdn (0.51/20.72)

wiener (0.59/21.13) kerunc (0.61/22.35) usrnet (0.75/24.30) dwdn <0.65/22.38) wiener (0.42/18.24) kerunc (0.41/19.46) usrnet (0.62/22.24) dwdn (0.50/20.59)

Fig. 5. Visualization of the performance of the models with different types of blur kernels. The numbers represent the metrics values

(SSIM/PSNR)

Without noise

With Gaussian noise (o is 1% of the maximum pixel brightness)

dwdn

kerunc ^ usrnet $

kerunc usrnet

float 32 bit uint 16 bit uint 8 bit float 32 bit uint 16 bit uint 8 bit float 32 bit uint 16 bit uint 8 bit float 32 bit uint 16 bit uint 8 bit

Quantization level Quantization level Quantization level Quantization level

Fig. 6. Dependency of the restoration quality on the quantization level

Without noise

With Gaussian noise (o is 1% of the maximum pixel brightness)

texts nature icons faces city animals

texts nature icons faces city animals

texts nature icons faces city animals

Image category Image category

Fig. 7. The restoration quality metrics versus the image category

un texts nature icons faces city animals Image category

The influence of gamma-correction on the restoration quality

For all the neural network models tested, the blurred images in train dataset were obtained by convolving the blur kernel with the ground truth sRGB image. This blurring process is not physically correct as the images were not linearized beforehand. Since neural networks were trained on sRGB images, but NBID methods are supposed to be used on linearized images, we compared the quality of recovery when the methods were applied

directly to sRGB images and to linRGB images. The results are shown in Fig. 8. The best restoration quality is achieved when neural networks are applied to pre-linearized images. Additional non-linear processing between the image convolution in the linear space and the neural network application degrades the quality.

Robustness evaluation

This section describes experiments investigating the robustness of the methods to errors in kernel blur estimation and non-uniform blur.

Without noise

With Gaussian noise (cr is 1% of the maximum pixel brightness)

linrgb_float srgb_float

linrgb_float srgb_float

dwdn kerunc usrnet wiener

dwdn kerunc usrnet wiener

Fig. 8. Influence of the linRGB-to-sRGB conversion on the restoration quality

DM = PSNR(f, fdist0rted)/PSNR(f, f), to assess the

Robustness of methods to blur kernel distortions

Often the PSF is not known with 100 % accuracy, but is estimated, and like any estimate it may contain errors. Therefore, the robustness of NBID methods to blur kernel distortions is a very important property in practice. To test the methods robustness, we set up the following experiment:

1. Obtain a blurred image g by convolving the ground truth image f and the blur kernel k: g=k*f;

2. Recover the blurred image by feeding the true kernel into the deconvolution model:

f = model(g, k);

3. Distort the blur kernel kdistorted using the algorithm from [7];

4. Recover the blurred image by feeding the distorted kernel into the deconvolution model:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

fdistorted = model(g, kdistorted ) ;

5. Compare the results obtained. We will introduce a quality degradation metric DM, defined as

quality degradation.

The test was performed on the same pairs that had been used for the restoration quality test. A visualization of the results is shown in Fig. 9. For those types of blur that were not present in the training dataset, the degradation metric for the neural network models is close to 1: the models reconstruct them equally badly with the correct kernel and with the distorted one. The most informative results are on the types of blur that were present in the training dataset.

The Wiener filter (DMe[0.20, 0.43]; for all blur types) and USRNet (DM = 0.46 for motion blur, DM = 0.79 for gaussian blur) are the least robust. The DWDN model handles errors better (DM = 0.70 for motion blur), although it was only trained on the correct kernels. The KerUnc model (DM = 0.79 for motion blur) handles blur kernel errors best of all, with robustness to errors built into the architecture and training data.

Distorted PSF

wiener (0.81 \ 23.36) kerunc (0.84 \ 25.31) usrnet (0.94 \ 30.28) dwdn (0.89 \ 25.99)

wiener (0.71 \ 19.65) kerunc (0.78 \ 21.20) usrnet (0.92 \ 25.68) dwdn (0.75 \ 19.76)

Wiener (0.46 \ 16,19) kerunc (0.66 \ 19.68) usrnet (0.43 \ 15.43) dwdn (0.59 \ 18,09) Wiener (0.70 \ 20.35) kerunc (0.79 \ 21.31) usrnet (0.79 \ 18.89) dwdn (0.78 \ 20.97)

7 ¿-.II'J

Fig. 9. Visual comparison of robustness to error in the blur kernel. The numbers in the figure represent the values of the metrics (SSIM / PSNR)

Robustness of methods to non-uniform blur

In the field of NBID, most papers assume that the blurring model is uniform and is described by expression (1). This blurring model is also the basis of all the tested methods. In practice, however, this model is violated, e.g. when the camera moves not only along the image plane, but also in the depth direction of the scene [34]. An experiment that evaluates the robustness of NBID methods to the non-uniform blur model is as follows:

1. Get a uniformly blurred image g by convolving the ground truth image f and the blur kernel k:

guniform =k*f;

2. Get a non-uniformly blurred image gnon uniform. We introduce a grid of non-overlapping patches in the original image. Within one patch (i,j), the image is convolved with the initial blur kernel rotated by a certain angle: k i j = rotate(k, angle = (i +j)*A), i, je[0, N], where A is the difference between the rotation angles of adjacent patches, N2 is the total

number of patches. An example of a grid for A = 8°, N = 8 is shown in Fig. 10.

3. Compare the metrics obtained by deblurring images with uniform blur funiform = model(gu„iform, k) and nonuniform blur f„0„_u„if0rm = model(gnon uniform, k).

Blur kernel rotation grid

/ I /

Uniformly blurred «

"if.

Non-uniformly blurred

W §fà\Bl /1|! vW,

Fig. 10. Comparison of uniformly and non-uniformly blurred images

Uniform blurred Non-uniform blurred Original image

The visualization of the results is shown in Fig. 11. The second row is reconstructed images with uniform blur, the third row is reconstructed images with nonuniform blur. In all cases, the parameters of the kernel distortion are the same: A = 3°, N = 4. KerUnc and DWDN handle non-uniform blur better than USRNet and the Wiener filter, which produce visible artifacts.

Computational efficiency

Computational efficiency is an important factor in the selection of a suitable NBID algorithm. The inference time of the models is shown in the Table 1. The hardware used for testing is Intel Core i9-11900K CPU and NVIDIA GeForce RTX 3090 GPU. The blurred image has a size of 256x256. The evaluated average inference time includes the time for necessary pre- and post-processing of the image. We tested the models in their original configurations, so we did not convert them to other formats such as TensorRT or OpenVINO. We also did not perform any pre-optimization such as quantization or knowledge distillation, which can speed up the inference time of neural networks but may require additional training of the models.

Only the Wiener filter can be used in a real-time processing mode on CPU, while the neural network models require GPU. It is worth noting that the speed of USRNet and KerUnc does not depend on the size of the blur kernel, unlike the Wiener filter and the DWDN model. It should also be noted that KerUnc and the Wiener filter take single-channel images as input, while USRNet and DWDN take three-channel images. Therefore, when processing RGB color images, the processing time of the Wiener filter and KerUnc increases because they have to be run per channel.

Uniform blurred Non-uniform blurred Original Image

wiener (1.00 / 89.22) kerunc (0.92 I 28.19) dwdn (0.92 / 25.04)

wiener (1.00 / 85.45) kerunc (0.99 / 34.89) usrnet (1.00 / 47.59) dwdn (0.99 / 33.58)

Fig. 11. Visual comparison of robustness to non-uniform blur Table 1. Average processing time per image (30 runs)

Device CPU GPU

Blur kernel size 41x41 256x256 41x41 256x256

Wiener 3.1 ms 56.3 ms - -

USRNet 2.16 s 2.19 s 77.5 ms 77.5 ms

KerUnc 1.79 s 1.75 s 118 ms 94.5 ms

DWDN 3.07 s 15.1 s 54 ms 124 ms

KoMmrorepHaa omma, 2024, TOM 48, №4 DOI: 10.18287/2412-6179-C0-1409

569

Practical recommendations

As we demonstrated in previous sections, many aspects being of great importance for practical applications of NBID, such as the temporal performance or the ability to handle different types of blur, are still areas for improvement. Moreover, unlike the field of noisy image filtering where algorithmic methods are still being efficiently developed [35], mostly neural network models are being presently discussed in the area of NBID. Based on the experiments we reported above, we can propose some recommendations that do not require changes to the model architecture and can potentially help to improve the quality of neural-network NBID:

1. Model the blurring process correctly by pre-linearizing the image before convolution.

2. Include different types of blur kernels, which may extend the limits of the model's applicability. The USRNet model serves as an illustrative example, as it has been trained on motion blur and Gaussian blur kernels, but also shows satisfactory quality on eye blur kernels. Training the model on the Gaussian blur is likely to enable it to handle the eye blur effectively, given their similar shapes. Increasing the size of the training kernels can also help the models to better handle stronger blur.

3. Use more diverse and larger datasets. The experiment showed that the neural network models trained on datasets containing mainly real-world images performs worse, on average, on images containing text or icons.

4. Include a kernel distortion procedure in the training pipeline, as implemented for the KerUnc model, which can help to improve the robustness. Some blurred images can also be simulated with a nonuniform blur model.

5. Develop more computationally efficient models that do not require GPUs for inference. This can be done by using modern model compression techniques such as quantization [36], which can significantly speed up the model inference without sacrificing the quality.

Conclusion

The study has shown that at the current stage of development of the NBID field, there is no single model that combines high-speed image processing, high-quality restoration, the ability to handle different types of blur, and the robustness to errors. The choice of the model depends on specific task conditions, as different models have their own advantages and disadvantages:

1. Computational efficiency: the inference of modern neural network models in the real-time processing mode often requires the use of hardware accelerators (Table 1), which is not always possible in practice. In scenarios where only CPUs are available, the Wiener filter remains the only option among all the considered models that can provide the required

performance. However, if the real-time speed isn't critical, neural network models can be used, including USRNet and KerUnc, whose speed is independent of the size of the blur kernel.

2. Restoration quality and generality: In noise-free scenarios, the Wiener filter has excellent restoration quality and can handle various types of blur. In practice, however, noise is unavoidable, and the filter performs poorly compared to the neural network models which produce more natural-looking images without visible artifacts. The KerUnc and DWDN models show a good quality in deblurring images with the motion blur, but are not good with the other types of blur. The USRNet model is more versatile and handles Gaussian blur and eye blur kernels much better.

3. Robustness: the KerUnc and DWDN models were the most robust to errors in the kernel blur estimation as well as to non-uniform blur. Although the quality of the restoration is reduced in this case, the reduction is not as severe and noticeable as with the Wiener filter and USRNet, which do not consider the possibility of distortion in their image formation and deconvolution models.

Therefore, further developments in the area of non-blind image deconvolution are still relevant. Progress in this area is primarily associated with increased robustness to noise and various distortions.

Acknowledgement

This work was supported by the Russian Science Foundation (Project No. 20-61-47089).

References

[1] Wang H, Sreejith S, Lin Y, Ramachandra N, Slosar A, Yoo S. Neural network based point spread function deconvolution for astronomical applications. arXiv Preprint. 2022. Source: <https://arxiv.org/abs/2210.01666>. DOI: 10.48550/arXiv.2210.01666.

[2] Strohl F, Kaminski C. A joint Richardson-Lucy deconvolution algorithm for the reconstruction of multifocal structured illumination microscopy data. Methods Appl Fluoresc 2015; 3(1): 014002. DOI: 10.1088/2050-6120/3/1/014002.

[3] Agarwal S, Singh O.P., Nagaria D. Deblurring of MRI image using blind and non-blind deconvolution methods. Biomed Pharmacol J 2017; 9(10): 1409-1413. DOI: 10.13005/bpj/1246.

[4] Chakrabarti A. A neural approach to blind motion deblurring. In Book: Leibe S, Matas J, Sebe N, Welling M, eds. Computer Vision - ECCV 2016. Cham: Springer International Publishing AG; 2016: 221-235. DOI: 10.1007/978-3-319-46487-9_14.

[5] Zhang K, Gool L, Radu T. Deep unfolding network for image super-resolution. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition 2020: 32143223. DOI: 10.1109/CVPR42600.2020.00328.

[6] Dong J, Roth S, Schiele B. Deep Wiener deconvolution: Wiener meets deep learning for image deblurring. 34th Conf on Neural Information Processing Systems (NeurIPS 2020) 2020. DOI: 10.48550/arXiv.2103.09962.

[7] Nan Y, Ji H. Deep learning for handling kernel/model uncertainty in image deconvolution. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition 2020: 23852394. DOI: 10.1109/CVPR42600.2020.00246.

[8] Levin A, Weiss Y, Durand F, Freeman WT. Understanding and evaluating blind deconvolution algorithms. IEEE Conf on Computer Vision and Pattern Recognition 2009: 19641971. DOI: 10.1109/CVPR.2009.5206815.

[9] Sun L, Cho S, Wang J, Hays J. Edge-based blur kernel estimation using patch priors. IEEE Int Conf on Computational Photography (ICCP) 2013: 1-8. DOI: 10.1109/ICCPhot.2013.6528301.

[10] Bhandari A, Kadambi A, Raskar R. Computational Imaging. MIT Press; 2022. ISBN: 9780262046473.

[11] Tai Y, Chen X, Kim S, et al. Nonlinear camera response functions and image deblurring: Theoretical analysis and practice. IEEE Trans Pattern Anal Mach Intell 2013; 35(10): 2498-2512. DOI: 10.1109/TPAMI.2013.40.

[12] Fergus R, Singh B, Hertzmann A, et al. Removing camera shake from a single photograph. ACM Trans Graph 2006; 25(3): 787-794. DOI: 10.1145/1141911.1141956.

[13] Anger J, Facciolo G, Delbracio M. Modeling realistic degradations in non-blind deconvolution. 2018 25th IEEE Int Conf on Image Processing (ICIP) 2018: 978-982. DOI: 10.1109/ICIP.2018.8451115.

[14] Whyte O, Sivic J, Zisserman A. Deblurring shaken and partially saturated images. Int J Comput Vis 2014; 110(2): 185-201. DOI: 10.1007/s11263-014-0727-3.

[15] Murli A, D'Amore L, De Simone V. The Wiener filter and regularization methods for image restoration problems. Proc 10th Int Conf on Image Analysis and Processing 1999: 394-399. DOI: 10.1109/ICIAP.1999.797627.

[16] Lee J, Ho Y. High-quality non-blind image deconvolution with adaptive regularization. J Vis Commun Image Represent 2011; 22(7): 653-663. DOI: 10.1016/j.jvcir.2011.07.010.

[17] Bioucas-Dias JM, Figueiredo MAT, Oliveira JP. Total variation-based image deconvolution: A majorization-minimization approach. 2006 IEEE Int Conf on Acoustics Speech and Signal Processing Proceedings 2006; 2: II-II. DOI: 10.1109/ICASSP.2006.1660479.

[18] Lucy LB. An iterative technique for the rectification of observed distributions. Astron J 1974; 79: 745-754. DOI: 10.1086/111605.

[19] Richardson WH. Bayesian-based iterative method of image restoration. J Opt Soc Am 1972; 62(1): 55-59. DOI: 10.1364/JOSA.62.000055.

[20] Huang H, Ma S. Gradient-based image deconvolution. J Electron Imaging 2013; 01(22): 013006. DOI: 10.1117/1.JEI.22.1.013006.

[21] Karnaukhov VN, Mozerov MG. Restoration of multispectral images by the gradient reconstruction method and estimation of the blur parameters on the basis of the multipurpose matching model. J Commun Technol Electron 2016; 61(12): 1426-1431. DOI: 10.1134/S106422691612010X.

[22] Cascarano P, Sebastiani A, Comes MC, Franchini G, Porta F. Combining weighted total variation and deep image prior for natural and medical image restoration via ADMM. 2021 21st Int Conf on Computational Science and its Applications (ICCSA) 2021: 39-46. DOI: 10.1109/ICCSA54496.2021.00016.

[23] Schuler C, Burger H, Harmeling S, Scholkopf B. A machine learning approach for non-blind image deconvolution. IEEE Computer Society Conf on Computer Vision and Pattern Recognition (CVPR) 2013: 1067-1074. DOI: 10.1109/CVPR.2013.142.

[24] Xu L, Ren J, Liu C, Jia J. Deep convolutional neural network for image deconvolution. Adv Neural Inf Process Syst 2014; 27: 1790-1798.

[25] Gong D, Zhang Z, Shi Q, Hengel A, Shen C, Zhang Y. Learning deep gradient descent optimization for image deconvolution. IEEE Trans Neural Netw Learn Syst 2020; 31(12): 5468-5482. DOI: 10.1109/TNNLS.2020.2968289.

[26] Agarwal C, Khobahi S, Bose A, et al. DEEP-URL: A model-aware approach to blind deconvolution based on deep unfolded Richardson-Lucy network. 2020 IEEE International Conference on Image Processing (ICIP) 2020: 3299-3303. DOI: 10.1109/ICIP40778.2020.9190825.

[27] Mou C, Wang Q, Zhang J. Deep generalized unfolding networks for image restoration. arXiv Preprint. 2022. Source: <https://arxiv.org/abs/2204.13348>. DOI: 10.48550/arXiv.2204.13348.

[28] Lai W-S, Huang J-B, Hu Z, et al. A comparative study for single image blind deblurring. 29th IEEE Conf on Computer Vision and Pattern Recognition (CVPR 2016) 2016: 1701-1709. DOI: 10.1109/CVPR.2016.188.

[29] Alkzir NB, Nikolaev IP, Nikolaev DP. SCA-2023: a two-part dataset for benchmarking the methods of image precompensation for users with refractive errors. 37th ECMS Int Conf on Modelling and Simulation 2023: 298305. DOI: 10.7148/2023-0298.

[30] Hore A, Ziou D. Image quality metrics: PSNR vs. SSIM. 2010 20th Int Conf on Pattern Recognition 2010: 23662369. DOI: 10.1109/ICPR.2010.579.

[31] Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 2004; 13(4): 600612. DOI: 10.1109/tip.2003.819861.

[32] Wang Z, Bovik AC. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures. IEEE Signal Process Mag 2009; 26(1): 98-117. DOI: 10.1109/MSP.2008.930649.

[33] Kotevski Z, Mitrevski P. Experimental comparison of PSNR and SSIM metrics for video quality estimation. In Book: Davcev D, Gomez JM, eds. ICT Innovations 2009. Berlin, Heidelberg: Springer-Verlag; 2010: 357-366. DOI: 10.1007/978-3-642-10781-8_37.

[34] Kober VI, Karnaukhov VN. Restoration of multispectral images distorted by spatially nonuniform camera motion. J Commun Technol Electron 2015; 60(12): 1366-1371. DOI: 10.1134/S1064226915120153.

[35] Andriyanov NA, Dementiev VE, Vasiliev KK. Developing a filtering algorithm for doubly stochastic images based on models with multiple roots of characteristic equations. Pattern Recognit Image Anal 2019; 29(1): 10-20. DOI: 10.1134/S1054661819010048.

[36] Sher A, Trusov A, Limonova E, Nikolaev D, Arlazarov VV. Neuron-by-neuron quantization for efficient low-bit QNN training. Mathematics 2023: 11(9): 2112. DOI: 10.3390/math11092112.

Authors' information

Olga Borisovna Chaganova, (b. 1998) graduated from Orenburg State University in 2021, majoring in Applied Mathematics. Currently she is a M.Sc. student in Applied Mathematics and Physics at Moscow Institute of Physics and Technology. Her research interests include deep learning and computer vision. E-mail: [email protected]

KoMntrorepHaa omma, 2024, tom 48, №4 DOI: 10.18287/2412-6179-CO-1409

571

Anton Sergeevich Grigoryev, (b. 1989) graduated from Moscow Institute of Physics and Technology (MIPT) in 2012, majoring in Applied Mathematics and Informatics. Currently he works as a researcher at the Institute for Information Transmission Problems (IITP RAS) and also is the Director of Technology of an AI software development company Visillect Service. Research interests are image processing and enhancement methods, autonomous robotics and software architecture. E-mail: [email protected]

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Dmitry Petrovich Nikolaev, (b. 1978) Ph. D. in Physics and Mathematics. He graduated from Lomonosov Moscow State University (MSU) in 2000, is a head of the vision systems laboratory at the Institute for Information Transmission Problems (IITP RAS) and the Director of Technology of Smart Engines Service LLC. Research interests are machine vision, algorithms for fast image processing, pattern recognition. E-mail: [email protected]

Ilya Petrovich Nikolaev, (b. 1972) Ph. D. in Physics and Mathematics. He graduated from Lomonosov Moscow State University (MSU) in 1994, majoring in Physics. After years of working in the field of adaptive optics, in 2020 he joined the vision systems laboratory at the Institute for Information Transmission Problems (IITP RAS). Research interests are image processing and visual perception. E-mail: i.p. [email protected]

GRNTI: 28.23.37. Received August 10, 2023. The final version - December 10, 2023.

i Надоели баннеры? Вы всегда можете отключить рекламу.