Научная статья на тему 'COMPREHENSIVE REVIEW OF IMAGE DENOISING METHODS: CLASSICAL ALGORITHMS AND DEEP LEARNING APPROACHES'

COMPREHENSIVE REVIEW OF IMAGE DENOISING METHODS: CLASSICAL ALGORITHMS AND DEEP LEARNING APPROACHES Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
0
0
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
image denoising / noise cancellation / image processing / denoising methods

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Abdrakhim D.

In computer vision and image processing, image denoising is an essential task. Several techniques have been put forth to recover clear images from noisy ones. These techniques fall under the categories of spatial domain and frequency domain procedures, as well as local and non-local techniques. Low-rank denoising and sparse coding have become more well-liked recently. This study provides an overview of current methods for image denoising and makes some interesting recommendations for further investigation. The paper presents an extensive review of the latest advancements in photo denoising algorithms and offers insightful recommendations for future improvements. By examining the technical and performance differences among various denoising methods, this research seeks to deepen the understanding and enhance the implementation of image denoising in computer vision and image processing.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «COMPREHENSIVE REVIEW OF IMAGE DENOISING METHODS: CLASSICAL ALGORITHMS AND DEEP LEARNING APPROACHES»

УДК 004

Abdrakhim D.

SDU University (Almaty, Kazakhstan)

COMPREHENSIVE REVIEW OF IMAGE DENOISING METHODS: CLASSICAL ALGORITHMS AND DEEP LEARNING APPROACHES

Аннотация: in computer vision and image processing, image denoising is an essential task. Several techniques have been put forth to recover clear images from noisy ones. These techniques fall under the categories of spatial domain and frequency domain procedures, as well as local and non-local techniques. Low-rank denoising and sparse coding have become more well-liked recently. This study provides an overview of current methods for image denoising and makes some interesting recommendations for further investigation. The paper presents an extensive review of the latest advancements in photo denoising algorithms and offers insightful recommendations for future improvements. By examining the technical and performance differences among various denoising methods, this research seeks to deepen the understanding and enhance the implementation of image denoising in computer vision and image processing.

Ключевые слова: image denoising, noise cancellation, image processing, denoising

methods.

Introduction.

Despite considerable efforts to tackle challenges related to lighting, camera angles, and facial expressions, facial images can still be corrupted by noise during acquisition, quantization, compression, and other processing stages. This noise often results in a notable decrease in the accuracy of most recognition techniques. While facial recognition technology is rapidly advancing and being applied in areas such as law enforcement, video surveillance, access control, disaster preparedness, and security, its performance can be significantly compromised in uncontrolled environments.

Problems with image transmission and reception might arise from noise. Reducing unnecessary noise while maintaining the signal's essential components is the aim of noise reduction. The majority of datasets collected with image sensors contain some noise. Natural disruptions, malfunctioning equipment, and issues with data collection can all have an impact on the data. As a result, a critical first step in image analysis is noise reduction. Techniques for noise reduction must be used to prevent distortion in digital images. Noisy digital images might have artifacts, fake edges, invisible lines, fuzzy objects, and disturbed background scenes due to undesired information. In order to address these issues, it is crucial to look at the noise patterns first. Numerous models of noise have been investigated in the literature.

Several techniques have been developed to mitigate the impact of noise in images prior to the identification stage. One approach involves transforming the visual signals into a format that facilitates the differentiation between the desired information and the noise. A different approach focuses on obtaining image statistics directly in the image area. These two methods produce very good photos, however the image may lose some information during the noise reduction process, which could prevent the image from being recognized further.

The advancement of deep artificial neural networks has enabled the possibility of end-to-end denoising. The objective of this study is to present a comprehensive overview and analysis of different approaches and denoising algorithms used in facial recognition software. Each method has its own advantages, limitations, and suitability depending on the type of noise and specific requirements of the facial recognition software.

TYPES OF NOISES.

2.1. Gaussian noise.

Gaussian noise is a common type of noise that can be observed in digital images. It is a type of electronic noise where the values added to the image pixels follow a normal (Gaussian) probability distribution, meaning the noise values have a

bell-shaped curve distribution centered around zero. The noise values are randomly distributed and independent of the pixel values in the original image. The noise is additive, so the noise values are simply added to the original pixel values, and the noise has an equal distribution across all color channels for color images. The standard deviation of the Gaussian distribution determines the strength or intensity of the noise. Gaussian noise is often introduced during image acquisition due to factors like sensor imperfections, thermal fluctuations, and electronic interference, and it can also occur during image transmission or storage due to channel noise.

This type of noise causes a grainy or speckled appearance in the image, can reduce image contrast, and degrade fine details, with the impact generally being more noticeable in smooth, low-contrast regions of the image. Gaussian noise can be reduced using various filtering techniques, such as Gaussian blur, median filtering, or more advanced denoising algorithms, with the choice of filtering method depending on the specific application and the trade-off between noise reduction and preserving image details.

where P(x) is the image's Gaussian distribution noise and x and y stand for the mean and standard deviation, respectively.

2.2. Salt and Pepper Noise.

"Salt and pepper noise" is a common type of noise often found in images. It appears as random white and black pixels scattered throughout the image. This type of noise can be effectively mitigated by applying techniques such as a counter harmonic mean filter, morphological filter, or median filter. Salt and pepper noise in images can occur when abrupt and incorrect transitions happen, leading to the presence of scattered bright and dark pixels.

Various denoising techniques can improve image quality by removing pepper and salt noise. Typical denoising techniques include median filtering and adaptive filters. Adaptive filters detect distorted images and substitute them with more precise approximations derived from neural network data. These techniques successfully preserve the image's structure and features while reducing pepper and salt noise.

2.3. Poisson Noise.

When counting photons in low-light imaging circumstances, a form of random noise called poisson noise, commonly referred to as shot noise, arises. This is due to a statistical trend in the quantity of photons detected. Simeon Denis Poisson, a French mathematician who investigated the statistical characteristics of random events, is credited with coining the term "Poisson". Imaging methods such as X-ray imaging, astronomical imaging, and fluorescence microscopy frequently exhibit poisson noise.

It's critical to understand the distinctions between Gaussian and Poisson noise. Because poisson noise is multiplicative, the square root of the average signal value determines how much of its standard deviation there is. Gaussian noise, on the other hand, is often characterized as additive and exhibits a constant standard deviation:

(3)

A random variable X that obeys a Poisson distribution takes on only nonnegative values. The probability that x=k is where ^ is a positive parameter

2.5. Speckle Noise.

Miniscule reflections within internal organs can give photos a rough appearance, a phenomenon known as speckle. This makes it harder for onlookers to pick up on subtle details while diagnosing someone. Random interference among the coherent returns produces this type of noise, which has a gamma distribution and is found in many coherent systems, including SAR and ultrasound images. Mechanical image processing is now required to eliminate this noise from the image while maintaining edge details and image quality. Various denoising techniques have been studied to address this problem.

g(n,m) = f(n,M)u(n9m) + Ç(n,m)

Where 9>f,u $ stands for the observed image corrupted by speckle noise that can be decomposed into the original image, a multiplicative component representing local illumination or reflectivity variations, and an additive component representing random noise.

Classification of image denoising techniques.

Image denoising methods can be broadly categorized into spatial and transform domain filtering. Spatial domain techniques aim to remove noise by considering the relationship between pixels and image patches in the original image, determining the gray value of each pixel based on this information.

3.1. Spatial domain filtering.

In image processing, spatial domain methods for denoising aim to remove noise by calculating the gray value of each pixel based on the correlation between pixels within image patches in the original image. Spatial domain denoising techniques can be categorized into linear and nonlinear filters. Linear filters, such as mean filtering, are commonly used for reducing Gaussian noise [5]. However, they

can lead to excessive smoothing and compromise texture preservation in images. To tackle this issue, Wiener filtering was introduced [4], but it has the drawback of blurring sharp edges.

Nonlinear filters, such as weighted median filtering and median filtering, offer effective noise reduction without introducing artifacts. Among them, bilateral filtering is a commonly utilized nonlinear filter that preserves edges while smoothing the image. It replaces the intensity value of each pixel with a weighted average of neighboring pixel intensities. However, bilateral filtering may suffer from computational inefficiency, especially when larger kernel sizes are used, which can impact its practical applicability.

Spatial filters, which are commonly used for denoising, operate on groups of pixels through low-pass filtering. These filters take advantage of the fact that noise is predominantly present in higher frequencies. However, a drawback of spatial filters is that they can cause image blurring and the loss of well-defined edges. By directly manipulating the pixel values in the spatial domain, these filters aim to reduce noise while preserving visual details. Some frequently employed spatial filters for denoising include mean, median, Gaussian, and adaptive filters [5]. These filters are applied to each pixel or a small neighborhood of pixels, allowing for noise reduction while maintaining important visual information.

A. Median Filter.

The median filter operates by utilizing a sliding window approach. A kernel, typically of size 3x3, 5x5, or 7x7, is applied to cover the entire image. The central pixel within the window is then replaced with the median value [5], which is computed from the pixel values within the window. The median filter is advantageous because it is less affected by the presence of outliers in the neighborhood, making it more robust than averaging. Additionally, the median filter avoids introducing spurious pixels at region boundaries since the median value must be one of the existing pixel values. As a result, the median filter preserves sharp edges better than the averaging filter [7].

B. Gaussian filter.

The Gaussian filter is a useful instrument commonly employed in design and photography. It can be likened to an overlay of a translucent material, like parchment, that imparts a softening effect to the image, as described by photographer Kenton Waltz. The Gaussian filter utilizes the Gaussian distribution, also known as the normal distribution, which is a mathematical function used to generate the blur effect in the image.

The Gaussian blur is a valuable tool for designers and photographers, finding applications in various scenarios. One notable application is reducing noise or graininess in low-light photography. By applying a mathematical function to each pixel, the Gaussian blur smooths the image and imparts a softer, more uniform appearance. This helps to minimize the visible noise and create a visually pleasing result.

The Gaussian filter is a prevalent image processing technique that diminishes noise and smooths images. It functions by applying a mathematical function to each pixel within the image. This process involves using a kernel, or matrix, of a particular size that interacts with the image to achieve the desired level of smoothness. The degree of smoothing applied is directly related to the size of the kernel used. With the central element having the largest weight, the Gaussian filter kernel values drop outward from the center in accordance with the Gaussian function. This indicates that pixels in the center have a greater impact on the result than ones farther out. The pixel values are multiplied by the matching kernel values to determine the new value for each pixel, which is then added collectively.

C. Adaptive Filter.

An image processing technique called adaptive image denoising examines the characteristics of each pixel's immediate surroundings before applying specific filters to minimize noise. This method works better at reducing different kinds of noise, particularly when the noise features change across the image. The adaptive filter can more effectively reduce the negative effects of noise on the image by taking into

account the neighborhood's local attributes when filtering. For this reason, adaptive filters excel at eliminating different kinds of noise, particularly when the characteristics of the noise vary across an image.

D. Variational denoising methods.

Denoising algorithms commonly employed today minimize an energy function (E) to compute the denoised image (x). These algorithms rely on image priors, where a noisy image (y) is used to derive the energy function (E). By mapping low values of E to noise-free images, the energy function is minimized to obtain the denoised image (x).

The variational denoising methods described in Equation (4) are derived from the maximum a posteriori (MAP) probability estimate. From a Bayesian standpoint, the MAP probability estimate of x can be understood as finding the most probable denoised image given the observed noisy image and the prior information:

E. Total variation regularization.

Total Variation (TV) Regularization is a powerful technique used in image denoising and other image processing tasks. It is a Partial Differential Equation (PDE)-based method that exploits the sparsity and piecewise smoothness of natural images.

The main idea behind TV regularization is to minimize the total variation of the denoised image, while preserving important features such as edges and textures [10]. The method promotes a smooth image by reducing the overall variance by six, which efficiently reduces noise and preserves significant edges.

Mathematically, the total variation of an image, denoted as

Rtv(x) = llVxIli

[8]: (5)

where Vx is the gradient of x.

TV (Total Variation) regularization is a robust image denoising technique known for its several key properties and advantages. It effectively preserves sharp edges and crucial image features by minimizing the total variation, or the L1 norm of the gradient, of the image. This approach helps maintain edge integrity while avoiding blurring. TV regularization operates under the assumption that the denoised image should be piecewise smooth, featuring sharp transitions only at edges - a characteristic commonly found in natural images. This makes it particularly suited for images that require clear delineation of boundaries without smoothing out necessary details. Additionally, TV regularization is versatile, capable of handling various types of noise including Gaussian, salt-and-pepper, and speckle noise, due to its Ll-based formulation. The technique's practical application is supported by the development of efficient numerical algorithms such as the split Bregman method and the primal-dual algorithm. These algorithms ensure that TV minimization is computationally feasible for large-scale image processing tasks, enhancing its usefulness in real-world applications where computational efficiency is paramount[8][9].

Optimization techniques like convex optimization and gradient descent are then used to minimize this overall cost function. This approach allows the denoising or reconstruction process to reduce noise while preserving the essential edges and features of the image. Total variation regularization is widely used for tasks like inpainting, deblurring, compressed sensing, and denoising because it effectively reduces noise and artifacts while retaining important details. However, it's crucial to find the right balance, as too much regularization can lead to over-smoothing, causing small details to be lost.

F. Non-local regularization.

Non-local regularization represents a departure from the conventional, localized filtering approaches commonly employed in image denoising tasks. This technique exploits the inherent non-local self-similarity that is prevalent in natural images. Rather than relying solely on the immediate spatial neighborhood of a pixel, non-local methods leverage the repetitive patterns and similar structures that exist across the entirety of the image, even at distant spatial locations, to effectively denoise the image.

The core principle underlying non-local regularization is the utilization of information from the entire image, rather than restricting the denoising process to the local pixel vicinity. By considering the global image context, non-local techniques are able to better preserve fine details and textures, which can often be compromised by traditional local filtering approaches [13-14].

Furthermore, non-local regularization methods have demonstrated robustness to various noise types, including Gaussian, Poisson, and mixed noise. A key advantage of these techniques is their ability to adapt to the local image structure, making them more effective in handling complex and heterogeneous image content.

Some prominent non-local regularization algorithms employed in image denoising include non-local means (NLM) filtering, sparse coding and dictionary learning, and block-matching and 3D filtering (BM3D). These methods have been widely adopted in numerous image processing applications, such as denoising, superresolution, inpainting, and compressed sensing, due to their effectiveness in preserving important image features while effectively mitigating the presence of noise [13-15].

Non-local regularization can be conceptualized mathematically as an optimization problem involving the minimization of a cost function. Typically, this cost function consists of two components: a measure of how well the reconstructed image corresponds to the observed data, and a component that promotes coherence and similarity across the image patches.Finding the perfect balance between these two factors is the aim in order to achieve the final reconstructed image. The

fundamental concept behind non-local regularization is to construct a pixel-wise estimation of the image. This is achieved by calculating each pixel as a weighted average of pixels within regions that closely match the region centered around the estimated pixel. In other words, instead of considering only local neighborhoods, non-local regularization incorporates information from similar regions across the entire image to generate a more accurate estimation at each pixel location. This approach helps capture global patterns and dependencies, leading to improved denoising and preservation of image details.

NLM(A0 produces the NLM-filtered value for a specific pixel Xi [11] in an image A . Let Xi and ^ represent the centers of their respective image patches, The weight of"*' relative to Xi denoted 11 is calculated as follows:

Here, the weight is inversely proportional to the squared Euclidean

distance between the patch centers, normalized by the parameter 'l and scaled by €i~

Regularization techniques have been designed considering the estimation of pixel similarities, which forms the initial phase in NLM [11,12]. Equation (4) describes the NSS prior [10] in this manner:

Non-local regularization has shown promise in various image processing applications, including deblurring, inpainting, super-resolution, and denoising. In situations with intricate structures, textures, and patterns, where local approaches

might find it difficult to capture overall coherence, it performs exceptionally well. Still, due to its reliance on comparing image patches across the entire image, nonlocal regularization can be computationally demanding. To address this challenge, researchers have developed efficient approximations and algorithms that enable the practical implementation of non-local regularization in real-world settings. These advancements have made it possible to harness the benefits of non-local regularization while managing computational complexity.

G. Sparse representation.

Sparse representation is a powerful image processing technique that has gained significant attention in the field of image denoising and other applications. The core idea behind sparse representation is to represent the image as a linear combination of a small number of basis functions, or atoms, from an overcomplete dictionary.

The first step in sparse representation [19] is to learn an overcomplete dictionary of image patches or atoms that can effectively represent the image content. This dictionary is typically learned from a set of training image patches using optimization techniques, such as the K-SVD algorithm. Given the learned dictionary, the image is then represented as a linear combination of a small number of dictionary atoms. This sparse coding process is formulated as an optimization problem, where the goal is to find the sparsest representation of the image while preserving important features.

The sparse representation of the noisy image is then used to denoise the image. This is typically done by assuming that the clean image can be well-approximated by a sparse linear combination of the dictionary atoms, while the noise component is not sparse in the dictionary. Sparse representation is a versatile technique that can be adapted to handle various types of noise and image content, and the learned dictionary can be tailored to specific application domains.

Despite the complexity of the optimization problems involved, efficient algorithms have been developed for sparse representation, making it computationally

feasible for practical image processing tasks. Sparse representation has been successfully applied to a wide range of image processing applications, including denoising, super-resolution, inpainting, and compressed sensing. It has demonstrated superior performance compared to traditional image processing techniques, particularly in preserving important image features and edges while effectively removing noise. The combination of sparse representation with other image processing techniques, such as non-local methods and total variation regularization, has led to even more advanced and effective image denoising algorithms [19-21].

Sparse representations can be obtained in a variety of ways. Basis pursuit is a well-liked approach that presents the problem as an optimization task that involves finding the sparsest representation within predetermined bounds. This can be accomplished using techniques like base pursuit denoising (BPDN) and L1 regularization, also referred to as Lasso.

Dictionary learning is a further method that creates sparse representations by teaching a dictionary or set of foundation elements using the data itself. Techniques such as sparse coding and K-SVD [17,18] are commonly applied to dictionary learning. Sparse representation has a lot of advantages. By lowering the number of dimensions, it can facilitate the understanding of data. One instance of representative work is the non-locally centralized sparse representation (NCSR) model:

It also makes the data more resilient to noise and volatility. Signals can now be delivered and stored more efficiently because of this. Additionally, it enhances signal processing and analysis. However, selecting the best sparse representation might be challenging. Selecting the right foundation functions or dictionaries is crucial for getting good results.

H. Low-rank minimization.

Several fields, including signal processing, computer vision, and machine learning, use a technique known as low-rank minimization.

This method aids in reconstructing or estimating a low-rank matrix from corrupted or inaccurate data. The objective is to determine which low-rank matrix best approximates the observed data based on specific criteria. An accurate approximation of a low rank matrix can be found in a low rank matrix [22-25]. This suggests that the fundamental structure or data of the matrix may be explained by a small number of underlying parts or components. Utilizing this low-rank characteristic, low-rank reduction recovers or approximates the original matrix when noisy or insufficient observations are given.

Low-rank minimization is a challenging problem characterized as non-convex and NP-hard. In contrast, methods based on Nuclear Norm Minimization (NNM) strive to identify the lowest rank approximation, denoted as X, of an observable matrix Y. Assuming Y represents a noisy patch matrix, the low-rank matrix X can be derived from Y using the NNM technique [26,27], as:

The fundamental concept behind low-rank minimization involves using the available data to address an optimization problem that favors a low-rank matrix. This problem generally comprises two components: a data fidelity term and a regularization term. The data fidelity term assesses the accuracy with which the estimated matrix matches the actual data, while the regularization term promotes solutions with a low rank.

Various methods and algorithms are available for low-rank minimization, tailored to the specific situation and its constraints.

3.2. Transform domain filtering.

Transform domain filtering is an important image processing technique that leverages the power of signal transformations to effectively denoise and enhance images. The core idea behind transform domain filtering is to represent the image in a transformed domain [27-30], where the noise and image features exhibit distinct characteristics, and then apply targeted filtering operations in this transformed domain.

3.2.1 Data adaptive transform.

Data-adaptive transform is a powerful image processing technique that goes beyond the traditional fixed, predefined transform domains, such as Fourier or wavelet transforms. The key idea behind data-adaptive transform is to learn a transform that is tailored to the specific characteristics of the input image or dataset, allowing for more effective and efficient representation and processing of the image content.

Unlike fixed transforms, which are designed to have desirable mathematical properties but may not be optimally suited for a particular image or application, data-adaptive transforms are learned directly from the data. This learning process involves finding a transform that can sparsely represent the image content, effectively separating the signal from the noise or other undesirable components.

The advantages of data-adaptive transform in image processing tasks, such as denoising, include improved sparse representation, where the image can be represented using fewer non-zero coefficients, leading to more effective sparse coding and enhanced feature preservation. It also enables better noise separation, as the data-adaptive transform can better separate the image content from the noise, allowing for more targeted and effective filtering in the transform domain. Additionally, the adaptability to image characteristics allows the learned transform to adapt to the specific properties of the input image, such as its texture, structure, and local features, leading to superior performance compared to fixed transform-based methods. Furthermore, data-adaptive transforms can often be implemented in a more

computationally efficient manner, as the transform itself is optimized for the input data, reducing the overall computational burden [31-32].

Some prominent examples of data-adaptive transform techniques in image processing include Principal Component Analysis (PCA)-based transforms, where PCA is used to learn a data-adaptive orthogonal transform that maximizes the variance of the input image data, as well as sparse coding and dictionary learning, where the transform is learned as part of an optimization process that jointly learns a sparse representation of the image and an overcomplete dictionary of image atoms. Additionally, Convolutional Neural Networks (CNNs) have demonstrated their ability to learn data-adaptive transforms that are tailored to specific image processing tasks, such as denoising and super-resolution.

Data-adaptive transform techniques have demonstrated their effectiveness in a variety of image processing applications, including denoising, compression, superresolution, and feature extraction. By leveraging the inherent structure and characteristics of the input data, these methods can outperform traditional fixed transform-based approaches, making them an important and actively researched topic in the field of image processing.

A. Wavelet transform.

The wavelet transform is a sophisticated signal processing tool that is extensively used in diverse domains such as image processing, signal analysis, and data compression. In contrast to the Fourier transform, which breaks down a signal into its frequency components, the wavelet transform offers a time-frequency representation, enabling analysis of both temporal and spectral aspects of the signal [29].

A key attribute of the wavelet transform is its ability to express a signal with a series of basis functions known as wavelets, which are localized in time and frequency. This characteristic allows the wavelet transform to effectively capture both broad and detailed features of a signal, making it ideal for analyzing non-stationary and transient signals.

Within the realm of image processing, the wavelet transform is widely utilized for tasks like image denoising, compression, and feature extraction. It works by decomposing an input image into wavelet subbands, which represent different scales and orientations of the image content [27-30]. This decomposition generally involves applying a sequence of low-pass and high-pass filters, followed by downsampling. Each wavelet subband is then processed separately, often with specialized techniques adapted to the specific properties of that subband. For instance, in image denoising, high-frequency subbands, which typically contain more noise, may be reduced while preserving the low-frequency subbands that hold critical image details. After processing, these subbands are reassembled using an inverse wavelet transform to produce the final image.

The flexibility and wide-ranging utility of the wavelet transform have established it as a favored choice in many image processing applications. Advantages of using wavelet-based methods include multiresolution analysis, effective separation of noise, efficient compression capabilities, and robust feature extraction. The wavelet transform continues to be a focal point of extensive research and development, leading to the creation of various wavelet families, filter design techniques, and sophisticated wavelet-based algorithms. Its comprehensive applicability and efficiency have solidified its role as an essential tool in image processing, with uses extending from image enhancement and compression to object detection and recognition.

3.2.2. Non-data adaptive transform.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Non-data-adaptive transforms, often referred to as fixed transforms, constitute a class of signal processing techniques that utilize pre-established, unchanging basis functions to represent and manipulate data. These techniques are engineered to exhibit advantageous mathematical characteristics and are extensively applied across diverse sectors such as image processing, signal analysis, and data compression.

Unlike data-adaptive transforms that determine the transformation basis from the incoming data, non-data-adaptive transforms operate using a predefined set of

basis functions that do not adjust according to the particular traits of the input data. Prominent examples of such transforms include the Fourier transform, the Discrete Cosine Transform (DCT), and the Discrete Wavelet Transform (DWT).

The Fourier transform [33], for instance, models a signal using a series of complex exponential functions, inherently global and fitting for the analysis of stationary, periodic signals. Conversely, the DCT, a real-valued transform that employs cosine basis functions, is predominantly used in image and video compression standards like JPEG and MPEG due to its energy compaction capabilities. The DWT, while also non-data-adaptive, provides a multi-scale signal representation that captures both global and local features, albeit with preset wavelet basis functions that do not modify according to the input data [34].

The primary benefits of non-data-adaptive transforms lie in their mathematical simplicity, computational efficiency, and broad implementation and standardization. These transforms are mathematically straightforward, simplifying both their analysis and application, and they boast fast algorithms like the Fast Fourier Transform (FFT) and DCT that are highly optimized for speed. Additionally, non-data-adaptive transforms such as the DCT in JPEG and the DWT in JPEG2000 are extensively standardized and implemented in various signal processing applications and standards. A significant drawback of non-data-adaptive transforms is their potential suboptimal performance in applications like image denoising or sparse coding, where data-adaptive methods might excel due to their ability to tailor to specific data characteristics.

Despite these limitations, non-data-adaptive transforms remain indispensable in signal and image processing. They offer a robust and computationally efficient framework suitable for numerous applications. The decision between employing data-adaptive versus non-data-adaptive transforms often hinges on specific application requirements, available computational resources, and the necessity for adaptability to the characteristics of the input data.

Fourier transform.

The Fourier transform [33] is a mathematical method utilized to examine the frequency components of a signal or image. Specifically, in the realm of image denoising, the Fourier transform facilitates the conversion of an image from the spatial domain to the frequency domain. This transformation allows for the analysis and manipulation of frequency components, which is essential for effective denoising processes.

This allows the high-frequency components, which often correspond to noise, to be more easily identified and manipulated. By selectively modifying or removing these high-frequency noisy components, the Fourier-based denoising method can produce an image with significantly reduced noise levels.

To achieve optimal results with this method, it is crucial to meticulously adjust the filtering parameters and possess a thorough understanding of the Fourier Transform, as well as how different signal components are represented in the frequency domain [34].

This procedure allows for the reconstruction of the image with less noise while maintaining structural integrity and crucial information. A strong technique for handling photos with various noise characteristics is the wavelet transform.

Transform domain filtering techniques are highly effective at removing noise from image features while preserving structural integrity. They excel in denoising images that have complex textures and spatially varying noise.

Block-matching 3D filtering.

Block-Matching and 3D Filtering (BM3D) is a well-known and widely-used denoising technique in the field of image processing. This method was created by Dabov et al. [37] and has gained significant attention and popularity for its effectiveness in removing noise from images. A lot of people started using BM3D in 2007 because of its incredible overall denoising performance. Researchers and practitioners working in the field must comprehend the BM3D standards, especially those concentrating on photo denoising for their theses.

The basic principle of BM3D is to effectively block noise by taking advantage of the recurring structures and patterns found in botanical images. The two most crucial phases in the algorithm are aggregation and collaborative filtering. Similar patches reveal the application of block matching techniques in the collaborative filtering stage. BM3D generates 3D groupings of patches with similarity in terms of noise characteristics and content material by looking for similar patches inside each pixel's neighborhood community.

After the formation of 3D groups, the aggregation process occurs. Collaborative filtering is performed for each group to compute a weighted average that accounts for the similarities and reliabilities of the patches. This aggregation approach significantly reduces noise while preserving the image's fundamental structures and characteristics.

The Block-Matching and 3D Filtering (BM3D) method has demonstrated exceptional denoising performance, especially when dealing with additive white Gaussian noise (AWGN) in images. It may be expanded to handle color photos and video sequences and is flexible enough to handle a variety of denoising scenarios.

While BM3D has proven to be effective, deep learning-based denoising algorithms have advanced significantly and deserve acknowledgement.

BM3D has undergone several modifications to enhance its denoising performance, as seen in references [35-36]. For instance, Maggioni et al. introduced the block-matching and 4D filtering (BM4D) technique, an extension of BM3D tailored for volumetric data. This method uses voxel cubes grouped in a 4-D stacking manner. A 4-D transform applied to these groups simultaneously leverages both nonlocal and local correlations among the voxels. The resulting exceptionally sparse spectrum of the group facilitates effective coefficient shrinking, thereby allowing highly efficient separation of signal and noise.

3.3. CNN-based denoising methods.

CNN-based total denoising techniques have gained considerable attention in image processing and computer vision due to their effectiveness in removing noise

from images or signals. Convolutional neural networks (CNNs) have achieved impressive results in this area, showing remarkable performance across various computer vision tasks.

One of the primary advantages of CNN-based denoising techniques is their ability to analyze complex and nonlinear relationships between noisy and clean images [38]. Traditional denoising methods are often limited because they depend on manually crafted features and assumptions about the noise distribution, which can affect their ability to handle different noise types [40]. CNNs have the capability to automatically learn feature representations directly from the data, enabling them to adaptively recognize noise patterns and generate high-quality denoised images.

CNN-based denoising methods employ deep, multi-level structures to effectively assemble high-level and low-level picture elements. The deeper layers help analyze hierarchical representations, with the initial layers focusing on local details while the deeper layers capture more global features [39]. This hierarchical approach is crucial for effective denoising since it helps distinguish noise from real image content.

In recent years, CNN-based image denoising techniques have rapidly advanced, demonstrating excellent performance that significantly outperforms traditional methods like BM3D. The adaptive nature of CNNs allows them to automatically learn effective features directly from data, better modeling complex noise patterns and producing high-quality denoised images. While BM3D remains an industry standard, the progress in CNN-based denoising has made these deep learning approaches the preferred choice in many modern image processing applications.

3.3.1. MLP models.

Multilayer Perceptron (MLP) models are a type of artificial neural network architecture that belong to the class of feedforward neural networks. MLP models are widely used in various machine learning and deep learning applications due to their ability to approximate complex functions and learn from data. MLP models have been successfully applied to a variety of problems, including image classification,

natural language processing, prediction and forecasting, and function approximation. They have the ability to learn complex patterns and relationships in the data, making them a popular choice in many machine learning and deep learning applications.

3.3.2. Deep learning-based denoising methods.

Deep learning-based denoising techniques use sophisticated neural networks, like Convolutional Neural Networks (CNNs), to eliminate noise from images. These approaches have achieved remarkable progress in image denoising, showing excellent results in a range of conditions. By harnessing deep learning, these algorithms can significantly reduce noise in photos, resulting in clearer and cleaner visuals.

DnCNN is a prominent deep learning-based method for image denoising that was presented in 2017 [42] by Zhang et al. Prominent for its remarkable denoising capabilities, DnCNN established a standard for subsequent CNN-driven denoising techniques. The utilization of a residual learning framework is DnCNN's central idea. This methodology allows the network to focus on learning the residual noise instead of trying to directly model the complete clear image. The network's capacity to recognize intricate patterns is greatly enhanced by this method, which also streamlines the training process.

In general, deep learning-based denoising models leverage this residual learning concept, enabling them to efficiently distinguish between noise and useful features in images:

min loss (xyx), s.t. x = F (y, <r; 0)

where loss(-) represents the loss function and F(-) represents a CNN with

parameter set ©. The distance between the ground-truth x and the denoised image r is estimated using loss(-).

The DnCNN model has proven to have exceptional noise reduction capabilities for a range of image types and noise levels. It has shown to be especially successful in reducing additive Gaussian noise and has been adjusted to deal with more complicated, real-world noise scenarios. DnCNN has become a standard paradigm for other CNN-based denoising techniques because of its ease of use, effectiveness, and strong denoising performance.

4. Conclusion.

In summary, this literature review has provided a comprehensive understanding of the various methodologies employed to eliminate noise from digital images. We have discussed three primary categories of image denoising techniques: spatial domain methods, transform domain methods, and deep learning-based methods.

Spatial domain techniques, such as bilateral filtering, non-local means, and median filtering, directly manipulate the image's pixel values. These methods compare pixels, whether neighboring or distant, to estimate noise-free values. While these techniques are computationally efficient, they sometimes struggle to preserve fine details and textures.

On the other hand, transform domain methods utilize the properties of transformed representations like sparse representation and wavelet-based denoising. These methods operate by distinguishing between signal and noise components within altered domains, which enhances noise reduction and detail retention. However, they can sometimes introduce artifacts and are dependent on the choice of transform.

Deep learning-based methods, particularly those employing convolutional neural networks (CNNs) and advanced architectures such as residual networks and U-Net, have garnered significant attention for their exceptional denoising capabilities. These methods adapt effectively to various noise levels and image complexities, offering outstanding performance even in challenging conditions with low signal-to-

noise ratios and complex noise patterns, by learning denoising functions from extensive training datasets.

It is also important to note the exploration of hybrid approaches that combine multiple denoising methods. These might integrate spatial and transform domain techniques or use deep learning-based methods as a post-processing step. By leveraging the strengths of various approaches, hybrid methods can achieve more comprehensive and effective noise reduction.

When choosing a denoising technique, it is essential to consider factors such as computational efficiency, noise characteristics, image content, and the desired balance between noise reduction and feature retention, depending on the specific application.

All things considered, research and development in deep learning-based approaches, transform domain techniques, and spatial domain methods are continuing to shape the field of image denoising. This review increases our understanding and consistently raises the quality of denoised images by laying a strong foundation for future developments and enhancements in image denoising techniques.

СПИСОК ЛИТЕРАТУРЫ:

1. Motwani MC, Gadiya MC, Motwani RC, Harris FC Jr (2019) Survey of image denoising techniques. In: Abstracts of GSPX. Santa Clara Convention Center, Santa Clara, pp 27-30;

2. Linwei Fan, Fan Zhang2, Hui Fan and Caiming Zhang (2019) Brief review of image denoising techniques. In: Visual Computing for Industry, Biomedicine, and Art;

3. Sheeraz Ahmed Solangi, Qunsheng Cao, Shumaila Solangi, Tanzeela Solangi & Zaheer Ahmed Dayo (2017) Image denoising methods: literature review. In: International Journal of Recent Research and Applied Studies (IJRRAS);

4. Wiener N (1949) Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications. MIT Press, Cambridge;

5. Gonzalez RC, Woods RE (2006) Digital image processing, 3rd edn. Prentice-Hall, Inc, Upper Saddle River;

6. Al-Ameen Z, Al Ameen S, Sulong G (2015) Latest methods of image enhancement and restoration for computed tomography: a concise review. Appl Med Inf 36(1): 1—12;

7. Pitas I, Venetsanopoulos AN (1990) Nonlinear digital filters: principles and applications. Kluwer, Boston. https://doi.org/10.1007/978-1-4757-6017-0;

8. Rudin, Leonid I., Stanley Osher, and Emad Fatemi. "Nonlinear total variation based noise removal algorithms." Physica D: Nonlinear Phenomena 60.1-4 (1992): 259-268;

9. Chan, Tony F., and Luminita A. Vese. "An active contour model without edges." I E E E Transactions on Image Processing 10.2 (2001): 266-277;

10. Gilboa G, Osher S (2009) Nonlocal operators with applications to image processing. SIAM J Multiscale Model Simul 7(3):1005-1028. https://doi.org/10.1137/070698592;

11. Buades A, Coll B, Morel JM (2005) A non-local algorithm for image denoising. In: Abstracts of 2005 I E E E computer society conference on computer vision and pattern recognition. IEEE, San Diego, pp 60-65. https://doi.org/10.1109/CVPR.2005.38;

12. Fan LW, Li XM, Fan H, Feng YL, Zhang CM (2018) Adaptive texture-preserving denoising method using gradient histogram and nonlocal self-similarity priors. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2018.2878794;

13. Buades, Antoni, Bartomeu Coll, and Jean-Michel Morel. "A non-local algorithm for image denoising." I E E E Computer Vision and Pattern Recognition (CVPR) 2005;

14. Dabov, Kostadin, et al. "Image denoising by sparse 3D transform-domain collaborative filtering." I E E E Transactions on Image Processing 16.8 (2007): 20802095;

15. Elad, Michael, and Michal Aharon. "Image denoising via sparse and redundant representations over learned dictionaries." I E E E Transactions on Image Processing 15.12 (2006): 3736-3745;

16. Zhang KB, Gao XB, Tao DC, Li XL (2012) Multi-scale dictionary for single image super-resolution. In: Abstracts of 2012 I E E E conference on computer vision and pattern recognition. IEEE, Providence, pp 1114-1121. https://doi. org/10.1109/CVPR.2012.6247791 ;

17. Aharon M, Elad M, Bruckstein A (2006) rmK-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. I E E E Trans Signal Process 54(11):4311-4322. https://doi.org/10.1109/TSP.2006.881199;

18. Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. I E E E Trans Image Process 15(12):3736-3745. https://doi.org/10.1109/TIP.2006.881969;

19. Olshausen, Bruno A., and David J. Field. "Sparse coding with an overcomplete basis set: A strategy employed by V1?" Vision research 37.23 (1997): 3311-3325;

20. Tropp, Joel A., and Anna C. Gilbert. "Signal recovery from random measurements via orthogonal matching pursuit." I E E E Transactions on Information Theory 53.12 (2007): 4655-4666;

21. Donoho, David L. "Compressed sensing." I E E E Transactions on Information Theory 52.4 (2006): 1289-1306;

22. Ji H, Liu CQ, Shen ZW, Xu YH (2010) Robust video denoising using low rank matrix completion. In: Abstracts of 2010 I E E E computer vision and pattern recognition. IEEE, San Francisco, pp 1791-1798. https://doi.org/10.1109/CVPR.2010.5539849;

23. Ji H, Huang SB, Shen ZW, Xu YH (2011) Robust video restoration by joint sparse and low rank matrix approximation. SIAM J Imaging Sci 4(4): 1122-1142. https://doi.org/10.1137/100817206;

24. Liu XY, Ma J, Zhang XM, Hu (2014) Image denoising of low-rank matrix recovery via joint frobenius norm. J Image Graph 19(4):502-511;

25. Yuan Z, Lin XB, Wang XN (2013) The LSE model to denoise mixed noise in images. J Signal Process 29(10):1329-1335;

26. Liu GC, Lin ZC, Yan SC, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. I E E E Trans Pattern Anal Mach Intell 35(1): 171-184. https://doi.org/10.1109/TPAMI.2012.88;

27. Cai JF, Candès EJ, Shen ZW (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4):1956-1982. https://doi.org/10.1137/080738970;

28. Hou JH (2007) Research on image denoising approach based on wavelet and its statistical characteristics. Dissertation, Huazhong University of Science and Technology;

29. Jiao LC, Hou B, Wang S, Liu F (2008) Image multiscale geometric analysis: theory and applications. Xidian University press, Xi'an;

30. Zhang L, Bao P, Wu XL (2005) Multiscale lmmse-based image denoising with optimal wavelet selection. I E E E Trans Circuits Syst Video Technol 15(4):469-481. https://doi.org/10.1109/TCSVT.2005.844456;

31. Jung A (2001) An introduction to a new data analysis tool: independent component analysis. In: Proceedings of workshop GK. I E EE, "nonlinearity", Regensburg, pp 127-132;

32. Hyvarinen A, Oja E, Hoyer P, Hurri J (1998) Image feature extraction by sparse coding and independent component analysis. In: Abstracts of the 14th international conference on pattern recognition. I E E E, Brisbane, pp 1268-1273. https://doi.org/10.1109/ICPR. 1998.711932;

33. Hamza AB, Luque-Escamilla PL, Martinez-Aroza J, Roman-Roldan R (1999) Removing noise and preserving details with relaxed median filters. J Math Imaging Vis 11(2): 161-177. https://doi.org/10.1023/A:1008395514426;

34. Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. I E E E Trans Pattern Anal Mach Intell 11(7):674-693. https://doi.org/10.1109/34.192463;

35. Dabov K, Foi A, Katkovnik V, Egiazarian K (2009) Bm3D image denoising with shape-adaptive principal component analysis. In: Abstracts of signal processing with adaptive sparse structured representations. Inria, Saint Malo;

36. Maggioni M, Katkovnik V, Egiazarian K, Foi A (2013) Nonlocal transformdomain filter for volumetric data denoising and reconstruction. I E E E Trans Image Process 22(1): 119-133. https://doi.org/10.1109/TIP.2012.2210725;

37. Dabov K, Foi A, Katkovnik V, Egiazarian K (2007) Image denoising by sparse 3-D transform-domain collaborative filtering. I E E E Trans Image Process 16(8):2080-2095. https://doi.org/10.1109/TIP.2007.901238;

38. Schmidt U, Roth S (2014) Shrinkage fields for effective image restoration. In: Abstracts of 2014 I E E E conference on computer vision and pattern recognition. I E E E, Columbus, pp 2774-2781. https://doi.org/10.1109/CVPR.2014.349;

39. Kim J, Lee JK, Lee KM (2016) Accurate image super-resolution using very deep convolutional networks. In: Abstracts of 2016 I E E E conference on computer vision and pattern recognition. IEEE, Las Vegas, pp 1646-1654. https://doi.org/10.1109/CVPR.2016.182;

40. Nah S, Kim TH, Lee KM (2017) Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Abstracts of 2017 I E E E conference on computer vision and pattern recognition. IEEE, Honolulu, pp 257-265. https://doi.org/10.1109/CVPR.2017.35;

41. Zhang K, Zuo WM, Chen YJ, Meng DY, Zhang L (2017) Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. I E E E Trans Image Process 26(7):3142-3155. https://doi.org/10.1109/TIP.2017.2662206

i Надоели баннеры? Вы всегда можете отключить рекламу.