Научная статья на тему 'Low light combining multiscale deep learning networks and image enhancement algorithm'

Low light combining multiscale deep learning networks and image enhancement algorithm Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
93
30
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
deep learning / Retinex theory / improve low light picture / imageinpainting / глубокое обучение / теория Retinex / улучшение изображения при слабом освещении / изображение в живописи

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Xia Yu, Lin Bo, Chen Xin

Aiming at the lack of reference images for low-light enhancement tasks and the problems of color distortion, texture loss, blurred details, and difficulty in obtaining ground-truth images in existing algorithms, this paper proposes a multi-scale weighted feature low-light based on Retinex theory and attention mechanism. An image enhancement algorithm is proposed.The algorithm performs multi-scale feature extraction on low-light images through the feature extraction module based on the Unet architecture, generates a high-dimensional multi-scale feature map, and establishes an attention mechanism module to highlight the feature information of different scales that are beneficial to the enhanced image, and obtain a weighted image. High-dimensional feature map, the final reflection estimation module uses Retinex theory to build a network model, and generates the finalenhanced image through the high-dimensional feature map. An end-to-end network architecture is designed and a set of self-regular loss functions are used to constrain the network model, which gets rid of the constraints of reference images and realizes unsupervised learning. The final experimental results show that the algorithm in this paper maintains high image details and textures while enhancing the contrast and clarity of the image, has good visual effects, can effectively enhance low-light images, and greatly improves the visual quality. Compared with other enhanced algorithms, the objective indicators PSNR and SSIM have been improved

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Низкая освещенность, сочетающая многомасштабные сети глубокого обучения и алгоритм улучшения изображения

В связи с отсутствием эталонных изображений для задач улучшения при слабом освещении, а также в связи с проблемами искажения цвета, потери текстуры, размытости деталей и сложности получения достоверных изображений в существующих алгоритмах в статье предлагается многомасштабная взвешенная характеристика при слабом освещении на основе теории Retinex и механизма внимания. Алгоритм улучшения изображения выполняет извлечение многомасштабных признаков на изображениях при слабом освещении с помощью модуля извлечения признаков, основанного на архитектуре Unet, генерирует многомерную многомасштабную карту признаков и устанавливает модуль механизма внимания для выделения информации о признаках разных масштабов, которые выгодны для улучшенного изображения и получения взвешенного изображения. Карта объектов высокой размерности, модуль окончательной оценки отражения использует теорию Retinex для построения сетевой модели и генерирует окончательное улучшенное изображение с помощью карты объектов высокой размерности. Разработана сквозная сетевая архитектура, а набор саморегулярных функций потерь используется для ограничения сетевой модели, которая избавляется от ограничений эталонных изображений и реализует обучение без учителя. Окончательные экспериментальные результаты показывают, что алгоритм, в предложенный в данной статье поддерживает высокую детализацию и текстуру изображения, повышая контрастность и четкость изображения, имеет хорошие визуальные эффекты, может эффективно улучшать изображения при слабом освещении и значительно улучшает визуальное качество. По сравнению с другими усовершенствованными алгоритмами нами улучшены объективные показатели PSNR и SSIM.

Текст научной работы на тему «Low light combining multiscale deep learning networks and image enhancement algorithm»

Современные инновации, системы и технологии // Modern Innovations, Systems and Technologies

2022; 2(4) eISSN: 2782-2818 https://www.oajmist.com

УДК: 004.048 EDN: \VRHCBP

DOI: https://doi.org/10.47813/2782-2818-2022-2-4-0215-0232

Низкая освещенность, сочетающая многомасштабные сети глубокого обучения и алгоритм улучшения

изображения

Ся Ю 1, Лин Бо2, Чен Синь3

1 Софийский университет, Болгария 1Южно-Китайский педагогический университет, Китай 3Шэньчжэньский университет, Китай

Аннотация. В связи с отсутствием эталонных изображений для задач улучшения при слабом освещении, а также в связи с проблемами искажения цвета, потери текстуры, размытости деталей и сложности получения достоверных изображений в существующих алгоритмах в статье предлагается многомасштабная взвешенная характеристика при слабом освещении на основе теории Retinex и механизма внимания. Алгоритм улучшения изображения выполняет извлечение многомасштабных признаков на изображениях при слабом освещении с помощью модуля извлечения признаков, основанного на архитектуре Unet, генерирует многомерную многомасштабную карту признаков и устанавливает модуль механизма внимания для выделения информации о признаках разных масштабов, которые выгодны для улучшенного изображения и получения взвешенного изображения. Карта объектов высокой размерности, модуль окончательной оценки отражения использует теорию Retinex для построения сетевой модели и генерирует окончательное улучшенное изображение с помощью карты объектов высокой размерности. Разработана сквозная сетевая архитектура, а набор саморегулярных функций потерь используется для ограничения сетевой модели, которая избавляется от ограничений эталонных изображений и реализует обучение без учителя. Окончательные экспериментальные результаты показывают, что алгоритм, в предложенный в данной статье поддерживает высокую детализацию и текстуру изображения, повышая контрастность и четкость изображения, имеет хорошие визуальные эффекты, может эффективно улучшать изображения при слабом освещении и значительно улучшает визуальное качество. По сравнению с другими усовершенствованными алгоритмами нами улучшены объективные показатели PSNR и SSIM.

Ключевые слова: глубокое обучение, теория Retinex, улучшение изображения при слабом освещении, изображение в живописи.

Для цитирования: Ся, Ю, Лин, Бо, & Чен? Синь (2022). Низкая освещенность, сочетающая многомасштабные сети глубокого обучения и алгоритм улучшения изображения. Современные инновации, системы и технологии - Modern Innovations, Systems and Technologies, 2(4), 0215-0232. https://doi.org/10.47813/2782-2818-2022-2-4-0215-0232

© Xia Yu, Lin Bo, Chen Xin, 2022

0214

Low light combining multiscale deep learning networks and image enhancement algorithm

Xia Yu1, Lin Bo2, Chen Xin3

1Sofia University, Bulgaria 2South China Normal University, China 3Shenzhen University, China

Abstract. Aiming at the lack of reference images for low-light enhancement tasks and the problems of color distortion, texture loss, blurred details, and difficulty in obtaining ground-truth images in existing algorithms, this paper proposes a multi-scale weighted feature low-light based on Retinex theory and attention mechanism. An image enhancement algorithm is proposed. The algorithm performs multi-scale feature extraction on low-light images through the feature extraction module based on the Unet architecture, generates a high-dimensional multi-scale feature map, and establishes an attention mechanism module to highlight the feature information of different scales that are beneficial to the enhanced image, and obtain a weighted image. High-dimensional feature map, the final reflection estimation module uses Retinex theory to build a network model, and generates the final enhanced image through the high-dimensional feature map. An end-to-end network architecture is designed and a set of self-regular loss functions are used to constrain the network model, which gets rid of the constraints of reference images and realizes unsupervised learning. The final experimental results show that the algorithm in this paper maintains high image details and textures while enhancing the contrast and clarity of the image, has good visual effects, can effectively enhance low-light images, and greatly improves the visual quality. Compared with other enhanced algorithms, the objective indicators PSNR and SSIM have been improved.

Keywords: deep learning, Retinex theory, improve low light picture, image inpainting.

For citation: Yu, X., Bo, L., & Xin, C. (2022). Low light combining multiscale deep learning networks and image enhancement algorithm. Modern Innovations, Systems and Technologies, 2(4), 0215-0232. https://doi.org/10.47813/2782-2818-2022-2-4-0215-0232

INTRODUCTION

As the devices for acquiring images become more and more popular, there are more and more scenes for people to acquire images, but it is very difficult to obtain images with better visual effects in scenes with insufficient light, such as outdoors at dusk and dimly lit indoors, the image obtained in this case is called a low-light image. Among the various factors that affect image quality, low-light factors are common and unavoidable. Images captured under low-light conditions will show some quality degradation, including low visibility, color deviation, and dense noise, which affect the availability of effective information. Acquire and follow-up machine vision tasks such as segmentation, detection, tracking, etc. Therefore, enhancing the

low-light image can not only improve the visual effect of the image, extract more detailed information, but also better serve the subsequent machine vision tasks.

The traditional low-light image enhancement method is to directly adjust the global illumination characteristics of the image, such as histogram equalization, by stretching the dynamic range of the low-light image to make low-exposure areas visible, effectively improving the contrast of the image, but this method will be due to grayscale. Level merging results in the loss of some details of the image and also amplifies noise buried in local areas. In response to these problems, the subsequent improvement method adopts some constraints, such as maintaining the average intensity of the image, noise robustness, etc. to improve the visual quality of the overall image, so that the pixel histogram can be adaptively equalized according to the low-light image. Histogram equalization methods based on constraint information improve the local adaptability in the enhancement process, however, most of them cannot adjust the visual properties of the image in local regions, resulting in overexposed or underexposed regions.

At this stage, low-light enhancement algorithms can be roughly divided into three solutions: methods based on Retinex models, methods based on deep learning [1-3], and methods combining traditional models and deep learning. The Retinex model is a human visual perception model, which assumes that the low-light image is decomposed into the product relationship between the reflection component and the illuminance component. Earlier studies directly removed the illumination component of low-light images and converted the remaining. However, the results obtained by such methods are usually unnatural and easy to over-enhance the image [4-5]. The current research tends to optimize the illuminance component or estimate the reflection component and the illuminance component while adding a weight model. This kind of method has the advantages of noise suppression and high-frequency detail preservation in the presence of prior information and regularization processing. Better performance. A weighted variational model for low-light images in the logarithmic domain, the model defines different prior constraints at different layers, and estimates the illumination and reflection components at the same time. The local area of the three RGB channels in the low-light image is maximized Filtering estimates the luminance component of the image, and iteratively enhances the details of the image using structural priors. However, these methods generally rely on the assumptions of real-world environments, and the ability to represent prior information is limited, and cannot achieve satisfactory results when applied to tests in different environments.

With the rise of deep learning, in recent years, data-driven low-light image enhancement algorithms have strong performance and flexibility, especially when dealing with complex scenes. A deep auto-encoder named LLNet (Low Light Net) is used for contrast enhancement and denoising of low-light images; the multi-branch enhancement network MBLLEN-Net is designed to extract rich multi-level features in low-light images [6-8]. Good results in terms of noise and artifacts in light areas. Since it is difficult to obtain a good exposure map corresponding to a low-light image in a real scene, the above methods all use a synthetic dataset to train the model, because the unrealistic synthetic data leads to artifacts in the enhanced image. Considering the relative lack of generalization ability of paired training data during model training and the difficulty in capturing pairs of low-light images and normal-light images of the same visual scene, methods to alleviate the dependence on paired data have been proposed one after another. The generative adversarial network EnlightenGAN, which gets rid of the dependence on paired data [9-11], effectively solves the dependence on paired datasets in the low-light enhancement field through the self-regularized perceptual loss function and the local discriminator. By constructing a pixel-fitted curve, using the convolutional neural network Zero-DCE to learn the key parameters in the curve, and solving the problem of paired data through a series of zero-reference loss functions [9], the image with good visual effect is obtained, but this method The resulting results have chromatic aberrations and low contrast.

It is also an idea to enhance the low-light image by guiding the theoretical ideas of the traditional model to the structural design of the network. The low-light enhancement network RetinexNet is composed of an illumination component estimation module and a reflection component estimation module. The input image is decomposed and the illumination components are analyzed. enhanced. In addition, a synthetic dataset LOL (Low-Light dataset) with varying exposure times. This method can effectively improve the brightness of the image, but due to insufficient constraints on intermediate variables, there will be unknown artifacts and local distortion problems in the enhanced image. An effective low-light image enhancement network KinD uses two convolutional neural networks. The simulated external environment acts as a constraint on the model to guide learning [13-15].

In view of the advantages and disadvantages of existing algorithms, this paper proposes an end-to-end low-light enhancement algorithm combining Retinex theory and attention mechanism with multi-scale weighted features. The algorithm in this paper does not use most of the previous methods to obtain the reflection component and illumination component [11] of the image by using the decomposition network, but uses the neural network to directly learn the

mapping between the low-light image and the illumination component, and refers to the Retinex theory to calculate it with the input image. Derive the reflection component as the enhanced image output by the model. The form of the illumination component of natural images is relatively simple and usually contains known prior information, so the network model has strong generalization ability and can adapt to different illumination conditions. This paper also uses the feature extraction module based on the U-Net network architecture to extract multi-scale feature information in low-light images, and introduces the channel attention mechanism into the low-light enhancement task to focus the extracted multi-scale features and obtain texture information representations The feature map with strong ability highlights the advantageous features of the subsequent calculation of the reflection component; in order to accurately train the model without reference images, a set of self-regular loss function constraints neural networks are designed to guide the training of the model. The algorithm in this paper fully extracts the feature information of different scales in the low-light image, and improves the local information loss and color distortion of the enhanced image.

MULTISCALE IMAGE AUGMENTATION DEEP LEARNING MODEL

The multi-scale low-light enhancement algorithm network framework designed in this paper combining Retinex theory and attention mechanism is composed of three modules: feature extraction module, attention mechanism module and reflection estimation module. First, the low-light image S is input into the feature extraction module to obtain a multi-scale high-dimensional feature map. Then, the fused features are focused through the attention mechanism module to suppress the features of over-exposed areas and low-quality areas [16-18]. Finally, through the reflection estimation module generates the inverse L of the illumination component, multiplies the original image S and L to complete the calculation of the reflection component R, and uses R as the enhancement result to obtain a well-exposed enhanced image, as shown in formula (5).

Feature extraction module

First, the low-light image is extracted. The feature extraction module adopts a classic U-Net network architecture. The network architecture diagram is shown in Figure 1.

It consists of an encoder and decoder structure with convolution, pooling, upsampling, and concatenation operations. It has 18 convolutional layers, 4 downsampling steps and 4 upsampling steps. Each convolutional layer consists of a convolution operation with a kernel

size of 3*3, a stride of 1, a padding of 1, and a ReLU activation function. Each downsampling step is performed by a 2* 2 The max pooling operation is completed, reducing the size of the feature map to half of the original size. Each upsampling step is done by a deconvolution operation with a stride of 2, expanding the size of the feature map to twice the original size. In addition, two cascaded convolutional layers are passed before each upsampling and downsampling [19-22].

Figure 1. Feature extraction network structure.

The network symmetrically cascades the downsampled feature map to the feature map of the same resolution in the upsampling through skip connections to increase the amount of information in the upsampling step. The biggest feature of this architecture is its U-shaped architecture and skip connections, which extract high-level and low-level features of the image from networks of different depths, so that the texture and edge information of the image are better preserved, and the upper and lower layers are used. The correlation between the shallow and low-level image features (such as contrast, detail sharpness, etc.) and the deep high-level abstract features (such as color distribution, average brightness, etc.) are stacked by feature splicing to achieve multi-scale extraction of image features. , and finally output a 32-channel high-dimensional multi-scale feature map.

Attention mechanism module

Low-light images contain many texture detail features of the target scene. In order to highlight favorable features of interest and suppress uninteresting features, the attention

mechanism module is used to guide the network to refine redundant features, and pay more attention to feature channels that are beneficial for generating reflection components.

The attention mechanism module performs a global average pooling operation Fsq on the feature map generated by the feature extraction module to convert the 32-dimensional spatial features x in the channel direction into channel descriptors z, as shown in formula (1):

Zc = Fsq(xc) = -i-2H=iZWiXcaj) (1)

Among them, xc is the input feature of the c channel, zc is the feature descriptor of the c channel, and H and W are the length and width of the channel feature map. Then, two fully connected operations Fex are used for the feature descriptor zc to improve the generalization ability of the model. The ReLU activation function is used for nonlinear processing between the two fully connected layers, and finally the weights of each dimension are output through the Sigmoid activation function. This operation enables the network to learn the relationship between each channel and generate the weight of each channel. The calculation method is shown in formula (2):

Sc = Fex(Zc,^) = a(g(Zc,w)) = a(w26(wizc)) (2)

In the formula, m 1, m 2 represent the two fully connected layers, a, S represent the Sigmoid activation function and the ReLU activation function, and sc represents the weight of the output c-th channel. Finally, the channel weight sc and the feature map xc output by the feature extraction module are rescaled Fscale to obtain the weighted feature map x, as shown in formula (3). The attention mechanism module makes the network model more capable of identifying the features of each channel, so that the model can highlight the channel features that are beneficial to the enhancement results.

X= Fscale(X,Sc) = X • Sc (3)

Reflection estimation module

This module builds a reflection estimation network based on a variant of the Retinex model, using the weighted feature maps to generate the reflection component of a low-light image through the network shown in Figure 2, taking the reflection component as an enhanced wellexposed image.

Retinex theory aims to decompose an image into reflection and illumination components, as shown in formula (4):

5 = R • / (4)

Among them, S represents the original image, I represents the illumination component, which reflects the illumination intensity information, and R represents the reflection component, which reflects the inherent properties of the object itself and is not affected by external factors. We treat the reflection component as a well-exposed image for the purpose of enhancing low-light images.

Figure 2. Reflection estimation network structure.

Consider the illumination component I as an intermediate variable for calculating the reflection component R, as shown in formula (5):

R = 5 • L (5)

L represents the inverse of the light component I, L = I-1. The advantage of this Retinex modelbased augmentation model is that it abandons the reconstruction image process in most Retinex-based methods and avoids the loss of information in the reconstruction process.

The network model in the reflection estimation module is composed of 6 symmetrically cascaded convolutional layers and 1 output layer. The 1st, 2nd, and 3rd convolutional layers are cascaded corresponding to the 7th, 6th, and 5th convolutional layers. Each convolution layer consists of 32 convolution operations with kernel size 3*3, stride 1 and ReLU activation function, and the activation function of the last layer is replaced by the Tanh activation function. The input of the network is the weighted feature map with 32 channels, the output layer outputs the inverse L of the illumination component with 3 channels, and finally the reflection component R is obtained by multiplying the input image and the inverse of the illumination

component pixel by pixel, which is used as the network. A well-exposed enhanced image of the final output of the model.

EXPERIMENTAL RESULTS AND ANALYSIS Dataset and training configuration

In this paper, part I of the publicly available data set SICE provided by CAI J et al. is selected as the training set of the model, which contains 360 sets of images with different exposures, totaling 2002 images. Both underexposed and overexposed images are included in the image set, and adding overexposed images to the training set is beneficial for low-light enhancement tasks. All images in the training set were resized to 512*512 size before training.

The configuration of the training hardware platform is Inter(R) Core(TM) i9-11900kf 3.5GHz CPU and NVIDIA Geforce RTX 3070ti GPU, the operating system is Unbuntu22.04, the programming language is python3.8, and the deep learning framework is pytorch1.12. The parameters of the Adam optimizer are 01=0.9, 02=0.999, a=10-8, and the learning rate size is 0.0001. The number of iterations epochs is set to 500 times [23-27], and the model is evaluated every 50 times, and the optimal model is used as the final model. In order to verify the effectiveness of the algorithm in this paper, the LIME dataset and part II of the SICE dataset are selected as the test set. Part II in the SICE dataset contains 229 sets of multi-exposure sequences and corresponding normal lighting images, which include indoor, outdoor and other complex lighting scenes, covering most real environments. In order to compare the performance of this algorithm with other algorithms, two traditional algorithms are selected: MSRCP algorithm and LIME algorithm and four deep learning-based algorithms: MBLLEN-Net algorithm, RetinexNet algorithm, KinD algorithm, Zero-DCE algorithm from both subjective and objective. Aspects are compared with the algorithm in this paper. The enhanced images of the above algorithms are all generated by public implementable code, and the training model based on deep learning is provided by the original author.

Subjective evaluation

Firstly, the enhancement results are subjectively compared and analyzed from the aspects of brightness and color, and two indoor scenes and outdoor scenes under low light conditions in the test set are selected as test images.

Figure 3 is the image enhancement comparison result of the indoor scene, Figure 3(a) is the input image, Figure 3(b), Figure 3(c), Figure 3(d), Figure 3(e), Figure 3(f), 3(g) is the enhancement result of the comparison algorithm selected in this paper, and Fig. 3(h) is the enhancement result of the algorithm in this paper. It can be seen that for the traditional algorithm, the entire image in Figure 3(b) has obvious color distortion, and some areas are overexposed. 3(d) and the color card part of the red frame area in Fig. 3(g) have obvious local color deviation, and Fig. 3(d) is relatively smooth as a whole, and the texture effect is not good ; Figure 3(e) has obvious artifacts and color deviation, and the visual effect is poor; Figure 3(f) has poor enhancement effect on the dark area in the green frame area. The algorithm shown in Figure 3(h) in this paper effectively avoids color distortion in local areas while maintaining contrast and clarity, and can effectively enhance dark areas and contain rich texture details.

Figure 4 is the image enhancement comparison result of the outdoor scene, Figure 4(a) is the input image, Figure 4(b), Figure 4(c), Figure 4(d), Figure 4(e), Figure 4(f), 4(g) is the enhancement result of the comparison algorithm selected in this paper, and Fig. 4(h) is the enhancement result of the algorithm in this paper. In the traditional algorithm, it can be seen that there is still overexposure in Figure 4(b) in the outdoor scene, and some detailed information is lost, such as the clouds in the green frame. Figure 4(c) The visual effect is better, and the details are the same as The color information has been significantly improved, but the overall image brightness is low; the algorithm based on deep learning is shown in Fig. 4(d) The whole image is too smooth, the detail information is missing and there are local dark areas[28-31], such as the red frame area; Fig. 4(e) ), the color deviation is serious and there are a lot of artifacts; the visual perception in Figure 4(f) and Figure 4(g) is good, but the image colors are not natural enough; The above shortcomings are improved to a certain extent, the whole image is rich in texture and detail information, and local areas are also effectively enhanced.

Fig. 3 Comparison of indoor scene results

Figure 4. Comparison of outdoor scene results.

Objective comment

This paper selects Peak Signal Noise Ration (PSNR), Structural Similarity (SSIM), Naturalness Image Quality Evaluator (NIQE) and Visual Information Fidelity (Visual Information Fidelity, SSIM). VIF) as an objective indicator for qualitative analysis.

The peak signal-to-noise ratio (PSNR) represents the ratio between the peak power of the enhanced image and the noise, and is used to measure the degree of distortion between the enhanced image and the normal illumination image. Spatial Structure Similarity (SSIM) is used to measure the similarity of two images. The larger the value, the more similar the spatial structure features between the enhanced image and the normal illumination image. The Natural Image Quality Evaluation Index (NIQE) represents a set of quality-aware statistical features constructed based on spatial natural scene statistics. Visual Information Fidelity (VIF) represents a quality evaluation index based on the fidelity of human visual information. It mainly calculates the information distortion between the enhanced image and the normal illumination image by establishing a visual model. Perceptually, the enhanced image quality is better.

Table 1 shows the objective index values obtained by the algorithm in this paper and the comparison algorithm tested on part II of the dataset SICE. It can be seen that the algorithm in this paper has a higher objective index value than the comparison algorithm, which proves the effectiveness of the algorithm in this paper.

Table 1. The average value of objective indexes of this algorithm and the comparison

algorithm under different test sets.

Method PSNR SSIM NIQE VIF

MSRCP 12.6995 0.4493 5.3842 0.4525

LIME 14.0784 0.5274 4.9674 0.4821

MBLLEN 15.8227 0.5670 4.8713 0.3804

RetinexNet 14.6839 0.4752 5.5244 0.3482

KinD 15.7432 0.6762 4.8869 0.4381

Zero-DCE 16.1157 0.5933 4.4762 0.4226

Ours 16.6783 0.6946 4.3814 0.4796

The computational complexity of the model

Complexity is an important indicator that reflects the performance of an algorithm. This section discusses the time complexity of the proposed and compared algorithms. Let m denote the number of rows of the image, n the number of columns of the image, and N the number of images in the source image sequence, so the complexity of the traditional algorithm is O(Nmn). For deep learning algorithms, the parameter amount of the model will be compared[32-36]. The smaller the parameter amount, the lighter the model and the lower the computational complexity. As can be seen from Table 2, the amount of parameters in this paper is relatively large. This is due to the fact that the feature extraction layer uses the Unet architecture multi-scale feature extraction and attention mechanism module to filter features, which improves the performance of the model, but the cost is the amount of parameters. increases, and the computational complexity increases.

Table 2. The computational complexity of this algorithm and the comparison algorithm.

Method Complexity Parameters(M)

MSRCP O (Nmn) /

LIME O (Nmn) /

MBLLEN / 5.95

RetinexNet / 14.32

KinD / 29.6

Zero-DCE / 0.32

Ours / 16.90

Ablation experiment

In order to verify the effectiveness of the network framework in this paper in low-light enhancement tasks, ablation analysis is carried out from two aspects: 1) to verify the effectiveness of each loss function; 2) to verify the effectiveness of the attention mechanism module. This section conducts qualitative analysis from the above two aspects. In order to study the contribution of each loss function to the network model, this paper removes one of the three loss functions when training the network, and qualitatively analyzes the impact of each loss function on the enhancement results.

The visual comparison diagram is shown in Figure 5, the baseline is the enhanced image generated by the model when the three loss functions are retained.

(a) input (b) baseline (c) w/o Lexp (d) w/o Lspa (e) w/o L^

Figure 5. Visual comparison of each loss function by qualitative analysis.

When the exposure control loss Lexp is removed, it can be seen that the brightness enhancement of the whole image is small, and the overall image is darker; when the spatial consistency loss Lspa is removed, it can be seen that the contrast of the whole image has decreased slightly, indicating that it maintains the color of the image area before and after enhancement. Consistency effect; when removing light smoothing loss Ltv, there will be obvious artifacts, the whole image scene will be unnatural and color distortion will appear. It can be seen from Table 3 that abandoning any loss function will lead to the performance degradation of the model in this paper, and the objective indicators will be reduced to varying degrees, which proves the effectiveness of each loss function.

Table 3. The qualitative analysis of the objective average value of each loss function.

Index/Method

baseline

w/o L,

exp

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

w/o L

spa

w/o L

tv

PSNR SSIM NIQE VIM

16.6783 0.6946 4.3814 0.4796

10.2441 0.2941 6.0241 0.3574

15.7642 0.6172 4.5479 0.4655

12.9608 0.5839 4.8607 0.3901

In this paper, an attention module is introduced into the network framework to focus on the contrast between the details of the image and the overall contrast at different scales, so that the overall image is clearer and more natural, and the bright contrast is more obvious. In order to verify the effectiveness of the attention mechanism module, the attention module is removed [36-38], and the rest of the settings remain unchanged, and the influence of the attention mechanism on the enhancement results is analyzed.

The visual comparison diagram is shown in Figure 6, the baseline is the enhanced image generated by the model when the attention module is retained.

(a) Input (с) baseline (b) w/o attention

Figure 6. Visual comparison of attention module.

Compared with the enhanced image with the attention module removed, the details of the enhanced image without the attention module avoid overexposure, the color does not appear distorted, and a good contrast is maintained, and the overall light and dark contrast of the image is stronger and more natural.

The objective index values in Table 4 also reflect that if the attention module is removed, the performance of the model will be affected to a certain extent, which reflects the effectiveness of the attention module for the model in this paper.

Table 4. Evaluations of attention module on average value. Index/Method baseline w/o attention

PSNR 16.6783 16.4501

SSIM 0.6946 0.6114

NIQE 4.3814 4.4765

VIM 0.4796 0.4592

CONCLUSION

This paper proposes an end-to-end multi-scale weighted feature low-light enhancement algorithm that combines Retinex theory and attention mechanism. By introducing a feature extraction module and an attention module, multi-scale feature information is extracted and image features that are beneficial to enhanced images are highlighted. The Retinex theory is integrated into the reflection estimation module, and its variant form avoids the loss of information in the process of reconstructing the enhanced image and improves the quality of the enhanced image. Besides, in order to train the model without reference images, this paper

introduces a set of self-regular loss functions in terms of spatial structure, texture information, etc. to normalize the model and better enhance the images. The comparison experiments and ablation experiments both prove the superiority of the proposed algorithm in low-light enhancement and the effectiveness of each module, which greatly improves the visual effect of the image.

REFERENCES

[1] Lv Z., Li Y., Feng H., Lv H. Deep learning for security in digital twins of cooperative intelligent transportation systems. IEEE Transactions on Intelligent Transportation Systems. 2021; 23(9): 16666-16675.

[2] Lv Z., Chen D., Feng H., Zhu H., Lv H. Digital twins in unmanned aerial vehicles for rapid medical resource delivery in epidemics. IEEE Transactions on Intelligent Transportation Systems. 2021. DOI: 10.1109/TITS.2021.3113787.

[3] Zhang X., Zhou X., Lin M. et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 6848-6856.

[4] Ma N., Zhang X., Zheng H. T. et al. Shufflenet v2: Practical guidelines for efficient CNN architecture design. Proceedings of the European conference on computer vision (ECCV). 2018: 116-131.

[5] Petrosian O., Shi L., Li Y., Gao H. Moving information horizon approach for dynamic game models. Mathematics. 2019; 7(12): 1239.

[6] Yin L. The time-consistency of optimality principles in multistage cooperative games with spanning tree. 2017.

[7] Weilong H., Weijun H., Yuqi Y., Hui S., Yanyou W., Yuehang S., Xiaobin L. Improved left-and right-hand tracker using computer vision. Student research. 2022; 3: 21.

[8] Zhao C., Blekanov I. Two Towers Collaborative Filtering Algorithm for Movie Recommendation. Management processes and sustainability. 2021; 8(1): 397-401.

[9] Yuan C., Liu X., Zhang, Z. The Current Status and progress of Adversarial Examples Attacks. Proceedings of 2021 International Conference on Communications, Information System and Computer Engineering (CISCE); 2021, May; IEEE; 2021: 707-711.

[10] Liu X., Xie X., Hu W., Zhou H. The application and influencing factors of computer vision: focus on human face recognition in medical field. Science, education, innovations: topical issues and modern aspects. 2022: 32-37.

[11] Shen G., He K., Jin J., Chen B., Hu W., Liu X. Capturing and analyzing financial public opinion using nlp and deep forest. Scientific research of students and pupils. 2022: 66-71.

[12] Chen B., Song Y., Cheng L., He, W., Hu W., Liu X., Chen J. A review of research on machine learning in stock price forecasting. Science and modern education: topical issues, achievements and innovations. 2022: 56-62.

[13] Liu Z., Feng R., Chen H., Wu S., Gao Y., Gao Y., Wang X. Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 1100611016.

[14] Liu X., Liu W., Yi S., Li J. Research on Software Development Automation Based on Microservice Architecture. Proceedings of the 2020 International Conference on Aviation Safety and Information Technology 2020, October; 2020: 670-677.

[15] He K., Song Y., Shen G., He W., Liu W. Based on deep reinforcement learning and combined with trends stock price prediction model. Topical issues of modern scientific research. 2022: 156-166.

[16] Petrosyan, L., Pankratova, Y. Two Level Cooperation in Dynamic Network Games with Partner Sets. Proceedings of International Conference on Mathematical Optimization Theory and Operations Research. Springer, Cham; 2022: 250-263.

[17] Wu J., Lee P. P., Li Q., Pan L., Zhang J. CellPAD: Detecting performance anomalies in cellular networks via regression analysis. Proceedings of 2018 IFIP Networking Conference (IFIP Networking) and Workshops. 2018 May; IEEE; 2018: 1-9.

[18] Lv Z., Li Y., Feng H., Lv H. Deep learning for security in digital twins of cooperative intelligent transportation systems. IEEE Transactions on Intelligent Transportation Systems. 2021.

[19] Ou S., Gao Y., Zhang Z., Shi C. Polyp-YOLOv5-Tiny: A Lightweight Model for RealTime Polyp Detection. Proceedings of 2021 IEEE 2nd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA); December 2021; IEEE; 2021; 2: 1106-1111.

[20] Huang T., Zhou C., Zhang R. X., Wu C., Sun L. Learning Tailored Adaptive Bitrate Algorithms to Heterogeneous Network Conditions: A Domain-Specific Priors and Meta-Reinforcement Learning Approach. IEEE Journal on Selected Areas in Communications. 2022; 40(8): 2485-2503.

[21] Sun Q., Zhao C., Li Y., Petrosian O. Management processes and sustainability. 2022; 9(1): 357-362.

[22] Xiaomin L., Yuehang S., Borun C., Xiaobin L., Weijun, H. A novel deep learning based multi-feature fusion method for drowsy driving detection. Industry and agriculture. 2022: 3449.

[23] Yin, L. The dynamic Nash bargaining solution for 2-stage cost sharing game. Contributions to Game Theory and Management. 2020; 13(0): 296-303.

[24] Zhouyi X., Weijun H., Yanrong H. Intelligent acquisition method of herbaceous flowers image based on theme crawler, deep learning and game theory. Kronos. 2022; 7(4(66)): 44-52.

[25] In L. Dynamic stability of optimality principles in cooperative multistage games with spanning tree. 2021.

[26] Xie Z., Hu W., Fan Y., Wang, Y. Research on multi-target recognition of flowers in landscape garden based on ghost net and game theory. Development of science, technologies, education in the XXI century: topical issues, achievements and innovation. 2022: 46-56.

[27] Yin L. Dynamic Shapley Value for Two-Stage Cost Sharing Game. Proceedings of International Conference Dedicated to the Memory of Professor Vladimir Zubov. 2020 October; Springer, Cham; 2020: 457-464.

[28] Hu W., Zheng T., Chen B., Jin J., Song Y. Research on product recommendation system based on deep learning. Basic and applied scientific research: current issues, achievements and innovations. 2022:116-124.

[29] Hu W., Liu X., Xie Z. Ore image segmentation application based on deep learning and game theory. World science: problems and innovations. 2022: 71-76.

[30] He W., Hu W., Wu Y., Sun L., Liu X., Chen B. Development history and research status of convolutional neural networks. Student scientific forum. 2022: 28-36.

[31] Yin L. The dynamic Shapley Value in the game with spanning tree. Proceedings of International Conference Stability and Oscillations of Nonlinear Control Systems (Pyatnitskiy's Conference). 2016 June; IEEE; 2016: 1-4.

[32] Yin L. Dynamic Shapley Value for 2-stage cost sharing game with perishable products. Proceedings of 29th Chinese Control and Decision Conference (CCDC). 2017 May; IEEE; 2017: 3770-3774.

[33] Yin L. Dynamic Shapley value in the game with spanning forest. Proceedings of 2017 Constructive Nonsmoothed Analysis and Related Topics (dedicated to the memory of V. F. Demyanov) - (CNSA). 2017 May; IEEE; 2017:1-4.

[35] Petrosian O., Nastych M., Li Y. The Looking Forward Approach in a Differential Game Model of the Oil Market with Non-transferable Utility. Proceedings Frontiers of Dynamic Games; Birkhauser, Cham; 2020: 215-244.

[36] Xie Z., Hu W., Zhu J., Li B., Wu Y., He W., Liu X. Left- and right-hand tracker based on convolutional neural network. Topical issues of modern science and education: Proceedings of the XXIV International Scientific and Practical Conference. 2022, November 10; Penza: ICNS "Science and Education"; 2022: 61-67.

[37] Cheng M., Li Y. New characteristic function for two stage games with spanning tree. Contributions to Game Theory and Management. 2021; 14: 59-71.

[38] Li Y., Petrosyan O. L., Zou J. Dynamic shapley value in the game with perishable goods. Contributions to Game Theory and Management. 2021; 14(0): 273-289.

ИНФОРМАЦИЯ ОБ АВТОРАХ I INFORMATION ABOUT THE AUTHORS

Ся Ю, Софийский университет, София, Болгария

e-mail: 1519245949@qq.com

Лин Бо, Южно-Китайский педагогический

университет, Китай

e-mail: bridgemr643@gmail.com

Чен Синь, Шэньчжэньский университет, Шэньчжэнь, Китай e-mail: 877448627@qq.com

Xia Yu, Sofia University, Sofia, Bulgaria e-mail: 1519245949@qq.com

Lin Bo, South China Normal University, China

e-mail: bridgemr643@gmail.com

Chen Xin, Shenzhen University, Shenzhen, China

e-mail: 877448627@qq.com

Статья поступила в редакцию 13.11.2022; одобрена после рецензирования 26.11.2022; принята

к публикации 28.11.2022.

The article was submitted 13.11.2022; approved after reviewing 26.11.2022; accepted for publication

28.11.2022.

i Надоели баннеры? Вы всегда можете отключить рекламу.