системный анализ, управление
и обработка информации,
ВЕСТНИК ТОГУ. 2024. № 3 (74)
статистика
УДК 004.942
DOI https://doi.org/10.38161/1996-3440-2024-3-35-46
Qin Hongwu, Wang Xinyu, Xu Fei, Chen Qiming, Chye En Un, V. V. Voronin
PULMONARY NODULE DETECTION ALGORITHM BASED ON LIGHTWEIGHT IMPROVED YOLOv8 NETWORK MODEL
Qin Hongwu - PhD, Professor, College of Electronic and Information Engineering, Changchun University, Changchun, email: [email protected] (China); Wang Xinyu - Master of engineering; College of Electronic and Information Engineering, Changchun University, Changchun, email: [email protected] (China); Xu Fei - Master of engineering; College of Electronic and Information Engineering, Changchun University, Changchun, email: [email protected] (China); Chen Qiming - Master of engineering; College of Electronic and Information Engineering, Changchun University, Changchun, email: [email protected] (China); Chye En Un - Doctor of Technical Sciences, Professor of the Higher School of Cybernetics and Digital Technologies, Pacific National University, e-mail: [email protected]; Voronin V. V. - Doctor of Technical Sciences, Professor of the Higher School of Cybernetics and Digital Technologies, Pacific National University (Russia), e-mail: [email protected].
Pulmonary nodule is one of the early manifestations of lung cancer, and early screening and diagnosis of pulmonary nodule is of great significance. Aiming at the problem of low detection accuracy of lung nodules in CT images, this paper proposes a detection algorithm based on lightweight improved YOLOv8 network model. Block module in FasterNet lightweight network is used to replace the Bottleneck module in C2f module, and EMA attention mechanism is introduced into the Block module. The experimental results show that the mAP value of the improved algorithm is 0.705 on LUNA16 data set. Compared with the YOLOv8, YOLOv5 and YOLOv5-FE algorithms, the YOLOv8-FE algorithm improves the mAP value by 10.2 % in accuracy and reduces the parameters by 23.2 % in speed that has certain advantages.
Keywords: detection algorithm, YOLOv8, lung nodules, EMA attention, FasterNet
© Qin Hongwu, Wang Xinyu, Xu Fei, Chen Qiming, Chye En Un, Voronin V.V., 2024
3Tl
BECTHHK TOry. 2024. № 3 (74)
Introduction
The most common cancer in China is lung cancer, with more than 630,000 deaths due to lung cancer. Lung cancer develops from pulmonary nodules, so screening for pulmonary nodules has become a key examination item to determine lung cancer 1. In recent years, with the rise of big data and deep convolutional neural networks, pulmonary nodule detection based on deep learning can obtain better feature representation capabilities, thus achieving a series of breakthroughs in the field of detection. For example, Zhao et al. 2 designed an adaptive 3D CNN structure to further reduce false alarms, but its equipment performance requirements are high and the calculation is complex. The Faster-R-CNN algorithm designed by Su et al. 3 has high accuracy, but its data set is small, lacks specificity, and has low sensitivity to small pulmonary nodules. Goel, L. et al. 4 designed a hybrid of improved YOLOv3 and BBO/EE optimizer, which has high accuracy, but does not consider light-weighting to broaden application scenarios. Methods based on deep learning have made some progress, but the performance of pulmonary nodule detection algorithms still needs to be improved. Currently, a pulmonary nodule detection model with low hardware requirements, high accuracy and fast speed is needed. Therefore, this article proposes a YOLOV8-FE pulmonary nodule detection algorithm, and proves the feasibility and effectiveness of the proposed model through experiments.
Improve the algorithm theory of YOLOv8
YOLOv8 is an algorithm released in 2023. It is a cutting-edge model that provides higher detection accuracy and speed. It mainly consists of three parts: Backbone, Neck, and Head. The Backbone part mainly performs feature extraction and consists of C2f and SPPF modules. Performing channel-by-channel convolution on all feature maps may cause a large amount of redundant calculations and reduce network efficiency. The Neck part mainly performs feature fusion, using feature pyramid to upsample the features output by Backbone at different stages. In order to make the network more efficient and reduce model parameters, the C2f module can be lightweight and improved, and an attention mechanism can be added to balance the calculation accuracy.
Head As the final prediction part, there are three detection heads with different size feature maps to detect and output target objects of different sizes. Its specific structure is shown in Fig. 1.
Neck Head
Fig. 1. YOLOv8 structure diagram
FasterNet proposed by J. Chen et al. is faster on an accurate basis 5. By examining the relationship between latency, PConv and FLOPs, as shown in Equation (1). The FLOPs of PConv are only shown in formula (2). In addition, PConv has smaller memory access, as shown in equation (3). The goal of FasterNet is to make the architecture as simple as possible and generally hardware-friendly 6. The specific architecture is shown in Fig. 2.
. FLOPS
Latency =-, (1)
J FLOPS' v '
hxwxk2 xc^ , (2)
hxwx 2cp + k2 xc£ - hxwx 2cp . (3)
BEGTHHK TOry. 2024. № 3 (74)
Fig. 2. FasterNet structure diagram
The C2f module uses more jump-layer connections and additional split operations. Without the support of high-performance hardware equipment, the model's computing speed is slow and it cannot provide real-time help, and the practicality of the model decreases 7. Therefore, the Bottleneck module in YOLOv8's C2f is replaced with the Block module in FasterNet. Achieve lightweight while ensuring that its detection accuracy does not drop significantly. The specific architecture is shown in Fig. 3 below.
Fig. 3. YOLOv8-F structure diagram
Lightweighting has brought about a decrease in detection accuracy. On the original basis, the Efficient Multi-Scale Attention (EMA) attention mechanism was added to form the YOLOv8-FE model in order to achieve a balance between speed and accuracy 8. Compared with attention such as CBAM, NAM, SA, ECA and CA, the EMA attention mechanism proposed by D Ouyang et al. not only achieves better results, but is also more efficient in terms of required parameters. Its specific structure is shown in Fig. 4 show 9. Among them, the E step updates the attention map, the M step updates this set of bases, X represents the observation data, and Z represents the latent variable space, as shown in Equation (4).
X ={x1,x2,...,xN}, lnp'(X,Z\9),
Q(9,9old) = ^p(Z\X,9old) lnp(X,Z\9), z
0new = argmax Q (9,9old).
BEGTHHK TOry. 2024. № 3 (74)
Fig. 4. EMA structure diagram
EMA models cross-channel feature interactions in the channel direction to achieve richer feature aggregation, which makes it more powerful10. EMA is added to the YOLOv8-F model to form YOLOv8-FE. The specific structure is shown in Fig. 4 below 11.
Neck
Fig. 5. YOLOv8-FE
Experimental results and analysis
Data set a subset of LIDC-IDRI, the largest public pulmonary nodule data set. LIDC-IDRI deletes CT images with slice thickness greater than 3 mm and pulmonary nodules less than 3 mm, leaving the remaining LUNA16 data set. A total of 1186 images in the LUNA16 data set are used, 80 % of which are training sets and 20 % are test sets, which are 948 and 238 images respectively. Perform lung parenchyma segmentation on the dataset, find the average pixel values near the lungs, and renormalize the washed-out images. Use Kmeans to separate foreground and background. After lung parenchyma segmentation, the picture background is cleaner, which is of great help to the detection effect 12. The comparison is shown in Fig. 6.
Fig. 6. Comparison before and after segmentation
The experimental environment uses Windows operating system, python 3.9, In-ter(R)Core (TM) i9-10980XE CPU @ 3.00 GHz, NVIDA RTX A6000 for training, operating system Windows 10, and the model was trained for a total of 300 epochs. Several evaluat ion indicators commonly used in the field of target detection are selected for evaluation, and they are introduced below. The detection accuracy is measured by Precision, Recall, Fl-Score, and mAP, and the detection speed is measured by Parameters, FPS, and GFLOPs. Finally, mAP was selected as the most representative indicator of accuracy, and Parameters was selected as the most representative indicator of speed.
P recision and R ecall calculation formulas (5) and (6) are as follows. Among them, Positive means positive, Negative means negative, True means the result is correct, and False means the result is wrong. TP represents correct identification as positive, FP represents incorrect identification as positive, and FN represents incorrect identification as negative.
TP
Precision =-, (5)
TP+FP' v '
TP
Recall =-, (6)
TP+FN ' v '
mAP is the average accuracy of all categories and its formula (7) is as follows.
yV- AP-
mAP = (7)
n
Fl-Score is the harmonious function of Precision and Recall, which can comprehensively reflect the accuracy of the model. Formula (8) is as follows.
2xPr ecisionxRe call
r1=-—-. (8)
Pr ecision+Re call
Parameter mainly focuses on the weight of the convolutional layer and the fully connected layer. Formula (9) is as follows.
Parameter = (Cin x K2 + 1)C0Ut. (9)
FPS is Frames Per Second, which is the number of frames transmitted per second. Formula (10) is as follows.
FPS =-:-1-. (10)
Processing time per frame
GFLOPs is Giga Floating - point Operations Per Second, which is 1 billion floating point operations per second. The formula is shown in (11).
FLOPs = 2x Cout x Hout x Wout x Cin x k2 ,
lGFLOPs = 109 FLOPs . (11)
Ablation experiment performance by various optimization schemes designed in the YOLO-FE network, the same data set and experimental environment were used to decompose the YOLO-FE network and analyze the test results obtained. net 1: Use the FasterNet module to replace the C2f module in YOLOv8 to build the YOLOv8-F network, in which the backbone and neck are replaced. net 2: Use the FasterNet - EMA module to replace the C2f module in the backbone of the YOLOv8 backbone network, but the neck has not been replaced. net 3: The FasterNet -EMA module is used to replace the C2f module in YOLOv8, and the backbone and neck are replaced to form the final YOLOv8-FE network.
The ablation comparison results in terms of detection accuracy are shown in Table 1 below. Net1 has decreased in mAP and F1-Score compared to YOLOv8, mainly due to the use of the FasterNet Block module; net2 and 3 have increased in mAP and F1-Score compared to YOLOv8, mainly due to the addition of EMA, and net3's mAP The maximum increase is 10.2 %, proving the effectiveness of the improved model YOLO-FE (net3).
Table 1
Precision ablation comparison
name Precision Recall mAP F1-Score
YOLOv8 0.691 0.658 0.64 0.674
net 1 0.683 0.562 0.589 0.617
net 2 0.688 0.664 0.673 0.676
net 3 0.709 0.689 0.705 0.699
A visual comparison of various solutions on Precision, Recall, and mAP is shown in Fig. 7. It can be seen more clearly and intuitively that the net3 curve is better than other solutions.
Fig. 7. Visual data comparison
The results of the ablation comparison test in terms of detection speed are shown in Table. 2 below. The Parameters of net1 are reduced by 23.5 % compared with YOLOv8, and the Parameters of net3 are reduced by 23.2 % compared with YOLOv8. Although the reduction of Parameters of net3 is slightly worse than that of net1, the gap is small. Based on the comprehensive detection accuracy, net3 is finally selected as the final model.
Table 2
Comparison of speed ablation
name Parameters FPS GFLOPs
YOLOv8 3005843 333.3 8.1
net 1 2300643 333.3 6.3
net 2 2305011 312.5 6.4
net 3 2309155 303 6.5
BEGTHHK TOry. 2024. № 3 (74)
By selecting different types of networks for comparison, it is verified that the YOLO-FE proposed in this article has better performance in detection accuracy. The algorithm selections are YOLOv5, YOLOv5-FE, and YOLOv8.
Among them, YOLOv 5-FE adds the same Faster-EMA module to C3 as a comparison model.
The model comparison test results in terms of detection accuracy are shown in Table below. The mAP value of YOLOv5 is 0.688, which is higher than the mAP value of YOLOv8. After the improvement, the mAP value of YOLOv5-FE dropped to 0.613. The improved YOLOv8-FE has a mAP value of up to 0.699. Although the mAP value YOLOv5 model is better than the YOLOv8 model, the improved YOLOv8 -FE is better than YOLOv5 -FE, so YOLOv8 was finally selected as the improved original model.
Table 3
name Precision Recall mAP Fl-Score
YOLOv5 0.75 0.642 0.6 88 0.692
YOLOv8 0.691 0.658 0.64 0.674
YOLOv5-FE 0.577 0.6 2 0.6 13 0.6
YOLOv8-FE 0.709 0.689 0.705 0.699
The visual comparison of different types of networks on Precision, Recall, and mAP is shown in Fig. 8. It can be clearly seen that the OLOv8-FE curve result is higher than other curves.
Fig. 8. Visualization results
The model comparison test results in terms of detection speed are shown in Table 4 below. The Parameters of YOLOv5 are 2503139, and the Parameters of the improved YOLOv5-FE are 2204691. It has the smallest number of parameters and has certain advantages. However, the Parameters of YOLOv 8 -FE are only 4.5 % higher than that of YOLOv5-FE, while the mAP is increased by 18.6%. After comprehensive consideration, YOLOv8-FE is still chosen.
Table 4
Speed comparison
name Parameters FPS GFLOPs
YOLOv5 2503139 333.3 7.1
YOLOv8 3005843 333 8.1
YOLOv5-FE 2204691 303 6.5
YOLOv8-FE 2309155 303 6.5
Conclusion
The YOLOv8 lightweight pulmonary nodule detection algorithm improved in this article uses YOLOv8 as the baseline network and introduces the FasterNet lightweight network to form the YOLOv8-F network, which reduces the hardware resource consumption and parameter amount in the neural network model, and solves the problem that high-performance networks cannot be embedded Problems in devices with low computing power. In addition, EMA attention is added to the network module to reduce the loss of feature information caused by lightweighting, as well as the difficulty in extracting features of lung nodules that are blurry, irregular in edge and inconsistent in size, and solves the problem of low overall detection accuracy of the model. The mAP of the improved algorithm reached 70.5 %, and the Parameters were reduced to 2309155. The algorithm has been improved in accuracy and speed.
Acknowledgments
This work were supported by the Project of Jilin Provincial Development and Reform Commission (2023C042-4), the Innovation and Entrepreneurship Talent Funding Project of Jilin Province (2023RY17) and the project of Jilin Provincial Education Department (SJZD23-01).
References
1. Cancer statistics / R. L. Siegel, R. D. Miller, H. TE. Fuchs, et al. //A cancer journal for clinicians. 2022. 72(1). pp 7-33.
2. An attentive and adaptive 3D CNN for automatic pulmonary nodule detection in CT image / Zhao, Dandan, et al. // Expert Systems with Applications. 2023. 211..
3. Su Y, Chen X. Lung nodule detection based on faster R- CNN framework // Computer Methods and Programs in Biomedicine. 2021. 200.
4. Goel L, Mishra S. A hybrid of modified YOLOv3 with BBO/EE optimizer for lung cancer detection // Multimedia Tools and Applications. 2023. pp. 1-33.
5. You only look once: Unified, real-time object detection / J. Redmon, S. Divvala, R. Girshick, et al. // Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 779-788.
6. Run, Don' t walk: Chasing higher FLOPS for faster neural networks / J. Chen, S. Kao S,
H. He, et al. // Proceedings of the IEEE CVF Conference on Computer Vision and Pattern Recognition. 2023.
7. UAV-YOLOv8: a small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios / G. Wang, Y. Chen, P. An, et al. // Sensors. 2023. 23(16).
8. Liu K. Stbi -yolo: A real-time object detection method for lung nodule recognition[ // IEEE Access. 2022. 10.
9. Automatic detection of pulmonary nodules on CT images with YOLOv3: development and evaluation using simulated and patient data / C. Liu, S. Hu, C. Wang, et al. // Quantitative Imaging in Medicine and Surgery. 2020. 10(10).
10. Efficient multi-scale attention module with cross-spatial learning / D. Ouyang, S. He, G. Zhang, et al. // ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE. 2023.
11. Real-time pulmonary nodule detection algorithm combining attention and multi-path fusion / Zhao Kui, Qiu Huiqi, Li Xu, et al. // Computer Applications. 2024. 44(03). pp. 945-952.
12. Qin Yuanyuan, Zhang Hong. Pulmonary nodule detection algorithm based on attention feature pyramid network // Computer application. 2023. 43(07).
Заглавие: Алгоритм обнаружения узелков в легких на основе усовершенствованной сетевой модели YOLOv8
Авторы:
Цинь Хуну - Чанчуньский университет (КНР)
Ван Синьюй - Чанчуньский университет (КНР)
Сюй Фэй - Чанчуньский университет (КНР)
Чэнь Цимин - Чанчуньский университет (КНР)
Чье Ен Ун - Тихоокеанский государственный университет (Россия)
Воронин В. В. - Тихоокеанский государственный университет (Россия)
Аннотация: Появление узелков в легких является одним из ранних проявлений рака легких. Поэтому ранний скрининг и диагностика узелков в легких имеют большое значение. В данной статье, направленной на решение проблемы точности обнаружения узелков в легких на КТ-изображениях, предлагается алгоритм обнаружения, основанный на усовершенствованной сетевой модели YOLOv8. Блочный модуль в сети FasterNet используется для замены модуля Bottleneck в модуле C2f, а в блочный модуль введен механизм EMA. Результаты экспериментов показывают, что значение mAP улучшенного алгоритма составляет 0,705 на наборе данных LUNA16. По сравнению с алгоритмами YOLOv8, YOLOv5 и YOLOv5-FE алгоритм YOLOv8-FE улучшает значение mAP на 10,2 % по точности и на 23,2 % по скорости обнаружения, что дает определенные преимущества.
Ключевые слова: алгоритм обнаружения, YOLOv8, узелки в легких, внимание EMA, FasterNet