
东北大学学报(自然科学版) ›› 2026, Vol. 47 ›› Issue (1): 11-19.DOI: 10.12068/j.issn.1005-3026.2026.20259020
收稿日期:2025-06-06
出版日期:2026-01-15
发布日期:2026-03-17
通讯作者:
姚超
基金资助:
Chao YAO1(
), Zi-xuan GAO2, Jun-ru CHEN3, Yi-peng LU4
Received:2025-06-06
Online:2026-01-15
Published:2026-03-17
Contact:
Chao YAO
摘要:
针对医学图像处理中依赖独立编码组件无法实现数据压缩与机器视觉任务联合优化的问题,本文构建了一种端到端的机器视觉任务驱动的医学图像压缩网络(machine vision task-driven medical image compression network,MVMICNet)模型,端到端地实现数据压缩与医学图像分析的和谐统一.为了保持医学图像压缩前后机器视觉任务的性能,设计了任务感知的改进码率-准确率损失函数,通过引入任务相关的损失项,在优化过程中动态平衡码率、重建图像失真与机器视觉任务精度三者之间的关系;同时,MVMICNet模型采用分阶段训练的模式,针对机器视觉任务的不同特性进行特定的优化,确保了模型能够精准捕获对诊断至关重要的特征信息,实现了压缩效率与任务性能的同步提升,从而在复杂的医学应用场景中展现出更优越的鲁棒性;最终,本文在语义分割和目标检测任务中验证了该框架的有效性.
中图分类号:
姚超, 高梓轩, 陈俊如, 卢奕鹏. 医学图像压缩与视觉任务联合优化方法[J]. 东北大学学报(自然科学版), 2026, 47(1): 11-19.
Chao YAO, Zi-xuan GAO, Jun-ru CHEN, Yi-peng LU. Joint Optimization Approach for Medical Image Compression and Vision Tasks[J]. Journal of Northeastern University(Natural Science), 2026, 47(1): 11-19.
| 算法 | Bpp | PSNR/dB | MS-SSIM | mIoU | |
|---|---|---|---|---|---|
| 第一阶段 | 第二阶段 | ||||
| BPG | 0.090 | 30.02 | 0.908 | 0.498 7 | |
| 0.100 | 31.40 | 0.929 | 0.559 1 | ||
| 0.114 | 33.01 | 0.944 | 0.618 3 | ||
| 0.132 | 34.53 | 0.956 | 0.668 3 | ||
| MBT2018-Mean | 0.080 | 32.55 | 0.907 | 0.613 5 | |
| 0.094 | 33.84 | 0.938 | 0.674 5 | ||
| 0.111 | 35.15 | 0.952 | 0.722 0 | ||
| 0.130 | 35.98 | 0.969 | 0.754 8 | ||
| Cheng2020-Anchor | 0.068 | 36.45 | 0.934 | 0.697 1 | |
| 0.089 | 37.98 | 0.952 | 0.743 5 | ||
| 0.112 | 39.04 | 0.972 | 0.789 8 | ||
| 0.137 | 39.82 | 0.978 | 0.819 2 | ||
| MVMICNet | 0.065 | 40.38 | 0.982 | 0.780 4 | 0.822 6 |
| 0.083 | 41.25 | 0.984 | 0.825 2 | 0.831 7 | |
| 0.105 | 42.08 | 0.987 | 0.832 6 | 0.849 0 | |
| 0.131 | 42.83 | 0.989 | 0.833 7 | 0.864 5 | |
表1 CVC-ColonDB数据集上语义分割精度的对比结果
Table 1 Comparison results of semantic segmentation accuracy on CVC-ColonDB dataset
| 算法 | Bpp | PSNR/dB | MS-SSIM | mIoU | |
|---|---|---|---|---|---|
| 第一阶段 | 第二阶段 | ||||
| BPG | 0.090 | 30.02 | 0.908 | 0.498 7 | |
| 0.100 | 31.40 | 0.929 | 0.559 1 | ||
| 0.114 | 33.01 | 0.944 | 0.618 3 | ||
| 0.132 | 34.53 | 0.956 | 0.668 3 | ||
| MBT2018-Mean | 0.080 | 32.55 | 0.907 | 0.613 5 | |
| 0.094 | 33.84 | 0.938 | 0.674 5 | ||
| 0.111 | 35.15 | 0.952 | 0.722 0 | ||
| 0.130 | 35.98 | 0.969 | 0.754 8 | ||
| Cheng2020-Anchor | 0.068 | 36.45 | 0.934 | 0.697 1 | |
| 0.089 | 37.98 | 0.952 | 0.743 5 | ||
| 0.112 | 39.04 | 0.972 | 0.789 8 | ||
| 0.137 | 39.82 | 0.978 | 0.819 2 | ||
| MVMICNet | 0.065 | 40.38 | 0.982 | 0.780 4 | 0.822 6 |
| 0.083 | 41.25 | 0.984 | 0.825 2 | 0.831 7 | |
| 0.105 | 42.08 | 0.987 | 0.832 6 | 0.849 0 | |
| 0.131 | 42.83 | 0.989 | 0.833 7 | 0.864 5 | |
图3 CVC-ColonDB和ChestX-Det数据集上不同算法的码率-准确率曲线比较结果(a)—CVC-ColonDB; (b)—ChestX-Det; (c)—ChestX-Det; (d)—ChestX-Det.
Fig.3 Comparison of code rate-accuracy curves for different algorithms on CVC-ColonDB and ChestX-Det datasets
图4 不同算法在CVC-ColonDB数据集上的语义分割可视化结果对比(a)—原始图像; (b)—原始图像对应的语义分割结果; (c)—BPG压缩图像的语义分割结果; (d)—Cheng2020-Anchor压缩图像的语义分割结果; (e)—MVMICNet的第一阶段压缩图像的语义分割结果; (f)—MVMICNet的第二阶段语义分割结果.
Fig.4 Visual result comparison of semantic segmentation by different algorithms on CVC-ColonDB dataset
| 指标 | MVMICNet | BPG | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| q=34 | q=31 | q=28 | q=25 | ||||||||||
| Bpp | 0.047 | 0.060 | 0.074 | 0.089 | 0.046 | 0.060 | 0.079 | 0.091 | |||||
| PSNR/dB | 41.14 | 41.93 | 42.57 | 43.10 | 35.98 | 37.50 | 38.76 | 39.52 | |||||
| MS-SSIM | 0.984 7 | 0.987 7 | 0.989 8 | 0.991 3 | 0.959 4 | 0.969 5 | 0.976 7 | 0.983 4 | |||||
| mAP | IoU=0.50:0.95 | 第一阶段 | 0.079 7 | 0.089 9 | 0.101 0 | 0.115 4 | 0.030 0 | 0.049 1 | 0.030 0 | 0.093 4 | |||
| 第二阶段 | 0.083 8 | 0.094 9 | 0.105 6 | 0.120 1 | |||||||||
| IoU=0.50 | 第一阶段 | 0.164 9 | 0.190 1 | 0.218 4 | 0.241 0 | 0.064 0 | 0.106 9 | 0.064 0 | 0.203 3 | ||||
| 第二阶段 | 0.175 9 | 0.203 6 | 0.230 4 | 0.253 6 | |||||||||
| IoU=0.75 | 第一阶段 | 0.069 3 | 0.082 0 | 0.092 0 | 0.100 6 | 0.023 5 | 0.036 3 | 0.023 5 | 0.084 7 | ||||
| 第二阶段 | 0.071 9 | 0.084 7 | 0.093 6 | 0.103 1 | |||||||||
表2 ChestX-Det数据集上目标检测精度的对比结果1
Table 2 Comparison results 1 of object detection accuracy on ChestX-Det dataset
| 指标 | MVMICNet | BPG | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| q=34 | q=31 | q=28 | q=25 | ||||||||||
| Bpp | 0.047 | 0.060 | 0.074 | 0.089 | 0.046 | 0.060 | 0.079 | 0.091 | |||||
| PSNR/dB | 41.14 | 41.93 | 42.57 | 43.10 | 35.98 | 37.50 | 38.76 | 39.52 | |||||
| MS-SSIM | 0.984 7 | 0.987 7 | 0.989 8 | 0.991 3 | 0.959 4 | 0.969 5 | 0.976 7 | 0.983 4 | |||||
| mAP | IoU=0.50:0.95 | 第一阶段 | 0.079 7 | 0.089 9 | 0.101 0 | 0.115 4 | 0.030 0 | 0.049 1 | 0.030 0 | 0.093 4 | |||
| 第二阶段 | 0.083 8 | 0.094 9 | 0.105 6 | 0.120 1 | |||||||||
| IoU=0.50 | 第一阶段 | 0.164 9 | 0.190 1 | 0.218 4 | 0.241 0 | 0.064 0 | 0.106 9 | 0.064 0 | 0.203 3 | ||||
| 第二阶段 | 0.175 9 | 0.203 6 | 0.230 4 | 0.253 6 | |||||||||
| IoU=0.75 | 第一阶段 | 0.069 3 | 0.082 0 | 0.092 0 | 0.100 6 | 0.023 5 | 0.036 3 | 0.023 5 | 0.084 7 | ||||
| 第二阶段 | 0.071 9 | 0.084 7 | 0.093 6 | 0.103 1 | |||||||||
| 指标 | MBT2018-Mean | Cheng2020-Anchor | |||||||
|---|---|---|---|---|---|---|---|---|---|
| q=5 | q=4 | q=3 | q=2 | q=5 | q=4 | q=3 | q=2 | ||
| Bpp | 0.041 | 0.054 | 0.072 | 0.097 | 0.043 | 0.061 | 0.079 | 0.093 | |
| PSNR/dB | 37.36 | 38.60 | 39.98 | 40.81 | 39.41 | 40.63 | 41.63 | 42.36 | |
| MS-SSIM | 0.969 1 | 0.974 0 | 0.978 8 | 0.984 5 | 0.979 6 | 0.983 2 | 0.985 4 | 0.986 6 | |
| mAP | IoU=0.50:0.95 | 0.048 4 | 0.069 9 | 0.086 4 | 0.096 7 | 0.064 9 | 0.084 0 | 0.096 7 | 0.106 2 |
| IoU=0.50 | 0.102 4 | 0.142 7 | 0.183 1 | 0.214 1 | 0.128 2 | 0.173 4 | 0.208 2 | 0.223 7 | |
| IoU=0.75 | 0.036 1 | 0.054 2 | 0.072 3 | 0.087 1 | 0.055 4 | 0.074 7 | 0.087 6 | 0.092 4 | |
表3 ChestX-Det数据集上目标检测精度的对比结果2
Table 3 Comparison result 2 of object detection accuracy on the ChestX-Det dataset
| 指标 | MBT2018-Mean | Cheng2020-Anchor | |||||||
|---|---|---|---|---|---|---|---|---|---|
| q=5 | q=4 | q=3 | q=2 | q=5 | q=4 | q=3 | q=2 | ||
| Bpp | 0.041 | 0.054 | 0.072 | 0.097 | 0.043 | 0.061 | 0.079 | 0.093 | |
| PSNR/dB | 37.36 | 38.60 | 39.98 | 40.81 | 39.41 | 40.63 | 41.63 | 42.36 | |
| MS-SSIM | 0.969 1 | 0.974 0 | 0.978 8 | 0.984 5 | 0.979 6 | 0.983 2 | 0.985 4 | 0.986 6 | |
| mAP | IoU=0.50:0.95 | 0.048 4 | 0.069 9 | 0.086 4 | 0.096 7 | 0.064 9 | 0.084 0 | 0.096 7 | 0.106 2 |
| IoU=0.50 | 0.102 4 | 0.142 7 | 0.183 1 | 0.214 1 | 0.128 2 | 0.173 4 | 0.208 2 | 0.223 7 | |
| IoU=0.75 | 0.036 1 | 0.054 2 | 0.072 3 | 0.087 1 | 0.055 4 | 0.074 7 | 0.087 6 | 0.092 4 | |
| Bpp | PSNR/dB | mIoU/% | 准确率/% | |
|---|---|---|---|---|
| 0.1 | 0.252 30 | 24.250 | 33.040 | 45.5 |
| 0.001 | 0.245 60 | 31.880 | 45.780 | 63.0 |
| 0.000 1 | 0.243 20 | 35.080 | 58.030 | 79.9 |
| 0.000 01 | 0.243 10 | 34.570 | 52.710 | 72.6 |
表4 参数λ2对语义分割任务性能的影响 (segmentation performance)
Table 4 Impact of parameter λ2 on semantic
| Bpp | PSNR/dB | mIoU/% | 准确率/% | |
|---|---|---|---|---|
| 0.1 | 0.252 30 | 24.250 | 33.040 | 45.5 |
| 0.001 | 0.245 60 | 31.880 | 45.780 | 63.0 |
| 0.000 1 | 0.243 20 | 35.080 | 58.030 | 79.9 |
| 0.000 01 | 0.243 10 | 34.570 | 52.710 | 72.6 |
| [1] | Wallace G K. The JPEG still picture compression standard[J]. IEEE Transactions on Consumer Electronics, 1992, 38(1): 18-34. |
| [2] | Christopoulos C, Skodras A, Ebrahimi T. The JPEG2000 still image coding system: an overview[J]. IEEE Transactions on Consumer Electronics, 2000, 46(4): 1103-1127. |
| [3] | Sullivan G J, Ohm J R, Han W J, et al. Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649-1668. |
| [4] | Ballé J, Laparra V, Simoncelli E P. End-to-end optimized image compression[C]// Proceedings of the International Conference on Learning Representations. Toulon,2017: 1611.01704. |
| [5] | Ballé J, Minnen D, Singh S, et al. Variational image compression with a scale hyperprior[C]// Proceedings of International Conference on Learning Representations. Vancouver, 2018: 1802.01436. |
| [6] | Liu J H, Lu G, Hu Z H, et al. A unified end-to-end framework for efficient deep image compression[EB/OL]. (2020-02-09)[2025-05-10]. . |
| [7] | 乔思波, 庞善臣, 王敏, 等. 基于残差混合注意力机制的脑部CT图像分类卷积神经网络模型[J]. 电子学报, 2021, 49(5): 984-991. |
| Qiao Si-bo, Pang Shan-chen, Wang Min, et al. A convolutional neural network for brain CT image classification based on residual hybrid attention mechanism[J]. Acta Electronica Sinica, 2021, 49(5): 984-991. | |
| [8] | 张诗源, 赵桐溪, 戚飞越, 等. 基于小波变换的智能生物医学图像分类算法[J]. 应用数学进展, 2025(3): 16-25. |
| Zhang Shi-yuan, Zhao Tong-xi, Qi Fei-yue, et al. Intelligent biomedical image classification algorithm based on wavelet transform[J]. Advances in Applied Mathematics, 2025(3): 16-25. | |
| [9] | 江贵平, 秦文健, 周寿军, 等. 医学图像分割及其发展现状[J]. 计算机学报, 2015, 38(6): 1222-1242. |
| Jiang Gui-ping, Qin Wen-jian, Zhou Shou-jun, et al. Medical image segmentation and its development status[J]. Chinese Journal of Computers, 2015, 38(6): 1222-1242. | |
| [10] | 周涛, 董雅丽, 霍兵强, 等. U-Net网络医学图像分割应用综述[J]. 中国图象图形学报, 2021, 26(9): 2058-2077. |
| Zhou Tao, Dong Ya-li, Huo Bing-qiang, et al. U-Net and its applications in medical image segmentation: a review[J]. Journal of Image and Graphics, 2021, 26(9): 2058-2077. | |
| [11] | 刘飞, 张俊然, 杨豪. 基于深度学习的医学图像识别研究进展[J]. 中国生物医学工程学报, 2018, 37(1): 86-94. |
| Liu Fei, Zhang Jun-ran, Yang Hao. Research progress of medical image recognition based on deep learning[J]. Chinese Journal of Biomedical Engineering, 2018, 37(1): 86-94. | |
| [12] | 苏华强, 雷海军, 雷柏英. 多分支特征融合分类网络用于CXR图像识别[J]. 信号处理, 2025, 41(2): 253-266. |
| Su Hua-qiang, Lei Hai-jun, Lei Bai-ying. Multi-branch feature fusion classification network for chest X-ray image recognition[J]. Journal of Signal Processing, 2025, 41(2): 253-266. | |
| [13] | Duan L Y, Liu J Y, Yang W H, et al. Video coding for machines: a paradigm of collaborative compression and intelligent analytics[J]. IEEE Transactions on Image Processing, 2020, 29: 8680-8695. |
| [14] | Wang S R, Wang Z, Wang S Q, et al. End-to-end compression towards machine vision: network architecture design and optimization[J]. IEEE Open Journal of Circuits and Systems, 2021, 2: 675-685. |
| [15] | Girod B, Chandrasekhar V, Chen D M, et al. Mobile visual search[J]. IEEE Signal Processing Magazine, 2011, 28(4): 61-76. |
| [16] | 王凯. 基于双分支特征融合的高动态范围医学影像压缩研究[D]. 哈尔滨: 哈尔滨工业大学, 2022. |
| Wang Kai. High dynamic range medical image compression based on two-branch feature fusion [D]. Harbin: Harbin Institute of Technology, 2022. | |
| [17] | Herbert R, Tuytelaars T, Gool L V. SURF: speeded up robust features[C]// Proceedings of the European Conference on Computer Vision. Graz, 2006: 404-417. |
| [18] | Redondi A, Cesana M, Tagliasacchi M. Rate-accuracy optimization in visual wireless sensor networks[C]//The 19th IEEE International Conference on Image Processing. Orlando, 2013: 1105-1108. |
| [19] | Liu L, Chen Z H, Hu Z H, et al. An efficient adaptive compression method for human perception and machine vision tasks[EB/OL]. (2025-01-08)[2025-05-10]. . |
| [20] | 李基臣, 亓玉龙, 胡海瑞, 等. 数字图像处理技术在医学影像中的研究与应用[J]. 电子技术与软件工程, 2022(9): 194-197. |
| Li Ji-chen, Qi Yu-long, Hu Hai-rui, et al. Research and application of digital image processing technology in medical images[J]. Electronic Technology & Software Engineering, 2022(9): 194-197. | |
| [21] | Zabala A, Pons X. Effects of lossy compression on remote sensing image classification of forest areas[J]. International Journal of Applied Earth Observation and Geoinformation, 2011, 13(1): 43-51. |
| [22] | Chao J S, Steinbach E. Preserving SIFT features in JPEG-encoded images[C]//The 18th IEEE International Conference on Image Processing. Brussels, 2011: 301-304. |
| [23] | Shindo T, Yamada K, Watanabe T, et al. Image coding for machines with edge information learning using segment anything[C]// IEEE International Conference on Image Processing (ICIP). Abu Dhabi, 2024: 3702-3708. |
| [24] | Paniga S, Borsani L, Redondi A, et al. Experimental evaluation of a video streaming system for wireless multimedia sensor networks[C]// The 10th IFIP Annual Mediterranean Ad Hoc Networking Workshop. Favignana Island, 2011: 165-170. |
| [25] | Bernal J, Sánchez J, Vilariño F. Towards automatic polyp detection with a polyp appearance model[J]. Pattern Recognition, 2012, 45(9): 3166-3182. |
| [26] | He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]// IEEE International Conference on Computer Vision. Venice, 2017: 2980-2988. |
| [27] | Liu J Y, Lian J, Yu Y Z. ChestX-Det10: chest X-ray dataset on detection of thoracic abnormalities[EB/OL]. (2020-06-17)[2025-05-10]. . |
| [28] | Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
| [29] | Minnen D, Ballé J, Toderici G. Joint autoregressive and hierarchical priors for learned image compression[EB/OL]. (2018-09-08)[2025-05-10].. |
| [30] | Cheng Z X, Sun H M, Takeuchi M, et al. Learned image compression with discretized Gaussian mixture likelihoods and attention modules[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 2020: 7936-7945. |
| [31] | Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding[C]//IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 3213-3223. |
| [32] | Romera E, Álvarez J M, Bergasa L M, et al. ERFNet: efficient residual factorized ConvNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2018, 19(1): 263-272. |
| [1] | 艾散·西尔艾力, 车德福, 王夺, 喻甜. 基于目标检测的复杂城市交通环境感知技术及应用[J]. 东北大学学报(自然科学版), 2025, 46(5): 29-36. |
| [2] | 刘纪红, 时瑞瑞. 基于YOLOv8改进的无人机视觉小目标检测模型[J]. 东北大学学报(自然科学版), 2025, 46(12): 29-37. |
| [3] | 胡博, 熊华德, 刘尧, 张勇军. 基于改进YOLOv8的脱水蔬菜异物检测方法[J]. 东北大学学报(自然科学版), 2025, 46(11): 19-29. |
| [4] | 吕真真, 房立金, 赵乾坤, 万应才. 基于改进的YOLOv8的PCB瑕疵检测[J]. 东北大学学报(自然科学版), 2025, 46(10): 1-9. |
| [5] | 万应才, 房立金, 赵乾坤. 基于跨模态融合的玻璃类似物分割方法[J]. 东北大学学报(自然科学版), 2025, 46(1): 1-8. |
| [6] | 冯虎, 宋克臣, 崔文琦, 颜云辉. 基于元学习的带钢表面缺陷小样本语义分割[J]. 东北大学学报(自然科学版), 2024, 45(3): 354-360. |
| [7] | 成怡, 王阳. 基于SPWVD-STFT的海面弱目标检测方法[J]. 东北大学学报(自然科学版), 2024, 45(10): 1401-1408. |
| [8] | 包妮沙, 韩子松, 于嘉欣, 韦丽红. 基于改进Faster-RCNN的露天煤矿开采区遥感识别方法[J]. 东北大学学报(自然科学版), 2023, 44(12): 1759-1768. |
| [9] | 李娟莉, 魏代良, 李博, 文小. 基于深度学习轻量化的改进SSD煤矸快速分选模型[J]. 东北大学学报(自然科学版), 2023, 44(10): 1474-1480. |
| [10] | 顾德英, 罗聿伦, 李文超. 基于改进YOLOv5算法的复杂场景交通目标检测[J]. 东北大学学报(自然科学版), 2022, 43(8): 1073-1079. |
| [11] | 于哲舟, 刘岩, 刘元宁. 基于YOLOV3改进的虹膜定位算法[J]. 东北大学学报(自然科学版), 2022, 43(4): 496-501. |
| [12] | 王璐, 王帅, 张国峰, 徐礼胜. 基于语义分割注意力与可见区域预测的行人检测方法[J]. 东北大学学报(自然科学版), 2021, 42(9): 1261-1267. |
| [13] | 杨爱萍, 宋尚阳, 程思萌. 轻量化自适应特征选择目标检测网络[J]. 东北大学学报(自然科学版), 2021, 42(9): 1238-1245. |
| [14] | 宋欣, 李奇, 解婉君, 李宁. YOLOv3-ADS:一种基于YOLOv3的深度学习目标检测压缩模型[J]. 东北大学学报(自然科学版), 2021, 42(5): 609-615. |
| [15] | 谢天植, 雷为民, 张伟, 李志远. 一种基于深度学习的实时视频图像背景替换方法[J]. 东北大学学报(自然科学版), 2021, 42(11): 1540-1546. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||