东北大学学报(自然科学版) ›› 2024, Vol. 45 ›› Issue (12): 1696-1705.DOI: 10.12068/j.issn.1005-3026.2024.12.004

• 信息与控制 • 上一篇    

基于Transformer的多尺度水下图像增强网络

杨爱萍(), 方思捷, 邵明福, 张腾飞   

  1. 天津大学 电气自动化与信息工程学院,天津 300072
  • 收稿日期:2023-06-06 出版日期:2024-12-10 发布日期:2025-03-18
  • 通讯作者: 杨爱萍
  • 作者简介:杨爱萍(1977-),女,山东聊城人,天津大学副教授,博士生导师.
  • 基金资助:
    国家自然科学基金资助项目(62071323)

Transformer-based Multi-scale Underwater Image Enhancement Network

Ai-ping YANG(), Si-jie FANG, Ming-fu SHAO, Teng-fei ZHANG   

  1. School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China.
  • Received:2023-06-06 Online:2024-12-10 Published:2025-03-18
  • Contact: Ai-ping YANG

摘要:

基于CNN(convolutional neural network)的水下图像增强方法容易忽略全局特征,导致复原图像出现颜色失真、对比度下降等现象,影响全局视觉感知效果.因此,提出一种基于Transformer的多尺度水下图像增强网络.针对全局特征缺失问题,融入水下图像先验设计位置编码模块,构建适用于水下场景的Swin Transformer模块,并通过自注意力机制针对性地提取图像全局特征,提升全局感知性能;针对局部细节模糊现象,设计CNN模块关注水下图像纹理、边缘等局部特征,改善细节感知效果;构建转移融合模块,将Swin Transformer的全局注意力转移到卷积特征上,达成全局和局部特征的高效融合与利用.实验结果表明,所提方法在EUVP子集上的PSNR值最高可达23.47 dB,可有效增强全局视觉感知能力,显著改善图像视觉质量.

关键词: 水下图像增强, 深度学习, Transformer, 卷积神经网络, 转移融合

Abstract:

CNN(convolutional neural network)‑based underwater image enhancement methods neglect global visual perception, leading to color distortion and contrast degradation.

A Transformer‑based multi‑scale underwater image enhancement network (MTransNet) is proposed. To address the problem of lacking global visual perception, a position encoding module is designed based on underwater image priors and a Swin Transformer module which is applicable to underwater scenes is constructed. Furthermore, self‑attention mechanism is built to improve global perception performance. As for the detail blurring that exists in current methods, a CNN module is developed to capture local features such as textures or edges, to improve local perception performance. The transfer fusion module is built to transfer global attention of Swin Transformer to local convolutional feature, achieving full fusion and utilization of global feature and local feature. The PSNR value on subsets of EUVP can reach up to 23.47 dB, which demonstrates the method can significantly enhance global visual perception and increase image visual quality.

Key words: underwater image enhancement, deep learning, Transformer, convolutional neural network, transfer fusion

中图分类号: