CNN(convolutional neural network)‑based underwater image enhancement methods neglect global visual perception, leading to color distortion and contrast degradation.
A Transformer‑based multi‑scale underwater image enhancement network (MTransNet) is proposed. To address the problem of lacking global visual perception, a position encoding module is designed based on underwater image priors and a Swin Transformer module which is applicable to underwater scenes is constructed. Furthermore, self‑attention mechanism is built to improve global perception performance. As for the detail blurring that exists in current methods, a CNN module is developed to capture local features such as textures or edges, to improve local perception performance. The transfer fusion module is built to transfer global attention of Swin Transformer to local convolutional feature, achieving full fusion and utilization of global feature and local feature. The PSNR value on subsets of EUVP can reach up to 23.47 dB, which demonstrates the method can significantly enhance global visual perception and increase image visual quality.