Journal of Northeastern University(Natural Science) ›› 2025, Vol. 46 ›› Issue (1): 1-8.DOI: 10.12068/j.issn.1005-3026.2025.20230204
• Information & Control •
Ying-cai WAN, Li-jin FANG, Qian-kun ZHAO
Received:
2023-07-17
Online:
2025-01-15
Published:
2025-03-25
CLC Number:
Ying-cai WAN, Li-jin FANG, Qian-kun ZHAO. Segmentation Method for Glass-like Object Based on Cross-Modal Fusion[J]. Journal of Northeastern University(Natural Science), 2025, 46(1): 1-8.
方法 | 骨干网络 | GDD | Trans10k | ||||||
---|---|---|---|---|---|---|---|---|---|
ICNet[ | ResNet-50 | 69.59 | 0.747 | 0.164 | 16.10 | 74.94 | 0.784 | 0.110 | 10.92 |
DeepLabv3+[ | ResNet-50 | 69.95 | 0.767 | 0.147 | 15.49 | 51.52 | 0.602 | 0.229 | 23.80 |
MINet-R[ | ResNet-50 | 82.03 | 0.847 | 0.092 | 8.55 | 85.88 | 0.881 | 0.060 | 6.03 |
ITSD[ | ResNet-50 | 83.72 | 0.862 | 0.087 | 7.77 | 85.44 | 0.871 | 0.063 | 6.26 |
MirrorNet[ | ResNeXt-101 | 85.07 | 0.866 | 0.083 | 7.67 | 88.30 | 0.907 | 0.047 | 4.95 |
TransLab[ | ResNet-50 | 81.64 | 0.849 | 0.097 | 9.70 | 87.10 | 0.897 | 0.051 | 5.44 |
GDNet[ | ResNeXt-101 | 87.63 | 0.898 | 0.063 | 5.62 | 88.68 | 0.907 | 0.046 | 4.72 |
GSD[ | ResNeXt-101 | 88.07 | 0.932 | 0.059 | 5.71 | 89.16 | 0.937 | 0.043 | 4.50 |
PGSNet[ | ResNeXt-101 | 87.81 | 0.901 | 0.062 | 5.56 | 89.79 | 0.917 | 0.042 | 4.39 |
EBLNet[ | ResNeXt-101 | 88.16 | 0.939 | 0.059 | 5.58 | 90.28 | 0.947 | 0.048 | 4.14 |
本文方法 | Swin-s | 89.61 | 0.942 | 0.060 | 5.02 | 92.32 | 0.949 | 0.035 | 2.98 |
Table 1 Quantitative comparison with other methods on the GDD and Trans10k datasets.
方法 | 骨干网络 | GDD | Trans10k | ||||||
---|---|---|---|---|---|---|---|---|---|
ICNet[ | ResNet-50 | 69.59 | 0.747 | 0.164 | 16.10 | 74.94 | 0.784 | 0.110 | 10.92 |
DeepLabv3+[ | ResNet-50 | 69.95 | 0.767 | 0.147 | 15.49 | 51.52 | 0.602 | 0.229 | 23.80 |
MINet-R[ | ResNet-50 | 82.03 | 0.847 | 0.092 | 8.55 | 85.88 | 0.881 | 0.060 | 6.03 |
ITSD[ | ResNet-50 | 83.72 | 0.862 | 0.087 | 7.77 | 85.44 | 0.871 | 0.063 | 6.26 |
MirrorNet[ | ResNeXt-101 | 85.07 | 0.866 | 0.083 | 7.67 | 88.30 | 0.907 | 0.047 | 4.95 |
TransLab[ | ResNet-50 | 81.64 | 0.849 | 0.097 | 9.70 | 87.10 | 0.897 | 0.051 | 5.44 |
GDNet[ | ResNeXt-101 | 87.63 | 0.898 | 0.063 | 5.62 | 88.68 | 0.907 | 0.046 | 4.72 |
GSD[ | ResNeXt-101 | 88.07 | 0.932 | 0.059 | 5.71 | 89.16 | 0.937 | 0.043 | 4.50 |
PGSNet[ | ResNeXt-101 | 87.81 | 0.901 | 0.062 | 5.56 | 89.79 | 0.917 | 0.042 | 4.39 |
EBLNet[ | ResNeXt-101 | 88.16 | 0.939 | 0.059 | 5.58 | 90.28 | 0.947 | 0.048 | 4.14 |
本文方法 | Swin-s | 89.61 | 0.942 | 0.060 | 5.02 | 92.32 | 0.949 | 0.035 | 2.98 |
方法 | ||||
---|---|---|---|---|
ICNet[ | 57.25 | 0.710 | 0.124 | 18.75 |
DeepLabv3+[ | 78.81 | 0.872 | 0.054 | 8.95 |
MirrorNet[ | 78.95 | 0.857 | 0.065 | 6.39 |
EBLNet[ | 80.33 | 0.883 | 0.049 | 8.63 |
本文方法 | 86.26 | 0.909 | 0.045 | 8.03 |
Table 2 Quantitative comparison with other methods
方法 | ||||
---|---|---|---|---|
ICNet[ | 57.25 | 0.710 | 0.124 | 18.75 |
DeepLabv3+[ | 78.81 | 0.872 | 0.054 | 8.95 |
MirrorNet[ | 78.95 | 0.857 | 0.065 | 6.39 |
EBLNet[ | 80.33 | 0.883 | 0.049 | 8.63 |
本文方法 | 86.26 | 0.909 | 0.045 | 8.03 |
方法 | ||||
---|---|---|---|---|
F3Net[ | 65.15 | 0.707 | 0.069 | 14.25 |
MirrorNet[ | 68.37 | 0.723 | 0.062 | 8.66 |
PMD[ | 72.27 | 0.775 | 0.054 | 10.71 |
PDNet[ | 77.77 | 0.825 | 0.042 | 7.77 |
本文方法 | 85.15 | 0.922 | 0.037 | 6.13 |
Table 3 Quantitative comparison with other methods on the RGBD-Mirror dataset
方法 | ||||
---|---|---|---|---|
F3Net[ | 65.15 | 0.707 | 0.069 | 14.25 |
MirrorNet[ | 68.37 | 0.723 | 0.062 | 8.66 |
PMD[ | 72.27 | 0.775 | 0.054 | 10.71 |
PDNet[ | 77.77 | 0.825 | 0.042 | 7.77 |
本文方法 | 85.15 | 0.922 | 0.037 | 6.13 |
方法 | ||||
---|---|---|---|---|
PDNet[ PDNet(网络估计深度) | 77.77 | 0.825 | 0.042 | 7.77 |
78.58 | 0.849 | 0.041 | 7.01 | |
本文方法(相机采集深度) | 84.15 | 0.908 | 0.042 | 6.50 |
本文方法(网络估计深度) | 85.15 | 0.922 | 0.037 | 6.13 |
Table 5 Experimental results of different types of
方法 | ||||
---|---|---|---|---|
PDNet[ PDNet(网络估计深度) | 77.77 | 0.825 | 0.042 | 7.77 |
78.58 | 0.849 | 0.041 | 7.01 | |
本文方法(相机采集深度) | 84.15 | 0.908 | 0.042 | 6.50 |
本文方法(网络估计深度) | 85.15 | 0.922 | 0.037 | 6.13 |
第1阶段 | 第2阶段 | 第3阶段 | 第4阶段 | |
---|---|---|---|---|
| 87.71 | |||
| 87.31 | |||
| 87.29 | |||
| 87.89 | |||
| | 88.69 | ||
| | | 89.48 | |
| | | | 89.61 |
Table7 Results of mean intersection‑over‑union ratio at different fusion stages
第1阶段 | 第2阶段 | 第3阶段 | 第4阶段 | |
---|---|---|---|---|
| 87.71 | |||
| 87.31 | |||
| 87.29 | |||
| 87.89 | |||
| | 88.69 | ||
| | | 89.48 | |
| | | | 89.61 |
1 | Zhao H S, Qi X J, Shen X Y,et al.ICNet for real‑time semantic segmentation on high‑resolution images[C]//Proceedings of the European Conference on Computer Vision (ECCV 2018).Munich:Springer International Publishing,2018:418‑434. |
2 | Wang D Q, Zhang T, Süsstrunk S.NEMTO:neural environment matting for novel view and relighting synthesis of transparent objects[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV).Paris:IEEE,2023:317-327. |
3 | 王璐,王帅,张国峰,等.基于语义分割注意力与可见区域预测的行人检测方法[J].东北大学学报(自然科学版),2021,42(9):1261-1267. |
Wang Lu, Wang Shuai, Zhang Guo‑feng,et al. Pedestrian detection based on semantic segmentation attention and visible region prediction[J].Journal of Northeastern University (Natural Science ),2021,42(9):1261-1267. | |
4 | 张之敏,乔建忠,林树宽,等.一种基于深度网络的视图重建方法[J].东北大学学报(自然科学版),2020,41(8):1065-1069. |
Zhang Zhi‑min, Qiao Jian‑zhong, Lin Shu‑kuan,et al.A view reconstruction method based on deep network[J].Journal of Northeastern University (Natural Science),2020,41(8):1065-1069. | |
5 | Wang Z Y, Li Y C, Cheng X N,et al.Key points trajectory and multi‑level depth distinction based refinement for video mirror and glass segmentation[J].Multimedia Tools and Applications,2024,83(39):86513-86535. |
6 | Yang X, Mei H Y, Xu K,et al.Where is my mirror?[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV).Seoul:IEEE,2019:8808-8817. |
7 | Lin J Y, He Z B, Lau R W H.Rich context aggregation with reflection prior for glass surface detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Nashville:IEEE,2021:13410-13419. |
8 | Lin J Y, Wang G D, Lau R W H.Progressive mirror detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE,2020:3694-3702. |
9 | Mei H Y, Yang X, Wang Y,et al.Don’t hit me!glass detection in real‑world scenes[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE,2020:3684-3693. |
10 | He H, Li X T, Cheng G L,et al.Enhanced boundary learning for glass‑like object segmentation[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV).Montreal:IEEE,2021:15839-15848. |
11 | Mei H Y, Dong B, Dong W,et al.Depth‑aware mirror segmentation[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Nashville:IEEE,2021:3043-3052. |
12 | Chang Q L, Liao H H, Meng X F,et al.PanoGlassNet:glass detection with panoramic RGB and intensity images[J].IEEE Transactions on Instrumentation and Measurement,2024,73:5019015. |
13 | Liu Z, Lin Y T, Cao Y,et al.Swin transformer:hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV).Montreal:IEEE,2021:9992-10002. |
14 | Yin W, Zhang J M, Wang O,et al.Learning to recover 3D scene shape from a single image[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Nashville:IEEE,2021:204-213. |
15 | Taud H, Mas J F.Multilayer perceptron (MLP)[M]//Cámacho O M T,Paegelow M,Mas J F,et al.Geomatic Approaches for Modeling Land Change Scenarios.Cham:Springer,2018:451-455. |
16 | Zhao H S, Shi J P, Qi X J,et al.Pyramid scene parsing network[C]//2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu:IEEE,2017:6230-6239. |
17 | Deng J J, Pan Y W, Yao T,et al.MINet:meta‑learning instance identifiers for video object detection[J].IEEE Transactions on Image Processing,2021,30:6879-6891. |
18 | Zhou H J, Xie X H, Lai J H,et al.Interactive two‑stream decoder for accurate and fast saliency detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE,2020:9138-9147. |
19 | Xie E Z, Wang W J, Wang W H,et al.Segmenting transparent objects in the wild[C]//Computer Vision and Pattern Recognition.Cham:Springer International Publishing,2020:696-711. |
20 | Wei J, Wang S H, Huang Q M.F3Net:fusion,feedback and focus for salient object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:IEEE,2020:12321-12328. |
[1] | Yan LIU, Qi-jie BU, Hong-chen ZHAO, Xin GUO. Operating Performance Assessment of Flotation Process Based on Multi-source Heterogeneous Information [J]. Journal of Northeastern University(Natural Science), 2024, 45(9): 1217-1226. |
[2] | An-lin TIAN, Wei-min LEI, Peng ZHANG, Wei ZHANG. A Multi-scale Edge Detection Method Based on Encoder-Decoder [J]. Journal of Northeastern University(Natural Science), 2024, 45(7): 936-943. |
[3] | Wei-wei LIU, Jia-he QIU, Guang-da HU, Ze-yuan LIU. Surface Damage Detection Method for Retired Shaft Parts Based on Improved YOLOv5 [J]. Journal of Northeastern University(Natural Science), 2024, 45(7): 1002-1010. |
[4] | Yuan MA, Li-huang SHE, Jia-wei LI, Xi-rong BAO. Adaptive Graph Convolutional 3D Point Cloud Recognition Algorithm Based on Attention Mechanism [J]. Journal of Northeastern University(Natural Science), 2024, 45(6): 786-792. |
[5] | Li-xin GUO, Su-tao BI, Ming-yang ZHAO. State Detection Algorithm of Manipulator Based on Improved YOLOv4 Lightweight Network [J]. Journal of Northeastern University(Natural Science), 2024, 45(6): 769-775. |
[6] | Hu FENG, Ke-chen SONG, Wen-qi CUI, Yun-hui YAN. Few-Shot Semantic Segmentation of Strip Steel Surface Defects Based on Meta-Learning [J]. Journal of Northeastern University(Natural Science), 2024, 45(3): 354-360. |
[7] | Peng SHAN, Lin ZHANG, Hong-ming XIAO, Yu-liang ZHAO. CT Diagnosis Method for Coronavirus Pneumonia with Integrated Multi-scale Attention Mechanism [J]. Journal of Northeastern University(Natural Science), 2024, 45(12): 1673-1679. |
[8] | Bo HAO, Xin-yan XU, Yu-xin ZHAO, Jun-wei YAN. Surface Defect Detection of Riveting Holes Based on Improved YOLOv8 [J]. Journal of Northeastern University(Natural Science), 2024, 45(11): 1595-1603. |
[9] | Zhi-jin ZHANG, He LI, Yu-shi HUANG, Wen-xue WANG. Application of Deep Residual Shrinkage Network in Rolling Bearing Fault Diagnosis [J]. Journal of Northeastern University(Natural Science), 2024, 45(11): 1587-1594. |
[10] | Meng-yuan LIU, Zhao-xia WU, Jin-yang WANG, Guang-lei XIA. Air Permeability Prediction of Sinter Layer Based on TST-LSTM Model [J]. Journal of Northeastern University(Natural Science), 2024, 45(10): 1379-1385. |
[11] | Ying SUN, Ya-ru ZHOU, Xue-ying ZHANG. Speech Emotion Recognition Fusing Functional Paralanguage Proportion Coefficient [J]. Journal of Northeastern University(Natural Science), 2024, 45(1): 40-48. |
[12] | Hao SUN, Zong-sheng DAI, Ai-bing JIN, Yan CHEN. Intelligent Identification and Parameter Extraction of Key Joints in Rock (Mass) Based on AttentionR2U-net [J]. Journal of Northeastern University(Natural Science), 2024, 45(1): 101-110. |
[13] | JIANG Yang, LIU Cheng, DING Qi-chuan, WANG Li. Segmentation of COVID-19 CT Images Based on Dual Attention Mechanism [J]. Journal of Northeastern University(Natural Science), 2023, 44(9): 1259-1268. |
[14] | ZHOU Song, GAO Tian-han. EEG Recognition Method for Epileptic Patients Based on RNN Model with Attention Mechanism [J]. Journal of Northeastern University(Natural Science), 2023, 44(8): 1098-1103. |
[15] | DING Qi-chuan, WANG Li, LIU Cheng. Classification of Pulmonary Nodule by Combining Long-Distance Channel Attention and Pathological Feature [J]. Journal of Northeastern University(Natural Science), 2023, 44(4): 476-485. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||