基于注意力机制的自适应图卷积三维点云识别算法

doi:10.12068/j.issn.1005-3026.2024.06.004

摘要/Abstract

摘要：

为了更好地捕捉三维点云的局部几何结构信息，提出了一种基于注意力机制的自适应图卷积三维点云识别算法.为了解决固定卷积核忽略特征的缺点，首先通过图结构特征动态学习自适应卷积核；其次为了提高模型对局部几何结构的建模能力，通过向量注意力机制自适应地调整卷积核的权重分配；而后使用点云的位置特征构建图，并利用自适应卷积核来对新构建的图结构特征进行卷积操作；最后通过池化得到新的点云特征.实验结果表明，相较之前的点云卷积算法，所提算法在采样点较少时仍可以很好地提取局部几何结构信息并在分类任务上取得较高精度.所提算法在ModelNet40，ScanObjectNN和ShapeNetPart数据集上的效果对比目前的点云分类和分割方法具有一定的优势.

关键词: 三维点云, 注意力机制, 自适应, 图卷积, 动态学习

Abstract:

To better capture the local geometric structural information of 3D point clouds， an adaptive graph convolutional 3D point cloud recognition algorithm is proposed based on attention mechanism. To address the drawback of fixed convolutional kernels ignoring features， the algorithm first dynamically learns adaptive convolutional kernels based on graph structural features. Furthermore， to enhance the modeling capability of the model for local geometric structures， the weight distribution of the convolutional kernels using a vector attention mechanism is adjusted adaptively. Subsequently， a graph is constructed using the position features of the point cloud and perform convolution operations on the newly constructed graph structural features using the adaptive convolutional kernels. Finally， new point cloud features through pooling is obtained. Experimental results demonstrate that the proposed algorithm effectively extracts local geometric structural information and achieves higher accuracy in classification tasks even with a limited number of sampled points， outperforming previous point cloud convolutional algorithms. The proposed algorithm also exhibits certain advantages compared to existing methods for point cloud classification and segmentation， as evidenced by the performance evaluation on the ModelNet40， ScanObjectNN， and ShapeNetPart datasets.

Key words: 3D point clouds, attention mechanism, self?adaptation, graph convolution, dynamically learn

中图分类号:

TP 391.4

马原, 佘黎煌, 李佳蔚, 鲍喜荣. 基于注意力机制的自适应图卷积三维点云识别算法[J]. 东北大学学报（自然科学版）, 2024, 45(6): 786-792.

Yuan MA, Li-huang SHE, Jia-wei LI, Xi-rong BAO. Adaptive Graph Convolutional 3D Point Cloud Recognition Algorithm Based on Attention Mechanism[J]. Journal of Northeastern University(Natural Science), 2024, 45(6): 786-792.

图/表 11

图1 转换模块

Fig.1 Transformation block

图2 在目标点附近A-GConv处理的图解

Fig.2 Diagram of A-GConv module processing near the target point

图3 分类网络结构

Fig.3 Classification network

图4 用于分割任务的网络结构

Fig.4 Part segmentation network

表1 基于ModelNet40数据集的分类结果

Table 1 Classification results based on ModelNet40 dataset

方法	输入	点数量	mAcc/%	OA/%
VoxNet	voxel	—	83.0	85.9
PointNet	xyz	1 024	86.0	89.2
PointNet++	xyz	1 024	—	90.7
PointNet	xyz	5 120	—	91.9
PointCNN	xyz	1 024	88.1	92.5
KPConv	xyz	7 168	—	92.9
DGCNN	xyz	1 024	—	92.9
Point Trans	xyz	1 024	—	92.8
PCT	xyz	1 024	—	93.2
PointNext	xyz	1 024	—	93.2
本文	xyz	1 024	90.7	93.5

表2 在ScanObjectNN数据集上的分类结果 (%)

Table 2 Classification results on the ScanObjectNN dataset

方法	mAcc	OA
PointNet	63.4	68.2
PointNet++	75.4	77.9
PointCNN	75.1	78.5
DGCNN	73.6	78.1
PRANet^［19］	79.1	82.1
本文	79.3	82.7

表3 在ShapeNetPart数据集上的分割结果 (%)

Table 3 Segmentation results on ShapeNetPart dataset

算法	mcIoU	mIoU
PointNet	80.4	83.7
PointNet++	81.9	85.1
PointCNN	84.6	86.1
DGCNN	82.3	85.2
PRANet	85.1	86.4
本文	84.8	86.7

表4 不同数量的ResG Block在ModelNet40数据集上的分类精度 (%)

Table 4 Classification accuracy of different numbers of ResG Block on ModelNet40 dataset

Block	mAcc	OA
1	87.9	90.1
2	89.3	92.3
3	90.7	93.3
4	89.5	92.7

表5 使用不同邻域点数量K的分类精度 (of different neighborhood pointsK %)

Table 5 Classification accuracy using the number

K	mAcc	OA
10	89.9	92.9
20	90.7	93.3
30	89.5	92.1
40	86.8	89.7

表6 使用不同算子的分类精度 (operators %)

Table 6 Classification accuracy using different

算子	mAcc	OA
MLP	86.5	88.3
MLP+Pooling	87.9	90.5
Scalar	89.5	92.2
本文	90.7	93.3

表7 集成A-GConv模块后网络的分类精度 (integration of A-GConv module %)

Table 7 Classification accuracy of network after

算子	mAcc	OA
PointNet	86.0	89.2
A-GConv+PointNet	87.5	90.5
DGCNN	—	92.9
A-GConv+DGCNN	—	93.1

参考文献 19

1	Riegler G， Ulusoy A O， Geiger A.OctNet：learning deep 3D representations at high resolutions［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）.Honolulu，2017：6620-6629.
2	Klokov R， Lempitsky V.Escape from cells：deep KD-networks for the recognition of 3D point cloud models［C］//2017 IEEE International Conference on Computer Vision （ICCV）.Venice，2017：863-872.
3	Charles R Q， Hao S， Mo K C，et al.PointNet：deep learning on point sets for 3D classification and segmentation［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）.Honolulu，2017：77-85.
4	Shen Y R， Feng C， Yang Y Q，et al.Mining point cloud local structures by kernel correlation and graph pooling［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City，2018：4548-4557.
5	Wang Y， Sun Y B， Liu Z W，et al.Dynamic graph CNN for learning on point clouds［J］.ACM Transactions on Graphics，2019，38（5）：1-12.
6	Devlin J， Chang M W， Lee K，et al.BERT：pre‐training of deep bidirectional transformers for language understanding［EB/OL］.2018：arXiv：1810.04805..
7	Hu H， Zhang Z， Xie Z D，et al.Local relation networks for image recognition［C］//2019 IEEE/CVF International Conference on Computer Vision （ICCV）.Seoul，2019：3463-3472.
8	Zhao H S， Jia J Y， Koltun V.Exploring self‐attention for image recognition［C］//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）.Seattle，2020：10073-10082.
9	Zhao H S， Jiang L， Jia J Y，et al.Point transformer［C］//2021 IEEE/CVF International Conference on Computer Vision （ICCV）.Montreal，2021：16239-16248.
10	Liu Y C， Fan B， Xiang S M，et al.Relation‐shape convolutional neural network for point cloud analysis［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）.Long Beach，2019：8887-8896.
11	Wu Z R， Song S R， Khosla A，et al.3D ShapeNets：a deep representation for volumetric shapes［C］//2015 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）.Boston，2015：1912-1920.
12	Loshchilov I， Hutter F.SGDR：stochastic gradient descent with warm restarts［EB/OL］.2016：arXiv：1608.03983..
13	Zhou Y， Tuzel O.VoxelNet：end‐to‐end learning for point cloud based 3D object detection［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City，2018：4490-4499.
14	Li Y， Bu R， Sun M，et al.PointCNN：convolution on x‐transformed points［J］.Advances in Neural Information Processing Systems，2018.
15	Thomas H， Qi C R， Deschaud J E，et al.KPConv：flexible and deformable convolution for point clouds［C］//2019 IEEE/CVF International Conference on Computer Vision （ICCV）.Seoul，2019：6410-6419.
16	Guo M H， Cai J X， Liu Z N，et al.PCT：point cloud transformer［J］.Computational Visual Media，2021，7（2）：187-199.
17	Qian G， Li Y， Peng H，et al.PointNext：revisiting pointnet++ with improved training and scaling strategies［J］.Advances in Neural Information Processing Systems，2022，35：23192-23204.
18	Uy M A， Pham Q H， Hua B S，et al.Revisiting point cloud classification：a new benchmark dataset and classification model on real‐world data［C］//2019 IEEE/CVF International Conference on Computer Vision （ICCV）.Seoul，2019：1588-1597.
19	Cheng S L， Chen X W， He X W，et al.PRANet：point relation‑aware network for 3D point cloud analysis［J］.IEEE Transactions on Image Processing，2021，30：4436-4448.

[1]	郭立新, 毕素涛, 赵明扬. 基于改进YOLOv4轻量化网络的机械手状态检测算法[J]. 东北大学学报（自然科学版）, 2024, 45(6): 769-775.
[2]	冯虎, 宋克臣, 崔文琦, 颜云辉. 基于元学习的带钢表面缺陷小样本语义分割[J]. 东北大学学报（自然科学版）, 2024, 45(3): 354-360.
[3]	任朝晖, 刘玉麟, 姜泽宇, 陈翔宇. 基于模糊增益滑模四旋翼无人机自适应容错控制[J]. 东北大学学报（自然科学版）, 2024, 45(2): 209-216.
[4]	孙颖, 周雅茹, 张雪英. 融合功能性副语言比例系数的语音情感识别[J]. 东北大学学报（自然科学版）, 2024, 45(1): 40-48.
[5]	吴飞, 王梦辉, 李亦能. 3D打印机多轴联动插补算法的研究与优化[J]. 东北大学学报（自然科学版）, 2024, 45(1): 85-92.
[6]	姜杨，刘成，丁其川，王力. 基于双注意力机制的COVID-19病灶CT图像分割方法[J]. 东北大学学报（自然科学版）, 2023, 44(9): 1259-1268.
[7]	赵俊涛，罗小川，刘俊秘. 改进鲸鱼优化算法在机器人路径规划中的应用[J]. 东北大学学报（自然科学版）, 2023, 44(8): 1065-1071.
[8]	康岩松，臧顺来. 基于多种策略的改进粒子群优化算法[J]. 东北大学学报（自然科学版）, 2023, 44(8): 1089-1097.
[9]	周嵩，高天寒. 基于注意力机制RNN模型的癫痫患者脑电信号识别方法[J]. 东北大学学报（自然科学版）, 2023, 44(8): 1098-1103.
[10]	胡兵，黄贤振，杜姗珊. 风电轴承疲劳寿命的可靠性灵敏度分析[J]. 东北大学学报（自然科学版）, 2023, 44(8): 1128-1135.
[11]	毛亚纯，樊硕，曹旺，李时. 基于爆破矿石图像分割优化算法的大块率统计方法[J]. 东北大学学报（自然科学版）, 2023, 44(5): 705-711.
[12]	庞彦伟，苏畅，龙涛. 自适应构造与聚合多尺度代价体的双目立体匹配[J]. 东北大学学报（自然科学版）, 2023, 44(4): 457-468.
[13]	丁其川，王力，刘成. 融合长距离信道注意力与病理特征的肺结节分类[J]. 东北大学学报（自然科学版）, 2023, 44(4): 476-485.
[14]	李海燕，熊立昌，郭磊，李海江. 基于U-net边缘生成和超图卷积的两阶段修复算法[J]. 东北大学学报（自然科学版）, 2023, 44(3): 331-339.
[15]	陈一馨，张婷，刘永刚，陈晶. 基于改进樽海鞘群算法的提梁机主梁轻量化设计方法[J]. 东北大学学报（自然科学版）, 2023, 44(2): 223-232.