基于Gan-St-YOLOv5的复杂环境下的手势识别

doi:10.12068/j.issn.1005-3026.2023.07.006

东北大学学报（自然科学版） ›› 2023, Vol. 44 ›› Issue (7): 953-963.DOI: 10.12068/j.issn.1005-3026.2023.07.006

基于Gan-St-YOLOv5的复杂环境下的手势识别

郝博^1,2，尹兴超¹，闫俊伟¹，张力^1,2

(1. 东北大学航空动力装备振动及控制教育部重点实验室(B类)，辽宁沈阳110819；2. 东北大学秦皇岛分校控制工程学院，河北秦皇岛066004)

发布日期:2023-07-13
通讯作者: 郝博
作者简介:郝博(1963-)，男，辽宁沈阳人，东北大学教授，博士生导师.
基金资助:
国防基础科研项目(JCKY2018110C012); 国家自然科学基金资助项目(51905082).

Gesture Recognition in the Complex Environment Based on Gan-St-YOLOv5

HAO Bo^1,2， YIN Xing-chao¹， YAN Jun-wei¹， ZHANG Li^1,2

1. Key Laboratory of Vibration and Control of Aero-Propulsion System of Ministry of Education， Northeastern University， Shenyang 110819， China; 2. School of Control Engineering， Northeastern University at Qinhuangdao， Qinhuangdao 066004， China.

Published:2023-07-13
Contact: YIN Xing-chao
About author:-
Supported by:
-

摘要/Abstract

摘要： 在智能工业生产中的复杂环境下进行手势识别人机交互，手势特征受到局部遮挡、强光照、远距离小目标的影响，导致目标检测识别过程中识别出的手势特征减少，甚至分类错误.在复杂环境下提高手势识别精准度成为人机交互任务中亟需解决的问题. 本文提出一种具有创新性的Gan-St-YOLOv5模型，在YOLOv5的基础上生成对抗网络(generative adversarial network， GAN)和Swin Transformer模块，融入SENet通道注意力机制，使用Confluence检测框选取算法，增强模型检测的准确度. 为了验证模型的优越性，与YOLOv5模型进行对比，得出Gan-St-YOLOv5在完全可见测试集上mAP_0.5高达96.1%，在强光照测试集上mAP_0.5高达92.3%，在部分遮挡测试集上mAP_0.5高达86.6%，在远距离小目标测试集上准确度高达96.4%，均优于YOLOv5目标检测算法，以较小的效率损失取得了较高精度.

关键词: Gan-St-YOLOv5; 手势识别; 局部遮挡; 强光照; 远距离小目标

Abstract: During the human-computer interaction of gesture recognition in the complex environment of intelligent industrial production， gesture features are affected by local occlusion， strong illumination and small distant targets， leading to the reduction of gesture features recognized in the process of target detection and recognition， and even classification errors. Given that improving the accuracy of gesture recognition in the complex environment has become an urgent problem to be solved in human-computer interaction tasks， an innovative Gan-St-YOLOv5 model is proposed. On the basis of YOLOv5， GAN and Swin Transformer modules are integrated into SENet channel attention mechanism， and Confluence detection box selection algorithm is used to enhance the accuracy of model detection. In order to verify the superiority of the model， the YOLOv5 model is used for comparison and it is concluded that the mAP_0.5 of Gan-St-YOLOv5 is up to 96.1% on the fully visible test set， as high as 92.3% in the intense illumination test set， as high as 86.6% in the partial occlusion test set， and as high as 96.4% in the remote small target test set， all of which are superior to the YOLOv5 target detection algorithm and achieve higher accuracy with less efficiency loss.

Key words: Gan-St-YOLOv5; gesture recognition; local occlusion; strong illumination; remote small target

中图分类号:

V211.5

郝博，尹兴超，闫俊伟，张力. 基于Gan-St-YOLOv5的复杂环境下的手势识别[J]. 东北大学学报（自然科学版）, 2023, 44(7): 953-963.

HAO Bo， YIN Xing-chao， YAN Jun-wei， ZHANG Li. Gesture Recognition in the Complex Environment Based on Gan-St-YOLOv5[J]. Journal of Northeastern University(Natural Science), 2023, 44(7): 953-963.

参考文献

[1]汤寓麟，李厚朴，张卫东，等.侧扫声纳检测沉船目标的轻量化DETR-YOLO法［J/OL］.系统工程与电子技术.［2022-04-06］.http://kns.cnki.net/kcms/detail/11.2422.TN.20211224.1538.010.html.(Tang Yu-lin，Li Hou-pu，Zhang Wei-dong，et al.A lightweight DETR-YOLO method for detecting shipwreck targets with side-scan sonar［J/OL］.Systems Engineering and Electronics.［2022-04-06］.http://kns.cnki.net/kcms/detail/11.2422.TN.20211224.1538.010.html.)
[2]安珊，林树宽，乔建忠，等.基于生成对抗网络学习被遮挡特征的目标检测方法［J］.控制与决策，2021，36(5):1199-1205.(An Shan，Lin Shu-kuan，Qiao Jian-zhong，et al.Object detection method based on generated adversarial network learning obscured feature［J］.Control and Decision，2021，36(5):1199-1205.)
[3]蔡旻，高涵文，李华一，等.基于CRF和HMM混合模型的手势识别方法［J］.计算机应用与软件，2021，38(11):162-166.(Cai Min，Gao Han-wen，Li Hua-yi，et al.Gesture recognition method based on hybrid model CRF and HMM［J］.Computer Applications and Software，2021，38(11):162-166.)
[4]Tan Y S，Lim K M，Tee C，et al.Convolutional neural network with spatial pyramid pooling for hand gesture recognition［J］.Neural Computing and Applications，2021，33(10):5339-5351.
[5]Sarma D，Bhuyan M K.Hand detection by two-level segmentation with double-tracking and gesture recognition using deep-features［J］.Sensing and Imaging，2022，23(1):1-29.
[6]Manmode P，Saha R，Amnerkar M N .Real-time hand gesture recognition［J］.International Journal of Scientific Research in Computer Science Engineering and Information Technology，2021，35(9):618-624.
[7]Yu T，Guo Z，Jin X，et al.Region normalization for image inpainting［C］//Proceedings of the AAAI Conference on Artificial Intelligence.Chongqing，2020:12733-12740.
[8]Shepley A，Falzon G，Kwan P.Confluence:a robust non-IoU alternative to non-maxima suppression in object detection［J］.ArXiv Preprint ArXiv.https://blog.csdn.net/chrisitian666/article/details/111759945.
[9]Liu Z，Lin Y，Cao Y，et al.Swin transformer:hierarchical vision transformer using shifted windows［C］//Proceedings of the IEEE/CVF International Conference on Computer Vision.Montreal，2021:10012-10022.
[10]Hu J，Shen L，Sun G.Squeeze-and-excitation networks［C］//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City，2018:2011-2023.
[11]刘邵凡.基于对抗学习的多域自适应目标检测方法研究［D］.合肥:合肥工业大学，2021.(Liu Shao-fan.Multiple domain adaptive target detection based on against learning method research［D］.Hefei:Hefei University of Technology，2021.)
[12]王德兴，王越，袁红春.基于Inception-Residual和生成对抗网络的水下图像增强［J］.液晶与显示，2021，36(11):1474-1485.(Wang De-xing，Wang Yue，Yuan Hong-chun.Underwater image enhancement based on inception-residual and generated antagonistic network［J］.Chinese Journal of Liquid Crystals and Displays，201，36(11):1474-1485.)
[13]万晓丹.基于对抗网络与卷积神经网络的目标检测方法［J］.计算机应用与软件，2021，38(1):192-196.(Wan Xiao-dan.Object detection method based on adversarial network and convolutional neural network［J］.Computer Applications and Software，2021，38(1):192-196.)
[14]孙强，李一全，于占江，等.Inception-ViT模型的微型铣刀磨损状态预测研究［J］.工具技术，2022，56(1):13-21.(Sun Qiang，Li Yi-quan，Yu Zhan-jiang，et al.Research on micro milling cutter wear state prediction based on Inception-ViT model［J］.Journal of Tool Technology，202，56(1):13-21.)
[15]苏晋鹏.基于超特征金字塔与对抗学习的目标检测算法研究［D］.南京:南京邮电大学，2019.(Su Jin-peng.Based on the characteristics of super pyramid and confrontation study target detection algorithm research［D］.Nanjing:Nanjing University of Posts and Telecommunications，2019.)
[16]任仲乐.基于数据驱动的遥感目标检测与地物分类［D］.西安:西安电子科技大学，2020.(Ren Zhong-le.Remote sensing target detection based on data driven and feature classification［D］.Xi’an:Xi’an University of Electronic Science and Technology，2020.)
[17]杨潇宇，汪西莉.结合多尺度注意力和边缘监督的遥感图像建筑物分割模型［J］.激光与光电子学进展，2022，59(22):335-344.(Yang Xiao-yu，Wang Xi-li.Building segmentation model of remote sensing image combining multi-scale attention and edge supervision［J］.Laser & Optoelectronics Progress，2022，59(22):335-344.)
[18]康健，王智睿，祝若鑫，等.基于监督对比学习正则化的高分辨率SAR图像建筑物提取方法［J］.雷达学报，2022，11(1):157-167.(Kang Jian，Wang Zhi-rui，Zhu Ruo-xin，et al.Building extraction method of high resolution SAR image based on supervised contrast learning regularization［J］.Chinese Journal of Radar，2022，11(1):157-167.)
[19]舒朗.基于强化学习的目标检测算法研究［D］.杭州:杭州电子科技大学，2018.(Shu Lang.Research on object detection algorithm based on reinforcement learning［D］.Hangzhou:Hangzhou Dianzi University，2018.)

基于Gan-St-YOLOv5的复杂环境下的手势识别

Gesture Recognition in the Complex Environment Based on Gan-St-YOLOv5

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 1

编辑推荐

Metrics

本文评价