东北大学学报(自然科学版) ›› 2023, Vol. 44 ›› Issue (7): 953-963.DOI: 10.12068/j.issn.1005-3026.2023.07.006

• 机械工程 • 上一篇    下一篇

基于Gan-St-YOLOv5的复杂环境下的手势识别

郝博1,2, 尹兴超1, 闫俊伟1, 张力1,2   

  1. (1. 东北大学 航空动力装备振动及控制教育部重点实验室(B类), 辽宁 沈阳110819;2. 东北大学秦皇岛分校 控制工程学院, 河北 秦皇岛066004)
  • 发布日期:2023-07-13
  • 通讯作者: 郝博
  • 作者简介:郝博(1963-),男,辽宁沈阳人,东北大学教授,博士生导师.
  • 基金资助:
    国防基础科研项目(JCKY2018110C012); 国家自然科学基金资助项目(51905082).

Gesture Recognition in the Complex Environment Based on Gan-St-YOLOv5

HAO Bo1,2, YIN Xing-chao1, YAN Jun-wei1, ZHANG Li1,2   

  1. 1. Key Laboratory of Vibration and Control of Aero-Propulsion System of Ministry of Education, Northeastern University, Shenyang 110819, China; 2. School of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China.
  • Published:2023-07-13
  • Contact: YIN Xing-chao
  • About author:-
  • Supported by:
    -

摘要: 在智能工业生产中的复杂环境下进行手势识别人机交互,手势特征受到局部遮挡、强光照、远距离小目标的影响,导致目标检测识别过程中识别出的手势特征减少,甚至分类错误.在复杂环境下提高手势识别精准度成为人机交互任务中亟需解决的问题. 本文提出一种具有创新性的Gan-St-YOLOv5模型,在YOLOv5的基础上生成对抗网络(generative adversarial network, GAN)和Swin Transformer模块,融入SENet通道注意力机制,使用Confluence检测框选取算法,增强模型检测的准确度. 为了验证模型的优越性,与YOLOv5模型进行对比,得出Gan-St-YOLOv5在完全可见测试集上mAP_0.5高达96.1%,在强光照测试集上mAP_0.5高达92.3%,在部分遮挡测试集上mAP_0.5高达86.6%,在远距离小目标测试集上准确度高达96.4%,均优于YOLOv5目标检测算法,以较小的效率损失取得了较高精度.

关键词: Gan-St-YOLOv5; 手势识别; 局部遮挡; 强光照; 远距离小目标

Abstract: During the human-computer interaction of gesture recognition in the complex environment of intelligent industrial production, gesture features are affected by local occlusion, strong illumination and small distant targets, leading to the reduction of gesture features recognized in the process of target detection and recognition, and even classification errors. Given that improving the accuracy of gesture recognition in the complex environment has become an urgent problem to be solved in human-computer interaction tasks, an innovative Gan-St-YOLOv5 model is proposed. On the basis of YOLOv5, GAN and Swin Transformer modules are integrated into SENet channel attention mechanism, and Confluence detection box selection algorithm is used to enhance the accuracy of model detection. In order to verify the superiority of the model, the YOLOv5 model is used for comparison and it is concluded that the mAP_0.5 of Gan-St-YOLOv5 is up to 96.1% on the fully visible test set, as high as 92.3% in the intense illumination test set, as high as 86.6% in the partial occlusion test set, and as high as 96.4% in the remote small target test set, all of which are superior to the YOLOv5 target detection algorithm and achieve higher accuracy with less efficiency loss.

Key words: Gan-St-YOLOv5; gesture recognition; local occlusion; strong illumination; remote small target

中图分类号: