东北大学学报(自然科学版) ›› 2021, Vol. 42 ›› Issue (9): 1261-1267.DOI: 10.12068/j.issn.1005-3026.2021.09.007

• 信息与控制 • 上一篇    下一篇

基于语义分割注意力与可见区域预测的行人检测方法

王璐1, 王帅1, 张国峰1, 徐礼胜2, 3   

  1. (1. 东北大学 计算机科学与工程学院, 辽宁 沈阳110169; 2. 东北大学 医学与生物信息工程学院, 辽宁 沈阳110169; 3. 沈阳东软智能医疗科技研究院有限公司, 辽宁 沈阳110167)
  • 修回日期:2021-01-04 接受日期:2021-01-04 发布日期:2021-09-16
  • 通讯作者: 王璐
  • 作者简介:王璐(1980-),女,辽宁沈阳人,东北大学副教授; 徐礼胜(1975-),男,安徽安庆人,东北大学教授,博士生导师.
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(N181604006); 辽宁省自然科学基金资助项目(20170540312); 国家自然科学基金资助项目(61773110); 沈阳市科学技术计划基金资助项目(20-201-4-10).

Pedestrian Detection Based on Semantic Segmentation Attention and Visible Region Prediction

WANG Lu1, WANG Shuai1, ZHANG Guo-feng1, XU Li-sheng2,3   

  1. 1. School of Computer Science & Engineering, Northeastern University, Shenyang 110169, China; 2. School of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China; 3. Neusoft Research of Intelligent Healthcare Technology, Co., Ltd., Shenyang 110167, China.
  • Revised:2021-01-04 Accepted:2021-01-04 Published:2021-09-16
  • Contact: XU Li-sheng
  • About author:-
  • Supported by:
    -

摘要: 为改善图像中遮挡和小尺寸行人的检测精度,提出一种基于语义分割注意力和可见区域预测的行人检测方法.具体地,在SSD(single shot multi-box detector)目标检测网络的基础上,首先优化SSD的超参数设置,使其更适于行人检测;然后在主干网络中引入基于语义分割的注意力分支来增强行人检测特征的表达能力;最后提出一种检测预测模块,它不仅能同时预测行人整体和可见区域,还能利用可见区域预测分支所学的特征去引导整体检测特征的学习,提升检测效果.在Caltech行人检测数据集上进行了实验,所提方法的对数平均缺失率为5.5%,与已有方法相比具有一定的优势.

关键词: 行人检测;卷积神经网络;语义分割注意力;行人可见区域预测;多任务网络

Abstract: To improve the detection performance on occluded and small pedestrians in images, a pedestrian detection method based on semantic segmentation attention and visible region prediction was proposed. Specifically, based on the single shot multi-box detector(SSD)object detection network, the hyperparameter setting of the SSD was firstly optimized to make it more suitable for pedestrian detection. Then the semantic segmentation attention branch was introduced into the network to enhance the pedestrian detection features learned by the network. Finally, a detection prediction module which can simultaneously detect the full bodies and visible regions of pedestrians was developed. This module has the advantage of leveraging the features learned from visible regions to guide the learning of the full-body detection features, hence improving the overall detection accuracy. The experiment carried out on the Caltech pedestrian detection benchmark shows that the log-average miss rate of the proposed method is 5.5%, which is competitive compared with existing pedestrian detection approaches.

Key words: pedestrian detection; convolutional neural network; semantic segmentation attention(SSA); pedestrian visible region detection; multi-task network

中图分类号: