东北大学学报(自然科学版) ›› 2023, Vol. 44 ›› Issue (1): 26-32.DOI: 10.12068/j.issn.1005-3026.2023.01.004

• 信息与控制 • 上一篇    下一篇

基于Transformer改进的两分支行人重识别算法

刘洋, 闫冬梅, 孟范伟   

  1. (东北大学秦皇岛分校 控制工程学院, 河北 秦皇岛066004)
  • 发布日期:2023-01-30
  • 通讯作者: 刘洋
  • 作者简介:刘洋(1997-),男,山东滕州人,东北大学硕士研究生; 闫冬梅(1970-),女,北京人,东北大学秦皇岛分校副教授,硕士生导师.
  • 基金资助:
    河北省自然科学基金资助项目(F2019501012).

Improved Two-Branch Person Re-identification Algorithm Based on Transformer

LIU Yang, YAN Dong-mei, MENG Fan-wei   

  1. School of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China.
  • Published:2023-01-30
  • Contact: LIU Yang
  • About author:-
  • Supported by:
    -

摘要: 针对基于卷积神经网络的行人重识别算法全局信息建模不足的问题,分析了卷积操作的局限性,提出一种基于Transformer改进的全局-局部两分支行人重识别算法.首先利用相对位置编码改进多头自注意力机制,并将其嵌入到Resnet50骨干网络中.之后在全局分支中对图像进行空间几何划分并利用Transformer的全局感受野增强抽象特征的提取能力;在局部分支中对Layer_3输出进行降维监督,利用多尺度池化获得更丰富的局部特征.实验结果表明,该算法在公开数据集Market-1501和DukeMTMC-reID上的mAP/Rank-1分别达到了93.45%/95.61%和88.79%/90.35%,相对于单纯基于卷积神经网络的算法,本文算法达到更高的精度.

关键词: 行人重识别;Transformer;卷积神经网络;特征提取;多头自注意力机制

Abstract: In order to solve the problem of insufficient global information modeling of person re-recognition algorithm based on convolutional neural network, the limitation of convolution operation is analyzed, and an improved global-local two-branch person re-recognition algorithm based on Transformer is proposed. Firstly, the multi-headed self-attention mechanism which is embedded in the Resnet50 backbone network is optimized by relative position-coding. After that, the processed image is split into two parts on the global branch geometrically, and the ability of extracting the abstract features is enhanced by the Transformer’s global receptive field. On the local branch, the Layer_3 output is under the supervision of dimensionality reduction while the multi-scale pooling obtains richer local features. The experimental result shows that, on the Market-1501 and DukeMTMC-reID datasets, mAP/Rank-1 of the algorithm reaches 93.45%/95.61% and 88.79%/90.35%, respectively. Compared with the algorithm which is only based on convolutional neural network, higher accuracy is achieved.

Key words: person re-identification; Transformer; convolutional neural network; feature extraction; multi-headed self-attention mechanism

中图分类号: