东北大学学报(自然科学版) ›› 2025, Vol. 46 ›› Issue (6): 8-15.DOI: 10.12068/j.issn.1005-3026.2025.20230341

• 信息与控制 • 上一篇    下一篇

基于X-ray-RTDETR的X射线图像违禁品检测算法

李立振, 马淑华, 郭泽旭, 车晓辰   

  1. 东北大学秦皇岛分校 控制工程学院,河北 秦皇岛 066004
  • 收稿日期:2023-12-25 出版日期:2025-06-15 发布日期:2025-09-01
  • 作者简介:李立振(1996—),男,山东聊城人,东北大学硕士研究生
    马淑华(1967—),女,河北秦皇岛人,东北大学秦皇岛分校教授.
  • 基金资助:
    河北省自然科学基金资助项目(F2021501021)

X-ray Image Prohibited Item Detection Algorithm Based on X-ray-RTDETR

Li-zhen LI, Shu-hua MA, Ze-xu GUO, Xiao-chen CHE   

  1. School of Control Engineering,Northeastern University at Qinhuangdao,Qinhuangdao 066004,China. Corresponding author: LI Li-zhen,E-mail: lilizhen559@163. com
  • Received:2023-12-25 Online:2025-06-15 Published:2025-09-01

摘要:

针对X射线违禁品图像大小不一致、背景噪声高和尺度变化大导致检测精度低的问题,在RT-DETR-R18的基础上进行优化,提出了X射线图像违禁品检测算法X-ray-RTDETR.该算法首先使用嵌入高效多尺度注意力的CSPRepResNet作为主干网络增强特征提取能力;其次,在主干网络输出的3个特征图之后引入简化的快速空间金字塔池化模块提高模型的鲁棒性和泛化能力;最后,将SPoolFormer编码器应用于语义概念更丰富的高级特征图进行尺度内特征交互.实验结果表明,X-ray-RTDETR在PIDray测试集上检测精度达到了74.6%,比RT-DETR-R18提升了8.5%,参数量和浮点操作次数nFLOP分别减少了1.67×106,2.24×109.与当前最先进的同量级目标检测算法实验对比结果表明,X-ray-RTDETR不仅检测精度更高,而且参数量与nFLOP也更少,同时推理速度在RTX2070 Max-Q GPU上达到了85.47 帧/s.

关键词: 违禁品检测, 多尺度注意力, 特征提取, 金字塔池化, SPoolFormer编码器

Abstract:

In response to the problem of low detection precision caused by inconsistent size, high background noise, and large-scale changes in X-ray image prohibited item, the optimization is performed based on RT-DETR-R18 and an X-ray image prohibited item detection algorithm named X-ray-RTDETR is proposed. Firstly, the algorithm employs CSPRepResNet embedded with efficient multi-scale attention as the backbone network to enhance feature extraction capabilities. Secondly, the simplified fast spatial pyramid pooling module is introduced after the three features maps output by the backbone network to improve the robustness and generalization ability of the model. Finally, the SPoolFormer encoder is applied to high-level feature maps with richer semantic concepts for intra-scale feature interaction. The experimental results show that the detection accuracy of X-ray-RTDETR achieves 74.6% on PIDray test set, surpassing RT-DETR-R18 by 8.5%, while reducing the number of parameters and nFLOP by 1.67×106 and 2.24×109, respectively. Compared to the state-of-the-art object detection algorithms at the same scale shows that X-ray-RTDETR not only has higher detection accuracy, but also has less number of parameters and nFLOP. At the same time, its inference speed reaches 85.47 frames per second on RTX2070 Max-Q GPU.

Key words: prohibited item detection, multi-scale attention, feature extraction, pyramid pooling, SPoolFormer encoder

中图分类号: