基于目标检测的复杂城市交通环境感知技术及应用

doi:10.12068/j.issn.1005-3026.2025.20230297

摘要/Abstract

摘要：

基于机器视觉的环境感知技术是智慧交通领域的关键任务之一.传统深度学习算法通常只能满足单一场景下的个别目标检测任务，难以应对复杂交通环境下的智能感知需求.为提高车辆在复杂环境下的智能感知能力，提出了一种改进的YOLOv8目标检测网络模型，结合注意力机制、优化器和可变形卷积层，实现了在复杂城市交通环境下的多目标检测.采用YOLOv4，YOLOv8及改进的YOLOv8算法对复杂交通环境样本图进行目标检测对比实验.结果表明，与YOLOv4，YOLOv8相比，改进的YOLOv8算法的平均精度分别提高了40.76%和16.92%.该算法的检测准确性与实时性满足实际应用需求，可通过多传感器信息融合，实现在复杂城市交通环境下的智能感知.

关键词: YOLOv8, 目标检测, 复杂城市交通, 环境感知, 智慧交通

Abstract:

Machine vision-based environmental perception technology is one of the key tasks in the field of intelligent transportation. Traditional deep learning algorithms typically meet the detection needs of individual targets in simple scenarios. However， they are not capable of addressing the intelligent perception requirements in complex traffic environment. To improve the intelligent perception capability of vehicles in such environment， this paper proposes an improved YOLOv8 object detection network model， integrating attention mechanisms， optimizers， and deformable convolutional layers to achieve multi-target detection in complex urban traffic environment. To verify the effectiveness of the algorithm， comparative experiment were conducted using YOLOv4， YOLOv8， and the improved YOLOv8 algorithm on sample images from complex traffic environments. The results show that， compared to YOLOv4 and YOLOv8， the improved YOLOv8 algorithm increased the average accuracy by 40.76% and 16.92%， respectively. The detection accuracy and real-time performance of the improved YOLOv8 algorithm meet the practical application requirements， and through multi-sensor information fusion， it can realize intelligent perception in complex urban traffic environment.

Key words: YOLOv8, target detection, complex urban traffic, environment perception, intelligent transportation

中图分类号:

P 23

艾散·西尔艾力, 车德福, 王夺, 喻甜. 基于目标检测的复杂城市交通环境感知技术及应用[J]. 东北大学学报（自然科学版）, 2025, 46(5): 29-36.

Aisan XIERAILI, De-fu CHE, Duo WANG, Tian YU. Perception Technology and Application of Complex Urban Traffic Environment Based on Target Detection[J]. Journal of Northeastern University(Natural Science), 2025, 46(5): 29-36.

图/表 13

图1 YOLOv8网络结构

Fig.1 YOLOv8 network structure

图2 C2F结构

Fig.2 Structure of C2F

图3 SPP与SPPF结构

Fig.3 Structure of SPP and SPPF

图4 缩放点积注意力与多头注意力机制结构（a）—缩放点积注意力；（b）—多头注意力.

Fig.4 Scaling point product attention and multi-head attention mechanism structure

图5 标准卷积和可变形卷积的采样位置示意图（a）—标准卷积规则采样网格；（b）—可变形卷积采样网格（带偏移量）；（c）—可变形卷积采样网格（尺度变换）；（d）—可变形卷积（旋转变换）.

Fig.5 Schematic diagram of sampling positions of standard convolution and deformable convolution

图6 标准卷积与可变形卷积的卷积过程示例（a） —标准卷积；（b）—可变形卷积.

Fig.6 Examples of convolution process of standard convolution and deformable convolution

图7 CutMix方法示意图

Fig.7 Schematic diagram of CutMix method

图8 Mosaic方法示意图

Fig.8 Schematic diagram of Mosaic method

图9 Mosaic图像截取方式

Fig.9 Image interception method of Mosaic

表1 训练参数设置

Table 1 Training parameter setting

训练参数	数值	训练参数	数值
epoch	100	imgsize	640
batch	16	workers	4
optimizer	Adam	patience	50
save_period	10	close_mosaic	10

表2 实验结果

Table 2 Experimental results

测试图像	人工标记图像	本文模型检测结果

表3 不同网络模型的性能对比

Table 3 Performance comparison of different network models

模型	主干网络	输入尺寸/像素	mAP/%	FPS	IoU
YOLOv4	CSPDarknet53	416×416	44.19	47	0.5
YOLOv8	CSPDarknet+C2F	640×640	53.20	87	0.5
改进YOLOv8	CSPDarknet+C2F	640×640	62.20	78	0.5

表4 实验结果对比 (%)

Table 4 Comparison of experimental results

目标类型	模型平均精度mAP50			改进模型精度提升量
目标类型	YOLOv4	YOLOv8	改进YOLOv8	相对YOLOv4	相对YOLOv8
总平均	44.19	53.20	62.20	18.01	9.00
交通标志牌	55.59	68.40	68.30	12.71	-0.10
路面破损	57.60	73.60	81.10	23.50	7.50
井盖	50.95	47.50	51.10	0.15	3.60
井盖丢失	96.62	86.50	97.40	0.78	10.90
垃圾桶	53.95	71.80	75.80	21.85	4.00
垃圾桶倾倒	56.93	95.00	99.50	42.57	4.50
雨水立箅	53.26	58.80	67.40	14.14	8.60
雨水立箅破损	00.00	33.20	99.50	99.50	66.30
火焰	45.07	63.10	65.40	20.33	2.30
违规摆摊	44.19	47.00	54.70	10.51	7.70

参考文献 27

[1]	高德芝，段建民，郑榜贵，等.智能车辆环境感知传感器的应用现状［J］. 现代电子技术， 2008（19）： 151-156.
	Gao De-zhi， Duan Jian-min， Zheng Bang-gui， et al. Application status of intelligent vehicle environmental sensing sensor ［J］. Modern Electronic Technology， 2008（19）： 151-156.
[2]	Wang K， Gou C， Zheng N， et al. Parallel vision for perception and understanding of complex scenes： methods， framework， and perspectives［J］. Artificial Intelligence Review， 2017， 48（3）： 299-329.
[3]	Aufrère R， Gowdy J， Mertz C， et al. Perception for collision avoidance and autonomous driving ［J］. Mechatronics， 2003， 13（10）： 1149-1161.
[4]	Montemerlo M， Becker J， Bhat S， et al. Junior： the Stanford entry in the urban challenge ［J］. Journal of Field Robotics， 2008， 25（9）： 569-597.
[5]	Leonard J， How J， Teller S， et al. A perception-driven autonomous urban vehicle［J］. Journal of Field Robotics， 2008， 25（10）： 727-774.
[6]	谢志萍，雷莉萍.智能网联汽车环境感知技术的发展和研究现状［J］. 成都工业学院学报， 2016， 19（4）： 87-92.
	Xie Zhi-ping， Lei Li-ping. Development and research status of intelligent networked automotive environment awareness technology［J］. Journal of Chengdu Technological University， 2016， 19（4）： 87-92.
[7]	Redmon J， Divvala S， Girshick R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Las Vegas， 2016：779-788.
[8]	Asmaa B， Khalid Z. Optimizing CNN-BiGRU performance： mish activation and comparative analysis［J］. International Journal of Computer Networks & Communications， 2024， 16（3）： 69-87.
[9]	Wang Y F， Hua C C， Ding W L， et al. Real-time detection of flame and smoke using an improved YOLOv4 network［J］. Signal， Image and Video Processing， 2022， 16（4）： 1-8.
[10]	张凯祥，朱明.基于YOLOv5的多任务自动驾驶环境感知算法［J］. 计算机系统应用， 2022， 31（9）： 226-232.
	Zhang Kai-xiang， Zhu Ming. Multi-task automatic driving environment perception algorithm based on YOLOv5 ［J］. Computer Systems & Applications， 2022， 31（9）： 226-232.
[11]	Fei X， Li T H， Xiao Y G， et al. Research on YOLOv3 model compression strategy for UAV deployment［J］. Cognitive Robotics， 2024， 4： 8-18.
[12]	Guo K Y， Cheng B H， Min Y， et al. A pavement distresses identification method optimized for YOLOv5s［J］. Scientific Reports， 2022， 12（1）： 1-15.
[13]	郭振宇，高国飞.基于YOLO v4的复杂路口下人车混行检测算法研究［J］. 信息技术与信息化， 2021（2）： 236-240.
	Guo Zhen-yu， Gao Guo-fei. Research on detection algorithm of mixed traffic between people and vehicles at complex intersections based on YOLO v4 ［J］. Information Technology and Informatization， 2021（2）： 236-240.
[14]	Łysakowski M， Żywanowski K， Banaszczyk A， et al. Real-time onboard object detection for augmented reality： enhancing head-mounted display with YOLOv8［C］//IEEE International Conference on Edge Computing and Communications. Chicago， 2023：364-371.
[15]	Vaswani A， Shazeer N， Parmar N， et al. Attention is all you need［J］. Advances in Neural Information Processing Systems， 2017， 30： 6000-6010.
[16]	李鸿，邹俊颖，谭茜成，等.面向医学图像分割的多注意力融合网络［J］. 计算机应用， 2022， 42（12）： 3891-3899.
	Li Hong， Zou Jun-ying， Tan Qian-cheng， et al. Multi-attention fusion network for medical image segmentation ［J］. Journal of Computer Applications， 2022， 42（12）： 3891-3899.
[17]	Wang Z Y， Zhu H， Liu F. SMSTracker： a self-calibration multi-head self-attention transformer for visual object tracking［J］. Computers， Materials & Continua， 2024， 80（1）： 605-623.
[18]	Vasanthi P， Mohan L. A reliable anchor regenerative-based transformer model for x-small and dense objects recognition［J］. Neural Networks， 2023， 165： 809-829.
[19]	Battiti R. First- and second-order methods for learning： between steepest descent and Newton's method［J］. Neural Computation， 2014， 4（2）： 141-166.
[20]	Sutskever I， Martens J， Dahl G， et al. On the importance of initialization and momentum in deep learning［C］// International Conference on Machine Learning （ICML）. Atlanta， 2013： 1139-1147.
[21]	Qian N. On the momentum term in gradient descent learning algorithms［J］. Neural Networks， 1999， 12（1）： 145-151.
[22]	Duchi J， Hazan E， Singer Y. Adaptive subgradient methods for online learning and stochastic optimization［J］. Journal of Machine Learning Research， 2011， 12： 2121-2159.
[23]	Mohamed R， Amany M S. A modified Adam algorithm for deep neural network optimization［J］. Neural Computing and Applications， 2023， 35（23）： 17095-17112.
[24]	Sarker I H. Deep learning： a comprehensive overview on techniques， taxonomy， applications and research directions［J］. SN Computer Science， 2021， 2（6）： 420-420.
[25]	Dai J F， Qi H Z， Xiong Y W， et al. Deformable convolutional networks［C］// Proceedings of the IEEE International Conference on Computer Vision （ICCV）. Venice， 2017： 764-773.
[26]	欧阳继红，王梓明，刘思光.改进多尺度特征的YOLO_v4目标检测方法［J］. 吉林大学学报（理学版）， 2022， 60（6）： 1349-1355.
	Ouyang Ji-hong， Wang Zi-ming， Liu Si-guang. Improved multi-scale feature method for YOLO_v4 target detection［J］. Journal of Jilin University （Science Edition）， 2022， 60（6）： 1349-1355.
[27]	魏东飞，熊峰，孔维畅.改进YOLOv4的轻量化目标检测方法［J］. 计量与测试技术， 2022， 49（11）： 18-22.
	Wei Dong-fei， Xiong Feng， Kong Wei-chang. Improved lightweight target detection method of YOLOv4 ［J］. Metrology & Measurement Technique， 2022， 49（11）： 18-22.