面向站口行人检测的改进型Yolov5s算法

doi:10.13232/j.cnki.jnju.2024.01.009

南京大学学报(自然科学版) ›› 2024, Vol. 60 ›› Issue (1): 87–96.doi: 10.13232/j.cnki.jnju.2024.01.009

面向站口行人检测的改进型Yolov5s算法

李林红¹^,², 杨杰¹^,²(), 冯志成¹^,², 朱浩¹

^1.江西理工大学电气工程与自动化学院，赣州，341000
^2.江西省磁悬浮技术重点实验室，赣州，341000

收稿日期:2023-10-27 出版日期:2024-01-30 发布日期:2024-01-29
通讯作者: 杨杰 E-mail:yangjie@jxust.edu.cn
基金资助:
国家自然科学基金(62063009)

Improved Yolov5s algorithm for pedestrian detection at station entrances

Linhong Li¹^,², Jie Yang¹^,²(), Zhicheng Feng¹^,², Hao Zhu¹

^1.School of Electrical Engineering and Automation, Jiangxi University of Science and Technology，Ganzhou，341000，China
^2.Jiangxi Provincial Key Laboratory of Maglev Technology，Jiangxi University of Science and Technology，Ganzhou，341000，China

Received:2023-10-27 Online:2024-01-30 Published:2024-01-29
Contact: Jie Yang E-mail:yangjie@jxust.edu.cn

摘要/Abstract

摘要：

针对现有站口行人检测方法难以在实时性与准确性之间均衡的问题，提出一种改进型的Yolov5s模型用于高效地检测站口行人.首先，基于EfficientNetV1改进提出轻量化主干网络EfficientNet_c，优化网络结构和基本单元堆叠次数，提高模型在浅层对小尺寸目标的特征提取能力和提取速度；其次，通过调整宽度因子为基础模型的1/2，改变模型特征层通道数，在较小的精度损失情况下降低模型参数量；再次，增加小目标检测层，优化模型特征提取能力，提高模型对小目标的敏感度和准确性；最后，利用迁移学习的方式优化模型，增强模型泛化能力，降低学习成本，进一步提升模型精度.在课题组收集的数据集上的实验结果表明，所提算法准确率为92.2%，模型参数量仅为1.4 M.在Tesla P100 GPU上的平均推理速度为7.7 ms，实现模型准确率和推理速度的提升.研究结果为地铁和火车站口的行人检测和流量统计提供了一种可行的解决方案.

关键词: 站口行人检测, Yolov5s, EfficientNet_c, 宽度因子, 小目标检测层, 迁移学习

Abstract:

Aiming at the problem that existing pedestrian detection method is difficult to strike a balance between real?time performance and accuracy，an improved Yolov5s model is proposed for efficient pedestrian detection at station entrances. First，the lightweight main network Efficientnet_c is improved based on the improved EfficientNetV1，and the network structure and stacking times of basic units are optimized to enhance the feature extraction capability and speed of the model for small targets at the shallow layer. Secondly，by adjusting the width factor as 1/2 of the basic model，the channel number of feature layer of the model is changed，and the number of model parameters is reduced in the case of small precision loss. Thirdly，a small target detection layer is added to optimize the feature extraction ability of the model and improve the sensitivity and accuracy of the model to small targets. Finally，transfer learning is used to optimize the model，enhance the generalization ability of the model，reduce the learning cost，and further improve the accuracy of the model. The experimental results on the data set collected by the research group show that the accuracy of the proposed algorithm is 92.2%，and the number of model parameters is only 1.4 M. The average inference speed on Tesla P100 GPU is 7.7 ms，which realizes the improvement of model accuracy and inference speed. The results provide a feasible solution for pedestrian detection and traffic statistics of subway and railway station.

Key words: pedestrian detection at station entrances, Yolov5s, EfficientNet_c, width factor, small object detection layer, transfer learning

中图分类号:

TP391.41

李林红, 杨杰, 冯志成, 朱浩. 面向站口行人检测的改进型Yolov5s算法[J]. 南京大学学报(自然科学版), 2024, 60(1): 87–96.

Linhong Li, Jie Yang, Zhicheng Feng, Hao Zhu. Improved Yolov5s algorithm for pedestrian detection at station entrances[J]. Journal of Nanjing University(Natural Sciences), 2024, 60(1): 87–96.

图/表 12

图1

图2

图3

表1

不同宽度因子下Yolov5s在CrowdHuman数据集上的性能表现"

模型	A_p	F （GFlOPs）	P （M）
Yolov5s $0.25$	74.9%	4.2	1.7
Yolov5s $0.5$	77.9%	15.9	7.0
Yolov5s $0.75$	79.6%	35.2	15.8
Yolov5s $1.0$	81.3%	61.9	28.0

表1

图4

图5

图6

表2

消融实验结果"

模型	改进				$A p$	P （M）
模型	EfficientNet_c	宽度因子	小目标检测层	迁移学习	$A p$	P （M）
Yolov5s	-	-	-	-	91.3%	7.0
	√	-	-	-	91.4%	4.3
	-	√	-	-	90.6%	1.8
	-	-	√	-	92.0%	7.7
	-	-	-	√	91.7%	7.0
本文算法	√	√	√	√	92.2%	1.4

表2

图7

表3

各模型在本研究数据集上的训练结果对比"

检测算法	$A p$	P (M)	T_avg (ms)	O (MB)
Faster RCNN	77.9%	54.8	69.4	104.6
Yolov3	91.0%	61.5	17.5	117.2
Yolov5s	91.3%	7.0	8.0	13.7
Yolov5m	91.6%	20.9	11.5	40.2
Yolov7_tiny	89.6%	6.0	7.0	11.7
文献[2]	75.4%	-	-	-
本文算法	92.2%	1.4	7.7	3.8

表3

图8

图9

参考文献 23

1	中国城市轨道交通协会. 城市轨道交通2022年度统计和分析报告. https://www.camet.org.cn/tjxx/11944，2023-03-31.
	China Association of Metros. Urban rail transit 2022 annual statistics and analysis report. https://www.camet.org.cn/tjxx/11944，2023-03-31.
2	康庄，杨杰，李桂兰，等. 基于改进YOLOv3的站口行人检测方法. 铁道科学与工程学报，2021，18(1)：55-63.
	Kang Z， Yang J， Li G L，et al. Pedestrian detection method for station based on improved YOLOv3. Journal of Railway Science and Engineering，2021，18(1)：55-63.
3	Papageorgiou C， Poggio T. A trainable system for object detection. International Journal of Computer Vision，2000，38(1)：15-33.
4	Dalal N， Triggs B. Histograms of oriented gradients for human detection∥2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego，CA，USA：IEEE，2005：886-893.
5	Ojala T， Pietikainen M， Maenpaa T. Multiresolution gray?scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence，2002，24(7)：971-987.
6	Lowe D G. Distinctive image features from scale?invariant keypoints. International Journal of Computer Vision，2004，60(2)：91-110.
7	Girshick R， Donahue J， Darrell T，et al. Rich feature hierarchies for accurate object detection and semantic segmentation∥2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus，OH，USA：IEEE，2014：580-587.
8	Girshick R. Fast R?CNN∥2015 IEEE International Conference on Computer Vision. Santiago，Chile：IEEE，2015：1440-1448.
9	Ren S Q， He K M， Girshick R，et al. Faster R?CNN：Towards real?time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39(6)：1137-1149.
10	Liu W， Anguelov D， Erhan D，et al. SSD：Single shot multiBox detector∥Leibe B，Matas J，Sebe N，et al. 14^th European conference on computer vision. Springer，Berlin Heidelberg，2016：21-37.
11	Redmon J， Divvala S， Girshick R，et al. You only look once：Unified，real?time object detection∥2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas，NV，USA：IEEE，2016：779-788.
12	Redmon J， Farhadi A. YOLO9000：Better，faster，stronger∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu，HI，USA：IEEE，2017：6517-6525.
13	Redmon J， Farhadi A. YOLOv3：An incremental improvement. 2018，arXiv：.
14	Bochkovskiy A， Wang C Y， Liao H Y M. YOLOv4：Optimal speed and accuracy of object detection. 2020，arXiv：.
15	林文杰，邵家玉，张宁. HDNet：一种轻量级的无锚行人头部检测算法. 东南大学学报(自然科学版)，2022，52(6)：1152-1160.
	Lin W J， Shao J Y， Zhang N. HDNet：A lightweight anchor?free pedestrian head detection algorithm. Journal of Southeast University (Natural Science Edition)，2022，52(6)：1152-1160.
16	Xu M， Wang Z， Liu X M，et al. An efficient pedestrian detection for realtime surveillance systems based on modified YOLOv3. Journal of Radio Frequency Identification，2022，6：972-976.
17	Gao F， Cai C X， Jia R H，et al. Improved YOLOX for pedestrian detection in crowded scenes. Journal of Real?Time Image Processing，2023，20(2)：24.
18	李翔，何淼，罗海波. 一种面向遮挡行人检测的改进YOLOv3算法. 光学学报，2022，42(14)：1415003.
	Li X， He M， Luo H B. Occluded pedestrian detection algorithm based on improved YOLOv3. Acta Optica Sinica，2022，42(14)：1415003.
19	张印辉，张朋程，何自芬，等. 红外行人目标精细尺度嵌入轻量化实时检测. 光子学报，2022，51(9)：0910001.
	Zhang Y H， Zhang P P， He Z F，et al. Lightweight real?time detection model of infrared pedestrian embedded in fine?scale. Acta Photonica Sinica，2022，51(9)：0910001.
20	Liu S， Qi L， Qin H F，et al. Path aggregation network for instance segmentation∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City，UT，USA：IEEE，2018：8759-8768.
21	Lin T Y， Dollár P， Girshick R，et al. Feature pyramid networks for object detection∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu，HI，USA：IEEE，2017：936-944.
22	Tan M X， Le Q. EfficientNet：Rethinking model scaling for convolutional neural networks∥Proceedings of the 36th International Conference on Machine Learning. Long Beach,CA，USA：PMLR Press，2019：6105-6114.
23	Tan M X， Le Q V. EfficientNetV2：Smaller models and faster training∥Proceedings of the 38th International Conference on Machine Learning. Virtual Event：PMLR Press，2021：10096-10106.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

[1]	孙晓燕, 乔娅利. 基于迁移与半监督共生融合的虚假评论识别[J]. 南京大学学报(自然科学版), 2022, 58(5): 846-855.
[2]	陈黎, 龚安民, 丁鹏, 伏云发. 基于欧式空间⁃加权逻辑回归迁移学习的运动想象EEG信号解码[J]. 南京大学学报(自然科学版), 2022, 58(2): 264-274.
[3]	王丽娟,丁世飞,丁玲. 基于迁移学习的软子空间聚类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 515-523.
[4]	钟琪,冯亚琴,王蔚. 跨语言语料库的语音情感识别对比研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 765-773.
[5]	孟佳娜*, 赵丹丹, 于玉海, 孙世昶. 归纳式迁移学习在跨领域情感倾向性分析中的应用[J]. 南京大学学报(自然科学版), 2016, 52(1): 175-183.

面向站口行人检测的改进型Yolov5s算法

Improved Yolov5s algorithm for pedestrian detection at station entrances

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 23

相关文章 5

Metrics

本文评价

推荐阅读 0