1.安徽大学人工智能学院,安徽 合肥 230601
2.北京科技大学计算机与通信工程学院,北京 100083
3.美国普渡大学电子与计算机工程系,印第安纳州 西拉法叶 IN 46204
4.光电信息获取与防护技术全国重点实验室,安徽 合肥 230031
5.安徽大学,安徽 合肥 230601
[ "尹守国(2002- ),男,安徽大学人工智能学院硕士生,主要研究方向为意图推理、轨迹预测、智能交通系统。" ]
[ "杜泉成(1994- ),男,北京科技大学计算机与通信工程学院博士生,主要研究方向为轨迹预测和车辆规划决策。" ]
[ "李灵犀(1977- ),男,博士,美国普渡大学电子与计算机工程系教授,主要研究方向为复杂系统的建模、分析、控制与优化、智能交通系统、离散事件动态系统。" ]
[ "王晓(1988- ),女,安徽大学人工智能学院教授,主要研究方向为社会计算、群体行为建模、无人自主系统及其平行测试。" ]
[ "孙长银(1975- ),男,安徽大学校长,主要研究方向为智能控制与优化、强化学习、神经网络和数据驱动控制。" ]
收稿:2025-09-27,
修回:2025-12-08,
录用:2025-12-15,
网络出版:2026-01-07,
移动端阅览
尹守国,杜泉成,李灵犀等.基于多特征融合的行人过街意图推理方法[J].智能科学与技术学报,
YIN Shouguo,DU Quancheng,LI Lingxi,et al.Pedestrian crossing intention inference method based on multi-feature fusion[J].Chinese Journal of Intelligent Science and Technology,
尹守国,杜泉成,李灵犀等.基于多特征融合的行人过街意图推理方法[J].智能科学与技术学报, DOI:10.11959/j.issn.2096-6652.2025.
YIN Shouguo,DU Quancheng,LI Lingxi,et al.Pedestrian crossing intention inference method based on multi-feature fusion[J].Chinese Journal of Intelligent Science and Technology, DOI:10.11959/j.issn.2096-6652.2025.
准确理解和预测行人过街意图对自动驾驶汽车的行驶安全至关重要。现有方法多依赖于行人轨迹或整体身体姿态等视觉运动特征,而对行人与车辆之间的交互信号关注不足,因此难以捕捉行人通过手势、头部朝向等细微信号所传递的通行意图。为了解决这些限制,提出了准确推理行人过街意图(accurate reasoning for pedestrian crossing intent,ARPCI)模型,这是一个多特征融合框架。具体而言,设计了一个行人特征模块,该模块首先关注行人的骨架特征以捕捉行人的运动趋势,在此基础上利用MobileNet提取头部姿态特征,结合YOLOv8识别手部动作,从而获得行人与车辆间的交互信号。此外,还引入了场景编码模块和自车特征模块,这能够有效融合环境上下文与车辆动态信息,增强模型对复杂交通场景的适应能力,提高对行人过街意图的预测准确率。在广泛使用的JAAD数据集上进行的实验表明,该方法准确率达到了88%,优于多个同类模型SOTA(state of the art),消融实验也进一步验证了各输入特征的有效性。
Accurately understanding and predicting pedestrians’ crossing intentions is crucial for ensuring the safety of autonomous vehicles. Existing approaches are often limited to visual motion cues such as pedestrian trajectories or body poses
while overlooking interactive signals like gestures and head orientations
making it difficult to capture key cues of pedestrian-vehicle interaction. To address these limitations
ARPCI (accurate reasoning for pedestrian crossing intent) was proposed
a multi-feature fusion framework designed for pedestrian intent inference. Specifically
a pedestrian feature module was developed that first focuseed on skeleton-based features to capture motion trends
and further leverageed MobileNet to extract head pose features. Combined with YOLOv8 for gesture recognition
pedestrian-vehicle interaction signals were captured more comprehensively by the model. In addition
a scene encoding module and a self-vehicle feature module were introduced to integrate contextual and ego-dynamic information
thereby enhancing adaptability to complex traffic environments and improving prediction accuracy. Extensive experiments on the widely used JAAD dataset show that the approach achieves an accuracy of 88%
surpassing several state-of-the-art counterparts. Moreover
the ablation studies provide further evidence of the effectiveness of the proposed input features.
HU J W, FLANNAGAN C, GANESAN S, et al. Understanding the new trends in pedestrian injury distribution and mechanism through data linkage and modeling[J]. Accident; Analysis and Prevention, 2023, 188: 107095.
WANG X, TANG K, DAI X, et al. S4TP: social-suitable and safety-sensitive trajectory planning for autonomous vehicles[EB]. 2024.
LIU J M, LIN H, WANG X D, et al. Reliable trajectory prediction in scene fusion based on spatio-temporal Structure Causal Model[J]. Information Fusion, 2024, 107: 102309.
WANG X D, LIU J M, LIN H, et al. A multi-modal spatial-temporal model for accurate motion forecasting with visual fusion[J]. Information Fusion, 2024, 102: 102046.
LANDRY F G, AKHLOUFI M A. Predicting pedestrian crossing intention in autonomous vehicles: a review[J]. Neurocomputing, 2025, 618: 129105.
DU Q C, XU L L, WU Q, et al. SafeCrossNet: Multi-modal fusion with social-aware for pedestrian crossing intention prediction[J]. Information Fusion, 2026, 126: 103609.
LI Z R, GONG C, LIN Y L, et al. Continual driver behaviour learning for connected vehicles and intelligent transportation systems: Framework, survey and challenges[J]. Green Energy and Intelligent Transportation, 2023, 2(4): 100103.
DU Q C, WU Q, LI L X, et al. Review and perspectives on pedestrian trajectory prediction for safe transportation[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, PP(99): 1-27.
WANG X, LIU J G, MEI T, et al. CoSeg: cognitively inspired unsupervised generic event segmentation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9): 12507-12517.
WANG X, WU Z Z, JIANG B, et al. HARDVS: revisiting human activity recognition with dynamic vision sensors[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Piscataway: AAAI Press, 2024, 38(6): 5615-5623.
AZARMI M, REZAEI M, WANG H. PIP-net: pedestrian intention prediction in the wild[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(7): 9824-9837.
WANG X, TANG K, DAI X Y, et al. Safety-balanced driving-style aware trajectory planning in intersection scenarios with uncertain environment[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(4): 2888-2898.
DU Q C, WANG X, YIN S G, et al. Social force embedded mixed graph convolutional network for multi-class trajectory prediction[J]. IEEE Transactions on Intelligent Vehicles, 2024, 9(9): 5571-5580.
HELBING D, MOLNÁR P. Social force model for pedestrian dynamics[J]. Physical Review E, Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 1995, 51(5): 4282-4286.
FANG Z J, VÁZQUEZ D, LÓPEZ A M. On-board detection of pedestrian intentions[J]. Sensors, 2017, 17(10): 2193.
ZHANG W, ZHU F H, CHEN Y Y, et al. Differential time-variant traffic flow prediction based on deep learning[C]//Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC). Piscataway: IEEE Press, 2020: 1-6.
WANG J G, WANG X, SHEN T Y, et al. A long-tail regularization method for traffic sign recognition based on parallel vision[J]. IEEE Journal of Radio Frequency Identification, 2022, 6: 957-961.
LI F, FAN S W, CHEN P Z, et al. Pedestrian motion state estimation from 2D pose[C]//Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE Press, 2020: 1682-1687.
VARYTIMIDIS D, ALONSO-FERNANDEZ F, DURAN B, et al. Action and intention recognition of pedestrians in urban traffic[C]//Proceedings of the 2018 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS). Piscataway: IEEE Press, 2018: 676-682.
PERDANA M I, ANGGRAENI W, SIDHARTA H A, et al. Early warning pedestrian crossing intention from its head gesture using head pose estimation[C]//Proceedings of the 2021 International Seminar on Intelligent Technology and Its Applications (ISITIA). Piscataway: IEEE Press, 2021: 402-407.
杜泉成, 王晓, 李灵犀, 等. 行人轨迹预测方法关键问题研究: 现状及展望[J]. 智能科学与技术学报, 2023, 5(2): 143-162.
DU Q C, WANG X, LI L X, et al. Key problems and progress of pedestrian trajectory prediction methods: the state of the art and prospects[J]. Chinese Journal of Intelligent Science and Technology, 2023, 5(2): 143-162.
李琳辉, 周彬, 任威威, 等. 行人轨迹预测方法综述[J]. 智能科学与技术学报, 2021, 3(4): 399-411.
LI L H, ZHOU B, REN W W, et al. Review of pedestrian trajectory prediction methods[J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 399-411.
GUO J R, DING Y T, TIAN A S. Multimodal feature fusion for pedestrian crossing intention prediction based on hybrid attention mechanism[C]//Proceedings of the 2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence (IoTAAI). Piscataway: IEEE Press, 2024: 70-74.
DEY D, TERKEN J. Pedestrian interaction with vehicles: roles of explicit and implicit communication[C]//Proceedings of the 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications. New York: ACM, 2017: 109-113.
GUPTA A, JOHNSON J, LI F F, et al. Social GAN: socially acceptable trajectories with generative adversarial networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 2255-2264.
LIN Y W, HU C, ZHAO B X, et al. Anchor-based multi-modal transformer network for pedestrian trajectory and intention prediction[C]//Proceedings of the 2023 7th CAA International Conference on Vehicular Control and Intelligence (CVCI). Piscataway: IEEE Press, 2023: 1-6.
WANG Y, WAN W X, ZHANG H Q, et al. Pedestrian trajectory intention prediction in autonomous driving scenarios based on spatio-temporal attention mechanism[C]//Proceedings of the 2024 4th International Conference on Electronic Information Engineering and Computer Communication (EIECC). Piscataway: IEEE Press, 2024: 1519-1522.
MOHAMED A, QIAN K, ELHOSEINY M, et al. Social-STGCNN: a social spatio-temporal graph convolutional neural network for human trajectory prediction[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 14412-14420.
TANG W X, LIU K, SHAKEEL M S, et al. DDAD: detachable crowd density estimation assisted pedestrian detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(2): 1867-1878.
胡远志, 蒋涛, 刘西, 等. 基于双流自适应图卷积神经网络的行人过街意图识别[J]. 汽车安全与节能学报, 2022, 13(2): 325-332.
HU Y Z, JIANG T, LIU X, et al. Pedestrian-crossing intention-recognition based on dual-stream adaptive graph-convolutional neural-network[J]. Journal of Automotive Safety and Energy, 2022, 13(2): 325-332.
CHAABANE M, TRABELSI A, BLANCHARD N, et al. Looking ahead: anticipating pedestrians crossing with future frames prediction[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE Press, 2020: 2286-2295.
XIE C, LIN C Y, ZHENG X Y, et al. GTransPDM: a graph-embedded transformer with positional decoupling for pedestrian crossing intention prediction[J]. IEEE Signal Processing Letters, 2025, 32: 2109-2113.
CADENA P R G, QIAN Y Q, WANG C X, et al. Pedestrian graph: a fast pedestrian crossing prediction model based on graph convolutional networks[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(11): 21050-21061.
LIU B B, ADELI E, CAO Z J, et al. Spatiotemporal relationship reasoning for pedestrian intent prediction[J]. IEEE Robotics and Automation Letters, 2020, 5(2): 3485-3492.
PICCOLI F, BALAKRISHNAN R, PEREZ M J, et al. FuSSI-Net: fusion of spatio-temporal skeletons for intention prediction network[EB]. 2020.
YANG T Y, CHEN Y T, LIN Y, et al. FSA-net: learning fine-grained structure aggregation for head pose estimation from a single image[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 1087-1096.
ZHANG J Y, XU Y N, CHEN X. Multimodal-based pedestrian crossing intention recognition method[C]//Proceedings of the 2023 China Automation Congress (CAC). Piscataway: IEEE Press, 2023: 3508-3513.
CHEN H, WANG Y, GUO J, et al. Vanillanet: the power of minimalism in deep learning[J]. Advances in Neural Information Processing Systems, 2023, 36: 7050-7064.
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV) 2018. Cham: Springer, 2018: 3-19.
WANG X, HUANG J, TIAN Y L, et al. Parallel driving with big models and foundation intelligence in cyber-physical-social spaces[J]. Research, 2024, 7: 0349.
KOTSERUBA I, RASOULI A, TSOTSOS J K. Joint attention in autonomous driving (JAAD)[J]. 2016.
BHATTACHARYYA A, FRITZ M, SCHIELE B. Long-term on-board prediction of people in traffic scenes under uncertainty[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4194-4202.
RASOULI A, KOTSERUBA I, TSOTSOS J K. Pedestrian action anticipation using contextual feature fusion in stacked rnns[EB]. 2020.
GESNOUIN J, PECHBERTI S, STANCIULCSCU B, et al. TrouSPI-Net: spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction[C]//Proceedings of the 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). New York: ACM Press, 2021: 1-7.
KOTSERUBA I, RASOULI A, TSOTSOS J K. Benchmark for evaluating pedestrian action prediction[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE Press, 2021: 1257-1267.
YANG D F, ZHANG H L, YURTSEVER E, et al. Predicting pedestrian crossing intention with feature fusion and spatio-temporal attention[J]. IEEE Transactions on Intelligent Vehicles, 2022, 7(2): 221-230.
RASOULI A, YAU T, ROHANI M, et al. Multi-modal hybrid architecture for pedestrian action prediction[C]//Proceedings of the 2022 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE Press, 2022: 91-97.
ALOFI A, GREER R, GOPALKRISHNAN A, et al. Pedestrian safety by intent prediction: a lightweight LSTM-attention architecture and experimental evaluations with real-world datasets[C]//Proceedings of the 2024 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE Press, 2024: 77-84.
0
浏览量
0
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621
