浏览全部资源
扫码关注微信
北京交通大学计算机科学与技术学院,北京 100044
[ "刘泽禹(2000- ),男,北京交通大学计算机科学与技术学院硕士生,主要研究方向为多模态融合感知。" ]
[ "张慧(1993- ),女,北京交通大学计算机科学与技术学院副教授,主要研究方向为复杂环境下的车辆感知、多传感器融合的协同检测、多模态目标检测、群体智能、平行视觉等。" ]
[ "李浥东(1982- ),男,北京交通大学计算机科学与技术学院教授、院长、博士生导师,交通大数据与人工智能教育部重点实验室主任,主要研究方向为大数据智能、数据隐私保护、先进计算、智能交通等。" ]
收稿日期:2024-10-16,
修回日期:2025-01-10,
纸质出版日期:2025-03-15
移动端阅览
刘泽禹,张慧,李浥东.面向自动驾驶的感知决策一体化综述[J].智能科学与技术学报,2025,07(01):4-20.
LIU Zeyu,ZHANG Hui,LI Yidong.A review of integrated perception and decision-making for autonomous driving[J].Chinese Journal of Intelligent Science and Technology,2025,07(01):4-20.
刘泽禹,张慧,李浥东.面向自动驾驶的感知决策一体化综述[J].智能科学与技术学报,2025,07(01):4-20. DOI: 10.11959/j.issn.2096-6652.202502.
LIU Zeyu,ZHANG Hui,LI Yidong.A review of integrated perception and decision-making for autonomous driving[J].Chinese Journal of Intelligent Science and Technology,2025,07(01):4-20. DOI: 10.11959/j.issn.2096-6652.202502.
近年来,感知决策一体化的端到端方案取得了突破性进展,为提升自动驾驶的安全性和可靠性提供了新的思路。现有的综述多聚焦于单独车辆对外部环境的感知,忽略了在多智能体协同下的复杂交互关系。基于当前的研究现状,从应用方法的角度出发,探讨了单智能体感知决策一体化方法,并分析了在多智能体协同条件下的感知决策一体化方法。首先,对感知决策一体化自动驾驶的基础技术进行了总结;其次,重点介绍了感知决策一体化自动驾驶技术的最新研究进展;再次,阐述了自动驾驶领域常用的大规模公共数据集;接着,介绍了目前常用的感知决策一体化自动驾驶评估方法;最后,对感知决策一体化自动驾驶进行了总结与展望。
In recent years
end-to-end solutions integrating perception and decision-making have made groundbreaking progress
providing new approaches to enhancing the safety and reliability of autonomous driving. Existing reviews often focus on the perception of the external environment by individual vehicles
overlooking the complex interactions in multi-agent collaboration. Based on the current state of research
integrated perception and decision-making methods for single agents were explored from an application perspective
and methods under multi-agent collaborative conditions were analyzed. Firstly
foundational technologies for integrated perception and decision-making in autonomous driving were summarized. Secondly
the latest research advancements in integrated perception and decision-making technologies for autonomous driving were highlighted. Thirdly
large-scale public datasets commonly used in the field of autonomous driving were elaborated upon. Then
the evaluation methods currently employed for integrated perception and decision-making in autonomous driving were introduced. Finally
a summary and outlook on integrated perception and decision-making in autonomous driving were presented.
POMERLEAU D A. ALVINN an autonomous land vehicle in a neural network[C]//Advances in Neural Information Processing Systems 1. Massachusetts: MIT Press, 1989: 305-313.
BOJARSKI M, YERES P, CHOROMANSKA A, et al. Explaining how a deep neural network trained with end-to-end learning steers a car[EB ] . arXiv preprint, 2017, arXiv: 1704.07911 .
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 5998-6008.
PRAKASH A, CHITTA K, GEIGER A, et al. Multi-modal fusion transformer for end-to-end autonomous driving[EB ] . arXiv preprint, 2021, arXiv: 2104.09224 .
RAHIMIAN P, O'NEAL E E, ZHOU S W, et al. Harnessing vehicle-to-pedestrian (V2P) communication technology: sending traffic warnings to texting pedestrians[J]. Human Factors, 2018, 60(6): 833-843.
廖勇, 尹子松, 田肖懿. 车联网V2I场景下基于GNN的SC-FDMA智能信道估计[J]. 电子学报, 2024, 52(3): 772-782.
LIAO Y, YIN Z S, TIAN X Y. Intelligent channel estimation of SC-FDMA based on GNN for V2I scenarios in Internet of vehicles[J]. Acta Electronica Sinica, 2024, 52(3): 772-782.
李原, 潘定海, 王志刚, 等. 基于V2I/V2N的感知融合系统技术及应用研究[J]. 工业控制计算机, 2022, 35(2): 5-7, 10.
LI Y, PAN D H, WANG Z G, et al. Perception fusion system based on V2I/V2N[J]. Industrial Control Computer, 2022, 35(2): 5-7, 10.
XU H Y, CHEN J L, MENG S Y, et al. A survey on occupancy perception for autonomous driving: the information fusion perspective[J]. Information Fusion, 2025, 114: 102671.
WU P X, CHEN S H, METAXAS D N. MotionNet: joint perception and motion prediction for autonomous driving based on bird's eye view maps[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 11234-11243.
PENG L, XU J K, CHENG H R, et al. Learning occupancy for monocular 3D object detection[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 10281-10292.
朱冰, 贾士政, 赵健, 等. 自动驾驶车辆决策与规划研究综述[J]. 中国公路学报, 2024, 37(1): 215-240.
ZHU B, JIA S Z, ZHAO J, et al. Review of research on decision-making and planning for automated vehicles[J]. China Journal of Highway and Transport, 2024, 37(1): 215-240.
ZYNER A, WORRALL S, WARD J, et al. Long short term memory for driver intent prediction[C]//Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE Press, 2017: 1484-1489.
DING W C, SHEN S J. Online vehicle trajectory prediction using policy anticipation network and optimization-based context reasoning[C]//Proceedings of the 2019 International Conference on Robotics and Automation (ICRA). Piscataway: IEEE Press, 2019: 2435-2441.
SUN H B, CHEN R F, LIU T Y, et al. LG-LSTM: modeling LSTM-based interactions for multi-agent trajectory prediction[C]//Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME). Piscataway: IEEE Press, 2022: 1-6.
POLYCHRONOPOULOS A, TSOGAS M, AMDITIS A J, et al. Sensor fusion for predicting vehicles' path for collision avoidance systems[J]. IEEE Transactions on Intelligent Transportation Systems, 2007, 8(3): 549-562.
BATZ T, WATSON K, BEYERER J. Recognition of dangerous situations within a cooperative group of vehicles[C]//Proceedings of the 2009 IEEE Intelligent Vehicles Symposium. Piscataway: IEEE Press, 2009: 907-912.
LIU W, KIM S W, PENDLETON S, et al. Situation-aware decision making for autonomous driving on urban road using online POMDP[C]//Proceedings of the 2015 IEEE Intelligent Vehicles Symposium (Ⅳ). Piscataway: IEEE Press, 2015: 1126-1133.
SHADRIN S S, IVANOVA A A. Analytical review of standard SAE J3016 taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles with latest updates[J]. Avtomobil Doroga Infrastruktura, 2019, 3(21): 10.
付新科, 蔡英凤, 陈龙, 等. 不确定性环境下的自动驾驶汽车行为决策方法[J]. 汽车工程, 2024, 46(2): 211-221, 259.
FU X K, CAI Y F, CHEN L, et al. Decision-making for autonomous driving in uncertain environment[J]. Automotive Engineering, 2024, 46(2): 211-221, 259.
ZHAO L H, ICHISE R, SASAKI Y, et al. Fast decision making using ontology-based knowledge base[C]//Proceedings of the 2016 IEEE Intelligent Vehicles Symposium (Ⅳ). Piscataway: IEEE Press, 2016: 173-178.
OLSSON M. Behavior trees for decision-making in autonomous driving[R]. 2016.
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
GLÄSCHER J, DAW N, DAYAN P, et al. States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning[J]. Neuron, 2010, 66(4): 585-595.
BOJARSKI M, DEL TESTA D, DWORAKOWSKI D, et al. End to end learning for self-driving cars[EB ] . arXiv preprint, 2016, arXiv: 1604.07316 .
LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[EB ] . arXiv preprint, 2015, arXiv: 1509.02971 .
TENG S Y, HU X M, DENG P, et al. Motion planning for autonomous driving: the state of the art and future perspectives[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(6): 3692-3711.
CHEN Z L, HUANG X M. End-to-end learning for lane keeping of self-driving cars[C]//Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (Ⅳ). Piscataway: IEEE Press, 2017: 1856-1860.
YANG Z Y, ZHANG Y X, YU J, et al. End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions[C]//Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR). Piscataway: IEEE Press, 2018: 2289-2294.
CHI L, MU Y D. Deep steering: learning end-to-end driving model from spatial and temporal visual cues[EB ] . arXiv preprint, 2017, arXiv: 1708.03798 .
AN D D, LIU J, ZHANG M, et al. Uncertainty modeling and runtime verification for autonomous vehicles driving control: a machine learning-based approach[J]. Journal of Systems and Software, 2020, 167: 110617.
HU Y H, YANG J Z, CHEN L, et al. Planning-oriented autonomous driving[EB ] . arXiv preprint, 2022, arXiv: 2212.10156 .
JIANG B, CHEN S Y, XU Q, et al. VAD: vectorized scene representation for efficient autonomous driving[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2023: 8306-8316.
CHEN S Y, JIANG B, GAO H, et al. VADv2: end-to-end vectorized autonomous driving via probabilistic planning[EB ] . arXiv preprint, 2024, arXiv: 2402.13243 .
ZHENG W Z, SONG R Q, GUO X D, et al. GenAD: generative end-to-end autonomous driving[EB ] . arXiv preprint, 2024, arXiv: 2402.11502 .
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 770-778.
ZHENG W Z, WU J J, ZHENG Y, et al. GaussianAD: Gaussian-centric end-to-end autonomous driving[EB ] . arXiv preprint, 2024, arXiv: 2412.10371 .
WENG X S, IVANOVIC B, WANG Y, et al. PARA-drive: parallelized architecture for real-time autonomous driving[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 15449-15458.
DOLL S, HANSELMANN N, SCHNEIDER L, et al. DualAD: disentangling the dynamic and static world for end-to-end driving[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 14728-14737.
JAEGER B, CHITTA K, GEIGER A. Hidden biases of end-to-end driving models[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2023: 8206-8215.
DUAN Y Q, GUO X D, ZHU Z, et al. MaskFuser: masked fusion of joint multi-modal tokenization for end-to-end autonomous driving[EB ] . arXiv preprint, 2024, arXiv: 2405.07573 .
LI Z X, LI K L, WANG S H, et al. Hydra-MDP: end-to-end multimodal planning with multi-target hydra-distillation[EB ] . arXiv preprint, 2024, arXiv: 2406.06978 .
FENG K T, LI C S, REN D C, et al. On the road to portability: compressing end-to-end motion planner for autonomous driving[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 15099-15108.
XU D Y, LI H K, WANG Q F, et al. M2DA: multi-modal fusion transformer incorporating driver attention for autonomous driving[EB ] . arXiv preprint, 2024, arXiv: 2403.12552 .
GU A, DAO T. Mamba: linear-time sequence modeling with selective state spaces[EB ] . arXiv preprint, 2023, arXiv: 2312.00752 .
YUAN C R, ZHANG Z Q, SUN J W, et al. DRAMA: an efficient end-to-end motion planner for autonomous driving with mamba[EB ] . arXiv preprint, 2024, arXiv: 2408.03601 .
WEN L C, YANG X M, FU D C, et al. On the road with GPT-4V(ision): early explorations of visual-language model on autonomous driving[EB ] . arXiv preprint, 2023, arXiv: 2311.05332 .
TIAN X Y, GU J R, LI B L, et al. DriveVLM: the convergence of autonomous driving and large vision-language models[EB ] . arXiv preprint, 2024, arXiv: 2402.12289 .
JIANG B, CHEN S Y, LIAO B C, et al. Senna: bridging large vision-language models and end-to-end autonomous driving[EB ] .arXiv preprint, 2024. arXiv preprint arXiv: 2410.22313 .
VALIENTE R, ZAMAN M, OZER S, et al. Controlling steering angle for cooperative self-driving vehicles utilizing CNN and LSTM-based deep networks[C]//Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV). Piscataway: IEEE Press, 2019: 2423-2428.
CUI J X, QIU H, CHEN D, et al. Coopernaut: end-to-end driving with cooperative perception for networked vehicles[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 17231-17241.
DENG Z Y, SHI Y J, SHEN W M. V2X-lead: LiDAR-based end-to-end autonomous driving with vehicle-to-everything communication integration[C]//Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE Press, 2023: 7471-7478.
REN S L, LEI Z X, WANG Z, et al. Interruption-aware cooperative perception for V2X communication-aided autonomous driving[J]. IEEE Transactions on Intelligent Vehicles, 2024, 9(4): 4698-4714.
YU H B, YANG W X, ZHONG J R, et al. End-to-end autonomous driving through V2X cooperation[EB ] . arXiv preprint, 2024, arXiv: 2404.00717 .
YOU J W, SHI H T, JIANG Z Y, et al. V2X-VLM: end-to-end V2X cooperative autonomous driving through large vision-language models[EB ] . arXiv preprint, 2024, arXiv: 2408.09251 .
WANG F Y, WANG C H. Agent-based control systems for operation and management of intelligent network-enabled devices[C]//Proceedings of 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance. Piscataway: IEEE Press, 2003: 5028-5033.
王飞跃. 平行系统方法与复杂系统的管理和控制[J]. 控制与决策, 2004, 19(5): 485-489, 514.
WANG F Y. Parallel system methods for management and control of complex systems[J]. Control and Decision, 2004, 19(5): 485-489, 514.
WANG F Y, ZHENG N N, CAO D P, et al. Parallel driving in CPSS: a unified approach for transport automation and vehicle intelligence[J]. IEEE/CAA Journal of Automatica Sinica, 2017, 4(4): 577-587.
刘腾, 王晓, 邢阳, 等. 基于数字四胞胎的平行驾驶系统及应用[J]. 智能科学与技术学报, 2019, 1(1): 40-51.
LIU T, WANG X, XING Y, et al. Research on digital quadruplets in cyber-physical-social space-based parallel driving[J]. Chinese Journal of Intelligent Science and Technology, 2019, 1(1): 40-51.
CAESAR H, BANKITI V, LANG A H, et al. nuScenes: a multimodal dataset for autonomous driving[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 11621-11631.
RICHTER S R, VINEET V, ROTH S, et al. Playing for data: ground truth from computer games[C]//Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 102-118.
WRENNINGE M, UNGER J. Synscapes: a photorealistic synthetic dataset for street scene parsing[EB ] . arXiv preprint, 2018, arXiv: 1810.08705 .
GÓMEZ J L, SILVA M, SEOANE A, et al. All for one, and one for all: UrbanSyn da taset, the third musketeer of synthetic driving scenes[EB ] . arXiv preprint, 2023, arXiv: 2312.12176 .
MAO J G, NIU M Z, JIANG C H, et al. One million scenes for autonomous driving: once dataset[EB ] . arXiv preprint, 2021, arXiv: 2106.11037 .
GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: the KITTI dataset[J]. International Journal of Robotics Research, 2013, 32(11): 1231-1237.
CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding[EB ] . arXiv preprint, 2016, arXiv: 1604.01685 .
XU R S, XIANG H, XIA X, et al. OPV. V: an open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication[C]//Proceedings of the 2022 International Conference on Robotics and Automation (ICRA). Piscataway: IEEE Press, 2022: 2583-2589.
XU R S, GUO Y, HAN X, et al. OpenCDA: an open cooperative driving automation framework integrated with co-simulation[C]//Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC). Piscataway: IEEE, 2021: 1155-1162.
DOSOVITSKIY A, ROS G, CODEVILLA F, et al. CARLA: an open urba n driving simulator[EB ] . arXiv preprint, 2017, arXiv: 1711.03938 .
LANG A H, VORA S, CAESAR H, et al. PointPillars: fast encoders for object detection from point clouds[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 12689-12697.
ZHOU Y, TUZEL O. VoxelNet: end-to-end learning for point cloud based 3D object detection[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4490-4499.
XU R S, XIA X, LI J L, et al. V. V. Real: a real-world large-scale dataset for vehicle-to-vehicle cooperative perception[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 13712-13722.
YU H B, LUO Y Z, SHU M, et al. DAIR-V2X: a large-scale dataset for vehicle-infrastructure cooperative 3D object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 21329-21338.
LI Y M, MA D K, AN Z Y, et al. V2X-Sim: multi-agent collaborative perception dataset and benchmark for autonomous driving[J]. IEEE Robotics and Automation Letters, 2022, 7(4): 10914-10921.
YU H B, YANG W X, RUAN H Z, et al. V2X-seq: a large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 5486-5495.
吴琼. 北京自动驾驶车辆道路测试报告(2019)[J]. 智能网联汽车, 2020(2): 46-55.
WU Q. Road test report of self-driving vehicles in Beijing (2019)[J]. Intelligent Connected Vehicles, 2020(2): 46-55.
THORN E, KIMMEL S C, CHAKA M, et al. A framework for automated driving system testable cases and scenarios[R]. 2018.
0
浏览量
5
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构