优点:在于其可解释性强,每个模块的功能明确,可以独立调试。 局限:(1)由于每个模块的优化目标不同,如感知模块可能会优化平均精度(mAP),而规划模块则侧重于驾驶的安全性和舒适性,这导致整个系统的优化目标可能无法统一协调。(2)由于系统是按顺序处理的,每个模块之间的误差会累积,从而导致信息的逐步丢失,不仅增加了计算负担,还可能导致计算资源的次优利用 。
优点:(1)简化了系统结构,通过联合训练整个模型,可以实现从感知到控制输出的无缝衔接,提高整体性能。(2)具有更高的计算效率,共享的骨干网络可以减少冗余计算。数据驱动的优化方法也使得通过增加训练资源,系统性能可以显著提升。减少了模块间的错误累积,提高计算资源利用效率,提高系统的鲁棒性和泛化能力 。
[3]: D. A. Pomerleau, "Alvinn: An autonomous land vehicle in a neural network," in NeurIPS, 1988.
[5]: S. Casas, W. Luo, A. Sadat, and R. Urtasun, "MP3: A unified model to map, perceive, predict and plan," in CVPR, 2021.
[6]: A. Prakash, K. Chitta, and A. Geiger, "Multi-modal fusion transformer for end-to-end autonomous driving," in CVPR, 2021.
[8]: M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, and J. Zhang, "End to end learning for self-driving cars," arXiv.org, vol. 1604.07316, 2016.
[10]: M. Müller, “End-to-end imitation learning with conditional adversarial networks,” arXiv preprint arXiv:1805.01987, 2018.
[16]: A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, "CARLA: An open urban driving simulator," in CoRL, 2017.
[21]: A. Sadat, S. Casas, M. Ren, X. Wu, P. Dhawan, and R. Urtasun, "Perceive, predict, and plan: Safe motion planning through interpretable semantic representations," in ECCV, 2020.
[25]: A. Codevilla, E. Santana, A. M. Lopez, and A. Gaidon, "Exploring the limitations of behavior cloning for autonomous driving," in ICCV, 2019.
[28]: A. Prakash, K. Chitta, and A. Geiger, "Multi-modal fusion transformer for end-to-end autonomous driving," in CVPR, 2021.
[39]: A. Sadat, S. Casas, M. Ren, X. Wu, P. Dhawan, and R. Urtasun, "Perceive, predict, and plan: Safe motion planning through interpretable semantic representations," in ECCV, 2020.
[52]: W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, "End-to-end interpretable neural motion planner," in CVPR, 2019.
[56]: L. I. Kunze, F. Landgraf, T. Ruhkopf, D. Gill, and K. Dietmayer, “Meta-learning with non-iid data for class-incremental continual learning,” in CVPR Workshops, 2021.
[58]: J. Wen, Y. Li, T. Luo, H. Wang, and W. Li, "DRIFT: A framework for improving the generalization of object detection models under distribution shift," in NeurIPS, 2020.
该方法是大部分上车落地的端到端自动驾驶模型使用的,李小毛在此补充几条~
CVPR自动驾驶公开赛冠军!Hydra-MDP: 端到端多模态规划与多目标 Hydra 蒸馏
IROS2024 | ParkingE2E:端到端自动泊车模型
模仿学习-逆最优控制(Inverse Optimal Control,IOC):通过专家的演示数据来推导出一个能够解释专家行为的奖励函数,然后基于该奖励函数来优化代理的策略。在连续的高维自动驾驶场景中,奖励的定义是隐含的,难以优化。
逆最优控制方法,通过专家策略(
代表作:轨迹代价从固定的专家轨迹采样轨迹[1]、从鸟瞰图(BEV)中学习代价体积[32]、从运动学模型学习代价体积[32, 39, 70]、从其他代理的未来运动中计算的联合代价体积[69]、从概率语义占用或空闲空间层计算代价体积[39, 70, 71]、生成对抗模仿学习(Generative Adversarial Imitation Learning, GAIL)[65, 66, 67]
[1]: S. Casas, A. Sadat, and R. Urtasun, "MP3: A unified model to map, perceive, predict and plan," in CVPR, 2021.
[32]: W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, and R. Urtasun, "End-to-end interpretable neural motion planner," in CVPR, 2019.
[39]: A. Sadat, S. Casas, M. Ren, X. Wu, P. Dhawan, and R. Urtasun, "Perceive, predict, and plan: Safe motion planning through interpretable semantic representations," in ECCV, 2020.
[67]: G. Lee, D. Kim, W. Oh, K. Lee, and S. Oh, "Mixgail: Autonomous driving using demonstrations with mixed qualities," in IROS, 2020.
[69]: H. Wang, P. Cai, R. Fan, Y. Sun, and M. Liu, "End-to-end interactive prediction and planning with optical flow distillation for autonomous driving," in CVPR Workshops, 2021.
[70]: P. Hu, A. Huang, J. Dolan, D. Held, and D. Ramanan, "Safe local motion planning with self-supervised freespace forecasting," in CVPR, 2021.
[71]: T. Khurana, P. Hu, A. Dave, J. Ziglar, D. Held, and D. Ramanan, "Differentiable raycasting for self-supervised occupancy forecasting," in ECCV, 2022.
(2)强化学习(Reinforcement Learning):通过试错学习策略的领域,在环境中执行一系列动作来学习最佳的策略。RL特别擅长解决那些无法直接定义明确目标的复杂问题,在自动驾驶中的应用主要集中在仿真环境中。
在强化学习方法中,系统通过与环境的反复互动来学习。策略
代表作:车辆在空旷街道上的车道保持[4]、RL与监督学习(Supervised Learning, SL)结合[18, 19]、微调通过模仿学习(Imitation Learning, IL)预训练的网络[17, 79]、特权BEV语义地图上训练了RL代理,并利用该策略自动收集数据集,以训练下游的IL代理[21]、采用了Q函数和表格动态规划,为静态数据集生成额外或改进的标签[20]、RL网络访问模拟器信息[80, 81]
[4]: A. Kendall, J. Hawke, D. Janz, et al., “Learning to drive in a day,” in ICRA, 2019.
[17]: X. Liang, T. Wang, L. Yang, and E. Xing, “CIRL: Controllable imitative reinforcement learning for vision-based self-driving,” in ECCV, 2018.
[18]: M. Toromanoff, E. Wirbel, and F. Moutarde, “End-to-end model-free reinforcement learning for urban driving using implicit affordances,” in CVPR, 2020.
[19]: R. Chekroun, M. Toromanoff, S. Hornauer, and F. Moutarde, “GRI: General reinforced imitation and its application to vision-based autonomous driving,” Robotics, 2023.GRI:
[20]: D. Chen, V. Koltun, and P. Krähenbühl, “Learning to drive from a world on rails,” in ICCV, 2021.
[21]: Z. Zhang, A. Liniger, D. Dai, F. Yu, and L. Van Gool, “End-to-end urban driving by imitating a reinforcement learning coach,” in ICCV, 2021.
[79]: E. Ohn-Bar, A. Prakash, A. Behl, K. Chitta, and A. Geiger, “Learning situational driving,” in CVPR, 2020.
[80]: W. B. Knox, A. Allievi, H. Banzhaf, F. Schmitt, and P. Stone, “Reward (mis)design for autonomous driving,” AI, 2023.
[81]: C. Zhang, R. Guo, W. Zeng, et al., “Rethinking closed-loop training for autonomous driving,” in ECCV, 2022.