single-rb.php

JRM Vol.37 No.5 pp. 1024-1033
doi: 10.20965/jrm.2025.p1024
(2025)

Paper:

Application and Experimental Verification of a Delay-Aware Deep Reinforcement Learning Method in End-to-End Autonomous Driving Control

Kazuya Emura and Ryuzo Hayashi

Tokyo University of Science
6-3-1 Niijuku, Katsushika-ku, Tokyo 125-8585, Japan

Received:
April 7, 2025
Accepted:
June 9, 2025
Published:
October 20, 2025
Keywords:
autonomous driving, reinforcement learning, end-to-end
Abstract

This study investigates the effect of sensor-to-actuation delay in end-to-end autonomous driving using deep reinforcement learning (DRL). Although DRL-based methods have demonstrated success in tasks such as lane keeping and obstacle avoidance, numerous challenges remain in real-world applications. A key issue is that real-world latency can violate the assumptions of the Markov decision process (MDP), resulting in degraded performance. To address this problem, a method is introduced wherein past actions are appended to the current state, thereby preserving the MDP property even under delayed control signals. The efficacy of this approach was evaluated by comparing three scenarios in simulation: no delay, delay without compensation, and delay compensation by including past actions. The results revealed that the scenario without delay compensation failed to learn effectively. Subsequently, the trained policy was deployed on a 1/10 scale experimental vehicle, demonstrating that explicitly modeling delay significantly enhances both stability and reliability in simulation and in physical trials. Moreover, when a longer delay was imposed, the learning process became slower and the action-value estimation was less stable, yet the simulated vehicle still performed successfully. Although experimental vehicle tests under extended delays exhibited some instability, it was confirmed that the approach accounted for such delays to a certain extent, thereby compensating effectively for latency in real-world environments.

Comparison of simulation and experiment

Comparison of simulation and experiment

Cite this article as:
K. Emura and R. Hayashi, “Application and Experimental Verification of a Delay-Aware Deep Reinforcement Learning Method in End-to-End Autonomous Driving Control,” J. Robot. Mechatron., Vol.37 No.5, pp. 1024-1033, 2025.
Data files:
References
  1. [1] S. A. Bagloee, M. Tavana, M. Asadi, and T. Oliver, “Autonomous vehicles: Challenges, opportunities, and future implications for transportation policies,” J. Mod. Transp., Vol.24, No.4, pp. 284-303, 2016. https://doi.org/10.1007/s40534-016-0117-3
  2. [2] D. J. Fagnant and K. Kockelman, “Preparing a nation for autonomous vehicles: Opportunities, barriers and policy recommendations,” Transp. Res. A: Policy Pract., Vol.77, pp. 167-181, 2015. https://doi.org/10.1016/j.tra.2015.04.003
  3. [3] Z. Chen and X. Huang, “End-to-end learning for lane keeping of self-driving cars,” 2017 IEEE Intell. Veh. Symp. (IV), pp. 1856-1860, 2017. https://doi.org/10.1109/IVS.2017.7995975.
  4. [4] E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,” IEEE Access, Vol.8, pp. 58443-58469, 2020. https://doi.org/10.1109/ACCESS.2020.2983149
  5. [5] M. Bojarski et al., “End to end learning for self-driving cars,” arXiv:1604.07316, 2016. https://doi.org/10.48550/arXiv.1604.07316
  6. [6] A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani, “Deep reinforcement learning framework for autonomous driving,” Proc. IS&T Int. Symp. Electron. Imaging: Auton. Veh. Mach., pp. 70-76, 2017. https://doi.org/10.2352/ISSN.2470-1173.2017.19.AVM-023
  7. [7] A. Carballo et al., “End-to-end autonomous mobile robot navigation with model-based system support,” J. Robot. Mechatron., Vol.30, No.4, pp. 563-583, 2018. https://doi.org/10.20965/jrm.2018.p0563
  8. [8] V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, Vol.518, No.7540, pp. 529-533, 2015. https://doi.org/10.1038/nature14236
  9. [9] D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, Vol.529, No.7587, pp. 484-489, 2016. https://doi.org/10.1038/nature16961
  10. [10] D. Silver et al., “Mastering the game of Go without human knowledge,” Nature, Vol.550, No.7676, pp. 354-359, 2017. https://doi.org/10.1038/nature24270
  11. [11] D. Silver et al., “Mastering chess and shogi by self-play with a general reinforcement learning algorithm,” arXiv:1712.01815, 2017. https://doi.org/10.48550/arXiv.1712.01815
  12. [12] P. Wolf et al., “Learning how to drive in a real world simulation with deep Q-networks,” 2017 IEEE Intell. Veh. Symp. (IV), pp. 244-250, 2017. https://doi.org/10.1109/IVS.2017.7995727
  13. [13] C.-J. Hoel, K. Wolff, and L. Laine, “Automated speed and lane change decision making using deep reinforcement learning,” 2018 21st Int. Conf. Intell. Transp. Syst. (ITSC), pp. 2148-2155, 2018. https://doi.org/10.1109/ITSC.2018.8569568
  14. [14] A. Kendall et al., “Learning to drive in a day,” 2019 Int. Conf. Robot. Autom. (ICRA), pp. 8248-8254, 2019. https://doi.org/10.1109/ICRA.2019.8793742
  15. [15] X. Xiong, J. Wang, F. Zhang, and K. Li, “Combining deep reinforcement learning and safety based control for autonomous driving,” arXiv.1612.00147, 2016. https://doi.org/10.48550/arXiv.1612.00147
  16. [16] T. Suzuki et al., “Acquisition of cooperative control of multiple vehicles through reinforcement learning utilizing vehicle-to-vehicle communication and map information,” J. Robot. Mechatron., Vol.36, No.3, pp. 642-657, 2024. https://doi.org/10.20965/jrm.2024.p0642
  17. [17] J. Tobin et al., “Domain randomization for transferring deep neural networks from simulation to the real world,” 2017 IEEE/RSJ Int. Conf. Intell. Robots and Syst. (IROS), pp. 23-30, 2017. https://doi.org/10.1109/IROS.2017.8202133
  18. [18] D. Kalaria, Q. Lin, and J. M. Dolan, “Delay-aware robust control for safe autonomous driving,” 2022 IEEE Intell. Veh. Symp. (IV), pp. 1565-1571, 2022. https://doi.org/10.1109/IV51971.2022.9827111
  19. [19] D. Kalaria, Q. Lin, and J. M. Dolan, “Delay-aware robust control for safe autonomous driving and racing,” IEEE Trans. Intell. Transp. Syst., Vol.25, No.7, pp. 7140-7150, 2024. https://doi.org/10.1109/TITS.2023.3339708
  20. [20] F. Naseer, M. N. Khan, A. Rasool, and N. Ayub, “A novel approach to compensate delay in communication by predicting teleoperator behaviour using deep learning and reinforcement learning to control telepresence robot,” Electron. Lett., Vol.59, No.9, Article No.e12806, 2023. https://doi.org/10.1049/ell2.12806
  21. [21] T. J. Walsh, A. Nouri, L. Li, and M. L. Littman, “Learning and planning in environments with delayed feedback,” Auton. Agents Multi-Agent Syst., Vol.18, No.1, pp. 83-105, 2009. https://doi.org/10.1007/s10458-008-9056-7
  22. [22] M. Hirano et al., “A transparent AI-based approach for controlling processes with time delays: With its experimental evaluation in a real-world plant operation,” Trans. Jpn. Soc. Artif. Intell., Vol.39, No.6, pp. A-O53_1-9, 2024 (in Japanese). https://doi.org/10.1527/tjsai.39-6-A-O53
  23. [23] V. Mnih et al., “Playing Atari with deep reinforcement learning,” arXiv:1312.5602, 2013. https://doi.org/10.48550/arXiv.1312.5602

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Oct. 19, 2025