single-jc.php

JACIII Vol.28 No.4 pp. 882-892
doi: 10.20965/jaciii.2024.p0882
(2024)

Research Paper:

Imitating with Sequential Masks: Alleviating Causal Confusion in Autonomous Driving

Huanghui Zhang* ORCID Icon and Zhi Zheng*,**,† ORCID Icon

*College of Computer and Cyber Security, Fujian Normal University
No.8 Xuefu South Road, Shangjie, Minhou, Fuzhou, Fujian 350117, China

**College of Control Science and Engineering, Zhejiang University
No.38 Zheda Road, West Lake District, Hangzhou, Zhejiang 310027, China

Corresponding author

Received:
December 15, 2023
Accepted:
March 21, 2024
Published:
July 20, 2024
Keywords:
causal confusion, invariant feature learning, imitation learning
Abstract

Imitation learning which uses only expert demonstrations is suitable for safety-crucial tasks, such as autonomous driving. However, causal confusion is a problem in imitation learning where, with more features offered, an agent may perform even worse. Hence, we aim to augment agents’ imitation ability in driving scenarios under sequential setting, using a novel method we proposed: sequential masking imitation learning (SEMI). Inspired by the idea of Granger causality, we improve the imitator’s performance through a random masking operation on the encoded features in a sequential setting. With this design, the imitator is forced to focus on critical features, leading to a robust model. We demonstrated that this method can alleviate causal confusion in driving simulations by deploying it the CARLA simulator and comparing it with other methods. The experimental results showed that SEMI can effectively reduce confusion during autonomous driving.

Alleviating causal confusion

Alleviating causal confusion

Cite this article as:
H. Zhang and Z. Zheng, “Imitating with Sequential Masks: Alleviating Causal Confusion in Autonomous Driving,” J. Adv. Comput. Intell. Intell. Inform., Vol.28 No.4, pp. 882-892, 2024.
Data files:
References
  1. [1] A. Saha et al., “Translating images into maps,” 2022 Int. Conf. on Robotics and Automation (ICRA). pp. 9200-9206, 2022. https://doi.org/10.1109/ICRA46639.2022.9811901
  2. [2] Z. Jin et al., “Secure state estimation of cyber-physical system under cyber attacks: Q-learning vs. SARSA,” Electronics, Vol.11, No.19, Article No.3161, 2022. https://doi.org/10.3390/electronics11193161
  3. [3] Z. Jin et al., “Security state estimation for cyber-physical systems against DoS attacks via reinforcement learning and game theory,” Actuators, Vol.11, No.7, Article No.192, 2022. https://doi.org/10.3390/act11070192
  4. [4] Z. Han et al., “Secure state estimation for event-triggered cyber-physical systems against deception attacks,” J. of the Franklin Institute, Vol.359, No.18, pp. 11155-11185, 2022. https://doi.org/10.1016/j.jfranklin.2022.10.049
  5. [5] S. Zhu, I. Ng, and Z. Chen, “Causal discovery with reinforcement learning,” International Conference on Learning Representations, 2020.
  6. [6] S. Li, C. Wei, and Y. Wang, “Combining decision making and trajectory planning for lane changing using deep reinforcement learning,” IEEE Trans. on Intelligent Transportation Systems, Vol.23, No.9, pp. 16110-16136, 2022. https://doi.org/10.1109/TITS.2022.3148085
  7. [7] X. Liang et al., “CIRL: Controllable Imitative Reinforcement Learning for Vision-Based Self-driving,” Proc. of 15th European Conf. on Computer Vision (ECCV 2018), pp. 604-620, 2018. https://doi.org/10.1007/978-3-030-01234-2_36
  8. [8] J. Chen, S. E. Li, and M. Tomizuka, “Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning,” IEEE Trans. on Intelligent Transportation Systems, Vol.23, No.6, pp. 5068-5078, 2022. https://doi.org/10.1109/TITS.2020.3046646
  9. [9] L. Anzalone, S. Barra, and M. Nappi, “Reinforced curriculum learning for autonomous driving in carla,” 2021 IEEE Int. Conf. on Image Processing (ICIP), pp. 3318-3322, 2021. https://doi.org/10.1109/ICIP42928.2021.9506673
  10. [10] D. Hadfield-Menell et al., “Inverse reward design,” Proc. of the 31st Int. Conf. on Neural Information Processing Systems (NIPS’17), pp. 6768-6777, 2017.
  11. [11] P. de Haan, D. Jayaraman, and S. Levine, “Causal confusion in imitation learning,” Proc. of the 33rd Int. Conf. on Neural Information Processing Systems (NeurIPS’19), pp. 11666-11677, 2019.
  12. [12] A. Shojaie and E. B. Fox, “Granger causality: A review and recent advances,” Annual Review of Statistics and its Application, Vol.9, pp. 289-319, 2022. https://doi.org/10.1146/annurev-statistics-040120-010930
  13. [13] A. Tank et al., “Neural Granger causality,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.44, No.8, pp. 4267-4279, 2022. https://doi.org/10.1109/TPAMI.2021.3065601
  14. [14] J. Chen, Z. Xu, and M. Tomizuka, “End-to-end autonomous driving perception with sequential latent representation learning,” arXiv:2003.12464, 2020. https://doi.org/10.48550/arXiv.2003.12464
  15. [15] J. Park et al., “Object-aware regularization for addressing causal confusion in imitation learning,” Proc. of the 35th Int. Conf. on Neural Information Processing Systems (NeurIPS’21), pp. 3029-3042, 2021.
  16. [16] A. Dosovitskiy et al., “CARLA: An open urban driving simulator,” Proc. of the 1st Annual Conf. on Robot Learning (CoRL 2017), pp. 1-16, 2017.
  17. [17] H. Zhang and Z. Zheng, “Sequential masking imitation learning for handling causal confusion in autonomous driving,” Proc. of the 8th Int. Workshop on Advanced Computational Intelligence and Intelligent Informatics (IWACIII 2023), Part 1, pp. 200-214, 2023. https://doi.org/10.1007/978-981-99-7590-7_17
  18. [18] W. Zeng et al., “End-to-end interpretable neural motion planner,” 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 8652-8661, 2019. https://doi.org/10.1109/CVPR.2019.00886
  19. [19] L. Tai et al., “Visual-based autonomous driving deployment from a stochastic and uncertainty-aware perspective,” 2019 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), pp. 2622-2628, 2019. https://doi.org/10.1109/IROS40897.2019.8968307
  20. [20] A. Y. Ng and S. J. Russell, “Algorithms for inverse reinforcement learning,” Proc. of the 17th Int. Conf. on Machine Learning (ICML’00), pp. 663-670, 2000.
  21. [21] P. Abbeel and A. Y. Ng, “Apprenticeship learning via inverse reinforcement learning,” Proc. of the 21st Int. Conf. on Machine Learning, 2004. https://doi.org/10.1145/1015330.1015430
  22. [22] N. D. Ratliff, J. A. Bagnell, and M. A. Zinkevich, “Maximum margin planning,” Proc. of the 23rd Int. Conf. on Machine Learning (ICML’06), pp. 729-736, 2006. https://doi.org/10.1145/1143844.1143936
  23. [23] B. D. Ziebart et al., “Maximum entropy inverse reinforcement learning,” Proc. of the 23rd AAAI Conf. on Artificial Intelligence, pp. 1433-1438, 2008.
  24. [24] F. Codevilla et al., “Exploring the limitations of behavior cloning for autonomous driving,” 2019 IEEE/CVF Int. Conf. on Computer Vision (ICCV), pp. 9328-9337, 2019. https://doi.org/10.1109/ICCV.2019.00942
  25. [25] B. Zheng et al., “Imitation learning: Progress, taxonomies and challenges,” IEEE Trans. on Neural Networks and Learning Systems, Vol.35, No.5, pp. 6322-6337, 2024. https://doi.org/10.1109/TNNLS.2022.3213246
  26. [26] L. Le Mero et al., “A survey on imitation learning techniques for end-to-end autonomous vehicles,” IEEE Trans. on Intelligent Transportation Systems, Vol.23, No.9, pp. 14128-14147, 2022. https://doi.org/10.1109/TITS.2022.3144867
  27. [27] G. Katz et al., “A novel parsimonious cause-effect reasoning algorithm for robot imitation and plan recognition,” IEEE Trans. on Cognitive and Developmental Systems, Vol.10, No.2, pp. 177-193, 2018. https://doi.org/10.1109/TCDS.2017.2651643
  28. [28] N. Srivastava et al., “Dropout: A simple way to prevent neural networks from overfitting,” The J. of Machine Learning Research, Vol.15, No.1, pp. 1929-1958, 2014.
  29. [29] S. Yun et al., “CutMix: Regularization strategy to train strong classifiers with localizable features,” 2019 IEEE/CVF Int. Conf. on Computer Vision (ICCV), pp. 6022-6031, 2019. https://doi.org/10.1109/ICCV.2019.00612
  30. [30] Z. Zhong et al., “Random erasing data augmentation,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.34, No.7, pp. 13001-13008, 2020. https://doi.org/10.1609/aaai.v34i07.7000
  31. [31] P. A. Ortega et al., “Shaking the foundations: Delusions in sequence models for interaction and control,” arXiv:2110.10819, 2021. https://doi.org/10.48550/arXiv.2110.10819
  32. [32] D. Kumor, J. Zhang, and E. Bareinboim, “Sequential causal imitation learning with unobserved confounders,” Proc. of the 35th Int. Conf. on Neural Information Processing Systems (NeurIPS’21), pp. 14669-14680, 2021.
  33. [33] G. Swamy et al., “Sequence model imitation learning with unobserved contexts,” Proc. of the 36th Int. Conf. on Neural Information Processing Systems (NeurIPS’22), pp. 17665-17676. 2022.
  34. [34] K. Ruan and X. Di, “Learning human driving behaviors with sequential causal imitation learning,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.36, No.4, pp. 4583-4592, 2022. https://doi.org/10.1609/aaai.v36i4.20382
  35. [35] K. Ruan et al., “Causal imitation learning via inverse reinforcement learning,” The 11th Int. Conf. on Learning Representations (ICLR 2023), 2023.
  36. [36] A. van den Oord, O. Vinyals, and K. Kavukcuoglu, “Neural discrete representation learning,” Proc. of the 31st Int. Conf. on Neural Information Processing Systems (NIPS’17), pp. 6309-6318, 2017.
  37. [37] A. Kumar, A. Deshpande, and A. Sharma, “Causal effect regularization: Automated detection and removal of spurious correlations,” Proc. of the 37th Conf. on Neural Information Processing Systems (NeurIPS’23), pp. 20942-20984, 2023.
  38. [38] S. Seo et al., “Regularized behavior cloning for blocking the leakage of past action information,” Proc. of the 37th Conf. on Neural Information Processing Systems (NeurIPS 2023), pp. 2128-2153, 2023.
  39. [39] T. Zhao et al., “Interpretable imitation learning with dynamic causal relations,” Proc. of the 17th ACM Int. Conf. on Web Search and Data Mining (WSDM’24), pp. 967-975, 2024. https://doi.org/10.1145/3616855.3635827
  40. [40] M. R. Samsami et al., “Causal imitative model for autonomous driving,” arXiv:2112.03908, 2021. https://doi.org/10.48550/arXiv.2112.03908
  41. [41] J. Kim and J. Canny, “Interpretable learning for self-driving cars by visualizing causal attention,” 2017 IEEE Int. Conf. on Computer Vision (ICCV), pp. 2961-2969, 2017. https://doi.org/10.1109/ICCV.2017.320
  42. [42] P. Hart and A. Knoll, “Counterfactual policy evaluation for decision-making in autonomous driving,” arXiv:2003.11919, 2020. https://doi.org/10.48550/arXiv.2003.11919
  43. [43] A. Gleave et al., “imitation: Clean imitation learning implementations,” arXiv:2211.11972, 2022. https://doi.org/10.48550/arXiv.2211.11972

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Sep. 09, 2024