Paper:
Adaptive Cruise Control Based on Reinforcement Leaning with Shaping Rewards
Zhaohui Hu and Dongbin Zhao
Institute of Automation, Chinese Academy of Sciences, No.95, Zhongguancun East Road, Beijing 100190, China
This paper proposes a Supervised Reinforcement Learning (SRL) algorithm for the Adaptive Cruise Control system (ACC) to comply with human driving habit, which can be thought of as a dynamic programming problem with stochastic demands. In short, the ACC problem can be deemed as the host vehicle adopting different control parameters (accelerations in the upper controller, brakes and throttles in the bottom controller) in the process of following or other driving situations according to the driver’s behavior. We discretize the relative speed as well as the relative distance to construct the two-dimensional states and map them to a one-dimensional state space; discretize the acceleration to generate the action set; and design additional speed improvement shaping reward and distance improvement shaping reward to construct the supervisor. We apply the SRL algorithm to the ACC problem in different scenarios. The results show the higher robustness and accuracy of the SRL control policy compared with traditional control methods.
- [1] G. N. Bifulco, F. Simonelli, and R. D. Pace, “Experiments toward an human-like adaptive cruise control,” Intelligent Vehicles Symposium, pp. 919-924, 2008.
- [2] K. Yi and I. Moon, “A driver-adaptive stop-and-go cruise control strategy,” Proc. of IEEE Int. Conf. on Networking. Sensing & Control, pp. 601-601, Taipei, 2004.
- [3] B. A. Guvenc and E. Kural, “Adaptive cruise control simulator: a low-cost, multiple-driver-in-the-loop simulator,” IEEE Control Systems Magazine, 2006, Vol.26, No3, pp. 42-55, 2006.
- [4] P. S. Fancher et al., “Comparative analyses of three types of headway control systems for heavy commercial vehicles,” Proc. of the IAVSD Symposium on the Dynamics of Vehicles on Roads and Tracks, at Ann Arbor, MI, USA, 1995.
- [5] M. Won et al., “Test bed for vehicle longitudinal control using chassis dynamometer and virtual reality: An Application to Adaptive Cruise Control,” AVEC’00 Int. Symposium on Advanced Vehicle Control, Ann Arbor, MI. USA. 2000.
- [6] H. Ohno, “A neural network based model for adaptive cruise use on driver behavior,” R&D Review, Vol.35, No.2, pp. 1524-1525, 2000.
- [7] F. Simonelli, G. N. Bifulco, V. D. Martinis, and V. Punzo, “Humanlike adaptive cruise control systems through a learning machine approach,” E. Avineri, M. Koppen, and K. Dahal (Eds.), Applications of Soft Computing, Springer Berlin/Heidelberg, pp. 240-249, 2009.
- [8] M. T. Rosenstein and A. G. Barto, “Supervised actor-critic reinforcement learning,” J. Si, A. Barto, W. Powell, and D. Wunsch (Eds.), Handbook of Learning and Approximate Dynamic Programming, IEEE Press, John Wiley & Sons, Inc., pp. 359-380, 2004.
- [9] A. Y. Ng, D. Harada, and S. J. Russell, “Policy invariance under reward transformations: theory and application to reward shaping,” Proc. of the Sixteenth Int. Conf. on Machine Learning, pp. 278-287, June 27-30, 1999.
- [10] R. S. Sutton and A. G. Barto, “Reinforcement learning: an introduction,” Massachusetts London, England, The MIT Press Cambridge, 1998.
- [11] P. J. Zheng and M. McDonald, “Manual vs. adaptive cruise control – can driver’s expectation be matched?” Transportation Research Part C, 2005, Vol.13, No.5-6, pp. 421-431, 2005.
- [12] H. Dugoff, P. Fancher, and L. Segel, “An analysis of tire traction properties and their influence on vehicle dynamics performance,” SAE, Paper No.700377, 1970.
- [13] C. Watkins, “Q-learning,” Machine Learning, 1992, Vol.8, No.3, pp. 279-292, 1992.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.