Hybrid Bidirectional Rapidly Exploring Random Tree Path Planning Algorithm with Reinforcement Learning

Junkui Wang; Kaoru Hirota; Xiangdong Wu; Yaping Dai; Zhiyang Jia

doi:10.20965/jaciii.2021.p0121

single-jc.php

« previous

JACIII Vol.25 No.1 pp. 121-129

doi: 10.20965/jaciii.2021.p0121

(2021)

Paper:

Views over last 60 days: 1,078

Hybrid Bidirectional Rapidly Exploring Random Tree Path Planning Algorithm with Reinforcement Learning

Junkui Wang, Kaoru Hirota, Xiangdong Wu, Yaping Dai, and Zhiyang Jia^†

School of Automation, Beijing Institute of Technology
No.5 Zhongguancun South Street, Haidian District, Beijing 100081, China

^†Corresponding author

Received:

October 25, 2020

Accepted:

November 17, 2020

Published:

January 20, 2021

Keywords:

path planning, reinforcement learning, Q-learning, rapidly exploring random tree, target gravitational strategy

Abstract

The randomness of path generation and slow convergence to the optimal path are two major problems in the current rapidly exploring random tree (RRT) path planning algorithm. Herein, a novel reinforcement-learning-based hybrid bidirectional rapidly exploring random tree (H-BRRT) is presented to solve these problems. To model the random exploration process, a target gravitational strategy is introduced. Reinforcement learning is applied to the improved target gravitational strategy using two operations: random exploration and target gravitational exploration. The algorithm is controlled to switch operations adaptively according to the accumulated performance. It not only improves the search efficiency, but also shortens the generated path after the proposed strategy is applied to a bidirectional rapidly exploring random tree (BRRT). In addition, to solve the problem of the traditional RRT continuously falling into the local optimum, an improved exploration strategy with collision weight is applied to the BRRT. Experimental results implemented in a robot operating system indicate that the proposed H-BRRT significantly outperforms alternative approaches such as the RRT and BRRT. The proposed algorithm enhances the capability of identifying unknown spaces and avoiding local optima.

Cite this article as:

J. Wang, K. Hirota, X. Wu, Y. Dai, and Z. Jia, “Hybrid Bidirectional Rapidly Exploring Random Tree Path Planning Algorithm with Reinforcement Learning,” J. Adv. Comput. Intell. Intell. Inform., Vol.25 No.1, pp. 121-129, 2021.

Data files:

References

[1] A. V. Le, V. Prabakaran, S. Sivanantham, and R. E. Mohan, “Modified A-Star Algorithm for Efficient Coverage Path Planning in Tetris Inspired Self-Reconfigurable Robot with Integrated Laser Sensor,” Sensors, Vol.18, No.8, p. 2585, 2018.
[2] K. P. Cheng, R. E. Mohan, K. N. N. Huu, and A. V. Le, “Multi-Objective Genetic Algorithm-Based Autonomous Path Planning for Hinged-Tetro Reconfigurable Tiling Robot,” IEEE Access, Vol.8, pp. 121267-121284, 2020.
[3] H. Ali, D. Gong, M. Wang, and X. Dai, “Path planning of mobile robot with improved ant colony algorithm and MDP to produce smooth trajectory in grid-based environment,” Frontiers Neurorobotics, Vol.14, p. 44, 2020.
[4] S. Askari, N. Montazerin, and M. H. Fazel Zarandi, “Modeling energy flow in natural gas networks using time series disaggregation and fuzzy systems tuned by particle swarm optimization,” Appl. Soft Comput., Vol.92, p. 106332, 2020.
[5] M. A. Baumann, S. Léonard, E. A. Croft, and J. J. Little, “Path Planning for Improved Visibility Using a Probabilistic Road Map,” IEEE Trans. Robotics, Vol.26, No.1, pp. 195-200, 2010.
[6] M. Kothari and I. Postlethwaite, “A Probabilistically Robust Path Planning Algorithm for UAVs Using Rapidly-Exploring Random Trees,” J. Intell. Robotic Syst. Robotics, Vol.71, No.2, pp. 231-253, 2013.
[7] S. M. LaValle, “Rapidly-exploring random trees: A new tool for path planning,” Iowa State University, Computer Science Dept., TR 98-11, Tech. Rep., 1998.
[8] A. H. Qureshi and Y. Ayaz, “Intelligent bidirectional rapidly-exploring random trees for optimal motion planning in complex cluttered environments,” Robotics and Autonomous Systems, Vol.68, pp. 1-11, 2015.
[9] N. A. Melchior and R. Simmons, “Particle RRT for path planning with uncertainty,” Proc. of IEEE Int. Conf. on Robotics and Automation, pp. 1617-1624, 2007.
[10] S. M. LaValle and J. J. Kuffner Jr., “Randomized kinodynamic planning,” Int. J. Robotics Res., Vol.20, No.5, pp. 378-400, 2001.
[11] A. Aldahak and A. Elnagar, “A Practical pursuit-evasion algorithm: detection and tracking,” Proc. of IEEE Int. Conf. on Robotics and Automation, Rome, pp. 343-348, 2007.
[12] L. Jaillet, A. Yershova, S. M. LaValle, and T. Siméon, “Adaptive tuning of the sampling domain for dynamic-domain RRTs,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems. Edmonton, pp. 2851-2856, 2005.
[13] A. Yershova, L. Jaillet, T. Simeon, and S. M. LaValle, “Dynamic domain RRTs: Efficient exploration by controlling the sampling domain,” Proc. of IEEE Int. Conf. on Robotics and Automation, Barcelona, pp. 3856-3861, 2005.
[14] B. Burns and O. Brock, “Single-query motion planning with utility guided random trees,” Proc. of IEEE Int. Conf. on Robotics and Automation, Rome, pp. 2007-3307, 2007.
[15] L. Kang, C. X. Zhao, and J. H. Guo, “Path Planning Based on Fuzzy Rolling Rapidly-exploring Random Tree for Mobile Robot,” J. of Nanjing University of Science and Technology (Nature Science), Vol.34, No.5, pp. 642-648, 2010.
[16] J. Z. Song, B. Dai, E. Z. Shan, and H. G. He, “An improved RRT path planning algorithm,” Acta Electronica Sinica, Vol.38, No.2A, pp. 225-228, 2010.
[17] H. Peng, L. Wang, and L. C. Shen, “The Modified RRT-based Real-time Route Planning for UAV Area Target Searching,” J. of National University of Defense Technology, Vol.31, No.5, pp. 86-91, 2009.
[18] S. Li, X. Xu, and L. Zuo, “Dynamic Path Planning of a Mobile Robot with Improved Q-Learning Algorithm,” IEEE Int. Conf. on Information and Automation, pp. 409-414, 2015.
[19] H. Wu, L. Cai, and X. Gao, “Online Pheromone Stringency Guiding Heuristically Accelerated Q-Learning,” Application Research of Computers, Vol.35, No.8, pp. 2323-2327, 2018.
[20] E. Wiewiora, “Potential-based shaping and Q-value initialization are equivalent,” J. of Artificial Intelligence Research, Vol.19, pp. 205-208, 2003.
[21] Y. Song, Y. Li, C. Li, and G. Zhang, “An efficient initialization approach of Q-learning for mobile robots,” Int. J. of Control Automation & Systems, Vol.10, pp. 166-172, 2012.
[22] M. Simsek, A. Czylwik, A. Galindo-Serrano, and L. Giupponi, “Improved decentralized Q-learning algorithm for interference reduction in LTE-femtocells,” Wireless Advanced, pp. 138-143, 2011.
[23] C. Yan and X. Xiang, “A path planning algorithm for UAV based on improved Q-learning,” Int. Conf. on Robotics and Automation Sciences, 2018.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] A. V. Le, V. Prabakaran, S. Sivanantham, and R. E. Mohan, “Modified A-Star Algorithm for Efficient Coverage Path Planning in Tetris Inspired Self-Reconfigurable Robot with Integrated Laser Sensor,” Sensors, Vol.18, No.8, p. 2585, 2018.

[2] [2] K. P. Cheng, R. E. Mohan, K. N. N. Huu, and A. V. Le, “Multi-Objective Genetic Algorithm-Based Autonomous Path Planning for Hinged-Tetro Reconfigurable Tiling Robot,” IEEE Access, Vol.8, pp. 121267-121284, 2020.

[3] [3] H. Ali, D. Gong, M. Wang, and X. Dai, “Path planning of mobile robot with improved ant colony algorithm and MDP to produce smooth trajectory in grid-based environment,” Frontiers Neurorobotics, Vol.14, p. 44, 2020.

[4] [4] S. Askari, N. Montazerin, and M. H. Fazel Zarandi, “Modeling energy flow in natural gas networks using time series disaggregation and fuzzy systems tuned by particle swarm optimization,” Appl. Soft Comput., Vol.92, p. 106332, 2020.

[5] [5] M. A. Baumann, S. Léonard, E. A. Croft, and J. J. Little, “Path Planning for Improved Visibility Using a Probabilistic Road Map,” IEEE Trans. Robotics, Vol.26, No.1, pp. 195-200, 2010.

[6] [6] M. Kothari and I. Postlethwaite, “A Probabilistically Robust Path Planning Algorithm for UAVs Using Rapidly-Exploring Random Trees,” J. Intell. Robotic Syst. Robotics, Vol.71, No.2, pp. 231-253, 2013.

[7] [7] S. M. LaValle, “Rapidly-exploring random trees: A new tool for path planning,” Iowa State University, Computer Science Dept., TR 98-11, Tech. Rep., 1998.

[8] [8] A. H. Qureshi and Y. Ayaz, “Intelligent bidirectional rapidly-exploring random trees for optimal motion planning in complex cluttered environments,” Robotics and Autonomous Systems, Vol.68, pp. 1-11, 2015.

[9] [9] N. A. Melchior and R. Simmons, “Particle RRT for path planning with uncertainty,” Proc. of IEEE Int. Conf. on Robotics and Automation, pp. 1617-1624, 2007.

[10] [10] S. M. LaValle and J. J. Kuffner Jr., “Randomized kinodynamic planning,” Int. J. Robotics Res., Vol.20, No.5, pp. 378-400, 2001.

[11] [11] A. Aldahak and A. Elnagar, “A Practical pursuit-evasion algorithm: detection and tracking,” Proc. of IEEE Int. Conf. on Robotics and Automation, Rome, pp. 343-348, 2007.

[12] [12] L. Jaillet, A. Yershova, S. M. LaValle, and T. Siméon, “Adaptive tuning of the sampling domain for dynamic-domain RRTs,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems. Edmonton, pp. 2851-2856, 2005.

[13] [13] A. Yershova, L. Jaillet, T. Simeon, and S. M. LaValle, “Dynamic domain RRTs: Efficient exploration by controlling the sampling domain,” Proc. of IEEE Int. Conf. on Robotics and Automation, Barcelona, pp. 3856-3861, 2005.

[14] [14] B. Burns and O. Brock, “Single-query motion planning with utility guided random trees,” Proc. of IEEE Int. Conf. on Robotics and Automation, Rome, pp. 2007-3307, 2007.

[15] [15] L. Kang, C. X. Zhao, and J. H. Guo, “Path Planning Based on Fuzzy Rolling Rapidly-exploring Random Tree for Mobile Robot,” J. of Nanjing University of Science and Technology (Nature Science), Vol.34, No.5, pp. 642-648, 2010.

[16] [16] J. Z. Song, B. Dai, E. Z. Shan, and H. G. He, “An improved RRT path planning algorithm,” Acta Electronica Sinica, Vol.38, No.2A, pp. 225-228, 2010.

[17] [17] H. Peng, L. Wang, and L. C. Shen, “The Modified RRT-based Real-time Route Planning for UAV Area Target Searching,” J. of National University of Defense Technology, Vol.31, No.5, pp. 86-91, 2009.

[18] [18] S. Li, X. Xu, and L. Zuo, “Dynamic Path Planning of a Mobile Robot with Improved Q-Learning Algorithm,” IEEE Int. Conf. on Information and Automation, pp. 409-414, 2015.

[19] [19] H. Wu, L. Cai, and X. Gao, “Online Pheromone Stringency Guiding Heuristically Accelerated Q-Learning,” Application Research of Computers, Vol.35, No.8, pp. 2323-2327, 2018.

[20] [20] E. Wiewiora, “Potential-based shaping and Q-value initialization are equivalent,” J. of Artificial Intelligence Research, Vol.19, pp. 205-208, 2003.

[21] [21] Y. Song, Y. Li, C. Li, and G. Zhang, “An efficient initialization approach of Q-learning for mobile robots,” Int. J. of Control Automation & Systems, Vol.10, pp. 166-172, 2012.

[22] [22] M. Simsek, A. Czylwik, A. Galindo-Serrano, and L. Giupponi, “Improved decentralized Q-learning algorithm for interference reduction in LTE-femtocells,” Wireless Advanced, pp. 138-143, 2011.

[23] [23] C. Yan and X. Xiang, “A path planning algorithm for UAV based on improved Q-learning,” Int. Conf. on Robotics and Automation Sciences, 2018.

Hybrid Bidirectional Rapidly Exploring Random Tree Path Planning Algorithm with Reinforcement Learning

Junkui Wang, Kaoru Hirota, Xiangdong Wu, Yaping Dai, and Zhiyang Jia†

Junkui Wang, Kaoru Hirota, Xiangdong Wu, Yaping Dai, and Zhiyang Jia^†