Paper:

# A Curiosity-Based Autonomous Navigation Algorithm for Maze Robot

## Xiaoping Zhang, Yihao Liu^{†}, Li Wang, Dunli Hu, and Lei Liu

School of Electrical and Control Engineering, North China University of Technology

No.5 Jinyuanzhuang Road, Shijingshan District, Beijing 100144, China

^{†}Corresponding author

The external reward plays an important role in the reinforcement learning process, and the quality of its design determines the final effect of the algorithm. However, in several real-world scenarios, rewards extrinsic to the agent are extremely sparse. This is particularly evident in mobile robot navigation. To solve this problem, this paper proposes a curiosity-based autonomous navigation algorithm that consists of a reinforcement learning framework and curiosity system. The curiosity system consists of three parts: prediction network, associative memory network, and curiosity rewards. The prediction network predicts the next state. An associative memory network was used to represent the world. Based on the associative memory network, an inference algorithm and distance calibration algorithm were designed. Curiosity rewards were combined with extrinsic rewards as complementary inputs to the Q-learning algorithm. The simulation results show that the algorithm helps the agent reduce repeated exploration of the environment during autonomous navigation. The algorithm also exhibits a better convergence effect.

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.26 No.6, pp. 893-904, 2022.

- [1] I. Sugiarto, L. L. U. Tung, and M. I. Rahman, “Implementation of Fuzzy Logic in FPGA for Maze Tracking of a Mobile Robot Based on Ultrasonic Distance Measurement,” Jurnal Teknik Elektro, Vol.8, No.2, pp. 96-102, 2008.
- [2] R. Kumar, P. Jitoko, S. Kumar, K. Pillay, P. Prakash, A. Sagar, R. Singh, and U. Mehta, “Maze solving robot with automated obstacle avoidance,” Procedia Computer Science, Vol.105, pp. 57-61, 2017.
- [3] S. V. Burtsev and Y. P. Kuzmin, “An efficient flood-filling algorithm,” Computers & Graphics, Vol.17, No.5, pp. 549-561, 1993.
- [4] H. Dang, J. Song, and Q. Guo, “An Efficient Algorithm for Robot Maze-Solving,” 2010 2nd Int. Conf. on Intelligent Human-Machine Systems and Cybernetics, pp. 79-82, 2010.
- [5] M. O. A. Aqel, A. Issa, M. Khdair, M. Elhabbash, M. AbuBaker, and M. Massoud, “Intelligent maze solving robot based on image processing and graph theory algorithms,” 2017 Int. Conf. on Promising Electronic Technologies (ICPET), pp. 48-53, 2017.
- [6] T. Mannucci and E.-J. v. Kampen, “A hierarchical maze navigation algorithm with reinforcement learning and mapping,” 2016 IEEE Symp. Series on Computational Intelligence (SSCI), doi: 10.1109/SSCI.2016.7849365, 2016.
- [7] S. H. Han, H. J. Choi, P. Benz, and J. Loaiciga, “Sensor-based mobile robot navigation via deep reinforcement learning,” 2018 IEEE Int. Conf. on Big Data and Smart Computing (BigComp), pp. 147-154, 2018.
- [8] O. Zhelo, J. Zhang, L. Tai, M. Liu, and W. Burgard, “Curiosity-driven exploration for mapless navigation with deep reinforcement learning,” arXiv:1804.00456, 2018.
- [9] D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell, “Curiosity-driven exploration by self-supervised prediction,” 2017 IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 488-489, 2017.
- [10] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” Proc. of the 33rd Int. Conf. on Machine Learning, Vol.48, pp. 1928-1937, 2016.
- [11] H. Shi, L. Shi, M. Xu, and K.-S. Hwang, “End-to-End Navigation Strategy with Deep Reinforcement Learning for Mobile Robots,” IEEE Trans. on Industrial Informatics, Vol.16, No.4, pp. 2393-2402, 2020.
- [12] P.-Y. Oudeyer, “Computational theories of curiosity-driven learning,” arXiv:1802.10546, 2018.
- [13] E. Law, P.-Y. Oudeyer, M. Yin, M. Schaekermann, and A. C. Williams, “Designing for curiosity: An interdisciplinary workshop,” Proc. of the 2017 CHI Conf. Extended Abstracts on Human Factors in Computing Systems, pp. 586-592, 2017.
- [14] P.-Y. Oudeyer, F. Kaplan, and V. V. Hafner, “Intrinsic motivation systems for autonomous mental development,” IEEE Trans. on Evolutionary Computation, Vol.11, No.2, pp. 265-286, 2007.
- [15] C. Kidd and B. Y. Hayden, “The psychology and neuroscience of curiosity,” Neuron, Vol.88, No.3, pp. 449-460, 2015.
- [16] B. C. Wittmann, N. D. Daw, B. Seymour, and R. J. Dolan, “Striatal activity underlies novelty-based choice in humans,” Neuron, Vol.58, No.6, pp. 967-973, 2008.
- [17] X. Zhang, Y. Liu, D. Hu, and L. Liu, “A Maze Robot Autonomous Navigation Method Based on Curiosity and Reinforcement Learning,” The 7th Int. Workshop on Advanced Computational Intelligence and Intelligent Informatics (IWACIII 2021), Article No.M1-6-1, 2021.
- [18] H. Wicaksono, “Q-learning behavior on autonomous navigation of physical robot,” 2011 8th Int. Conf. on Ubiquitous Robots and Ambient Intelligence (URAI), pp. 50-54, 2011.
- [19] R. R. Al-nima, “Picture recognition by using linear associative memory neural network,” Tikrit J. of Pure Science, Vol.13, No.3, pp. 266-273, 2008.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.