Technical Paper:
VGG-16-Based Map-Less Navigation Architecture with Temporal Vision Mosaic for Autonomous Ground Robots
Antonio Galiza Cerdeira Gonzalez*,
, Gentiane Venture**
, Ikuo Mizuuchi***
, and Bipin Indurkhya*

*Center for Cognitive Science, Jagiellonian University
ul. Ingardena 3, Kraków 30, Poland
Corresponding author
**Department of Mechanical Engineering, Graduate School of Engineering, The University of Tokyo
Tokyo, Japan
***Division of Advanced Mechanical Systems Engineering, Institute of Engineering, Tokyo University of Agriculture and Technology
Tokyo, Japan
This paper introduces a novel VGG-16-based visual navigation architecture for a differential drive robot, Social Plantroid, using a temporal vision mosaic, a novel approach which joins current and previous robot vision frames for heading estimation. As a minor contribution, it integrates a novel sunlight/shadow detection algorithm using Gabor filters. The neural network is trained with simulated data employing the artificial potential field method, which is another novelty for map-less robot navigation. Virtual and real-world experiments validate the effectiveness of this architecture in obstacle avoidance and navigation.
- [1] Y. D. Yasuda, L. E. G. Martins, and F. A. Cappabianco, “Autonomous visual navigation for mobile robots: A systematic literature review,” ACM Computing Surveys (CSUR), Vol.53, No.1, pp. 1-34, 2020. https://doi.org/10.1145/3368961
- [2] A. G. C. Gonzalez, G. Venture, and I. Mizuuchi, “VGG-16 neural network-based visual artificial potential field for autonomous navigation of ground robots,” Int. Conf. on Intelligent Autonomous Systems, 2023.
- [3] M. Yuasa and I. Mizuuchi, “A control method for a swarm of plant pot robots that uses artificial potential fields for effective utilization of sunlight,” J. Robot. Mechatron., Vol.26, No.4, pp. 505-512, 2014. https://doi.org/10.20965/jrm.2014.p0505
- [4] M. F. Land and R. D. Fernald, “The evolution of eyes,” Annual review of neuroscience, Vol.15, No.1, pp. 1-29, 1992. https://doi.org/10.1146/annurev.ne.15.030192.000245
- [5] E. C. Tolman, “Cognitive maps in rats and men,” Psychological Review, Vol.55, No.4, pp. 189-208, 1948. https://doi.org/10.1037/h0061626
- [6] F. Rodriguez, E. Duran, J. P. Vargas, B. Torres, and C. Salas, “Performance of goldfish trained in allocentric and egocentric maze procedures suggests the presence of a cognitive mapping system in fishes,” Animal Learning & Behavior, Vol.22, pp. 409-420, 1994. https://doi.org/10.3758/BF03209160
- [7] J. Mather, “The case for octopus consciousness: Temporality,” NeuroSci, Vol.3, No.2, pp. 245-261, 2022. https://doi.org/10.3390/neurosci3020018
- [8] Y. Liu, L. B. Day, K. Summers, and S. S. Burmeister, “A cognitive map in a poison frog,” J. of Experimental Biology, Vol.222, No.11, Article No.jeb197467, 2019. https://doi.org/10.1242/jeb.197467
- [9] A. R. Krochmal and T. C. Roth II, “The case for investigating the cognitive map in nonavian reptiles,” Animal Behaviour, Vol.197, pp. 71-80, 2023. https://doi.org/10.1016/j.anbehav.2023.01.006
- [10] K. Dhein, “The cognitive map debate in insects: A historical perspective on what is at stake,” Studies in History and Philosophy of Science, Vol.98, pp. 62-79, 2023. https://doi.org/10.1016/j.shpsa.2022.12.008
- [11] S. S. Ge and Y. J. Cui, “Dynamic motion planning for mobile robots using potential field method,” Autonomous Robots, Vol.13, No.3, pp. 207-222, 2002. https://doi.org/10.1023/A:1020564024509
- [12] M. Sarris and M. Sixt, “Navigating in tissue mazes: Chemoattractant interpretation in complex environments,” Current Opinion in Cell Biology, Vol.36, pp. 93-102, 2015. https://doi.org/10.1016/j.ceb.2015.08.001
- [13] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “DeepDriving: Learning affordance for direct perception in autonomous driving,” Proc. of the IEEE Int. Conf. on Computer Vision, pp. 2722-2730, 2015. https://doi.org/10.1109/ICCV.2015.312
- [14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014. https://doi.org/10.48550/arXiv.1409.1556
- [15] M. Egerstedt, X. Hu, and A. Stotsky, “Control of mobile platforms using a virtual vehicle approach,” IEEE Trans. on Automatic Control, Vol.46, No.11, pp. 1777-1782, 2001. https://doi.org/10.1109/9.964690
- [16] A. Kawakami, K. Tsukada, K. Kambara, and I. Siio, “PotPet: Pet-like flowerpot robot,” Proc. of the 5th Int. Conf.on Tangible, Embedded, and Embodied Interaction, pp. 263-264, 2010. https://doi.org/10.1145/1935701.1935755
- [17] M. Elings, “People-plant interaction: The physiological, psychological and sociological effects of plants on people,” J. Hassink and M. Van Dijk (Eds.), “Farming for Health,” pp. 43-55, Springer, 2006. https://doi.org/10.1007/1-4020-4541-7_4
- [18] V. Tolani, S. Bansal, A. Faust, and C. Tomlin, “Visual navigation among humans with optimal control as a supervisor,” IEEE Robotics and Automation Letters, Vol.6, No.2, pp. 2288-2295, 2021. https://doi.org/10.1109/LRA.2021.3060638
- [19] H. Lee, H. W. Ho, and Y. Zhou, “Deep learning-based monocular obstacle avoidance for unmanned aerial vehicle navigation in tree plantations: Faster region-based convolutional neural network approach,” J. of Intelligent & Robotic Systems, Vol.101, Article No.1, 2021. https://doi.org/10.1007/s10846-020-01284-z
- [20] L. Ran, Y. Zhang, Q. Zhang, and T. Yang, “Convolutional neural network-based robot navigation using uncalibrated spherical images,” Sensors, Vol.17, No.6, Article No.1341, 2017. https://doi.org/10.3390/s17061341
- [21] T. Ran, L. Yuan, and J. Zhang, “Scene perception based visual navigation of mobile robot in indoor environment,” ISA Trans., Vol.109, pp. 389-400, 2021. https://doi.org/10.1016/j.isatra.2020.10.023
- [22] Y.-H. Kim, J.-I. Jang, and S. Yun, “End-to-end deep learning for autonomous navigation of mobile robot,” 2018 IEEE Int. Conf. on Consumer Electronics (ICCE), 2018. http://dx.doi.org/10.1109/ICCE.2018.8326229
- [23] D. Dugas, O. Andersson, R. Siegwart, and J. J. Chung, “NavDreams: Towards camera-only RL navigation among humans,” arXiv:2203.12299, 2022. https://doi.org/10.48550/arXiv.2203.12299
- [24] A. V. Savkin, A. S. Matveev, M. Hoy, and C. Wang, “Safe Robot Navigation Among Moving and Steady Obstacles,” Butterworth-Heinemann, 2015.
- [25] R. Khadka, P. G. Lind, G. Mello, M. A. Riegler, and A. Yazidi, “Inducing inductive bias in vision transformer for eeg classification,” 2024 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’2024), pp. 2096-2100, 2024. https://doi.org/10.1109/ICASSP48485.2024.10446429
- [26] J. G. Daugman, “Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression,” IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol.36, No.7, pp. 1169-1179, 1988. https://doi.org/10.1109/29.1644
- [27] V. Agafonkin, “Polylabel: A fast algorithm for finding the pole of inaccessibility of a polygon,” 2016. https://github.com/mapbox/polylabel [Accessed October 20, 2024]
- [28] D. Garcia-Castellanos and U. Lombardo, “Poles of inaccessibility: A calculation algorithm for the remotest places on earth,” Scottish Geographical J., Vol.123, No.3, pp. 227-233, 2007. https://doi.org/10.1080/14702540801897809
- [29] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980, 2014. https://doi.org/10.48550/arXiv.1412.6980
- [30] T. Eiter and H. Mannila, “Computing Discrete Frechet Distance,” Tech. Report CD-TR 94/64, Christian Doppler Laboratory for Expert Systems, 1994.
- [31] X. Zhu, Y. Wang, J. Dai, L. Yuan, and Y. Wei, “Flow-guided feature aggregation for video object detection,” 2017 IEEE Int. Conf. on Computer Vision, pp. 408-417, 2017. https://doi.org/10.1109/ICCV.2017.52
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.