VGG-16-Based Map-Less Navigation Architecture with Temporal Vision Mosaic for Autonomous Ground Robots

Antonio Galiza Cerdeira Gonzalez; Gentiane Venture; Ikuo Mizuuchi; Bipin Indurkhya

doi:10.20965/ijat.2025.p0651

single-au.php

« previous

IJAT Vol.19 No.4 pp. 651-665

doi: 10.20965/ijat.2025.p0651

(2025)

Technical Paper:

Views over last 60 days: 154

VGG-16-Based Map-Less Navigation Architecture with Temporal Vision Mosaic for Autonomous Ground Robots

Antonio Galiza Cerdeira Gonzalez^,† , Gentiane Venture^ , Ikuo Mizuuchi^ , and Bipin Indurkhya^*

^*Center for Cognitive Science, Jagiellonian University
ul. Ingardena 3, Kraków 30, Poland

^†Corresponding author

^**Department of Mechanical Engineering, Graduate School of Engineering, The University of Tokyo
Tokyo, Japan

^***Division of Advanced Mechanical Systems Engineering, Institute of Engineering, Tokyo University of Agriculture and Technology
Tokyo, Japan

Received:

November 30, 2024

Accepted:

April 16, 2025

Published:

July 5, 2025

Keywords:

mobile robotics, map-less navigation, deep learning, sim-to-real

Abstract

This paper introduces a novel VGG-16-based visual navigation architecture for a differential drive robot, Social Plantroid, using a temporal vision mosaic, a novel approach which joins current and previous robot vision frames for heading estimation. As a minor contribution, it integrates a novel sunlight/shadow detection algorithm using Gabor filters. The neural network is trained with simulated data employing the artificial potential field method, which is another novelty for map-less robot navigation. Virtual and real-world experiments validate the effectiveness of this architecture in obstacle avoidance and navigation.

Cite this article as:

A. Gonzalez, G. Venture, I. Mizuuchi, and B. Indurkhya, “VGG-16-Based Map-Less Navigation Architecture with Temporal Vision Mosaic for Autonomous Ground Robots,” Int. J. Automation Technol., Vol.19 No.4, pp. 651-665, 2025.

Data files:

References

[1] Y. D. Yasuda, L. E. G. Martins, and F. A. Cappabianco, “Autonomous visual navigation for mobile robots: A systematic literature review,” ACM Computing Surveys (CSUR), Vol.53, No.1, pp. 1-34, 2020. https://doi.org/10.1145/3368961
[2] A. G. C. Gonzalez, G. Venture, and I. Mizuuchi, “VGG-16 neural network-based visual artificial potential field for autonomous navigation of ground robots,” Int. Conf. on Intelligent Autonomous Systems, 2023.
[3] M. Yuasa and I. Mizuuchi, “A control method for a swarm of plant pot robots that uses artificial potential fields for effective utilization of sunlight,” J. Robot. Mechatron., Vol.26, No.4, pp. 505-512, 2014. https://doi.org/10.20965/jrm.2014.p0505
[4] M. F. Land and R. D. Fernald, “The evolution of eyes,” Annual review of neuroscience, Vol.15, No.1, pp. 1-29, 1992. https://doi.org/10.1146/annurev.ne.15.030192.000245
[5] E. C. Tolman, “Cognitive maps in rats and men,” Psychological Review, Vol.55, No.4, pp. 189-208, 1948. https://doi.org/10.1037/h0061626
[6] F. Rodriguez, E. Duran, J. P. Vargas, B. Torres, and C. Salas, “Performance of goldfish trained in allocentric and egocentric maze procedures suggests the presence of a cognitive mapping system in fishes,” Animal Learning & Behavior, Vol.22, pp. 409-420, 1994. https://doi.org/10.3758/BF03209160
[7] J. Mather, “The case for octopus consciousness: Temporality,” NeuroSci, Vol.3, No.2, pp. 245-261, 2022. https://doi.org/10.3390/neurosci3020018
[8] Y. Liu, L. B. Day, K. Summers, and S. S. Burmeister, “A cognitive map in a poison frog,” J. of Experimental Biology, Vol.222, No.11, Article No.jeb197467, 2019. https://doi.org/10.1242/jeb.197467
[9] A. R. Krochmal and T. C. Roth II, “The case for investigating the cognitive map in nonavian reptiles,” Animal Behaviour, Vol.197, pp. 71-80, 2023. https://doi.org/10.1016/j.anbehav.2023.01.006
[10] K. Dhein, “The cognitive map debate in insects: A historical perspective on what is at stake,” Studies in History and Philosophy of Science, Vol.98, pp. 62-79, 2023. https://doi.org/10.1016/j.shpsa.2022.12.008
[11] S. S. Ge and Y. J. Cui, “Dynamic motion planning for mobile robots using potential field method,” Autonomous Robots, Vol.13, No.3, pp. 207-222, 2002. https://doi.org/10.1023/A:1020564024509
[12] M. Sarris and M. Sixt, “Navigating in tissue mazes: Chemoattractant interpretation in complex environments,” Current Opinion in Cell Biology, Vol.36, pp. 93-102, 2015. https://doi.org/10.1016/j.ceb.2015.08.001
[13] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “DeepDriving: Learning affordance for direct perception in autonomous driving,” Proc. of the IEEE Int. Conf. on Computer Vision, pp. 2722-2730, 2015. https://doi.org/10.1109/ICCV.2015.312
[14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014. https://doi.org/10.48550/arXiv.1409.1556
[15] M. Egerstedt, X. Hu, and A. Stotsky, “Control of mobile platforms using a virtual vehicle approach,” IEEE Trans. on Automatic Control, Vol.46, No.11, pp. 1777-1782, 2001. https://doi.org/10.1109/9.964690
[16] A. Kawakami, K. Tsukada, K. Kambara, and I. Siio, “PotPet: Pet-like flowerpot robot,” Proc. of the 5th Int. Conf.on Tangible, Embedded, and Embodied Interaction, pp. 263-264, 2010. https://doi.org/10.1145/1935701.1935755
[17] M. Elings, “People-plant interaction: The physiological, psychological and sociological effects of plants on people,” J. Hassink and M. Van Dijk (Eds.), “Farming for Health,” pp. 43-55, Springer, 2006. https://doi.org/10.1007/1-4020-4541-7_4
[18] V. Tolani, S. Bansal, A. Faust, and C. Tomlin, “Visual navigation among humans with optimal control as a supervisor,” IEEE Robotics and Automation Letters, Vol.6, No.2, pp. 2288-2295, 2021. https://doi.org/10.1109/LRA.2021.3060638
[19] H. Lee, H. W. Ho, and Y. Zhou, “Deep learning-based monocular obstacle avoidance for unmanned aerial vehicle navigation in tree plantations: Faster region-based convolutional neural network approach,” J. of Intelligent & Robotic Systems, Vol.101, Article No.1, 2021. https://doi.org/10.1007/s10846-020-01284-z
[20] L. Ran, Y. Zhang, Q. Zhang, and T. Yang, “Convolutional neural network-based robot navigation using uncalibrated spherical images,” Sensors, Vol.17, No.6, Article No.1341, 2017. https://doi.org/10.3390/s17061341
[21] T. Ran, L. Yuan, and J. Zhang, “Scene perception based visual navigation of mobile robot in indoor environment,” ISA Trans., Vol.109, pp. 389-400, 2021. https://doi.org/10.1016/j.isatra.2020.10.023
[22] Y.-H. Kim, J.-I. Jang, and S. Yun, “End-to-end deep learning for autonomous navigation of mobile robot,” 2018 IEEE Int. Conf. on Consumer Electronics (ICCE), 2018. http://dx.doi.org/10.1109/ICCE.2018.8326229
[23] D. Dugas, O. Andersson, R. Siegwart, and J. J. Chung, “NavDreams: Towards camera-only RL navigation among humans,” arXiv:2203.12299, 2022. https://doi.org/10.48550/arXiv.2203.12299
[24] A. V. Savkin, A. S. Matveev, M. Hoy, and C. Wang, “Safe Robot Navigation Among Moving and Steady Obstacles,” Butterworth-Heinemann, 2015.
[25] R. Khadka, P. G. Lind, G. Mello, M. A. Riegler, and A. Yazidi, “Inducing inductive bias in vision transformer for eeg classification,” 2024 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’2024), pp. 2096-2100, 2024. https://doi.org/10.1109/ICASSP48485.2024.10446429
[26] J. G. Daugman, “Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression,” IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol.36, No.7, pp. 1169-1179, 1988. https://doi.org/10.1109/29.1644
[27] V. Agafonkin, “Polylabel: A fast algorithm for finding the pole of inaccessibility of a polygon,” 2016. https://github.com/mapbox/polylabel [Accessed October 20, 2024]
[28] D. Garcia-Castellanos and U. Lombardo, “Poles of inaccessibility: A calculation algorithm for the remotest places on earth,” Scottish Geographical J., Vol.123, No.3, pp. 227-233, 2007. https://doi.org/10.1080/14702540801897809
[29] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980, 2014. https://doi.org/10.48550/arXiv.1412.6980
[30] T. Eiter and H. Mannila, “Computing Discrete Frechet Distance,” Tech. Report CD-TR 94/64, Christian Doppler Laboratory for Expert Systems, 1994.
[31] X. Zhu, Y. Wang, J. Dai, L. Yuan, and Y. Wei, “Flow-guided feature aggregation for video object detection,” 2017 IEEE Int. Conf. on Computer Vision, pp. 408-417, 2017. https://doi.org/10.1109/ICCV.2017.52

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] Y. D. Yasuda, L. E. G. Martins, and F. A. Cappabianco, “Autonomous visual navigation for mobile robots: A systematic literature review,” ACM Computing Surveys (CSUR), Vol.53, No.1, pp. 1-34, 2020. https://doi.org/10.1145/3368961

[2] [2] A. G. C. Gonzalez, G. Venture, and I. Mizuuchi, “VGG-16 neural network-based visual artificial potential field for autonomous navigation of ground robots,” Int. Conf. on Intelligent Autonomous Systems, 2023.

[3] [3] M. Yuasa and I. Mizuuchi, “A control method for a swarm of plant pot robots that uses artificial potential fields for effective utilization of sunlight,” J. Robot. Mechatron., Vol.26, No.4, pp. 505-512, 2014. https://doi.org/10.20965/jrm.2014.p0505

[4] [4] M. F. Land and R. D. Fernald, “The evolution of eyes,” Annual review of neuroscience, Vol.15, No.1, pp. 1-29, 1992. https://doi.org/10.1146/annurev.ne.15.030192.000245

[5] [5] E. C. Tolman, “Cognitive maps in rats and men,” Psychological Review, Vol.55, No.4, pp. 189-208, 1948. https://doi.org/10.1037/h0061626

[6] [6] F. Rodriguez, E. Duran, J. P. Vargas, B. Torres, and C. Salas, “Performance of goldfish trained in allocentric and egocentric maze procedures suggests the presence of a cognitive mapping system in fishes,” Animal Learning & Behavior, Vol.22, pp. 409-420, 1994. https://doi.org/10.3758/BF03209160

[7] [7] J. Mather, “The case for octopus consciousness: Temporality,” NeuroSci, Vol.3, No.2, pp. 245-261, 2022. https://doi.org/10.3390/neurosci3020018

[8] [8] Y. Liu, L. B. Day, K. Summers, and S. S. Burmeister, “A cognitive map in a poison frog,” J. of Experimental Biology, Vol.222, No.11, Article No.jeb197467, 2019. https://doi.org/10.1242/jeb.197467

[9] [9] A. R. Krochmal and T. C. Roth II, “The case for investigating the cognitive map in nonavian reptiles,” Animal Behaviour, Vol.197, pp. 71-80, 2023. https://doi.org/10.1016/j.anbehav.2023.01.006

[10] [10] K. Dhein, “The cognitive map debate in insects: A historical perspective on what is at stake,” Studies in History and Philosophy of Science, Vol.98, pp. 62-79, 2023. https://doi.org/10.1016/j.shpsa.2022.12.008

[11] [11] S. S. Ge and Y. J. Cui, “Dynamic motion planning for mobile robots using potential field method,” Autonomous Robots, Vol.13, No.3, pp. 207-222, 2002. https://doi.org/10.1023/A:1020564024509

[12] [12] M. Sarris and M. Sixt, “Navigating in tissue mazes: Chemoattractant interpretation in complex environments,” Current Opinion in Cell Biology, Vol.36, pp. 93-102, 2015. https://doi.org/10.1016/j.ceb.2015.08.001

[13] [13] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “DeepDriving: Learning affordance for direct perception in autonomous driving,” Proc. of the IEEE Int. Conf. on Computer Vision, pp. 2722-2730, 2015. https://doi.org/10.1109/ICCV.2015.312

[14] [14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014. https://doi.org/10.48550/arXiv.1409.1556

[15] [15] M. Egerstedt, X. Hu, and A. Stotsky, “Control of mobile platforms using a virtual vehicle approach,” IEEE Trans. on Automatic Control, Vol.46, No.11, pp. 1777-1782, 2001. https://doi.org/10.1109/9.964690

[16] [16] A. Kawakami, K. Tsukada, K. Kambara, and I. Siio, “PotPet: Pet-like flowerpot robot,” Proc. of the 5th Int. Conf.on Tangible, Embedded, and Embodied Interaction, pp. 263-264, 2010. https://doi.org/10.1145/1935701.1935755

[17] [17] M. Elings, “People-plant interaction: The physiological, psychological and sociological effects of plants on people,” J. Hassink and M. Van Dijk (Eds.), “Farming for Health,” pp. 43-55, Springer, 2006. https://doi.org/10.1007/1-4020-4541-7_4

[18] [18] V. Tolani, S. Bansal, A. Faust, and C. Tomlin, “Visual navigation among humans with optimal control as a supervisor,” IEEE Robotics and Automation Letters, Vol.6, No.2, pp. 2288-2295, 2021. https://doi.org/10.1109/LRA.2021.3060638

[19] [19] H. Lee, H. W. Ho, and Y. Zhou, “Deep learning-based monocular obstacle avoidance for unmanned aerial vehicle navigation in tree plantations: Faster region-based convolutional neural network approach,” J. of Intelligent & Robotic Systems, Vol.101, Article No.1, 2021. https://doi.org/10.1007/s10846-020-01284-z

[20] [20] L. Ran, Y. Zhang, Q. Zhang, and T. Yang, “Convolutional neural network-based robot navigation using uncalibrated spherical images,” Sensors, Vol.17, No.6, Article No.1341, 2017. https://doi.org/10.3390/s17061341

[21] [21] T. Ran, L. Yuan, and J. Zhang, “Scene perception based visual navigation of mobile robot in indoor environment,” ISA Trans., Vol.109, pp. 389-400, 2021. https://doi.org/10.1016/j.isatra.2020.10.023

[22] [22] Y.-H. Kim, J.-I. Jang, and S. Yun, “End-to-end deep learning for autonomous navigation of mobile robot,” 2018 IEEE Int. Conf. on Consumer Electronics (ICCE), 2018. http://dx.doi.org/10.1109/ICCE.2018.8326229

[23] [23] D. Dugas, O. Andersson, R. Siegwart, and J. J. Chung, “NavDreams: Towards camera-only RL navigation among humans,” arXiv:2203.12299, 2022. https://doi.org/10.48550/arXiv.2203.12299

[24] [24] A. V. Savkin, A. S. Matveev, M. Hoy, and C. Wang, “Safe Robot Navigation Among Moving and Steady Obstacles,” Butterworth-Heinemann, 2015.

[25] [25] R. Khadka, P. G. Lind, G. Mello, M. A. Riegler, and A. Yazidi, “Inducing inductive bias in vision transformer for eeg classification,” 2024 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’2024), pp. 2096-2100, 2024. https://doi.org/10.1109/ICASSP48485.2024.10446429

[26] [26] J. G. Daugman, “Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression,” IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol.36, No.7, pp. 1169-1179, 1988. https://doi.org/10.1109/29.1644

[27] [27] V. Agafonkin, “Polylabel: A fast algorithm for finding the pole of inaccessibility of a polygon,” 2016. https://github.com/mapbox/polylabel [Accessed October 20, 2024]

[28] [28] D. Garcia-Castellanos and U. Lombardo, “Poles of inaccessibility: A calculation algorithm for the remotest places on earth,” Scottish Geographical J., Vol.123, No.3, pp. 227-233, 2007. https://doi.org/10.1080/14702540801897809

[29] [29] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980, 2014. https://doi.org/10.48550/arXiv.1412.6980

[30] [30] T. Eiter and H. Mannila, “Computing Discrete Frechet Distance,” Tech. Report CD-TR 94/64, Christian Doppler Laboratory for Expert Systems, 1994.

[31] [31] X. Zhu, Y. Wang, J. Dai, L. Yuan, and Y. Wei, “Flow-guided feature aggregation for video object detection,” 2017 IEEE Int. Conf. on Computer Vision, pp. 408-417, 2017. https://doi.org/10.1109/ICCV.2017.52

VGG-16-Based Map-Less Navigation Architecture with Temporal Vision Mosaic for Autonomous Ground Robots

Antonio Galiza Cerdeira Gonzalez*,† , Gentiane Venture** , Ikuo Mizuuchi*** , and Bipin Indurkhya*

Antonio Galiza Cerdeira Gonzalez^,† , Gentiane Venture^ , Ikuo Mizuuchi^ , and Bipin Indurkhya^*