single-jc.php

JACIII Vol.30 No.1 pp. 78-95
doi: 10.20965/jaciii.2026.p0078
(2026)

Research Paper:

Saccade-Mimicking Grid Cell Network for Image Recognition

Yuichi Matsuda*, Kazuma Niwa*, Takeru Aoki** ORCID Icon, Keiki Takadama*** ORCID Icon, and Hiroyuki Sato* ORCID Icon

*Graduate School of Informatics and Engineering, The University of Electro-Communications
1-5-1 Chofugaoka, Chofu, Tokyo 182-8585, Japan

**Faculty of Engineering, Tokyo University of Science
6-3-1 Niijuku, Katsushika-ku, Tokyo 125-8585, Japan

***Graduate School of Information Science and Technology, The University of Tokyo
6-2-3 Kashiwanoha, Kashiwa, Chiba 277-0882, Japan

Received:
May 16, 2025
Accepted:
August 16, 2025
Published:
January 20, 2026
Keywords:
grid cell net, cortical learning algorithm, saccadic eye movement, time-series prediction
Abstract

This study proposes an autonomous focal location transition mechanism that mimics saccadic eye movements for image recognition. A grid cell net (GCN) is an image recognition algorithm inspired by the human neocortex. At each time step, it focuses on a specific region of the input image and sequentially shifts its attention across the visual field. The GCN receives both the feature pattern at the current focal location and the transition vector from the previous location, and performs recognition by integrating these time-series signals. However, the conventional GCN selects focal locations randomly and lacks a mechanism for autonomously determining effective transitions. To address this limitation, we introduce a novel method that incorporates candidate class selection and next-location prediction to guide the transition process. Candidate class selection identifies the most probable class in each step, and the next-location predictor effectively reduces the number of remaining candidate classes. Experiments on the MNIST dataset demonstrate that the proposed GCN learns to focus on the image edges during the early stages of recognition. Furthermore, the proposed GCN autonomously controlled its focal transitions and consistently outperformed the conventional GCN in terms of recognition accuracy.

The proposed GCN mimics saccades to autonomously acquire focal locations

The proposed GCN mimics saccades to autonomously acquire focal locations

Cite this article as:
Y. Matsuda, K. Niwa, T. Aoki, K. Takadama, and H. Sato, “Saccade-Mimicking Grid Cell Network for Image Recognition,” J. Adv. Comput. Intell. Intell. Inform., Vol.30 No.1, pp. 78-95, 2026.
Data files:
References
  1. [1] J. Hawkins, M. Lewis, M. Klukas, S. Purdy, and S. Ahmad, “A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex,” Frontiers in Neural Circuits, Vol.12, Article No.121, 2019. https://doi.org/10.3389/fncir.2018.00121
  2. [2] J. Hawkins and S. Blakeslee, “On Intelligence: How a New Understanding of the Brain will Lead to the Creation of Truly Intelligent Machines,” Times Books, 2004.
  3. [3] A. Baldominos, Y. Saez, and P. Isasi, “A Survey of Handwritten Character Recognition with MNIST and EMNIST,” Applied Sciences, Vol.9, Issue 15, Article No.3169, 2019. https://doi.org/10.3390/app9153169
  4. [4] F. Keddous, H.-N. Nguyen, and A. Nakib, “Characters Recognition Based on CNN-RNN Architecture and Metaheuristic,” Proc. of 2021 IEEE Int. Parallel and Distributed Processing Symp. Workshops (IPDPSW), pp. 500-507, 2021. https://doi.org/10.1109/IPDPSW52791.2021.00082
  5. [5] N. Leadholm, M. Lewis, and S. Ahmad, “Grid Cell Path Integration For Movement-Based Visual Object Recognition,” Proc. of 32nd British Machine Vision Conf. (BMVC2021), 2021. https://doi.org/10.5244/C.35.409
  6. [6] J. Lou, X. Zhao, P. Young, R. White, and H. Liu, “Study of Saccadic Eye Movements in Diagnostic Imaging,” Proc. of 2021 IEEE Int. Conf. on Image Processing (ICIP), pp. 1474-1478, 2021. https://doi.org/10.1109/ICIP42928.2021.9506017
  7. [7] M. C. Raabe, F. M. López, Z. Yu, S. Caplan, C. Yu, B. E. Shi, and J. Triesch, “Saccade Amplitude Statistics are Explained by Cortical Magnification,” Proc. of 2023 IEEE Int. Conf. on Development and Learning (ICDL), pp. 300-305, 2023. https://doi.org/10.1109/ICDL55364.2023.10364440
  8. [8] Y. Matsuda, K. Niwa, T. Aoki, K. Takadama, and H. Sato, “Image Recognition Imitating Saccadic Eye Movement in Grid Cell Net,” Proc. of 2024 Joint 13th Int. Conf. on Soft Computing and Intelligent Systems and 25th Int. Symp. on Advanced Intelligent Systems (SCIS&ISIS), 2024. https://doi.org/10.1109/SCISISIS61014.2024.10759985
  9. [9] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory” Neural Computation, Vol.9, Issue 8, pp. 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  10. [10] K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A Search Space Odyssey,” IEEE Trans. on Neural Networks and Learning Systems, Vol.28, Issue 10, pp. 2222-2232, 2017. https://doi.org/10.1109/TNNLS.2016.2582924
  11. [11] T. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE Trans. on Information Theory, Vol.13, Issue 1, pp. 21-27, 1967. https://doi.org/10.1109/TIT.1967.1053964
  12. [12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Advances in Neural Information Processing Systems, Vol.25, pp. 1097-1105, 2012.
  13. [13] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale,” Proc. of the 9th Int. Conf. on Learning Representations (ICLR), 2021.
  14. [14] L. E. van Dyck and W. R. Gruber, “Modeling Biological Face Recognition with Deep Convolutional Neural Networks,” J. of Cognitive Neuroscience, Vol.35, Issue 10, pp. 1521-1537, 2023. https://doi.org/10.1162/jocn_a_02040
  15. [15] A. Bicanski and N. Burgess, “A Computational Model of Visual Recognition Memory via Grid Cells,” Current Biology, Vol.29, Issue 6, pp. 979-990, 2019. https://doi.org/10.1016/j.cub.2019.01.077
  16. [16] V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent Models of Visual Attention,” Proc. of 28th Int. Conf. on Neural Information Processing Systems (NIPS’14), Vol.2, pp. 2204-2212, 2014.
  17. [17] R. J. Williams, “Complexity of Exact Gradient Computation Algorithms for Recurrent Neural Networks,” Technical Report, Northeastern University, College of Computer Science, 1989.
  18. [18] Z. Han et al., “SeqViews2SeqLabels: Learning 3D Global Features via Aggregating Sequential Views by RNN With Attention,” IEEE Trans. on Image Processing, Vol.28, Issue 2, pp. 658-672, 2019. https://doi.org/10.1109/TIP.2018.2868426
  19. [19] L. K. A. Sörensen, S. M. Bohté, D. de Jong, H. A. Slagter, and H. S. Scholte, “Mechanisms of Human Dynamic Object Recognition Revealed by Sequential Deep Neural Networks,” PLoS Computational Biology, Vol.19, No.6, Article No.e1011169, 2023. https://doi.org/10.1371/journal.pcbi.1011169
  20. [20] L. Lin, H. Luo, R. Huang, and M. Ye, “Recurrent Models of Visual Co-attention for Person Re-identification,” IEEE Access, Vol.7, pp. 8865-8875, 2019. https://doi.org/10.1109/ACCESS.2018.2890394
  21. [21] L. Wu, Y. Wang, X. Li, and J. Gao, “Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition,” IEEE Trans. on Cybernetics, Vol.49, Issue 5, pp. 1791-1802, 2019. https://doi.org/10.1109/TCYB.2018.2813971
  22. [22] A. L. Yarbus, “Eye Movements During Perception of Complex Objects,” A. L. Yarbus, “Eye Movements and Vision,” pp. 171-211, Springer, 1967. https://doi.org/10.1007/978-1-4899-5379-7
  23. [23] H. Kirchner and S. Thorpe, “Ultra-rapid Object Detection with Saccadic Eye Movements: Visual Processing Speed Revisited,” Vision Research, Vol.46, Issue 11, pp. 1762-1776, 2006. https://doi.org/10.1016/j.visres.2005.10.002
  24. [24] T. Ohara and F. Kinoshita, “Effect of Different Measurement Tasks on the Frequency of Microsaccades,” J. Adv. Comput. Intell. Intell. Inform., Vol.28, No.3, pp. 502-510, 2024. https://doi.org/10.20965/jaciii.2024.p0502
  25. [25] L. Drissi-Daoudi, H. Öğmen, M. H. Herzog, and G. M. Cicchini, “Object Identity Determines Trans-Saccadic Integration,” J. of Vision, Vol.20, Issue 7, 2020. https://doi.org/10.1167/jov.20.7.33
  26. [26] R. Salakhutdinov, A. Mnih, and G. Hinton, “Restricted Boltzmann Machines for Collaborative Filtering,” Proc. of 24th Int. Conf. on Machine Learning (ICML ’07), pp. 791-798, 2007. https://doi.org/10.1145/1273496.1273596
  27. [27] H. Larochelle and G. E. Hinton, “Learning to Combine Foveal Glimpses with a Third-Order Boltzmann Machine,” Advances in Neural Information Processing Systems, Vol.23, pp. 1243-1251, 2010.
  28. [28] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All You Need,” Proc. of 31st Int. Conf. on Neural Information Processing Systems (NIPS 2017), pp. 6000-6010, 2017.
  29. [29] H. Adeli, S. Ahn, and G. J. Zelinsky, “A Brain-Inspired Object-Based Attention Network for Multiobject Recognition and Visual Reasoning,” J. of Vision, Vol.23, Issue 5, Article No.16, 2023. https://doi.org/10.1167/jov.23.5.16
  30. [30] I. R. Fiete, Y. Burak, and T. Brookings, “What Grid Cells Convey about Rat Location,” J. of Neuroscience, Vol.28, No.27, pp. 6858-6871, 2008. https://doi.org/10.1523/JNEUROSCI.5684-07.2008
  31. [31] S. Schubert, P. Neubert, and P. Protzel, “Towards Combining a Neocortex Model with Entorhinal Grid Cells for Mobile Robot Localization,” Proc. of 2019 European Conf. on Mobile Robots (ECMR), 2019. https://doi.org/10.1109/ECMR.2019.8870939
  32. [32] X. Zhang, X. Long, S.-J. Zhang, and Z. S. Chen, “Excitatory-Inhibitory Recurrent Dynamics Produce Robust Visual Grids and Stable Attractors,” Cell Reports, Vol.41, Issue 11, Article No.111777, 2022. https://doi.org/10.1016/j.celrep.2022.111777
  33. [33] A. Bicanski and N. Burgess, “A Computational Model of Visual Recognition Memory via Grid Cells,” Current Biology, Vol.29, Issue 6, pp. 979-990, 2019. https://doi.org/10.1016/j.cub.2019.01.077
  34. [34] J. Liu, W. Xu, X. Li, and X. Zheng, “Improved Visual Recognition Memory Model Based on Grid Cells for Face Recognition,” Frontiers in Neuroscience, Vol.15, 2021. https://doi.org/10.3389/fnins.2021.718541
  35. [35] M. Lewis, S. Purdy, S. Ahmad, and J. Hawkins, “Locations in the Neocortex: A Theory of Sensorimotor Object Recognition Using Cortical Grid Cells,” Frontiers in Neural Circuits, Vol.13, No.22, 2019. https://doi.org/10.3389/fncir.2019.00022

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jan. 21, 2026