JACIII Vol.11 No.2 pp. 155-161
doi: 10.20965/jaciii.2007.p0155


Reinforcement Learning Scheme for Flocking Behavior Emergence

Koichiro Morihiro*, **, Teijiro Isokawa**, Haruhiko Nishimura***, Masahito Tomimasu**, Naotake Kamiura**, and Nobuyuki Matsui**

*Hyogo University of Teacher Education, 942-1 Shimokume, Kato-shi, Hyogo 673-1494, Japan

**Division of Computer Engineering, Graduate School of Engineering, University of Hyogo, 2167 Shosha, Himeji, Hyogo 671-2201, Japan

***Graduate School of Applied Informatics, University of Hyogo, 1-3-3 Chuo-ku, Kobe, Hyogo 650-0044, Japan

October 24, 2005
December 22, 2006
February 20, 2007
collective behavior, flocking, perceptual internal space, reinforcement learning, Q-learning
Collective behavior such as bird flocking, land animal herding, and fish schooling is well known in nature. Many observations have shown that there are no leaders to control the behavior of a group. Several models have been proposed for describing the grouping behavior, which we regard as a distinctive example of aggregate motions. In these models, a fixed rule is provided for each of the individuals a priori for their interactions in a reductive and rigid manner. In contrast, we propose a new framework for the self-organized grouping of agents by reinforcement learning. It is important to introduce a learning scheme for causing collective behavior in artificial autonomous distributed systems. The behavior of agents is demonstrated and evaluated through computer simulations and it is shown that their grouping behavior emerges as a result of learning.
Cite this article as:
K. Morihiro, T. Isokawa, H. Nishimura, M. Tomimasu, N. Kamiura, and N. Matsui, “Reinforcement Learning Scheme for Flocking Behavior Emergence,” J. Adv. Comput. Intell. Intell. Inform., Vol.11 No.2, pp. 155-161, 2007.
Data files:
  1. [1] I. Aoki, “A Simulation Study on the Schooling Mechanism in Fish,” Bulletin of the Japanese Society of Scientific Fisheries, 48(8), pp. 1081-1088, 1982.
  2. [2] P. Y. Glorennec, “Reinforcement Learning: an Overview,” in Proceedings of European Symposium on Intelligent Techniques 2000, pp. 17-35, Aachen, Germany, 2000.
  3. [3] A. Huth and C. Wissel, “The Simulation of the Movement of Fish Schools,” Journal of Theoretical Biology, 156, pp. 365-385, 1992.
  4. [4] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: A Survey,” Journal of Artificial Intelligence Research, 4, pp. 237-285, 1996.
  5. [5] B. L. Partridge, “The Structure and function of fish schools,” Scientific American, 246, pp. 90-99, 1982.
  6. [6] C. W. Reynolds, “Flocks, Herds, and Schools: A Distributed Behavioral Model,” Computer Graphics, 21(4), pp. 25-34, 1987.
  7. [7] E. Shaw, “Schooling Fishes,” American Scientist, 66, pp. 166-175, 1978.
  8. [8] R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, Cambridge, MA, 1982.
  9. [9] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. thesis, University of Cambridge, 1989.
  10. [10] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, 8, pp. 279-292, 1992.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jul. 19, 2024