Reinforcement Learning Scheme for Flocking Behavior Emergence

Koichiro Morihiro; Teijiro Isokawa; Haruhiko Nishimura; Masahito Tomimasu; Naotake Kamiura; Nobuyuki Matsui

doi:10.20965/jaciii.2007.p0155

single-jc.php

« previous

JACIII Vol.11 No.2 pp. 155-161

doi: 10.20965/jaciii.2007.p0155

(2007)

Paper:

Views over last 60 days: 658

Reinforcement Learning Scheme for Flocking Behavior Emergence

Koichiro Morihiro^{, }, Teijiro Isokawa^, Haruhiko Nishimura^, Masahito Tomimasu^, Naotake Kamiura^, and Nobuyuki Matsui^

^*Hyogo University of Teacher Education, 942-1 Shimokume, Kato-shi, Hyogo 673-1494, Japan

^**Division of Computer Engineering, Graduate School of Engineering, University of Hyogo, 2167 Shosha, Himeji, Hyogo 671-2201, Japan

^***Graduate School of Applied Informatics, University of Hyogo, 1-3-3 Chuo-ku, Kobe, Hyogo 650-0044, Japan

Received:

October 24, 2005

Accepted:

December 22, 2006

Published:

February 20, 2007

Keywords:

collective behavior, flocking, perceptual internal space, reinforcement learning, Q-learning

Abstract

Collective behavior such as bird flocking, land animal herding, and fish schooling is well known in nature. Many observations have shown that there are no leaders to control the behavior of a group. Several models have been proposed for describing the grouping behavior, which we regard as a distinctive example of aggregate motions. In these models, a fixed rule is provided for each of the individuals a priori for their interactions in a reductive and rigid manner. In contrast, we propose a new framework for the self-organized grouping of agents by reinforcement learning. It is important to introduce a learning scheme for causing collective behavior in artificial autonomous distributed systems. The behavior of agents is demonstrated and evaluated through computer simulations and it is shown that their grouping behavior emerges as a result of learning.

Cite this article as:

K. Morihiro, T. Isokawa, H. Nishimura, M. Tomimasu, N. Kamiura, and N. Matsui, “Reinforcement Learning Scheme for Flocking Behavior Emergence,” J. Adv. Comput. Intell. Intell. Inform., Vol.11 No.2, pp. 155-161, 2007.

Data files:

References

[1] I. Aoki, “A Simulation Study on the Schooling Mechanism in Fish,” Bulletin of the Japanese Society of Scientific Fisheries, 48(8), pp. 1081-1088, 1982.
[2] P. Y. Glorennec, “Reinforcement Learning: an Overview,” in Proceedings of European Symposium on Intelligent Techniques 2000, pp. 17-35, Aachen, Germany, 2000.
[3] A. Huth and C. Wissel, “The Simulation of the Movement of Fish Schools,” Journal of Theoretical Biology, 156, pp. 365-385, 1992.
[4] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: A Survey,” Journal of Artificial Intelligence Research, 4, pp. 237-285, 1996.
[5] B. L. Partridge, “The Structure and function of fish schools,” Scientific American, 246, pp. 90-99, 1982.
[6] C. W. Reynolds, “Flocks, Herds, and Schools: A Distributed Behavioral Model,” Computer Graphics, 21(4), pp. 25-34, 1987.
[7] E. Shaw, “Schooling Fishes,” American Scientist, 66, pp. 166-175, 1978.
[8] R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, Cambridge, MA, 1982.
[9] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. thesis, University of Cambridge, 1989.
[10] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, 8, pp. 279-292, 1992.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] I. Aoki, “A Simulation Study on the Schooling Mechanism in Fish,” Bulletin of the Japanese Society of Scientific Fisheries, 48(8), pp. 1081-1088, 1982.

[2] [2] P. Y. Glorennec, “Reinforcement Learning: an Overview,” in Proceedings of European Symposium on Intelligent Techniques 2000, pp. 17-35, Aachen, Germany, 2000.

[3] [3] A. Huth and C. Wissel, “The Simulation of the Movement of Fish Schools,” Journal of Theoretical Biology, 156, pp. 365-385, 1992.

[4] [4] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: A Survey,” Journal of Artificial Intelligence Research, 4, pp. 237-285, 1996.

[5] [5] B. L. Partridge, “The Structure and function of fish schools,” Scientific American, 246, pp. 90-99, 1982.

[6] [6] C. W. Reynolds, “Flocks, Herds, and Schools: A Distributed Behavioral Model,” Computer Graphics, 21(4), pp. 25-34, 1987.

[7] [7] E. Shaw, “Schooling Fishes,” American Scientist, 66, pp. 166-175, 1978.

[8] [8] R. S. Sutton and A. G. Barto, “Reinforcement Learning,” MIT Press, Cambridge, MA, 1982.

[9] [9] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. thesis, University of Cambridge, 1989.

[10] [10] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, 8, pp. 279-292, 1992.

Reinforcement Learning Scheme for Flocking Behavior Emergence

Koichiro Morihiro*, **, Teijiro Isokawa**, Haruhiko Nishimura***, Masahito Tomimasu**, Naotake Kamiura**, and Nobuyuki Matsui**

Koichiro Morihiro^{, }, Teijiro Isokawa^, Haruhiko Nishimura^, Masahito Tomimasu^, Naotake Kamiura^, and Nobuyuki Matsui^