Cooperative Behavior Learning Based on Social Interaction of State Conversion and Reward Exchange Among Multi-Agents

Kun Zhang; Yoichiro Maeda; Yasutake Takahashi

doi:10.20965/jaciii.2011.p0606

single-jc.php

« previous

JACIII Vol.15 No.5 pp. 606-616

doi: 10.20965/jaciii.2011.p0606

(2011)

Paper:

Views over last 60 days: 809

Cooperative Behavior Learning Based on Social Interaction of State Conversion and Reward Exchange Among Multi-Agents

Kun Zhang^*, Yoichiro Maeda^, and Yasutake Takahashi^

^*Dept. of System Design Engineering, Graduate School of Engineering, University of Fukui

^**Dept. of Human and Artificial Intelligent Systems, Graduate School of Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui 910-8507, Japan

Received:

November 20, 2010

Accepted:

April 21, 2011

Published:

July 20, 2011

Keywords:

social interaction, behavior learning, state conversion, reward exchange, reinforcement learning

Abstract

In multi-agent systems, it is necessary for autonomous agents to interact with each other in order to have excellent cooperative performance. Therefore, we have studied social interaction between agents to see how they acquire cooperative behavior. We have found that sharing environmental states can improve agent cooperation through reinforcement learning, and that changing environmental states to target-related individual states improves cooperation. To further improve cooperation, we propose reward redistribution based on reward exchanges among agents. In receiving rewards from both the environment and other agents, agents learned how to adjust themselves to the environment and how to explore and strengthen cooperation in tasks that a single agent could not do alone. Agents thus cooperate best through the interaction of state conversion and reward exchange.

Cite this article as:

K. Zhang, Y. Maeda, and Y. Takahashi, “Cooperative Behavior Learning Based on Social Interaction of State Conversion and Reward Exchange Among Multi-Agents,” J. Adv. Comput. Intell. Intell. Inform., Vol.15 No.5, pp. 606-616, 2011.

Data files:

References

[1] Y. Maeda, “Evolutionary Simulation for Co-Operative Behavior Learning on Multi-Agent Robots,” J. of Japan Society for Fuzzy Theory and intelligent informatics, Vol.13, No.3, pp. 281-291, 2001 (in Japanese).
[2] M. J. Wooldridge, “An Introduction to MultiAgent Systems,” John Wiley and Sons, Ltd. England, 2002.
[3] M. J. Mataric, “Reinforcement Learning in the Multi-Robot Domain,” Autonomous Robots, Vol.4, pp. 73-83, 1997.
[4] T. Matsuura and Y. Maeda, “Deadlock Avoidance of a Multi-Agent Robot Based on a Network of Chaotic Elements,” Advanced Robotics, Vol.13, No.3, pp. 249-251, 1999.
[5] S. Arai, “Multiagent Reinforcement Learning Frameworks: Steps toward Practical Use,” J. of The Japanese Society for Artificial Intelligence, Vol.16, No.4, pp. 476-481, 2001.
[6] M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents,” Proc. of Tenth Int. Conf. on Machine Learning, pp. 330-337, 1993.
[7] L. Nunes and E.Oliveira, “Cooperative learning using advice exchange,” Adaptive Agents and Multiagent Systems, Lecture Notes in Computer Science, pp. 33-48, 2003.
[8] M. L. Littman, “Freind-or-foe Q-learning in general-sum games,” Proc. of Eighteenth Int. Conf. on Machine Learning, pp. 322-328, 2001.
[9] A. Greenwald, K. Hall, and R. Serrano, “Correlated Q-learning,” Proc. of Twentieth Int. Conf. on Machine Learning, pp. 242-249, 2003.
[10] J. Hu, and M. P. Wellman, “Nash Q-learning for General-Sum Stochastic Games,” J. of Machine Learning Research, Vol.4, pp. 1039-1069, 2003.
[11] M. Bowling, “Convergence and no-regret in multiagent learning,” Proc. of the Annual Conf. on Neural Information Processing Systems, pp. 209-216, 2005.
[12] G.Weiss, “Multiagent Systems: AModern Approach to Distributed Artificial Intelligence,” MIT Press, 1999.
[13] C. Claus and C. Boutilier, “The dynamics of reinforcement learning in cooperative multiagent systems,” AAAI/IAAI, pp. 746-752, 1998.
[14] R. Ribeiro, A. P. Borges, and F. Enembreck, “Interaction Models for Multiagent Reinforcement Learning,” Computational Intelligence for Modelling, Control and Automation, Int. Conf., pp. 464-469, 2008.
[15] S. Kato and H. Matsuo, “A Theory of Profit Sharing in Dynamic Environment,” PRICAI 2000, LNAI 1886 pp. 115-124, 2000.
[16] S. Arai, K. Miyazaki, and S. Kobayashi, “Methodology in Multi-Agent Reinforcement Learning-Approaches by Q-Learning and Profit Sharing,” Japanese Society for Artificial Intelligence, Vol.13, No.5, pp. 609-618, 1998.
[17] K. Zhang, and Y. Maeda, “Multi Agent Reinforcement Learning Based on Contribution Degree of Individual and Group Evaluation,” The 27TH Annual Conf. of the Robotics Society of Japan, CD-ROM, RSJ2009AC1F1-03, 2009 (in Japanese).
[18] K. Zhang, Y. Maeda, and Y. Takahashi, “Group Behavior Learning in Multi-Agent Systems Based on Social Interaction among Agents,” Joint 5th Int. Conf. on Soft Computing and Intelligent Systems and 11th Int. Symposium on advanced Intelligent Systems, TH-B3-1, pp. 193-198, 2010.
[19] D. Barrios-Aranibar and L. M. G. Goncalves, “Learning Coordination in Multi-Agent Systems using Influence Value Reinforcement Learning,” 7th Int. Conf. on Intelligent Systems Design and Applications (ISDA07), pp. 471-478, 2007.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] Y. Maeda, “Evolutionary Simulation for Co-Operative Behavior Learning on Multi-Agent Robots,” J. of Japan Society for Fuzzy Theory and intelligent informatics, Vol.13, No.3, pp. 281-291, 2001 (in Japanese).

[2] [2] M. J. Wooldridge, “An Introduction to MultiAgent Systems,” John Wiley and Sons, Ltd. England, 2002.

[3] [3] M. J. Mataric, “Reinforcement Learning in the Multi-Robot Domain,” Autonomous Robots, Vol.4, pp. 73-83, 1997.

[4] [4] T. Matsuura and Y. Maeda, “Deadlock Avoidance of a Multi-Agent Robot Based on a Network of Chaotic Elements,” Advanced Robotics, Vol.13, No.3, pp. 249-251, 1999.

[5] [5] S. Arai, “Multiagent Reinforcement Learning Frameworks: Steps toward Practical Use,” J. of The Japanese Society for Artificial Intelligence, Vol.16, No.4, pp. 476-481, 2001.

[6] [6] M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents,” Proc. of Tenth Int. Conf. on Machine Learning, pp. 330-337, 1993.

[7] [7] L. Nunes and E.Oliveira, “Cooperative learning using advice exchange,” Adaptive Agents and Multiagent Systems, Lecture Notes in Computer Science, pp. 33-48, 2003.

[8] [8] M. L. Littman, “Freind-or-foe Q-learning in general-sum games,” Proc. of Eighteenth Int. Conf. on Machine Learning, pp. 322-328, 2001.

[9] [9] A. Greenwald, K. Hall, and R. Serrano, “Correlated Q-learning,” Proc. of Twentieth Int. Conf. on Machine Learning, pp. 242-249, 2003.

[10] [10] J. Hu, and M. P. Wellman, “Nash Q-learning for General-Sum Stochastic Games,” J. of Machine Learning Research, Vol.4, pp. 1039-1069, 2003.

[11] [11] M. Bowling, “Convergence and no-regret in multiagent learning,” Proc. of the Annual Conf. on Neural Information Processing Systems, pp. 209-216, 2005.

[12] [12] G.Weiss, “Multiagent Systems: AModern Approach to Distributed Artificial Intelligence,” MIT Press, 1999.

[13] [13] C. Claus and C. Boutilier, “The dynamics of reinforcement learning in cooperative multiagent systems,” AAAI/IAAI, pp. 746-752, 1998.

[14] [14] R. Ribeiro, A. P. Borges, and F. Enembreck, “Interaction Models for Multiagent Reinforcement Learning,” Computational Intelligence for Modelling, Control and Automation, Int. Conf., pp. 464-469, 2008.

[15] [15] S. Kato and H. Matsuo, “A Theory of Profit Sharing in Dynamic Environment,” PRICAI 2000, LNAI 1886 pp. 115-124, 2000.

[16] [16] S. Arai, K. Miyazaki, and S. Kobayashi, “Methodology in Multi-Agent Reinforcement Learning-Approaches by Q-Learning and Profit Sharing,” Japanese Society for Artificial Intelligence, Vol.13, No.5, pp. 609-618, 1998.

[17] [17] K. Zhang, and Y. Maeda, “Multi Agent Reinforcement Learning Based on Contribution Degree of Individual and Group Evaluation,” The 27TH Annual Conf. of the Robotics Society of Japan, CD-ROM, RSJ2009AC1F1-03, 2009 (in Japanese).

[18] [18] K. Zhang, Y. Maeda, and Y. Takahashi, “Group Behavior Learning in Multi-Agent Systems Based on Social Interaction among Agents,” Joint 5th Int. Conf. on Soft Computing and Intelligent Systems and 11th Int. Symposium on advanced Intelligent Systems, TH-B3-1, pp. 193-198, 2010.

[19] [19] D. Barrios-Aranibar and L. M. G. Goncalves, “Learning Coordination in Multi-Agent Systems using Influence Value Reinforcement Learning,” 7th Int. Conf. on Intelligent Systems Design and Applications (ISDA07), pp. 471-478, 2007.

Cooperative Behavior Learning Based on Social Interaction of State Conversion and Reward Exchange Among Multi-Agents

Kun Zhang*, Yoichiro Maeda**, and Yasutake Takahashi**

Kun Zhang^*, Yoichiro Maeda^, and Yasutake Takahashi^