Paper:
Cooperative Behavior Learning Based on Social Interaction of State Conversion and Reward Exchange Among Multi-Agents
Kun Zhang*, Yoichiro Maeda**, and Yasutake Takahashi**
*Dept. of System Design Engineering, Graduate School of Engineering, University of Fukui
**Dept. of Human and Artificial Intelligent Systems, Graduate School of Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui 910-8507, Japan
In multi-agent systems, it is necessary for autonomous agents to interact with each other in order to have excellent cooperative performance. Therefore, we have studied social interaction between agents to see how they acquire cooperative behavior. We have found that sharing environmental states can improve agent cooperation through reinforcement learning, and that changing environmental states to target-related individual states improves cooperation. To further improve cooperation, we propose reward redistribution based on reward exchanges among agents. In receiving rewards from both the environment and other agents, agents learned how to adjust themselves to the environment and how to explore and strengthen cooperation in tasks that a single agent could not do alone. Agents thus cooperate best through the interaction of state conversion and reward exchange.
- [1] Y. Maeda, “Evolutionary Simulation for Co-Operative Behavior Learning on Multi-Agent Robots,” J. of Japan Society for Fuzzy Theory and intelligent informatics, Vol.13, No.3, pp. 281-291, 2001 (in Japanese).
- [2] M. J. Wooldridge, “An Introduction to MultiAgent Systems,” John Wiley and Sons, Ltd. England, 2002.
- [3] M. J. Mataric, “Reinforcement Learning in the Multi-Robot Domain,” Autonomous Robots, Vol.4, pp. 73-83, 1997.
- [4] T. Matsuura and Y. Maeda, “Deadlock Avoidance of a Multi-Agent Robot Based on a Network of Chaotic Elements,” Advanced Robotics, Vol.13, No.3, pp. 249-251, 1999.
- [5] S. Arai, “Multiagent Reinforcement Learning Frameworks: Steps toward Practical Use,” J. of The Japanese Society for Artificial Intelligence, Vol.16, No.4, pp. 476-481, 2001.
- [6] M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents,” Proc. of Tenth Int. Conf. on Machine Learning, pp. 330-337, 1993.
- [7] L. Nunes and E.Oliveira, “Cooperative learning using advice exchange,” Adaptive Agents and Multiagent Systems, Lecture Notes in Computer Science, pp. 33-48, 2003.
- [8] M. L. Littman, “Freind-or-foe Q-learning in general-sum games,” Proc. of Eighteenth Int. Conf. on Machine Learning, pp. 322-328, 2001.
- [9] A. Greenwald, K. Hall, and R. Serrano, “Correlated Q-learning,” Proc. of Twentieth Int. Conf. on Machine Learning, pp. 242-249, 2003.
- [10] J. Hu, and M. P. Wellman, “Nash Q-learning for General-Sum Stochastic Games,” J. of Machine Learning Research, Vol.4, pp. 1039-1069, 2003.
- [11] M. Bowling, “Convergence and no-regret in multiagent learning,” Proc. of the Annual Conf. on Neural Information Processing Systems, pp. 209-216, 2005.
- [12] G.Weiss, “Multiagent Systems: AModern Approach to Distributed Artificial Intelligence,” MIT Press, 1999.
- [13] C. Claus and C. Boutilier, “The dynamics of reinforcement learning in cooperative multiagent systems,” AAAI/IAAI, pp. 746-752, 1998.
- [14] R. Ribeiro, A. P. Borges, and F. Enembreck, “Interaction Models for Multiagent Reinforcement Learning,” Computational Intelligence for Modelling, Control and Automation, Int. Conf., pp. 464-469, 2008.
- [15] S. Kato and H. Matsuo, “A Theory of Profit Sharing in Dynamic Environment,” PRICAI 2000, LNAI 1886 pp. 115-124, 2000.
- [16] S. Arai, K. Miyazaki, and S. Kobayashi, “Methodology in Multi-Agent Reinforcement Learning-Approaches by Q-Learning and Profit Sharing,” Japanese Society for Artificial Intelligence, Vol.13, No.5, pp. 609-618, 1998.
- [17] K. Zhang, and Y. Maeda, “Multi Agent Reinforcement Learning Based on Contribution Degree of Individual and Group Evaluation,” The 27TH Annual Conf. of the Robotics Society of Japan, CD-ROM, RSJ2009AC1F1-03, 2009 (in Japanese).
- [18] K. Zhang, Y. Maeda, and Y. Takahashi, “Group Behavior Learning in Multi-Agent Systems Based on Social Interaction among Agents,” Joint 5th Int. Conf. on Soft Computing and Intelligent Systems and 11th Int. Symposium on advanced Intelligent Systems, TH-B3-1, pp. 193-198, 2010.
- [19] D. Barrios-Aranibar and L. M. G. Goncalves, “Learning Coordination in Multi-Agent Systems using Influence Value Reinforcement Learning,” 7th Int. Conf. on Intelligent Systems Design and Applications (ISDA07), pp. 471-478, 2007.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.