Paper:

# Designing Internal Reward of Reinforcement Learning Agents in Multi-Step Dilemma Problem

## Yoshihiro Ichikawa and Keiki Takadama

The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo 182-8585, Japan

*J. Adv. Comput. Intell. Intell. Inform.*, Vol.17 No.6, pp. 926-931, 2013.

- [1] C. J. C. H. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, Vol.8, pp. 55-58,1992.
- [2] R. S. Sutton and A. G. Bart, “Reinforcement Learning -An Introduction-,” The MIT Press, 1998.
- [3] M. Tan, “Multiagent Reinforcement Learning: Independent vs. Cooperative Agent,” The 10th Int. Conf. on Machine Learning, pp. 330-337, 1993.
- [4] G.Weiss, “Multiagent Systems: AModern Approach to Distributed Artificial Intelligence,” The MIT Press, 1999.
- [5] P. Stone and M. Veloso, “Multiagent Systems: A Survey from a Machine Learning Perspective,” Autonomous Robots, Vol.8, pp. 345-383, 1997.
- [6] E. Yang and D. Gu, “Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey,” Technical Report CSM-404, Department of Computer Science, University of Essex, 2004.
- [7] Y.-M. D. Hauwere, P. Vrancx and A. Nowé, “Learning multiagent state space representations,” Proc. of the 9th Int. Conf. on Autonomous Agents and Multiagent Systems, Vol.1, pp. 715-722, 2010.
- [8] Y. Ichikawa, K. Sato, K. Hattori, and K. Takadama, “Entropy-based Conflict Avoidance According to Learning Progress in Multi-Agent Q-learning,” Proc. of the IADIS Int. Conf. on Intelligent Systems and Agents 2012 (ISA2012), 2012.
- [9] M. L. Littman, “Markov Games as a Framework for Multi-Agent Reinforcement Learning,” Proc. of the Eleventh Int. Conf. on Machine Learning, pp. 157-163, 1994.
- [10] J. Hu and M. P. Wellman, “Nash Q-Learning for General-Sum Stochastic Games,” J. of Machine Learning Research, Vol.4, pp. 1039-1069, 2003.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.