Paper:
Multiple-Timescale PIA for Model-Based Reinforcement Learning
Tomohiro Yamaguchi* and Eri Imatani**
*Nara National Collage of Technology, 22 Yata-cho, Yamatokoriyama, Nara 639-1080, Japan
**Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192, Japan
- [1] M. Bowling and M. M. Veloso, “An analysis of stochastic game theory for multiagent reinforcement learning,” Technical report CMU-CS-00-165, Carnegie Mellon University, 2000.
- [2] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” MIT Press, 1998.
- [3] A .G. Barto, S. J. Bradtke, and S. P. Singh, “Learning to act using real-time dynamic programming,” Artificial intelligence Vol.72, pp. 81-138, Elsevier, 1995.
- [4] M. L. Puterman, “Markov Decision Processes:Discrete Stochastic Dynamic Programming,” JOHN WILEY & SONS, INC, pp. 385-388, 1994.
- [5] Y. Shoham, R. Powers, and T. Grenager, “Multi-agent reinforcement learning:a critical survey,” Technical report, Stanford University, 2003.
- [6] J. Hu and M. P. Wellman, “Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm,” Proc. 15th Int. Conf. on Machine Learning, pp. 242-250, 1998.
- [7] M. Littman, “Markov games as a framework for multi-agent reinforcement learning,” Proc. of 11th Inter. Conf. on Machine Learning, pp. 157-163, 1994.
- [8] D. Leslie, “Multiple timescales for multiagent learning,” NIPS2002, workshop on Multi-Agent Learning: Theory and Practice, 2002.
- [9] M. Bowling and M. M. Veloso, “Multiagent learning using a variable learning rate,” Artificial Intelligence Vol.136, pp. 215-250, 2002.
- [10] F. Kaplan, P-Y. Oudeyer, E. Kubinyi, and A. Miklosi, “Robotic clicker training,” Robotics and Autonomous Systems, Vol.38, 3-4, pp. 197-206, 2002.
- [11] K. Satoh and T. Yamaguchi, “Preparing various policies for interactive reinforcement learning,” Proc. of the SICE-ICASE Int. Joint Conf. 2006.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.