Paper:
About Profit Sharing Considering Infatuate Actions
Wataru Uemura
Ryukoku University
- [1] J.J. Grefenstette, “Credit assignment in rule discovery systems based on genetic algorithms,” Machine Learning, Vol.3, pp. 225-245, 1988.
- [2] W. Uemura, A. Ueno, and S. Tatsumi, “The exploitation reinforcement learning method on POMDPs,” Joint 2nd Int. Conf. on Soft Computing and Intelligent Systems, TUE-1-3, 2004.
- [3] S.D. Whitehead and D.H. Balland, “Active perception and reinforcement learning,” Proc. of the 7th Int. Conf. on Machine Learning, pp. 162-169, 1990.
- [4] K. Miyazaki, M. Yamamura, and S. Kobayashi, “A Theory of Profit Sharing in Reinforcement Learning,” J. of Japanese Society for Artificial Intelligence, Vol.9, No.4, pp. 580-587, 1994.
- [5] K. Miyazaki, S. Arai, and S. Kobayashi, “Learning Deterministic Policies in Parially Observable Markov Decision Processes,” J. of Japanese Society for Artificial Intelligence, Vol.14, No.1, pp. 148-156, 1999.
- [6] G.A. Rummery and M. Niranjan, “On-line Q-learning using connectionist systems,” Technical Report CUED/F-INFENG/TR 166 Engineering Department, Cambridge University, 1994.
- [7] C.J.C.H. Watkins and P. Dayan, “Technical note:Q-Learning,” Machine Learning, Vol.8, pp. 279-292, 1992.
- [8] S. Kato and H. Matsuo, “A theory of profit sharing in dynamic environment,” Proc. of PRICAI-2000, pp. 115-124, 2000.
- [9] S. Kato and H. Matsuo, “A Theory of Profit Sharing in Dynamic Environment,” The Trans. of the Institute of Electronics, Information and Communication Engineers, D-1, Vol.J84-D-I, No.7, pp. 1067-1075, 2001.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.