Paper:
Language: English:
Journal ref: Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol.11, No.6 pp. 668-676, 2007
[1] P. Abbeel and A. Y. Ng, “Exploration and apprenticeship learning in reinforcement learning,” Proceedings of the 21st International Conference on Machine Learning, pp. 1-8, 2005.
[2] L. Chrisman, “Reinforcement Learning with perceptual aliasing: The Perceptual Distinctions Approach,” Proceedings of the 10th National Conference on Artificial Intelligence, pp. 183-188, 1992.
[3] H. Kimura and S. Kobayashi, “An analysis of actor/critic algorithms using eligibility traces: reinforcement learning with imperfect value function,” Proceedings of the 15th International Conference on Machine Learning, pp. 278-286, 1998.
[4] H. Kimura, “Reinforcement Learning in multi-dimensional stateaction space using random tiling and Gibbs sampling,” Transactionof the Society of Instrument and Control Engineers, Vol.42, No.12, 2006 (in Japanese).
[5] H. Kita, I. Ono, and S. Kobayashi, “Theoretical Analysis of the Unimodal Normal Distribution Crossover for Real-coded Genetic Algorithm,” Proceedings of 1998 IEEE Int. Conf. on Evolutionary Computation, pp. 529-534, 1998.
[6] K. Miyazaki, M. Yamamura, and S. Kobayashi, “On the Rationality of Profit Sharing in Reinforcement Learning,” 3rd International Conference on Fuzzy Logic, Neural Nets and Soft Computing, Iizuka, Japan, pp. 285-288, 1994.
[7] K. Miyazaki and S. Kobayashi, “Learning Deterministic Policies in Partially Observable Markov Decision Processes,” International Conference on Intelligent Autonomous System (IAS) 5, pp. 250-257, 1998.
[8] K. Miyazaki and S. Kobayashi, “Reinforcement Learning for Penalty Avoiding Policy Making,” 2000 IEEE International Conference on Systems, Man, and Cybernetics, pp. 206-211, 2000.
[9] K. Miyazaki and S. Kobayashi, “An Extension of Profit Sharing to Partially Observable Markov Decision Processes: Proposition of PS-r* and its Evaluation,” Journal of the Japanese Society for Artificial Intelligence, Vol.18, No.5, pp. 286-296, 2003 (in Japanese).
[10] K. Miyazaki and S. Kobayashi, “Reinforcement Learning Systems based on Profit Sharing in Robotics,” Proceedings of the 36th International Symposium on Robotics, 2005.
[11] A. Y. Ng, D. Harada, and S. J. Russell, “Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping,” Proceedings of the 17th International Conference on Machine Learning, pp. 278-287, 1999.
[12] A. Y. Ng and S. J. Russell, “Algorithms for Inverse Reinforcement Learning,” Proceedings of the 17th International Conference on Machine Learning, pp. 663-670, 2000.
[13] I. Ono and S. Kobayashi, “A Real-coded Genetic Algorithm for Function Optimization Using Unimodal Normal Distribution Crossover,” Proceedings of the 7th International Conference on Genetic Algorithms, pp. 246-253, 1997.
[14] J. C. Santamaria, R. S. Sutton, and A. Ram, “Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces,” Adaptive Behavior, Vol.6, No.2, pp. 163-218, 1998.
[15] R. S. Sutton, “Learning to Predict by the Methods of Temporal Differences,” Machine Learning, 3, pp. 9-44, 1988.
[16] R. S. Sutton and A. Barto, “Reinforcement Learning: An Introduction,” A Bradford Book, The MIT Press, 1998.
[17] T. Tateyama, S. Kawata, and Y. Shimomura, “A Reinforcement Learning Algorithm for Continuous State Spaces using Multiple Fuzzy-ART Networks,” Proceedings of SICE-ICCAS 2006, 2006.
[18] C. J. H. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, 8, pp. 55-68, 1992.
[Notice]
* "Preview" is the first 2 pages of the article. You don't need the registration.
* To read the PDF file you will then need to download and install the Adobe Reader.
Adobe Reader is free and available for download here: