Paper:
Construction of Semi-Markov Decision Process Models of Continuous State Space Environments Using Growing Cell Structures and Multiagent k-Certainty Exploration Method
Takeshi Tateyama*, Seiichi Kawata**, and Yoshiki Shimomura***
*Tokyo Metropolitan University, 6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan
**Advanced Institute of Industrial Technology, 1-10-40 Higashiohi , Shinagawa-ku, Tokyo 140-0011, Japan
***Tokyo Metropolitan University, 6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan
- [1] R. S. Sutton and A.G. Bart, “Reinforcement Learning: An Introduction,” MIT Press, 1998.
- [2] C.J.C.H. Watkins and P. Dayan, “Technical Note: Q-Learning,” Machine Learning 8, pp. 279-292, 1992.
- [3] K. Miyazaki, M. Yamamura, and S. Kobayashi, “k-Certainty Exploration Method: an action selector to identify the environment in reinforcement learning,” Artificial Intelligence 91, pp. 155-171, 1997.
- [4] R.E. Parr, “Hierarchical Control and Learning for Markov Decision Processes,” Ph.D. Thesis, Computer Science in the Graduate Division of the University of California at Berkeley, 1990.
- [5] B. Fritzke, “Unsupervised Clustering with Growing Cell Structures,” Proc. of the Int. Joint conf. on Neural Networks (IJCNN-91), Seattle, Vol.2, pp. 531-536, 1991.
- [6] B. Fritzke, “Growing Cell Structures - a self-organizing network for unsupervised and supervised learning,” Neural Networks, Vol.7, No.9, pp. 1441-1460, 1994.
- [7] U.R. Zimmer and E. von Puttkamer, “Realtime-learning on an Autonomous Mobile Robot with Neural Networks,” Proc. of the Euromicro ”94 Realtime Workshop, Sweden, June 15-17, pp. 40-44, 1994.
- [8] T. Tateyama, S. Kawata, and Y. Shimomura, “Parallel Reinforcement Learning Systems Using Exploration Agents,” Trans. of the Japan Society of Mechanical Engineers, Vol.74, No.739, C, pp. 200-209, 2008 (in Japanese).
- [9] T. Kohonen, “Self-organization and associative memory,” Springer Series in Information Sciences 8, Heidelberg, 1984.
- [10] B.J.A. Kröse and M. Eecen, “A self-organizing representation of sensor space for mobile robot navigation,” Proc. of the IEEE/RSJ/GI Int. Conf. on Intelligent Robots and Systems, pp. 9-14, 1994.
- [11] N.A. Vlassis, G. Papakonstantinou, and P. Tsanakas, “Robot Map Building by Kohonen”s Self-Organizing Neural Networks,” Proc. 1st Mobinet Symposium on Robotics for Health Care, pp. 187-194, 1997.
- [12] K. Terada, H. Takeda, and T. Nishida, “An acquisition of the relation between vision and action using self-organizing map and reinforcement learning,” In Second Int. Conf. on Knowledge-based Intelligent Electronic Systems, pp. 429-434, 1998.
- [13] H. M. Gross, V. Stephan, and H.J. Boehme, “Sensory-based Robot Navigation using Self-organizing Networks and Q-learning,” Proc. WCNN”96, World Congress on Neural Networks 1996, San Diego, pp. 94-99, 1996.
- [14] M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents,” Proc. of the 10th Int. Conf. on Machine Learning, pp. 330-337, 1993.
- [15] G. Laurent and E. Piat, “Parallel Q-Learning for a block-pushing problem,” Proc. of the 2001 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 286-291, 2001.
- [16] R.M. Kretchmar, “Parallel Reinforcement Learning,” Proc. of the 6th World Conf. on Systemics, Cybernetics, and Informatics, Vol.6, pp. 114-118, 2002.
- [17] T. Tateyama, S. Kawata, and Y. Shimomura, “Parallel Reinforcement Learning Systems including Exploration-oriented Agents,” In Proc. of Joint 3rd Int. Conf. on Soft Computing and Intelligent Systems and 7th Int. Symposium on advanced Intelligent Systems (SCIS & ISIS 2006), pp. 1471-1475, Tokyo, Japan, 2006.
- [18] S. J. Bradtke and M. O. Duff, “Reinforcement learning methods for continuous-time markov decision problems,” In advances in Neural Information Processing Systems 7., MIT Press, 1995.
- [19] A. McGovern, R. S. Sutton, and A. H. Fagg, “Roles of macro-actions in accelerating reinforcement learning,” In proc. of the 1997 Grace Hopper Celeration of Women in Computing, pp. 13-18, 1997.
- [20] V. Braitenberg, “Vehicles:Experiments in Synthetic Psychology Boston,” MA: MIT Press, 1984.
- [21] S. Ichikawa and F. Hara, “Characteristics on Swarm Intelligence Generated in Multi-Robot System - Space Coverage Behavior and its Application -,” J. of the Robotics Society of Japan, Vol.13, No.8, pp. 78-84, 1995 (in Japanese).
- [22] O. Michel, “Khepera Simulator Version 2.0 User Manual,”
http://diwww.epfl.ch/lami/team/michel/khep-sim/ 1996. - [23] R.A. Brooks, “A robust layered control system for a mobile robot,” IEEE J. of Robotics and Automation, Vol.2, No.1, pp. 14-23, 1986.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.