Paper:
Opposition-Based Reinforcement Learning
Hamid R. Tizhoosh
Pattern Analysis and Machine Intelligence Laboratory, Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1
- [1] A. G. Barto, R. S. Sutton, and P. S. Brouwer, “Associative search network: A reinforcement learning associative memory,” Biological Cybernetics, Vol.40, No.3, pp. 201-211, May, 1981.
- [2] A. W. Beggs, “On the convergence of reinforcement learning,” Journal of Economic Theory, Vol.122, Issue 1, pp. 1-36, May, 2005.
- [3] K. Driessens, J. Ramon, and H. Blockeel, “Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner,” Proc. 12th European Conference on Machine Learning, Freiburg, Germany, September, 2001.
- [4] C. Drummond, “Composing functions to speed up reinforcement learning in a changing world,” Proc. 10th European Conference on Machine Learning, Springer-Verlag, 1998.
- [5] S. Dzeroski, L. De Raedt, and K. Driessens, “Relational Reinforcement Learning,” Machine Learning Vol.43, Issue 1-2, pp. 7-52, April-May, 2001.
- [6] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: A Survey,” Journal of Artificial Intelligence Research, Vol.4, 1996.
- [7] C. H. C. Ribeiro, “Embedding a Priori Knowledge in Reinforcement Learning,” Journal of Intelligent and Robotic Systems 21, pp. 51-71, 1998.
- [8] S. Singh, T. Jaakkola, M. L. Littman, and C. Szepesvári, “Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms,” Machine Learning, Vol.38, Issue 3, pp. 287-308, March, 2000.
- [9] R. S. Sutton, “Temporal Credit Assignment in Reinforcement Learning,” PhD thesis, University of Massachusetts, Amherst, MA, 1984.
- [10] R. S. Sutton, and A. G. Barto, “Reinforcement Learning: An Introduction,” MIT Press, 1998.
- [11] H. R. Tizhoosh, “Opposition-based learning: A new scheme for machine intelligence,” International Conference on Computational Intelligence for Modelling Control and Automation CIMCA’05, Vienna, Austria, 2005, Vol.I, pp. 695-701.
- [12] H. R. Tizhoosh, “Reinforcement learning based on actions and opposite actions,” ICGST International Conference on Artificial Intelligence and Machine Learning AIML-05, Cairo, Egypt, 2005.
- [13] C. J. C. H. Watkins, “Learning from Delayed Rewards,” PhD thesis, Cambridge University, Cambridge, England, 1989.
- [14] C. J. C. H. Watkins, and P. Dayan, “Q-learning,” Machine Learning, 8, pp. 279-292, 1992.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.