JRM Vol.36 No.3 pp. 580-588
doi: 10.20965/jrm.2024.p0580


Enhanced Naive Agent in Angry Birds AI Competition via Exploitation-Oriented Learning

Kazuteru Miyazaki ORCID Icon

National Institution for Academic Degrees and Quality Enhancement of Higher Education
1-29-1 Gakuennishimachi, Kodaira, Tokyo 185-8587, Japan

November 15, 2023
February 24, 2024
June 20, 2024
reinforcement learning, multi-agent learning, profit sharing, Q-learning, game

The Angry Birds AI Competition engages artificial intelligence agents in a contest based on the game Angry Birds. This tournament has been conducted annually since 2012, with participants competing for high scores. The organizers of this competition provide a basic agent, termed “Naive Agent,” as a baseline indicator. This study enhanced the Naive Agent by integrating a profit-sharing approach known as exploitation-oriented learning, which is a type of experience-enhanced learning. The effectiveness of this method was substantiated through numerical experiments. Additionally, this study explored the use of level selection learning within a multi-agent environment and validated the utility of the rationality theorem concerning the indirect rewards in this environment.

Screenshot of the Angry Birds game

Screenshot of the Angry Birds game

Cite this article as:
K. Miyazaki, “Enhanced Naive Agent in Angry Birds AI Competition via Exploitation-Oriented Learning,” J. Robot. Mechatron., Vol.36 No.3, pp. 580-588, 2024.
Data files:
  1. [1] T. Liu, J. Renz, P. Zhang, and M. Stephenson, “Using Restart Heuristics to Improve Agent Performance in Angry Birds,” arXiv:1905.12877, 2019.
  2. [2] J. Renz, “AIBIRDS: The Angry Birds Artificial Intelligence Competition,” Proc. of the AAAI Conf. on Artificial Intelligence, Vol.29, No.1, 2015.
  3. [3] M. Stephenson and J. Renz, “Deceptive Angry Birds: Towards Smarter Game-Playing Agents,” Proc. of the 13th Int. Conf. on the Foundations of Digital Games, Article No.13, 2018.
  4. [4] M. Stephenson, J. Renz, X. Ge, and P. Zhang, “The 2017 AIBIRDS Competition,” arXiv:1803.05156, 2018.
  5. [5] M. Stephenson et al., “The 2017 AIBIRDS Level Generation Competition,” IEEE Trans. on Games, Vol.11, Issue 3, pp. 275-284, 2018.
  6. [6] K. Miyazaki, M. Yamamura, and S. Kobayashi, “A Theory of Profit Sharing in Reinforcement Learning,” J. of the Japanese Society for Artificial Intelligence, Vol.9, Issue 4, pp. 580-587, 1994 (in Japanese).
  7. [7] K. Miyazaki, “Exploitation-Oriented Learning XoL: A New Approach to Machine Learning Based on Trial-and-Error Searches,” S.-H. Chen, Y. Kambayashi, and H. Sato (Eds.), “Multi-Agent Applications with Evolutionary Computational and Biologically Inspired Technologies: Intelligent Techniques for Ubiquity and Optimization,” IGI Globel, pp. 267-293, 2010.
  8. [8] K. Miyazaki and S. Kobayashi, “Exploitation-Oriented Learning PS-r#,” J. Adv. Comput. Intell. Intell. Inform., Vol.13, No.6, pp. 624-630, 2009.
  9. [9] K. Miyazaki, “Challenge to the Angry Birds AI Competition Through the Exploitation-Oriented Learning Method,” SSI 2019, Poster No.SS09-01, 2019 (in Japanese).
  10. [10] D. Ha and J. Schmidhuber, “Recurrent World Models Facilitate Policy Evolution,” Proc. of the 32nd Conf. on Neural Information Processing Systems, pp. 2450-2462, 2018.
  11. [11] M. Polceanu and C. Buche, “Towards A Theory-of-Mind-Inspired Generic Decision-Making Framework,” Int. Joint Conf. on Artificial Intelligence (IJCAI) 2013 Symp. on AI in Angry Birds, 2013.
  12. [12] T. Eiter et al., “A Model Building Framework for Answer Set Programming with External Computations,” Theory and Practice of Logic Programming, Vol.16, Issue 4, pp. 418-464, 2016.
  13. [13] V. Mnih et al., “Playing Atari with Deep Reinforcement Learning,” arXiv:1312.5602, 2013.
  14. [14] V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, Vol.518, pp. 529-533, 2015.
  15. [15] H. van Hasselt, A. Guez, and D. Silver, “Deep Reinforcement Learning with Double Q-Learning,” arXiv:1509.06461v3, 2015.
  16. [16] Z. Wang et al., “Dueling Network Architectures for Deep Reinforcement Learning,” arXiv:1511.06581v3, 2015.
  17. [17] D. Silver et al., “Mastering the game of Go with deep neural networks and tree search,” Nature, Vol.529, pp. 484-489, 2016.
  18. [18] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” IEEE Trans. on Neural Networks, Vol.9, Issue 5, 1998.
  19. [19] N. Kodama, T. Harada, and K. Miyazaki, “A Proposal of a Deep Reinforcement Learning Algorithm Using Non-Bootstrap Method,” The 46th Symp. on Intelligent System, 2019 (in Japanese).
  20. [20] K. Miyazaki, “Exploitation-Oriented Learning with Deep Learning – Introducing Profit Sharing to a Deep Q-Network –,” J. Adv. Comput. Intell. Intell. Inform., Vol.21, No.5, pp. 849-855, 2017.
  21. [21] K. Miyazaki, S. Arai, and S. Kobayashi, “A Theory of Profit Sharing in Multi-Agent Reinforcement Learning,” J. of Japanese Society for Artificial Intelligence, Vol.14, No.6, pp. 1156-1164, 1999 (in Japanese).
  22. [22] K. Miyazaki and S. Kobayashi, “Rationality of Reward Sharing in Multi-Agent Reinforcement Learning,” New Generation Computing, Vol.19, No.2, pp. 157-172, 2001.
  23. [23] M.-J. Kim and K.-J. Kim, “Opponent Modeling Based on Action Table for MCTS-Based Fighting Game AI,” Proc. of 2017 IEEE Conf. on Computational Intelligence and Games (CIG), pp. 178-180, 2017.
  24. [24] A. Nakagawa, T. Shibazaki, S. Osaka, and R. Thawonmas, “Adjustment of Game Difficulty and Improvement of Action Variety in Fighting Action Games Using Neural Networks,” The J. of Game Amusement Society, Vol.3, No.1, pp. 35-40, 2009 (in Japanese).
  25. [25] R. Thawonmas and S. Osaka, “A Method for Online Adaptation of Computer-Game AI Rulebase,” Proc. of the 2006 ACM SIGCHI Int. Conf. on Advances in Computer Entertainment Technology, 2006.
  26. [26] S. Yoon and K.-J. Kim, “Deep Q Networks for Visual Fighting Game AI,” Proc. of 2017 IEEE Conf. on Computational Intelligence and Games (CIG), pp. 306-308, 2017.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Jul. 12, 2024