Research Paper:
Resource-Constrained and Time-Aware Reinforcement Learning Framework for Sustainable Fertilization Strategies
Muhammad Alkaff*,**
, Abdullah Basuhail*
, Yuslena Sari**,
, and Kamal Jambi*

*Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University
P.O. Box 80200, Abdullah Alsulaiman Road, Jeddah 21589, Saudi Arabia
**Department of Information Technology, Faculty of Engineering, Universitas Lambung Mangkurat
Jalan Brigjen H. Hasan Basry, Banjarmasin, Kalimantan Selatan 70123, Indonesia
Corresponding author
Achieving sustainable fertilization is critical for balancing crop productivity with environmental stewardship and resource efficiency. However, conventional fertilization methods rely on fixed schedules and generalized routines, resulting in inefficient nitrogen use and environmental risks owing to over- or under-application. Sustainable fertilization requires adaptive strategies that optimize resource use while preserving long-term soil health and productivity. Reinforcement learning (RL) offers a promising alternative by continuously adapting fertilization strategies based on real-time data, such as soil conditions, crop growth stages, and weather patterns. This study introduced the time-aware, idle-biased, Lagrangian-based, and resource-constrained approach with proximal policy optimization (TILARC-PPO), a novel RL framework designed to adaptively optimize fertilization. TILARC-PPO integrates (1) idle-biased action selection to prevent unnecessary fertilization, (2) time-awareness to optimize decision timing, and (3) Lagrangian-based resource constraints to dynamically regulate nitrogen applications. Experimental results show that TILARC-PPO maintains a comparable grain yield with only a slight reduction of 7.93%, while reducing nitrogen consumption by 32% when compared to expert fertilization. Additionally, it achieved the highest nitrogen use efficiency (30.8 kg grain per kg N), surpassing both the expert-based and vanilla proximal policy optimization (PPO) approaches. TILARC-PPO further improved training stability and policy convergence by learning effective fertilization strategies within 300,000 timesteps. These findings highlight TILARC-PPO as a scalable, intelligent solution for sustainable precision agriculture, aligned with global efforts to enhance resource efficiency, maintain soil health, and promote sustainable food production.
- [1] P. Krasilnikov, M. A. Taboada, and Amanullah, “Fertilizer use, soil health and agricultural sustainability,” Agriculture, Vol.12, No.4, Article No.462, 2022. https://doi.org/10.3390/agriculture12040462
- [2] S. Nath, “A vision of precision agriculture: Balance between agricultural sustainability and environmental stewardship,” Agronomy J., Vol.116, No.3, pp. 1126-1143, 2024. https://doi.org/10.1002/agj2.21405
- [3] Food and Agriculture Organization of the United Nations, “The future of food and agriculture: Trends and challenges,” 2017. http://www.fao.org/3/i6583e/i6583e.pdf [Accessed March 2, 2025]
- [4] M. Li and R. S. Yost, “Management-oriented modeling: Optimizing nitrogen management with artificial intelligence,” Agricultural Systems, Vol.65, No.1, pp. 1-27, 2000. https://doi.org/10.1016/S0308-521X(00)00023-8
- [5] D. Wright et al., “Field corn production guide: SS-AGR-85/AG202, rev. 8/2022,” EDIS, Vol.2022, No.4, 2022. https://doi.org/10.32473/edis-ag202-2022
- [6] S. Sela, H. M. van Es, B. N. Moebius-Clune, R. Marjerison, and G. Kneubuhler, “Dynamic model-based recommendations increase the precision and sustainability of N fertilization in midwestern US maize production,” Computers and Electronics in Agriculture, Vol.153, pp. 256-265, 2018. https://doi.org/10.1016/j.compag.2018.08.010
- [7] G. Mandrini, C. M. Pittelkow, S. V. Archontoulis, T. Mieno, and N. F. Martin, “Understanding differences between static and dynamic nitrogen fertilizer tools using simulation modeling,” Agricultural Systems, Vol.194, Article No.103275, 2021. https://doi.org/10.1016/j.agsy.2021.103275
- [8] K. J. Boote, J. W. Jones, G. Hoogenboom, and J. W. White, “The role of crop systems simulation in agriculture and environment,” Int. J. of Agricultural and Environmental Information Systems, Vol.1, No.1, pp. 41-54, 2010.
- [9] D. Holzworth et al., “APSIM next generation: Overcoming challenges in modernising a farming systems model,” Environmental Modelling & Software, Vol.103, pp. 43-51, 2018. https://doi.org/10.1016/j.envsoft.2018.02.002
- [10] G. Hoogenboom et al., “The DSSAT crop modeling ecosystem,” K. Boote (Ed.), “Advances in Crop Modelling for a Sustainable Agriculture,” pp. 173-216, Burleigh Dodds Science Publishing, 2019. https://doi.org/10.19103/AS.2019.0061.10
- [11] A. R. Kemanian et al., “The cycles agroecosystem model: Fundamentals, testing, and applications,” SSRN, 2022. https://doi.org/10.2139/ssrn.4188402
- [12] P. Steduto, T. C. Hsiao, D. Raes, and E. Fereres, “AquaCrop—The FAO crop model to simulate yield response to water: I. Concepts and underlying principles,” Agronomy J., Vol.101, No.3, pp. 426-437, 2009. https://doi.org/10.2134/agronj2008.0139s
- [13] A. de Wit et al., “25 years of the WOFOST cropping systems model,” Agricultural Systems, Vol.168, pp. 154-167, 2019. https://doi.org/10.1016/j.agsy.2018.06.018
- [14] R. Gautron, O.-A. Maillard, P. Preux, M. Corbeels, and R. Sabbadin, “Reinforcement learning for crop management support: Review, prospects and challenges,” Computers and Electronics in Agriculture, Vol.200, Article No.107182, 2022. https://doi.org/10.1016/j.compag.2022.107182
- [15] H. Overweg, H. N. C. Berghuijs, and I. N. Athanasiadis, “CropGym: A reinforcement learning environment for crop management,” arXiv:2104.04326, 2021. https://doi.org/10.48550/arXiv.2104.04326
- [16] M. G. J. Kallenberg, H. Overweg, R. van Bree, and I. N. Athanasiadis, “Nitrogen management with reinforcement learning and crop growth models,” Environmental Data Science, Vol.2, Article No.e34, 2023. https://doi.org/10.1017/eds.2023.28
- [17] M. Turchetta et al., “Learning long-term crop management strategies with CyclesGym,” Proc. of the 36th Int. Conf. on Neural Information Processing Systems, pp. 11396-11409, 2022.
- [18] T. D. Kelly, T. Foster, and D. M. Schultz, “Assessing the value of deep reinforcement learning for irrigation scheduling,” Smart Agricultural Technology, Vol.7, Article No.100403, 2024. https://doi.org/10.1016/j.atech.2024.100403
- [19] M. Madondo et al., “A SWAT-based reinforcement learning framework for crop management,” AAAI Conf. on Artificial Intelligence, 2023.
- [20] M. Alkaff, A. Basuhail, and Y. Sari, “Optimizing water use in maize irrigation with reinforcement learning,” Mathematics, Vol.13, No.4, Article No.595, 2025. https://doi.org/10.3390/math13040595
- [21] R. Gautron et al., “gym-DSSAT: A crop model turned into a reinforcement learning environment,” arXiv:2207.03270, 2022. https://doi.org/10.48550/arXiv.2207.03270
- [22] K. Yamashita and T. Hamagami, “Reinforcement learning for POMDP environments using state representation with reservoir computing,” J. Adv. Comput. Intell. Intell. Inform., Vol.26, No.4, pp. 562-569, 2022. https://doi.org/10.20965/jaciii.2022.p0562
- [23] F. Pardo, A. Tavakoli, V. Levdik, and P. Kormushev, “Time limits in reinforcement learning,” Proc. of the 35th Int. Conf. on Machine Learning, pp. 4045-4054, 2018.
- [24] R. S. Sutton and A. G. Barto, “Reinforcement learning,” MIT Press, 1998.
- [25] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv:1707.06347, 2017. https://doi.org/10.48550/arXiv.1707.06347
- [26] P. Hintjens, “ZeroMQ: Messaging for many applications,” O’Reilly Media, 2013.
- [27] G. Brockman et al., “OpenAI Gym,” arXiv:1606.01540, 2016. https://doi.org/10.48550/arXiv.1606.01540
- [28] V. Mnih et al., “Playing Atari with deep reinforcement learning,” arXiv:1312.5602, 2013. https://doi.org/10.48550/arXiv.1312.5602
- [29] V. Mnih et al., “Asynchronous methods for deep reinforcement learning,” arXiv:1602.01783, 2016. https://doi.org/10.48550/arXiv.1602.01783
- [30] A. Bhatia, P. Varakantham, and A. Kumar, “Resource constrained deep reinforcement learning,” Proc. of the Int. Conf. on Automated Planning and Scheduling, Vol.29, pp. 610-620, 2019. https://doi.org/10.1609/icaps.v29i1.3528
- [31] L. A. Hunt and K. J. Boote, “Data for model operation, calibration, and evaluation,” G. Y. Tsuji, G. Hoogenboom, and P. K. Thornton (Eds.), “Understanding Options for Agricultural Production,” pp. 9-39, 1998. https://doi.org/10.1007/978-94-017-3624-4_2
- [32] C. W. Richardson, “Weather simulation for crop management models,” Trans. of the ASAE, Vol.28, No.5, pp. 1602-1606, 1985. https://doi.org/10.13031/2013.32484
- [33] A. Soltani and G. Hoogenboom, “A statistical comparison of the stochastic weather generators WGEN and SIMMETEO,” Climate Research, Vol.24, pp. 215-230, 2003. https://doi.org/10.3354/cr024215
- [34] T. F. Morris et al., “Strengths and limitations of nitrogen rate recommendations for corn and opportunities for improvement,” Agronomy J., Vol.110, No.1, pp. 1-37, 2018. https://doi.org/10.2134/agronj2017.02.0112
- [35] B. Vanlauwe et al., “Agronomic use efficiency of N fertilizer in maize-based systems in sub-Saharan Africa within the context of integrated soil fertility management,” Plant and Soil, Vol.339, pp. 35-50, 2011. https://doi.org/10.1007/s11104-010-0462-7
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.