An Empirical Framework for Evaluating Reinforcement Learning in Automated Optimization Systems

Dr. Linnea J. Arwood

doi:10.64917/

Open Access icon Open Access

ARTICLE

An Empirical Framework for Evaluating Reinforcement Learning in Automated Optimization Systems

Dr. Linnea J. Arwood ¹

¹ Department of Computer Engineering North Cascadia Institute of Technology, Seattle, USA

Issue Vol. 2 No. 01 (2025): Volume 02 Issue 01 --- Section Articles --- Published Date: 2025-03-19

Citations: Loading…

ABSTRACT VIEWS: 10 | FILE VIEWS: 2 | PDF: 2 HTML: 0 OTHER: 0 | TOTAL: 12

Views + Downloads (Last 90 days)

Cumulative % included

Abstract

The integration of Reinforcement Learning (RL) into automation represents a paradigm shift in solving complex optimization problems across various industries⁶. While RL has demonstrated significant potential, its practical application is often hampered by a lack of standardized evaluation frameworks, making it difficult for practitioners to select appropriate algorithms for specific tasks⁷. This study introduces and executes a comprehensive empirical investigation to systematically evaluate the performance of leading RL algorithms across a diverse set of simulated automation environments⁸. We designed three high-fidelity simulation suites mimicking critical optimization tasks in manufacturing (production scheduling, inventory management), energy systems (microgrid management, HVAC control), and robotics (motion planning, multi-robot coordination)⁹. Within these environments, we benchmarked a portfolio of algorithms, including Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), and Multi-Agent Deep Deterministic Policy Gradient (MADDPG), against key performance indicators: task efficiency, sample complexity, scalability, and robustness to environmental stochasticity¹⁰. Our results reveal a nuanced performance landscape where no single algorithm dominates across all domains¹¹. For instance, while PPO demonstrated superior stability and performance in continuous control tasks prevalent in robotics and HVAC systems, DQN-based variants excelled in discrete action spaces typical of scheduling and inventory problems¹². Multi-agent algorithms showed profound efficiency gains in cooperative tasks but suffered from higher training complexity¹³. The findings underscore a critical trade-off between algorithm complexity, sample efficiency, and task-specific performance¹⁴. This research provides a foundational empirical baseline, offering actionable insights for deploying RL in real-world automation and highlighting critical areas for future research, particularly in enhancing transfer learning, safety, and interpretability to bridge the persistent gap between simulation and practical deployment¹⁵.

Keywords

Reinforcement Learning, Automation, Optimization

References

[1] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018. 338

[2] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015. 339

[3] C. Li, P. Zheng, Y. Yin, B. Wang, and L. Wang, “Deep reinforcement learning in smart manufacturing: A review and prospects,” CIRP Journal of Manufacturing Science and Technology, vol. 40, pp. 75–101, 2023. 340

[4] A. Perera and P. Kamalaruban, “Applications of reinforcement learning in energy systems,” Renewable and Sustainable Energy Reviews, vol. 137, p. 110618, 2021. 341

[5] J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1238–1274, 2013. 342

[6] A. Esteso, D. Peidro, J. Mula, and M. D´ıaz-Madronero, “Reinforcement learning applied to production planning and control,” International Journal of Production Research, vol. 61, no. 16, pp. 5772–5789, 2023. 343

[7] R. Nian, J. Liu, and B. Huang, “A review on reinforcement learning: Introduction and applications in industrial process control,” Computers & Chemical Engineering, vol. 139, p. 106886, 2020. 344

[8] R. N. Boute, J. Gijsbrechts, W. Van Jaarsveld, and N. Vanvuchelen, “Deep reinforcement learning for inventory control: A roadmap,” European Journal of Operational Research, vol. 298, no. 2, pp. 401–412, 2022. 345

[9] C. Blum and A. Roli, “Metaheuristics in combinatorial optimization: Overview and conceptual comparison,” ACM computing surveys (CSUR), vol. 35, no. 3, pp. 268–308, 2003. 346

[10] Y. Li, “Deep reinforcement learning: An overview,” arXiv preprint arXiv:1701.07274, 2017. 347

[11] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017. 348

[12] C. D. Hubbs, C. Li, N. V. Sahinidis, I. E. Grossmann, and J. M. Wassick, “A deep reinforcement learning approach for chemical production scheduling,”Computers & Chemical Engineering, vol. 141, p. 106982, 2020. 349

[13] D. Shi, W. Fan, Y. Xiao, T. Lin, and C. Xing, “Intelligent scheduling of discrete automated production line via deep reinforcement learning,” International journal of production research, vol. 58, no. 11, pp. 3362–3380, 2020. 350

[14] F. Guo, Y. Li, A. Liu, and Z. Liu, “A reinforcement learning method to scheduling problem of steel production process,” in Journal of Physics: Conference Series, vol. 1486, no. 7. IOP Publishing, 2020, p. 072035. 351

[15] M. Mowbray, D. Zhang, and E. A. D. R. Chanona, “Distributional reinforcement learning for scheduling of chemical production processes,” arXiv preprint arXiv:2203.00636, 2022. 352

[16] N. N. Sultana, H. Meisheri, V. Baniwal, S. Nath, B. Ravindran, and H. Khadilkar, “Reinforcement learning for multi-product multi-node inventory management in supply chains,” arXiv preprint arXiv:2006.04037, 2020. 353

[17] B. J. De Moor, J. Gijsbrechts, and R. N. Boute, “Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management,” European Journal of Operational Research, vol. 301, no. 2, pp. 535–545, 2022. 354

[18] M. Khirwar, K. S. Gurumoorthy, A. A. Jain, and S. Manchenahally, “Cooperative multi-agent reinforcement learning for inventory management,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2023, pp. 619–634. 355

[19] R. Leluc, E. Kadoche, A. Bertoncello, and S. Gourvenec, “Marlim: Multi-agent reinforcement learning for inventory management,” arXiv preprint arXiv:2308.01649, 2023. 356

[20] O. Ogunfowora and H. Najjaran, “Reinforcement and deep reinforcement learning-based solutions for machine maintenance planning, scheduling policies, and optimization,” Journal of Manufacturing Systems, vol. 70, pp. 244–263, 2023. 357

[21] N. Yousefi, S. Tsianikas, and D. W. Coit, “Reinforcement learning for dynamic condition-based maintenance of a system with individually repairable components,” Quality Engineering, vol. 32, no. 3, pp. 388–408, 2020. 358

[22] ——, “Dynamic maintenance model for a repairable multi-component system using deep reinforcement learning,” Quality Engineering, vol. 34, no. 1, pp. 16–35, 2022. 359

[23] P. Andrade, C. Silva, B. Ribeiro, and B. F. Santos, “Aircraft maintenance check scheduling using reinforcement learning,” Aerospace, vol. 8, no. 4, p. 113, 2021. 360

[24] J. Thomas, M. P. Hernandez, A. K. Parlikad, and R. Piechocki, “Network maintenance planning via multi-agent reinforcement learning,” in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2021, pp. 2289–2295. 361

[25] Z. J. Viharos and R. Jakab, “Reinforcement learning for statistical process control in manufacturing,” Measurement, vol. 182, p. 109616, 2021. 362

[26] A. Kuhnle, M. C. May, L. Schafer, and G. Lanza, “Explainable reinforcement learning in production control of job shop manufacturing system,” International Journal of Production Research, vol. 60, no. 19, pp. 5812–5834, 2022. 363

[27] M. Mowbray, R. Smith, E. A. Del Rio-Chanona, and D. Zhang, “Using process data to generate an optimal control policy via apprenticeship and reinforcement learning,” AIChE Journal, vol. 67, no. 9, p. e17306, 2021. 364

[28] Y. Li, J. Du, and W. Jiang, “Reinforcement learning for process control with application in semiconductor manufacturing,” IISE Transactions, pp. 1–15, 2023. 365

[29] D. Azuatalam, W.-L. Lee, F. de Nijs, and A. Liebman, “Reinforcement learning for whole-building hvac control and demand response,” Energy and AI, vol. 2, p. 100020, 2020. 366

[30] D. Jang, L. Spangher, M. Khattar, U. Agwan, and C. Spanos, “Using meta reinforcement learning to bridge the gap between simulation and experiment in energy demand response,” in Proceedings of the Twelfth ACM International Conference on Future Energy Systems, 2021, pp. 483–487. 367

[31] M. Ahrarinouri, M. Rastegar, and A. R. Seifi, “Multiagent reinforcement learning for energy management in residential buildings,” IEEE Transactions on Industrial Informatics, vol. 17, no. 1, pp. 659–666, 2020. 368

[32] R. Lu, R. Bai, Z. Luo, J. Jiang, M. Sun, and H.-T. Zhang, “Deep reinforcement learning-based demand response for smart facilities energy management,” IEEE Transactions on Industrial Electronics, vol. 69, no. 8, pp. 8554–8565, 2021. 369

[33] R. Lu, Y.-C. Li, Y. Li, J. Jiang, and Y. Ding, “Multi-agent deep reinforcement learning based demand response for discrete manufacturing systems energy management,” Applied Energy, vol. 276, p. 115473, 2020. 370

[34] X. Zhang, R. Lu, J. Jiang, S. H. Hong, and W. S. Song, “Testbed implementation of reinforcement learning-based demand response energy management system,” Applied energy, vol. 297, p. 117131, 2021. 371

[35] T. A. Nakabi and P. Toivanen, “Deep reinforcement learning for energy management in a microgrid with flexible demand,” Sustainable Energy, Grids and Networks, vol. 25, p. 100413, 2021. 372

[36] R. Hu and A. Kwasinski, “Energy management for microgrids using a reinforcement learning algorithm,” in 2021 IEEE Green Energy and Smart Systems Conference (IGESSC). IEEE, 2021, pp. 1–6. 373

[37] B. Zhang, Z. Chen, and A. M. Ghias, “Deep reinforcement learning-based energy management strategy for a microgrid with flexible loads,” in 2023 International Conference on Power Energy Systems and Applications (ICoPESA). IEEE, 2023, pp. 187–191. 374

[38] W. Zhang, H. Qiao, X. Xu, J. Chen, J. Xiao, K. Zhang, Y. Long, and Y. Zuo, “Energy management in microgrid based on deep reinforcement learning with expert knowledge,” in International Workshop on Automation, Control, and Communication Engineering (IWACCE 2022), vol. 12492. SPIE, 2022, pp. 275–284. 375

[39] A. Shojaeighadikolaei, A. Ghasemi, A. G. Bardas, R. Ahmadi, and M. Hashemi, “Weather-aware data-driven microgrid energy management using deep reinforcement learning,” in 2021 North American Power Symposium (NAPS). IEEE, 2021, pp. 1–6. 376

[40] Y. Du and F. Li, “Intelligent multi-microgrid energy management based on deep neural network and model-free reinforcement learning,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1066–1076, 2019. 377

[41] T. Yang, L. Zhao, W. Li, and A. Y. Zomaya, “Reinforcement learning in sustainable energy and electric systems: A survey,” Annual Reviews in Control, vol. 49, pp. 145–163, 2020. 378

[42] D. Cao, W. Hu, J. Zhao, G. Zhang, B. Zhang, Z. Liu, Z. Chen, and F. Blaabjerg, “Reinforcement learning and its applications in modern power and energy systems: A review,” Journal of modern power systems and clean energy, vol. 8, no. 6, pp. 1029–1042, 2020. 379

[43] X. Chen, G. Qu, Y. Tang, S. Low, and N. Li, “Reinforcement learning for selective key applications in power systems: Recent advances and future challenges,” IEEE Transactions on Smart Grid, vol. 13, no. 4, pp. 2935–2958, 2022. 380

[44] K. Sivamayil, E. Rajasekar, B. Aljafari, S. Nikolovski, S. Vairavasundaram, and I. Vairavasundaram, “A systematic study on reinforcement learning based applications,” Energies, vol. 16, no. 3, p. 1512, 2023. 381

[45] X. Zhong, Z. Zhang, R. Zhang, and C. Zhang, “End-to-end deep reinforcement learning control for hvac systems in office buildings,” Designs, vol. 6, no. 3, p. 52, 2022. 382

[46] S. Sierla, H. Ihasalo, and V. Vyatkin, “A review of reinforcement learning applications to control of heating, ventilation and air conditioning systems,” Energies, vol. 15, no. 10, p. 3526, 2022. 383

[47] H.-Y. Liu, B. Balaji, S. Gao, R. Gupta, and D. Hong, “Safe hvac control via batch reinforcement learning,” in 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS). IEEE, 2022, pp. 181–192. 384

[48] X. Yuan, Y. Pan, J. Yang, W. Wang, and Z. Huang, “Study on the application of reinforcement learning in the operation optimization of hvac system,” in Building Simulation, vol. 14. Springer, 2021, pp. 75–87. 385

[49] M. Biemann, F. Scheller, X. Liu, and L. Huang, “Experimental evaluation of model-free reinforcement learning algorithms for continuous hvac control,” Applied Energy, vol. 298, p. 117164, 2021. 386

[50] D. Zhou, R. Jia, and H. Yao, “Robotic arm motion planning based on curriculum reinforcement learning,” in 2021 6th International Conference on Control and Robotics Engineering (ICCRE). IEEE, 2021, pp. 44–49. 387

[51] T. Yu and Q. Chang, “Reinforcement learning based user-guided motion planning for human-robot collaboration,” arXiv preprint arXiv:2207.00492, 2022. 388

[52] Y. Cao, S. Wang, X. Zheng, W. Ma, X. Xie, and L. Liu, “Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot,” Aerospace Science and Technology, vol. 136, p. 108098, 2023. 389

[53] M. Schuck, J. Br¨udigam, A. Capone, S. Sosnowski, and S. Hirche, “Dext-gen: Dexterous grasping in sparse reward environments with full orientation control,” arXiv preprint arXiv:2206.13966, 2022. 390

[54] S. Joshi, S. Kumra, and F. Sahin, “Robotic grasping using deep reinforcement learning,” in 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE). IEEE, 2020, pp. 1461–1466. 391

[55] D. Wang, H. Deng, and Z. Pan, “Mrcdrl: Multi-robot coordination with deep reinforcement learning,” Neurocomputing, vol. 406, pp. 68–76, 2020. 392

[56] X. Lan, Y. Qiao, and B. Lee, “Towards pick and place multi robot coordination using multi-agent deep reinforcement learning,” in 2021 7th International Conference on Automation, Robotics and Applications (ICARA). IEEE, 2021, pp. 85–89. 393

How to Cite

An Empirical Framework for Evaluating Reinforcement Learning in Automated Optimization Systems. (2025). European Journal of Emerging Artificial Intelligence, 2(01), 21-33. https://doi.org/10.64917/

Download Citation

ejeai Open Access Journal

European Journal of Emerging Artificial Intelligence

All issues

An Empirical Framework for Evaluating Reinforcement Learning in Automated Optimization Systems

Abstract

Keywords

References

How to Cite

Related articles

Journal Information

Journal Guidelines

Follow Us

Join Us

Contact Us

Share Link

Related articles

BRIDGING THE GENERALIZATION GAP IN VISUAL REINFORCEMENT LEARNING: A THEORETICAL AND EMPIRICAL STUDY

A Residual Learning Framework for Advanced Visual Recognition

Neural Network–Driven Forecasting of Fine Particulate Air Pollution: Empirical Foundations, Methodological Evolutions, and Public Health Implications

Integrated Meteorological–Machine Learning Frameworks for Urban Air Pollution Characterization and Forecasting: Theoretical Foundations, Empirical Interpretations, and Policy-Relevant Implications

Graph Neural Networks: A Foundational Guide for the Applied ML Engineer

AUTOMATED RADIOGRAPHIC ASSESSMENT OF THE WEIGHT-BEARING FOOT: A DEEP LEARNING APPROACH TO ENHANCING MEASUREMENT RELIABILITY

Predictive Modeling of Programming Anxiety in University Students: A Logistic Regression Approach for Early Identification

The Resilience of Deep Learning Models for Breast Cancer Detection: A Quantitative Analysis of Performance Under Diverse Noise Conditions in Thermal Imaging

QUANTIFYING ALGORITHMIC FAIRNESS: A NOVEL PERSPECTIVE THROUGH UNCERTAINTY ESTIMATION

OPTIMIZING SHAP EXPLANATIONS: A COST-EFFECTIVE DATA SAMPLING METHOD FOR ENHANCED INTERPRETABILITY