Optimal Preventive Maintenance Policy for Non-Identical Components: Traditional Renewal Theory vs Modern Reinforcement Learning

Eidi, Shaghayegh; Safari, Abdollah; Haghighi, Firoozeh

doi:10.22034/IJRRS.2023.6.1.9

Optimal Preventive Maintenance Policy for Non-Identical Components: Traditional Renewal Theory vs Modern Reinforcement Learning

Document Type : Original Research Article

Authors

Shaghayegh Eidi

Abdollah Safari

Firoozeh Haghighi

School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran

10.22034/IJRRS.2023.6.1.9

Abstract

This paper compares the traditional approach against reinforcement learning algorithms to find the optimal preventive maintenance policy for equipment composed of multi-non-identical components with different time-to-failure distributions. As an application, we used the data from military trucks, which consisted of multiple components with very different failure behavior, such as tires, transmissions, wheel rims, couplings, motors, brakes, steering wheels, and shifting gears. The literature proposes Four different strategies for preventive maintenance of these components. To find the optimal preventive manganocene policy, we used the traditional approach (renewal theory-based) and the conventional reinforcement learning algorithms and compared their performance. The main advantages of the latter approach are that, unlike the traditional approach, they are not required to estimate the model parameters (e.g., transition probabilities). Without any explicit mathematical formula, they converge to the optimal solution. Our results showed that the traditional approach works best when the component time-to-failure distributions are available. However, the reinforcement learning approach outperforms where no such information is available or the distributions are misspecified.

Keywords

Opportunistic maintenance

Preventive maintenance

Markov decision process

Monte Carlo

Q-learning

Reinforcement learning

Subjects

Maintenance engineering

C. Márquez and J. N. D. Gupta, “Contemporary maintenance management: process, framework and supporting pillars,” Omega, vol. 34, no. 3, pp. 313–326, Jun. 2006, doi: https://doi.org/10.1016/j.omega.2004.11.003
Ravichandiran, “Hands-on reinforcement learning with python: Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow”. Birmingham, England: Packt Publishing, 2023.
Wang, H. Wang, and Q. Chen, “Multi-agent reinforcement learning based maintenance policy for a resource constrained flow line system,” Journal of Intelligent Manufacturing, vol. 27, no. 2, pp. 325–333, Jan. 2014, doi: https://doi.org/10.1007/s10845-013-0864-5
Liang, T. Deng, and Z.-J. M. Shen, “Demand-side energy management under time-varying prices,” IISE Transactions, vol. 51, no. 4, pp. 422–436, Feb. 2019, doi: https://doi.org/10.1080/24725854.2018.1504357
Yousefi, S. Tsianikas, and D. W. Coit, “Reinforcement learning for dynamic condition-based maintenance of a system with individually repairable components,” Quality Engineering, vol. 32, no. 3, pp. 388–408, Jun. 2020, doi: 10.1080/08982112.2020.1766692. https://doi.org/10.1080/08982112.2020.1766692
Adsule, M. S. Kulkarni, and A. Tewari, “Reinforcement learning for optimal policy learning in condition‐based maintenance,” IET Collaborative Intelligent Manufacturing, vol. 2, no. 4, pp. 182–188, Oct. 2020, doi: https://doi.org/10.1049/iet-cim.2020.0022
A. Haleem and S. Yacout, “Simulation of Components Replacement Policies for a Fleet of Military Trucks,” Quality Engineering, vol. 11, no. 2, pp. 303–308, Dec. 1998, doi: https://doi.org/10.1080/08982119808919242
Barde, S. Yacout, and H. Shin, “Optimal preventive maintenance policy based on reinforcement learning of a fleet of military trucks,” Journal of Intelligent Manufacturing, vol. 30, no. 1, pp. 147–161, Jun. 2016, doi: https://doi.org/10.1007/s10845-016-1237-7
B. Powell, Approximate Dynamic Programming: Solving the curses of dimensionality (Wiley Series in Probability and Statistics). 2007. [Online]. Available: https://dl.acm.org/citation.cfm?id=1324761
S. Sutton and A. G. Barto, “Reinforcement learning: An introduction”, MIT press, 2018.
J. N. Tsitsiklis, "On the convergence of optimistic policy iteration," Journal of Machine Learning Research, vol. 3, pp. 59-72, 2002, doi: https://doi.org/10.1162/153244303768966102