PhD, Associate Professor, Kazakh-British Technical University, Kazakhstan, Almaty
PREDICTIVE MAINTENANCE IN INDUSTRIAL SYSTEMS: A COST-SENSITIVE COMPARISON OF MACHINE LEARNING AND DEEP LEARNING APPROACHES
ABSTRACT
Predictive maintenance is an approach that helps to improve the dependability of industrial systems via early identification of possible equipment failures, which in turn, reduces unexpected downtime. The objective of this research was to provide a comparative analysis of machine learning and deep learning models using an industrial dataset from the UCI Machine Learning Repository. The analyzed models are: Random Forest, Logistic Regression, TabNet, and Multilayer Perceptron within an identical scientific framework using stratified cross-validation and class imbalance handling. Overall, results demonstrate that the Random Forest model offers the best accuracy and precision; however, TabNet has superior recall and provides the best balance between performance in identifying failures. To provide a true industrial environment, a cost-sensitive metric was developed based on the relative repercussions of false positives versus false negatives. Results show that selecting a model is dependent on operational priorities: Random Forest is designed to produce the fewest false alarms, while TabNet has better success at minimizing costs related to missed failures.
АННОТАЦИЯ
Прогнозное техническое обслуживание - это подход, который помогает повысить надежность промышленных систем за счет раннего выявления возможных отказов оборудования, что, в свою очередь, сокращает непредвиденные простои. Цель этого исследования состояла в том, чтобы провести сравнительный анализ моделей машинного и глубокого обучения с использованием промышленного набора данных из репозитория машинного обучения UCI. Анализируемыми моделями являются: Случайный лес, логистическая регрессия, TabNet и многослойный персептрон в рамках идентичной научной структуры с использованием стратифицированной перекрестной проверки и обработки классового дисбаланса. В целом, результаты показывают, что модель случайного леса обеспечивает наилучшую точность; однако TabNet обладает превосходным быстродействием и обеспечивает наилучший баланс между производительностью при выявлении сбоев. Чтобы обеспечить реальную производственную среду, был разработан показатель, учитывающий затраты, основанный на относительном соотношении ложных срабатываний и ложноотрицательных результатов. Результаты показывают, что выбор модели зависит от оперативных приоритетов: Random Forest разработан таким образом, чтобы генерировать наименьшее количество ложных тревог, в то время как TabNet добивается большего успеха в минимизации затрат, связанных с пропущенными сбоями.
Keywords: machine learning, deep learning, industrial system, cost minimization.
Ключевые слова: машинное обучение, глубокое обучение, промышленная система, минимизация затрат.
Introduction
Equipment downtime remains a major challenge for industrial systems, causing large amounts of money to be lost and limiting production output [1-4]. Instead of pre-scheduled maintenance strategies (preventative/reactive) that generate unplanned downtime, inefficient cuts in production capacity of as much as 20% and resulting significant monetary losses due to unplanned downtime caused by ineffective maintenance are also impacts of an aging infrastructure [3]. The principal reasons for downtime are due to technical faults, human errors, and non-automated systems [1, 4-6]. The challenges faced due to these limitations urge the need for using advanced technologies for monitoring and predicting failures.
Conventional maintenance strategies such as preventative and reactive are no longer going to be adequate for today’s manufacturing operations. For example, using a fixed scheduled time to change out gears will result in unnecessary maintenance and subsequently cause significant downtime because they were replaced prematurely (preventative maintenance) [12, 13]; likewise, by waiting until they break which will lead to higher repairs when compared to predictive maintenance. On the contrary, predictive maintenance (PdM) uses past sensor data/history to predict when and what type of failure(s) are expected, thereby leading to better/pre-planning of maintenance schedules thereby reducing overall operational losses [3, 7, 8]. Studies have shown that PdM can lower maintenance costs by as much as 25%, reduce breakdown rates by up to 70% [14], and increase the lifespan of equipment by 20-30% [9].
Artificial systems for preventive maintenance are primarily being affected by (AI) technology, which predominantly includes (ML) and (DL) [11, 15]. In particular, the implementation of these technologies enables appraisals of vast amounts of data generated by sensors. Specifically, the quantity and quality of failure causes are such that the identification of complex degradation patterns in all cases will be enabled through machine learning. The most commonly used ML algorithms/techniques for this prediction are Random Forest and Support Vector Machines, because of the fact that they are both robust and result in a high level of predictive performance [4, 5, 10, 16]; Logistic regression is the simplest form of ML and provides a baseline; however, it is a highly limited statistical approach when dealing with data that has a non-linear nature [12, 17, 18]. The greater level of non-linearity of a data set also seems to correspond with the improved capabilities for capturing complex dependent relationships of the data within a DL model, as demonstrated by various types/models of neural networks and TabNet [17, 19-22].
Another challenge in PdM continues to be that the class imbalance exists. Therefore, when evaluating performance of an algorithm, it would seem to be misleading due to the likelihood of obtaining a high accuracy on the total class of data to not obtaining detection in cases of actual critical defects. Many previous studies have identified a performance metric, however there have also been no consistent definitions of diseases across studies on product [13]. Moreover, when conducting product research, they do not examine the cost of errors associated with prediction; typically, the cost of mistakes generated with respect to missed failures will be associated with much higher costs relative to errors associated with false alarms.
New studies show that data preprocessing and optimization methods such as PCA, clustering, and metaheuristic algorithms are very important in giving the best possible performance from a model [4, 23]. However, experimental setup differences and differing evaluation methods make it hard to come to any reliable conclusions about how ML models compare to DL models.
As a result, this study will work to fill these gaps by developing a common evaluation framework for predictive maintenance which combines the use of stratified cross-validation, handling of class imbalances and cost-sensitive analyses. The goal is to provide a systematic way of comparing traditional ML models and current DL models and assessing their value under different costs which are meaningful to industry decision-makers.
Materials and methods
The AI4i 2020 dataset sourced from the UCI Machine Learning Repository contains information about industrial equipment operating under a variety of conditions, as measured by temperature, rotational speed, torque, and the state of wear on the tool being utilized [23]. There are 10,000 total observations in the dataset with a binary dependent variable that indicates if the machine failed or not; the class imbalance demonstrates that failure events are rare in industry (approximately one failure for every 28 running machines). To remedy the issue of this imbalance, class-weighted learning was incorporated into each model to provide emphasis on cases of failure; a stratified 5-fold cross-validation was used to provide consistent class distribution and to validate each model’s performance.
The four models selected for comparison include Random Forest (RF) and Logistic Regression (LR) (traditional machine learning approaches) and TabNet and Multilayer Perceptron (MLP) (deep learning approaches). Each model was trained using the same normalized sets of input feature(s) and with the same validation settings to ensure that they could be compared fairly.
Once models were validated against standard metrics such as accuracy, precision, recall, F1 score, and specificity, an additional framework that uses cost-sensitive evaluation was put into place as an alternative assessment of each model with regard to real-life business decision-making in industrial settings. This framework is based on the relative cost ratio
, where false negatives (missed failures) are assumed to be more critical than false positives (false alarms).
Confusion matrix components were used for each 𝑅 value to compute overall prediction costs so as to provide a more comprehensive comparison of model performance across all industrial cost scenarios; thus, providing an opportunity for more realistic model evaluations since models will be evaluated according to both statistical and economic outcomes.
Results and discussions
Stratified 5-fold cross-validation was used to assess the models. The findings shown in Table 1, demonstrate that class imbalance has a considerable impact on performance and that there is a trade-off between precision and recall in each model's results. Random Forest produced the highest overall accuracy at 98.19% and precision at 91.28%, indicating that it is able to identify many true positives while also limiting the number of false positives it produces; however, it obtained a moderate 51.94% recall meaning there were failure events that would not have been detected. TabNet produced a much higher recall (90.26%) than Random Forest allowing for the better detection of failures, but TabNet produced many more false positives than Random Forest which caused it to have a lower overall performance than Random Forest.
Table 1.
Stratified 5-Fold Cross-Validation Results
|
Model |
Accuracy, % |
Precision, % |
Recall, % |
F1-score, % |
Specificity, % |
|
RF |
98.19 ± 0.15 |
91.28 ± 4.93 |
51.94 ± 5.56 |
65.88 ± 3.99 |
99.81 ± 0.12 |
|
LR |
81.50 ± 0.52 |
13.38 ± 0.31 |
81.42 ± 3.89 |
22.97 ± 0.58 |
81.50 ± 0.65 |
|
TabNet |
93.65 ± 0.99 |
34.17 ± 4.15 |
90.26 ± 2.00 |
49.42 ± 4.14 |
93.77 ± 1.05 |
|
MLP |
87.88 ± 2.91 |
21.42 ± 4.05 |
89.68 ± 6.95 |
34.20 ± 4.44 |
87.82 ± 3.25 |
In order to simulate real-world industrial conditions, the models were evaluated with respect to different R-value costs, where R =1 provided a point of reference for total cost in which both false negatives and false positives contribute to the same total cost. Under this different R-value both Random Forest and TabNet would have produced their lowest total cost due to the high precision and recall values, respectively. As the cost ratios increased where failing to report a fault was seen as more costly than reporting a fault TabNet performed best from a total cost perspective compared to other models. More details shown in Table 2.
Table 2.
Total Cost of Models for Different Cost Ratios
|
Value of R |
RF |
LR |
TabNet |
MLP |
|
1 |
181 |
1850 |
637 |
1212 |
|
5 |
833 |
2102 |
769 |
1352 |
|
10 |
1648 |
2417 |
934 |
1527 |
|
20 |
3278 |
3047 |
1264 |
1877 |
|
50 |
8168 |
4937 |
2254 |
2927 |
The findings of this study suggest that model selection is a function of operational priorities. When minimizing false alarms is critical, Random Forest is preferred. However, for safety-critical applications, TabNet may be better suited due to the higher economic and/or operational risks associated with missing a failure.
Conclusion
This study compared how well different types of machine learning and deep learning models work for predicting when maintenance will be required on machinery in an industrial environment by utilizing a single set of common criteria by which to evaluate those models. The study found that both the performance of the models and their ability to predict future maintenance events were greatly affected by class imbalance and the relationship between precision and recall.
The machine learning model, Random Forest, produced the best overall result in accuracy and precision and was thus an ideal choice for lowering false positive results. The deep learning model TabNet, however, produced the highest overall recall and was therefore a better model for detecting failure events than Random Forest. However, using the traditional evaluation metrics by themselves was insufficient for helping managers make sound maintenance decisions in an industrial environment.
To address this inadequacy, a cost-based evaluation framework was provided by considering the value of a false positive as compared to the value of a false negative. The research also indicated that in planning an optimum model for predictive maintenance in an industrial setting, the final model choice depends on the operational priorities of that manufacture: if the priority is to reduce unnecessary maintenance actions, then Random Forest is the best choice; if, however, the priority is to minimize the risk of a missed failure, thus resulting in severe economic or safety consequences, then TabNet is the best choice.
The method described above shows the need for using both a statistically valid method and an economically valid method for developing a sound, accurate predictive maintenance plan. Future studies should expand the research into actual industrial environments, optimize the parameters of the models used in the analysis, and investigate the use of real-time-predictive maintenance decision-making support.
References:
- Haraguchi, N., Fang, C., Cheng, C., and Smeets, E., The Importance of Manufacturing in Economic Development: Has This Changed?, World Development, Vol. 93 (2017), pp. 293–315.
- Hong, Z., Predictive Maintenance in Manufacturing: Reducing Downtime and Costs with AI, Medium (2024).
- Pennel, M., Hsiung, J., and Putcha, V. B., Detecting Failures and Optimizing Performance in Artificial Lift Using Machine Learning Models, SPE Western Regional Meeting (2018), SPE-190090-MS.
- Turanoglu, B. E., Nyqvist, P., and Skoogh, A., An Intelligent Approach for Data Pre-processing and Analysis in Predictive Maintenance with an Industrial Case Study, Advances in Mechanical Engineering, Vol. 12 (2020), pp. 1–14.
- Nwanya, S. C., Udofia, J. I., and Ajayi, O. O., Optimization of Machine Downtime in the Plastic Manufacturing Industry, Cogent Engineering, Vol. 4 (2017), Article 1335444.
- Centomo, S., Dall’ora, A., and Fummi, F., The Design of a Digital Twin for Predictive Maintenance, Proceedings of the 2020 IEEE 31st International Symposium on Software Reliability Engineering Workshops, 2020, pp. 220–225.
- Ayvaz, S., and Alpay, K., Predictive Maintenance System for Production Lines in Manufacturing: A Machine Learning Approach Using IoT Data in Real Time, Expert Systems with Applications, Vol. 173 (2021), Article 114598.
- Jamwal, A., Goyal, A., Kumar, V., and Gupta, O. P., Models and Applications in Industry 4.0 and Implications, Materials Science for Energy Technologies, Vol. 5 (2022), Article 100073.
- Lee, W. J., Wu, H., Yun, H., Kim, H., Jun, M. B. G., and Sutherland, J. W., Predictive Maintenance of Machine Tool Systems Using Artificial Intelligence Techniques Applied to Machine Condition Data, Procedia CIRP, Vol. 80 (2019), pp. 506–511.
- Dada, E. G., Joseph, S., Oyewola, D., Fadele, A. A., Chiroma, H., and Abdulhamid, S. M., Application of Grey Wolf Optimization Algorithm: Recent Trends, Issues, and Possible Horizons, Gazi University Journal of Science, Vol. 35 (2022), pp. 485–504.
- Patil, D., Artificial Intelligence-Driven Predictive Maintenance in Manufacturing: Enhancing Operational Efficiency, Minimizing Downtime, and Optimizing Resource Utilization, SSRN Working Paper (2024).
- Carvalho, T. P., Soares, F. A. A. M. N., Vita, R., Francisco, R. P., Tavares Vieira Basto, J. P., and Gomes Soares Alcala, S., A Systematic Literature Review of Machine Learning Methods Applied to Predictive Maintenance, Computers & Industrial Engineering, Vol. 137 (2019), Article 106024.
- Samatas, G. G., Moumgiakmas, S. S., and Papakostas, G. A., Predictive Maintenance: Bridging Artificial Intelligence and IoT, arXiv preprint arXiv:2103.11148 (2021).
- Deloitte Global, Predictive Maintenance—Deloitte’s Approach (2022), online article.
- Potter, K., and Broklyn, P., AI-Based Predictive Maintenance in Manufacturing Industries, preprint (2024).
- Zhang, J., Arinez, J., Chang, Q., Gao, R. X., and Xu, C., Artificial Intelligence in Advanced Manufacturing: Current Status and Future Outlook, Journal of Manufacturing Science and Engineering, Vol. 142 (2020), Article 111003.
- Authors, Predicting Machine Failures Using Machine Learning and Deep Learning Algorithms, Sustainable Manufacturing and Service Economics, Vol. 3 (2024), Article 100029.
- Hector, I., and Panjanathan, R., Predictive Maintenance in Industry 4.0: A Survey of Planning Models and Machine Learning Techniques, PeerJ Computer Science, Vol. 10 (2024).
- Bhave, D., Adiga, D. T., Powar, N., and McKinley, T., Remaining Useful Life Prediction of Turbo Actuators for Predictive Maintenance of Diesel Engines, PHM Society European Conference, Vol. 6 (2021), pp. 516–526.
- Zhang, B., Jin, X., Liang, W., Chen, X., Li, Z., Panoutsos, G., Liu, Z., and Tang, Z., TabNet: Locally Interpretable Estimation and Prediction for Advanced Proton Exchange Membrane Fuel Cell Health Management, Electronics, Vol. 13 (2024), Article 1358.
- Qin, Y., Cai, N., Gao, C., Zhang, Y., Cheng, Y., and Chen, X., Remaining Useful Life Prediction Using Temporal Deep Degradation Network for Complex Machinery with Attention-Based Feature Extraction, arXiv preprint arXiv:2202.10916 (2022).
- Phongmoo, S., Leksakul, K., Suedumrong, C., Kuensaen, C., and Sinthavalai, R., Predictive Maintenance in Semiconductor Manufacturing: Comparative Analysis of Machine Learning Models for Downtime Reduction, Computers & Industrial Engineering, Vol. 205 (2025), Article 111211.
- Zaheer, R., and Shaziya, H., A Study of Optimization Algorithms in Deep Learning, Proceedings of the 3rd International Conference on Inventive Systems and Control (ICISC), 2019, pp. 536–539.
- Matzka, S., AI4I 2020 Predictive Maintenance Dataset, UCI Machine Learning Repository (2020).