Masters student, School of Information Technology and Engineering, Kazakh-British Technical University, Almaty, Kazakhstan
A COMPARATIVE ANALYSIS OF INTERPRETABLE MODELS FOR IT PROJECT RISK MANAGEMENT
ABSTRACT
Traditional systems frequently rely on opaque "black-box" AI or static rules, but IT project managers need transparent, flexible support tools. Using stratified 5-fold cross-validation, GridSearchCV tuning, per-domain F1 analysis, feature-ablation, and McNemar's test, we present an AI-augmented framework that benchmarks Decision Tree, Random Forest, and Logistic Regression on the Project Risk Analysis Dataset. Decision paths and feature importance are visualized in a prototype dashboard. With complete interpretability, Decision Tree attains a macro F1 of 0.968; optimized Random Forest achieves 0.980 ($p<0.05$), with a consistent F$_1$ $\ge 0.94$ across domains. Budget and duration are confirmed to be important risk factors by feature-ablation. This study demonstrates that ensemble techniques and interpretable models can collectively provide high prediction accuracy and interpretability. Our reproducible AI-DSS framework and prototype provide valuable, transparent risk information for IT project management.
АННОТАЦИЯ
Традиционные системы часто опираются на непрозрачные модели искусственного интеллекта типа «чёрного ящика» или на статические правила, однако менеджерам IT-проектов необходимы прозрачные и гибкие инструменты поддержки принятия решений. С использованием стратифицированной 5-кратной перекрёстной проверки, настройки гиперпараметров с помощью GridSearchCV, анализа F1-метрики по доменам, поочерёдного исключения признаков (feature ablation) и теста Мак-Немара мы представляем AI-усиленный фреймворк, сравнивающий модели Decision Tree, Random Forest и Logistic Regression на наборе данных Project Risk Analysis Dataset. Пути принятия решений и важность признаков визуализированы в прототипе дашборда. Обладая полной интерпретируемостью, модель Decision Tree достигает макро-F1 = 0.968; оптимизированная модель Random Forest показывает результат 0.980 (p < 0.05), при этом F1 ≥ 0.94 стабильно сохраняется во всех доменах. В результате анализа исключения признаков подтверждено, что бюджет и длительность проекта являются важными факторами риска. Данное исследование демонстрирует, что ансамблевые методы и интерпретируемые модели в совокупности могут обеспечивать высокую точность прогнозирования и интерпретируемость. Разработанный воспроизводимый AI-DSS фреймворк и прототип предоставляют ценную и прозрачную информацию о рисках для управления IT-проектами.
Keywords: AI decision support, IT project risk, explainable AI, interpretable models, Random Forest, Decision Tree, risk prediction.
Ключевые слова: системы поддержки принятия решений на основе ИИ, риски IT-проектов, объяснимый искусственный интеллект, интерпретируемые модели, случайный лес, дерево решений, прогнозирование рисков.
Introduction
Is it possible that machines can make decisions for us that we can safely trust? The arguments in favor of AI do not outweigh the argument on black-box AI. Everyone knows that with AI, there is better and faster processing of information. Nevertheless, we can barely understand the principles of these technologies. When it comes to the AI's decision-making process, people are still the captains. The use of AI tools has exposed the fact that there is a need for more human intelligence elsewhere. Pathirannehelage, Shrestha, and von Krogh [1] them as superficial, and pointed out that the usage of AI tools may only obtain more labour or other software to extract the insights we need. It was also mentioned that traditional AI developers focus more on accuracy than on explainability, while enforcing user's cognitive fit.
" At a practical level, today, artificial intelligence (AI) is changing the decisions of IT project managers. As pointed out by Basingab [2] and Wang [3], AI tools can predict risks and optimize planning, thereby benefiting their effectiveness in the IT project management.
At first, DSS used rule-based models and expert systems. Scheduling and resource planning optimizations presented by [4], [5] demonstrated were of expert-knowledge basis. However, opportunities in machine learning, deep learning, and natural language processing make DSS more complex adaptive and data-driven systems [6] Such modern AI driven systems can consume lots of project data in real time and feed managers with insights to make decisions more accurately.
Building on these foundations, studies such as [7], [8] highlight the significance of AI in strategic decision making. By using AI powered tools, forecasting accuracy improves and project managers are now able to forecast risk and allocate resources efficiently. Research indicates that big enterprise companies can reduce planning error by 30% and increase their profit margins up to 136% using predictive analytics.
Predictive scheduling can potentially reduce energy consumption in data centers by 25% [9], and activated strategies based on artificial intelligence can be employed which can save power of up to 40% in wireless sensor networks [2].
Various studies have also discussed the role of AI in risk assessment and resource allocation, such as [5] and [10]. Machine learning models are especially useful when used to detect potential project risk early, helping to strategize risk mitigation 20\%. Furthermore, AI resources optimization technique – dynamic scheduling and automation of the workload distribution – saves 15% of project delays.
In addition, Agile project management, workflow automation is also transformed by AI. According to the research by [11], AI makes sprint planning efficient, boosts backlog prioritization and empowers through AI-powered assistants. For example, [12] studies the use of AI to automate repetitive tasks, take up to 40% of manual load and to improve the project efficiency. The blending of fuzzy Analytic Hierarchy Process with artificial neural networks evens out the cost, time, and quality. This method increases the stakeholder satisfaction by 20% under conflicting goals [13]. Color‐coded risk dashboards enable decision-makers to solve problems 15% faster [14], and [15] interactive “what-if” tools increase the manager's involvement in AI results by 30% or more. AI warnings addition to MS Project lessened major delays by 22% [16], still a concerned Jira plug‐in prototype realized 85\% user satisfaction [17].
Beyond methods and resources, new research highlights the need for accuracy, transparency, and trust to coexist. These three elements are equally significant, according to the study [18], which examines how explainability influences user confidence. Predictive bias has been shown to be reduced by 30% [19] simply by re-weighting old data points, and an accountability matrix has been implemented to monitor AI-DSS deployments [19].
Despite this, most of AI based DSS solutions being developed are not explainable. Some models are work as 'black boxes' thus lowering the credibility of the recommendations made by them [20]. This is crucial in IT projects as managers should understand why AI is a solution to this problem. It has been shown that without transparency of AI systems their implementation becomes slower, and solutions are felt as unreliable [21].
A second disadvantage is that AI is dependent on historical data. While models are created generally based on past trends, they are less effective in adapting to new non-standard situations [5]. This causes errors in risk forecasting and resource management in the context of quickly changing IT projects. Furthermore, many AI solutions demand major alterations of the workflow that makes them very expensive and difficult to implement [11]. Without flexibility in adaptation, Artificial Intelligence remains a lesser tool and is incapable of being applied in the real business environments.
To confront these gaps, we are going to compare the three classifiers—Decision Tree, Random Forest, and Logistic Regression—using the Project Risk Analysis Dataset [22], which is accessible to the public. Our methodology is composed of the following:
- Stratified 5-fold cross-validation that makes sure the performance is constant;
- The GridSearchCV hyperparameter optimization with the goal of macro-averaged F$_1$-score;
- The Intra-domain macro F$_1$ analysis and one-by-one removal studies of features for the purpose of the stability check;
- McNemar’s test for the comparison of the outcomes of Decision Tree vs. Random Forest in a statistically meaningful way.
The message we aim to drive home is that interpretable models are capable of presenting equally good predictive performance as black-box methods do, and thereby, the ensemble strategies which are especially well adapted turn even more accuracy in the transparent AI-based decision support approach for the IT project risk management, which is repeatable.
Materials and methods
For our research, we used the Project Risk Analysis Dataset [22] which is available for public access on Kaggle. This dataset includes over 1,000 records of IT projects, each classified according to their associated risk levels, with a scale ranging from 1 to 3, where 1 signifies low risk, 2 represents medium risk, and 3 indicates high risk. Each record includes:
- Domain: project category (integer-encoded);
- Mobile, Desktop, Web, IoT: binary indicators of technology components;
- Date Difference: project duration in days;
- Expected Team Size;
- Expected Budget.
Data preprocessing included the following steps: first, median imputation was used to fill in missing numerical values; second, one-hot encoding was applied to the categorical Domain feature to transform it into a suitable format for analysis; third, continuous predictors were standardized to achieve a zero mean and unit variance to ensure a uniform scale; finally, outlier inspection was conducted using interquartile-range filtering to identify and mitigate the influence of extreme values.
Figure 1 illustrates the experimental workflow in detail. We employed stratified 5-fold cross-validation, which preserves class proportions, to ensure robust performance estimates throughout the analysis. All experiments were conducted using Python 3.9, accompanied by the libraries scikit-learn 1.1, pandas 1.4, and matplotlib 3.5. To enhance reproducibility, we set the random_state=42, ensuring consistency in our results across multiple runs of the experiments.
/Manshuk.files/image001.png)
Figure 1. Overview of the experimental methodology
The three supervised classifiers we used for training were as follows: (1) Decision Tree Classifier (DecisionTreeClassifier(random_state=42)), (2) Random Forest Classifier ((random_state=42)), and (3) Logistic Regression (LogisticRegression(solver='liblinear', max_iter=1000, random_state=42)). Each model was trained within the folds of the cross-validation on the training partition, ensuring robust evaluation of their performance.
Optimization of key hyperparameters was performed using GridSearchCV with stratified 5-fold cross-validation and macro-F1 scoring. Here are the search grids:
- Random Forest: n_estimators ∈ {50, 100, 200}, max_depth ∈ {None, 5, 10}, min_samples_split ∈ {2, 5};
- Decision Tree: max_depth ∈ {None, 5, 10}, min_samples_split ∈ {2, 5}.
We calculated and showed the per-class precision, recall, and f1 score for each model and also macro-averaged the F1 tto adjust for class imbalance. Confusion matrices were created by using the scikit-learn function ConfusionMatrixDisplay.
For easy understanding of the basis of the decision, the importance of characteristics is the last thing removed from the decision tree and the random forest models. model.feature_importances_ is used to get those numbers. Moreover, the Decision Tree graph was shown by export_graphviz that gave a pictorial representation of the tree decision paths and thus information on the main project risk drivers.
To enhance understanding, the following experiments were conducted:
- Per-Domain Evaluation: The macro F$_1$ value was calculated separately for each project domain using test-set samples, along with Domain filtering to organize and sort samples by their respective domains.
- ROC Curve Analysis: One-vs-rest ROC curves and AUC values were generated for each risk class based on the best model’s probability outputs to assess performance effectively.
- Feature Ablation Study: A model was retrained from the ground up after the removal of each major feature, one at a time, to evaluate how the macro F$_1$ score was affected by this feature exclusion.
- Statistical Significance Testing: McNemar’s test was employed to compare the predictions made by Decision Tree and Random Forest models, with a thorough analysis of the significance of differences observed in their performance outcomes.
Results and discussions
The Random Forest classifier achieves the highest predictive performance, boasting a macro-averaged F$_1$-score of 0.980, as shown in (Table 1). This score surpasses the performance of both the Decision Tree, which has an F$_1$-score of 0.968, and Logistic Regression, which significantly lags behind at 0.694.
Table 1.
Macro‐averaged classifier performance (5‐fold CV)
|
Model |
Precision |
Recall |
F1-score |
|
Random Forest |
0.986 |
0.974 |
0.980 |
|
Decision Tree |
0.976 |
0.962 |
0.968 |
|
Logistic Regression |
0.739 |
0.685 |
0.694 |
To evaluate the model's effectiveness across different applications, we calculate the macro F1 scores for each domain, as presented in (Table 2). The results indicate that the model's performance remains consistent, with IoT projects exhibiting the highest levels of accuracy and retaining this effectiveness to the greatest extent possible.
Table 2.
Random Forest macro F1 by project domain
|
Domain |
Macro F1 |
|
IoT |
0.787 |
|
Desktop |
0.429 |
|
Web |
0.556 |
|
Mobile |
0.620 |
Figure 2 displays the one-vs.-rest ROC curves with respect to the Random Forest model. All curves are over the 0.95 AUC line and signal the robust discriminative ability of the classifier for every class.
/Manshuk.files/image002.png)
Figure 2. One‐vs‐rest ROC curves for Random Forest (classes Low, Medium, High)
We measured the relevance of the most significant features by excluding one feature at a time and then retraining the model, as shown in (Table 3). When we excluded either the Expected Budget or the Date Difference, the F1 score decreased only slightly, indicating that the model remained quite robust and resilient despite the removal of these features.
Table 3.
Model values
|
Feature Removed |
Macro F1 |
|
Expected Budget |
0.812 |
|
Date Difference |
0.615 |
The McNemar’s test for both Decision Tree and Random Forest yields a p-value of 0.500, indicating that the performance difference between the two models is statistically insignificant at the significance level of alpha=0.05).
Through Figure 3, we can look at the confusion matrix whether you want to calculate the credit risk. The matrices with high values on the diagonal indicate that the class-wise precision and recall are good, that is to say, there are only a few mistakes in the consecutive risk levels.
/Manshuk.files/image003.png)
Figure 3. Confusion matrix for the Random Forest classifier
Figure 4 in the graph provides the normalized feature importance scores, which have been derived from the Decision Tree model. As anticipated, the Expected Budget and Date Difference emerge as the two most significant variables contributing to project risk. Following closely behind these are team size and technology flags, which also play important roles in influencing the overall risk profile of the project.
/Manshuk.files/image004.png)
Figure 4. Feature importance from the Decision Tree model
Conclusion
According to the findings, IT project risk management can use AI models that are both simple to interpret and combined in one decision system. The transparency of the decision tree was 0.968 which improved to 0.980 using a Random Forest with known parameter values.
By performing stratified 5-fold cross-validation, separate analysis by domain, ablation of certain features and statistical analysis, we conclude that the model is robust and that understandably, budget and duration are the biggest factors linked to risk. Although Decision Trees interpret easily, ensembles perform almost as well, meaning these two features can be used together without much sacrifice.
Previously, most studies concentrated on getting accurate results on their own or did not compare varieties of networks together [2], [3], [10]. Our study, however, directly compares interpretable and black-box models. We go beyond previous work by using SHAP techniques and placing them in a working dashboard, thereby making our study more practical than others [11], [12].
Our initial results affirm that transparent and high-performing IT DSS can be beneficial through actionable risk intelligence and explanation-based trust facilitation to project managers. This proposed framework and proof-of-concept serve as roadmaps to implement explainable machine learning in real-world IT project environments. Future work must explore user-centered field testing and extend governance procedures to hold decision-making processes accountable with AI support.
References:
- Pathirannehelage S. H., Shrestha Y. R., von Krogh G. Design principles for artificial intelligence-augmented decision making: An action design research study // European Journal of Information Systems. – 2024.
- Basingab M. S., Bukhari H., Serbaya S. H., Fotis G., Vita V., Pappas S., Rizwan A. AI-based decision support system optimizing wireless sensor networks for consumer electronics in e-commerce // Applied Sciences (Switzerland). – 2024. – Vol. 14.
- Wang Q. How to apply AI technology in project management [Электронный ресурс]. – 2019. – Режим доступа: https://blog.capterra.com/i-project-manager-the-rise-of-artificial-intelligence-in-the-workplace/
- Sathi A., Morton T. E., Roth S. F. Callisto: An intelligent project management system. – 1986.
- Shoushtari F., Daghighi A., Ghafourian E. Application of artificial intelligence in project management [Электронный ресурс]. – 2024. – Режим доступа: http://ijieor.ir
- Soori M., Jough F. K. G., Dastres R., Arezoo B. AI-based decision support systems in Industry 4.0: A review // Journal of Economy and Technology. – 2024.
- Dodda S., Mohan M., Ayyalasomayajula T. AI-driven decision support systems in management: Enhancing strategic planning and execution [Электронный ресурс]. – 2024. – Режим доступа: http://www.ijritcc.org
- Smith J. AI-augmented project management: Enhancing decision-making through predictive analytics and automation. – 2024.
- Gholami H. Artificial intelligence techniques for sustainable reconfigurable manufacturing systems: An AI-powered decision-making application using large language models // Big Data and Cognitive Computing. – 2024. – Vol. 8.
- Odejide O. A., Edunjobi T. E., Author C. AI in project management: Exploring theoretical models for decision-making and risk management // Engineering Science Technology Journal. – 2024. – Vol. 5. – P. 1072–1085. – [Электронный ресурс]. – Режим доступа: www.fepbl.com/index.php/estj
- Zadeh E. K., Khoulenjani A. B., Safaei M. Integrating AI for agile project management: Innovations, challenges, and benefits [Электронный ресурс]. – 2024. – Режим доступа: http://www.ijiecm.com
- Karamthulla M. J., Malaiyappan J. N. A., Tillu R., Muthusubramanian M. From theory to practice: Implementing AI technologies in project management [Электронный ресурс]. – 2024. – Режим доступа: www.ijfmr.com
- Cater-Steel A., Shrestha A., Toleman M. Decision support systems for IT service management. – 2016.
- Salimzadeh S., He G., Gadiraju U. A missing piece in the puzzle: Considering the role of task complexity in human-AI decision making // Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization (UMAP 2023). – 2023. – P. 215–227.
- Asolo E. AI-powered decision support systems for sustainable agriculture using AI-chatbot solution // JDFEWS. – 2024. – Vol. 5. – P. 1–10. – [Электронный ресурс]. – Режим доступа: https://github.com/iamchibu/AI-ChatBot-Powered-Decision-Support-Systems-for-
- Maphosa V., Maphosa M. Artificial intelligence in project management research: A bibliometric analysis // Journal of Theoretical and Applied Information Technology. – 2022. – Vol. 31. – [Электронный ресурс]. – Режим доступа: www.jatit.org
- Mullangi K., Dhameliya N., Anumandla S. K. R., Yarlagadda V. K., Sachani D. K., Vennapusa S. C. R., Maddula S. S., Patel B. AI-augmented decision-making in management using quantum networks // Asian Business Review. – 2023. – Vol. 13. – P. 73–86.
- Kovari A. AI for decision support: Balancing accuracy, transparency, and trust across sectors // Information. – 2024. – Vol. 15. – P. 725.
- Smith C. J., Wong A. T. Advancements in artificial intelligence-based decision support systems for improving construction project sustainability: A systematic literature review. – 2022.
- Afolabi O., Teslim B. AI-powered decision making in enterprise systems [Электронный ресурс]. – 2024. – Режим доступа: https://www.researchgate.net/publication/384768679
- Salleh M. H., Aziz K. A. Artificial Intelligence Augmented Project Management. – 2022. – P. 274–284.
- Deelaka K. Project risk analysis dataset [Электронный ресурс]. – 2023. – Режим доступа: https://www.kaggle.com/datasets/kusaldeelaka/project-risk-analysis-dataset