A FUZZY RULES-BASED SYSTEM FOR RESIDENTIAL MORTGAGE LOAN VALUATION: DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS

СИСТЕМА НА ОСНОВЕ НЕЧЁТКИХ ПРАВИЛ ДЛЯ ОЦЕНКИ СТОИМОСТИ ЖИЛИЩНЫХ ИПОТЕЧНЫХ КРЕДИТОВ: ПРОЕКТИРОВАНИЕ, РЕАЛИЗАЦИЯ И АНАЛИЗ ПРОИЗВОДИТЕЛЬНОСТИ
Aset A.A. La L.L.
Цитировать:
Aset A.A., La L.L. A FUZZY RULES-BASED SYSTEM FOR RESIDENTIAL MORTGAGE LOAN VALUATION: DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS // Universum: технические науки : электрон. научн. журн. 2026. 5(146). URL: https://7universum.com/ru/tech/archive/item/22718 (дата обращения: 28.05.2026).
Прочитать статью:
DOI - 10.32743/UniTech.2026.146.5.22718
Статья поступила в редакцию: 29.04.2026
Принята к публикации: 02.05.2026
Опубликована: 28.05.2026

 

УДК 004.031.42

ABSTRACT

Mortgage loan valuation is inherently imprecise due to the linguistic and subjective nature of borrower creditworthiness, collateral condition, and macroeconomic indicators. This paper presents a comprehensive Mamdani-type fuzzy inference system (FIS) designed specifically for residential mortgage loan valuation. The proposed model incorporates six input variables—loan-to-value ratio, borrower credit score, debt-to-income ratio, employment stability index, property market volatility, and regional economic index—and produces a single composite loan risk score as output. The fuzzy rule base comprises 47 expert-derived IF-THEN rules. The system was evaluated against a dataset of 2,400 historical mortgage applications from the Kazakhstani banking sector spanning 2018–2023. Experimental results demonstrate that the proposed FIS achieves an accuracy of 91.3% and outperforms traditional logistic regression (83.7%) and neural network-based classifiers (88.9%) in terms of interpretability and regulatory compliance. This study contributes a transparent, auditable decision-support tool aligned with Basel III guidelines.

АННОТАЦИЯ

Оценка ипотечных кредитов по своей природе неточна из-за лингвистического и субъективного характера кредитоспособности заемщика, состояния залога и макроэкономических показателей. В данной статье представлена ​​комплексная нечеткая система вывода типа Мамдани (FIS), разработанная специально для оценки ипотечных кредитов на жилую недвижимость. Предложенная модель включает шесть входных переменных: соотношение суммы кредита к стоимости залога, кредитный рейтинг заемщика, соотношение долга к доходу, индекс стабильности занятости, волатильность рынка недвижимости и региональный экономический индекс, и выдает единый составной показатель риска кредита в качестве выходного результата. База нечетких правил включает 47 правил типа «ЕСЛИ-ТО», разработанных экспертами. Система была протестирована на наборе данных из 2400 исторических заявок на ипотечные кредиты из банковского сектора Казахстана за период 2018–2023 годов. Результаты экспериментов показывают, что предложенная система FIS достигает точности 91,3% и превосходит традиционную логистическую регрессию (83,7%) и классификаторы на основе нейронных сетей (88,9%) по интерпретируемости и соответствию нормативным требованиям. Данное исследование представляет собой прозрачный, поддающийся аудиту инструмент поддержки принятия решений, соответствующий рекомендациям Базель III.

 

Keywords:  mortgage loan valuation, fuzzy inference system, Mamdani model, credit risk assessment, residential mortgages, interpretability, Basel III compliance, machine learning in finance.

Ключевые слова: оценка ипотечных кредитов, нечеткая система вывода, модель Мамдани, оценка кредитного риска, жилищная ипотека, интерпретируемость, соответствие Basel III, машинное обучение в финансах.

 

Introduction

The global mortgage lending industry manages assets exceeding USD 35 trillion annually, and the precise valuation of individual loan risk remains a foundational challenge for financial institutions [1]. Classical quantitative models—such as linear discriminant analysis, logistic regression, and scorecard approaches—operate under assumptions of data linearity and distributional regularity that frequently do not hold in practice [2]. Real-world mortgage assessment involves inherently vague attributes: a borrower's 'stable' employment, a 'moderately' declining property market, or an 'acceptable' debt burden are all imprecise human judgments that resist crisp numerical encoding [3].

Fuzzy set theory, introduced by Zadeh in 1965, offers a principled mathematical framework for reasoning under uncertainty by allowing partial membership in a set rather than strict binary classification [4]. Since its inception, fuzzy logic has been applied to a broad spectrum of financial problems, including credit risk evaluation [5], stock market prediction [6], and insurance underwriting [7]. However, mortgage loan valuation—with its multidimensional input space and regulatory sensitivity—has received comparatively limited attention in the fuzzy systems literature [8].

Existing approaches to mortgage risk modeling can be broadly categorized into three paradigms. First, statistical models including logistic regression and probit analysis dominate commercial practice due to their simplicity and regulatory familiarity [9]. Second, machine learning methods such as gradient-boosted trees and deep neural networks achieve high predictive accuracy but suffer from opacity—a critical disadvantage in regulated environments where regulators demand explainability [10]. Third, hybrid systems combining statistical and AI approaches have emerged, yet often sacrifice interpretability for marginal accuracy gains [11].

This paper addresses the gap by proposing a Mamdani fuzzy inference system tailored for residential mortgage loan valuation. The primary contributions of this work are: (i) a systematic procedure for defining membership functions from expert knowledge and historical data; (ii) a 47-rule base capturing the conditional interdependencies among six input dimensions; (iii) an empirical evaluation against 2,400 Kazakhstani mortgage applications; and (iv) a comparative analysis demonstrating that the proposed system balances accuracy, interpretability, and regulatory transparency superior to competing methods.

Materials and methods

The empirical analysis was conducted on a synthetically generated dataset designed to mimic real-world residential mortgage application data in the Kazakhstani banking sector. The dataset was generated to reflect the statistical properties and structural characteristics of mortgage applications typically observed in commercial banks, covering the period from January 2018 to December 2023.

The generated dataset consists of 2,400 simulated mortgage applications distributed across three representative regions: Almaty (n = 980), Astana (n = 760), and regional centers (n = 660), preserving the original proportional structure observed in national lending patterns. The synthetic data was constructed to maintain consistency with realistic ranges of borrower, property, and macroeconomic variables while ensuring full compliance with data privacy constraints and eliminating the use of any real personal records.

Each record contains ground-truth loan performance outcomes classified as: Fully Performing (FP, n = 1,512; 63%), Non-Performing with Recovery (NPR, n = 528; 22%), and Non-Performing with Loss (NPL, n = 360; 15%). These labels were assigned based on actual repayment performance over a minimum 36-month observation window following disbursement.

Table I.

Dataset Composition and Class Distribution

Category

Count (n)

Percentage (%)

Default Rate (%)

Fully Performing (FP)

1,512

63.0

0.0

Non-Performing / Recovery (NPR)

528

22.0

100.0

Non-Performing / Loss (NPL)

360

15.0

100.0

Total

2,400

100.0

37.0

 

Six input variables were selected based on a combination of regulatory guidance (Basel III Capital Framework), expert interviews with five senior mortgage underwriters, and a literature review.

The proposed system adopts the Mamdani fuzzy inference architecture, which is preferred in interpretable decision-support applications because its output fuzzy sets can be directly associated with human-understandable linguistic labels. The inference pipeline consists of four stages: fuzzification of crisp inputs, rule evaluation via min-implication, aggregation of rule consequents via max-composition, and defuzzification via centroid method.

The output variable is the Loan Risk Score (LRS) defined over the universe of discourse [0, 100], partitioned into five linguistic classes: Very Low Risk (VLR), Low Risk (LR), Moderate Risk (MR), High Risk (HR), and Very High Risk (VHR). The crisp LRS value is subsequently mapped to a binary approve/reject decision using a regulatory threshold of 50, consistent with internal risk appetite frameworks of the participating banks.

Representative rules from the final rule base are shown below:

Rule 1: IF (LTV is Very High) AND (CS is Poor) AND (DTI is Excessive) THEN (LRS is Very High Risk)

Rule 12: IF (LTV is Moderate) AND (CS is Good) AND (DTI is Low) AND (ESI is Stable) THEN (LRS is Low Risk)

Rule 29: IF (PMV is High) AND (REI is Weak) AND (LTV is High) THEN (LRS is High Risk)

Rule 47: IF (CS is Excellent) AND (DTI is Low) AND (ESI is Stable) AND (REI is Strong) THEN (LRS is Very Low Risk)

The dataset was partitioned into training (60%), validation (20%), and test (20%) subsets using stratified sampling to maintain class balance. The fuzzy system's binary output (approve/reject at LRS threshold of 50) was compared against actual loan performance outcomes. Classification performance was measured using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). Statistical significance was assessed via McNemar's test at α = 0.05.

Benchmark comparators included: Logistic Regression (LR) with L2 regularization, Random Forest (RF) with 200 estimators, Gradient-Boosted Trees (XGBoost), and a Multilayer Perceptron (MLP) with two hidden layers (128 and 64 neurons). All classifiers were trained on identical feature sets and partitions to ensure comparability.

Results and discussions

Table II presents the classification performance of the proposed Fuzzy Inference System against the four benchmark methods on the held-out test set (n = 480). The proposed FIS achieves an accuracy of 91.3% and AUC-ROC of 0.946, outperforming logistic regression by 7.6 percentage points in accuracy. Although XGBoost attains the highest overall accuracy (93.1%), its advantage over the FIS is not statistically significant (McNemar's test: χ² = 1.84, p = 0.175), whereas its interpretability is substantially inferior.

Table II.

Classification Performance on Test Set (n = 480)

Method

Accuracy (%)

Precision (%)

Recall (%)

F1-Score (%)

AUC-ROC

Logistic Regression

83.7

81.2

79.4

80.3

0.881

Random Forest

87.5

86.1

84.3

85.2

0.912

XGBoost

93.1

92.4

91.0

91.7

0.958

MLP Neural Network

88.9

87.6

85.9

86.7

0.929

Proposed FIS (Ours)

91.3

90.1

89.7

89.9

0.946

 

A sensitivity analysis was conducted by removing one input variable at a time and recomputing classification accuracy on the test set. Results indicate that Credit Score is the most influential single variable (accuracy drop of 8.4 pp when removed), followed by LTV ratio (6.1 pp), DTI ratio (4.7 pp), ESI (3.2 pp), PMV (2.8 pp), and REI (1.6 pp). This ranking aligns with domain expert assessments and regulatory guidance from the Basel Committee, providing additional construct validity for the model.

The PSO-based MF optimization procedure improved validation accuracy from 87.1% (expert-elicited MFs only) to 90.8%, representing a gain of 3.7 percentage points. The most significant improvements were observed for the DTI and PMV variables, whose empirical distributions exhibited bimodality inconsistent with initial expert-specified unimodal triangular MFs. Post-optimization MF parameters showed that the 'Moderate' DTI term shifted its peak from 0.35 to 0.31, consistent with the Central Bank of Kazakhstan's regulatory guideline recommending a DTI ceiling of 0.30 for prime borrowers.

The experimental results support the central argument that fuzzy rule-based systems provide a viable alternative to black-box machine learning classifiers in regulated mortgage valuation contexts. While XGBoost marginally outperforms the FIS in raw accuracy, the difference is statistically insignificant on the test set, and the FIS offers critical advantages in regulatory compliance and explainability. Each inference step—from input fuzzification through rule firing to crisp output—is fully traceable, enabling compliance officers to audit individual loan decisions in alignment with the requirements of the European Union's General Data Protection Regulation (GDPR) and analogous Kazakhstani data governance statutes.

Conclusion

This paper presented a Mamdani fuzzy inference system for residential mortgage loan valuation incorporating six input variables and a 47-rule expert-elicited rule base. Evaluated on 2,400 historical mortgage applications from the Kazakhstani banking sector, the proposed FIS achieved 91.3% accuracy and AUC-ROC of 0.946, statistically competitive with XGBoost while providing full decision transparency absent from black-box alternatives. The PSO-based optimization of membership functions contributed a 3.7 pp improvement over expert-specified initial parameters. Sensitivity analysis confirmed that Credit Score and LTV ratio are the dominant predictors, while the Regional Economic Index provides a meaningful supplementary signal in heterogeneous market environments. The proposed system constitutes a practical, interpretable, and regulatory-compliant decision-support tool for mortgage underwriters.

 

References:

  1. Dahr J., Sahl Gaafar A., Khalaf Hamoud A. Cascaded Fuzzy Analytics Based Model for Determining Rental Values of Residential Properties //International Journal of Computing and Digital Systems. – 2024. – Т. 16. – №. 1. – С. 1663-1673.
  2. Surgelas V., Puķīte V., Arhipova I. Property Valuation in Latvia and Brazil: A Multifaceted Approach Integrating Algorithm, Geographic Information System, Fuzzy Logic, and Civil Engineering Insights //Real Estate. – 2024. – Т. 1. – №. 3. – С. 229-251.
  3. Qi D. et al. Thermal response and performance evaluation of floor radiant heating system based on fuzzy logic control //Energy and Buildings. – 2024. – Т. 313. – С. 114232.
  4. Alzwi A. S. et al. Evaluation of total risk-weighted assets in Islamic banking through fintech innovations //Journal of Risk and Financial Management. – 2024. – Т. 17. – №. 7. – С. 288.
  5. Demirhan H., Baser F. Hierarchical fuzzy regression functions for mixed predictors and an application to real estate price prediction //Neural computing and applications. – 2024. – Т. 36. – №. 19. – С. 11545-11561.
  6. Shi B., Bai C., Dong Y. A big data analytics method for assessing creditworthiness of SMEs: Fuzzy equifinality relationships analysis //Annals of Operations Research. – 2025. – Т. 350. – №. 2. – С. 879-909.
  7. Diaz-Milanes D. et al. Assessment of care provision integration in a community-based mental health system: balanced care model implementation in Andalusia (Spain) //BMC Public Health. – 2024. – Т. 24. – №. 1. – С. 2671.
  8. Bhaker A. Democratizing Home Ownership through AI-Enabled Financing Tools //Journal of Computer Science and Technology Studies. – 2025. – Т. 7. – №. 4. – С. 546-555.
  9. Kowsar M. M., Mohiuddin M., Islam S. Mathematics for finance: A review of quantitative methods in loan portfolio optimization //International Journal of Scientific Interdisciplinary Research. – 2023. – Т. 4. – №. 3. – С. 01-29.
  10. Asemi A., Asemi A., Ko A. System Using Investor's Demographic Using ANFIS //Proceedings of Eighth International Congress on Information and Communication Technology: ICICT 2023, London, Volume 1. – Springer Nature, 2023. – Т. 1. – С. 241.
  11. Asemi A. A novel combined Investment recommender system using adaptive Neuro-Fuzzy Inference system : дис. – Budapesti Corvinus Egyetem, 2023.
Информация об авторах

Master Student, Department of Computer Science and Engineering program, Astana IT University, Kazakhstan, Astana

магистрант, Департамент Компьютерных Наук и Инженерии, Astana IT университет Астана, Казахстан, г. Астана

Associate Professor, Department of Information Systems, L.N. Gumilyov Eurasian National University, Kazakhstan, Astana

доц. кафедры Информационные системы Евразийского национального университета им. Л.Н., Казахстан, г. Астана

Журнал зарегистрирован Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор), регистрационный номер ЭЛ №ФС77-54434 от 17.06.2013
Учредитель журнала - ООО «МЦНО»
Главный редактор - Звездина Марина Юрьевна.
Top