A COMBINED BIOMETRIC IDENTIFICATION SYSTEM BASED ON INSIGHTFACE, LOGISTIC REGRESSION, AND ALIVENESS DETECTION

КОМБИНИРОВАННАЯ СИСТЕМА БИОМЕТРИЧЕСКОЙ ИДЕНТИФИКАЦИИ, ОСНОВАННАЯ НА INSIGHTFACE, ЛОГИСТИЧЕСКОЙ РЕГРЕССИИ И ОПРЕДЕЛЕНИИ ЖИВУЧЕСТИ

Makhmudova Sh.Y. Saydazimov J.

28.02.2026 264

2(143)

10. Информатика, вычислительная техника и управление

Цитировать:

Makhmudova Sh.Y., Saydazimov J. A COMBINED BIOMETRIC IDENTIFICATION SYSTEM BASED ON INSIGHTFACE, LOGISTIC REGRESSION, AND ALIVENESS DETECTION // Universum: технические науки : электрон. научн. журн. 2026. 2(143). URL: https://7universum.com/ru/tech/archive/item/21940 (дата обращения: 08.07.2026).

Прочитать статью:

DOI - 10.32743/UniTech.2026.143.2.21940

ABSTRACT

A hybrid biometric identification system has been developed that combines the Insight Face embedding model, a classifier based on logistic regression, and a livability verification module. We have created our own dataset of 150 people with extended augmentation of more than 23 GB, which has increased the model's resistance to changes in lighting, angle and image quality. Logistic regression has demonstrated high accuracy at low computational cost. The integrated liveability module recognizes spoofing attempts using photo and video images. Experiments have shown 96% recognition accuracy and the system's resistance to fake attempts to present a face. The presented visual materials confirm the correct operation of all components. The architecture can be used in access control and remote verification systems.

АННОТАЦИЯ

Была разработана гибридная система биометрической идентификации, которая сочетает в себе модель встраивания лиц Insight, классификатор, основанный на логистической регрессии, и модуль проверки пригодности для жизни. Мы создали собственный набор данных из 150 человек с расширенным расширением более чем на 23 ГБ, что позволило повысить устойчивость модели к изменениям освещения, угла обзора и качества изображения. Логистическая регрессия продемонстрировала высокую точность при низких вычислительных затратах. Встроенный модуль liveability распознает попытки подмены с использованием фото- и видеоизображений. Эксперименты показали точность распознавания на уровне 96% и устойчивость системы к попыткам подделки лица. Представленные визуальные материалы подтверждают корректную работу всех компонентов. Архитектура может быть использована в системах контроля доступа и удаленной верификации.

Keywords: facial recognition, biometric identification, computer vision, embeddings, InsightFace, logistic regression, liveness detection.

Ключевые слова: распознавание лиц, биометрическая идентификация, компьютерное зрение, эмбеддинги лиц, InsightFace, логистическая регрессия, определение живучести.

Introduction

In recent years, biometric technologies, especially facial recognition, have become a key element of banking services, physical and information security systems, as well as remote user identification. Their popularity is due to their convenience, high processing speed, and easy integration into existing solutions. However, the increase in the use of such systems is accompanied by an increase in the number of spoofing attacks using photographs, videos and 3D masks, which reduces the reliability of traditional methods.

Modern recognition systems mainly use deep neural networks, among which the Insight Face model occupies a leading position due to the Arc Face architecture and high feature separability. Despite this, most of the existing solutions either do not take into account the threat of spoofing, or require significant computing resources, which limits their use in application systems.

Taking into account these limitations, an easy and attack-resistant identification method is proposed. We have created our own dataset of 150 people with an extended augmentation of 23 GB, providing a high variety of training data. A hybrid architecture has been built on the basis of Insight Face, where embeddings are classified using logistic regression, a compact and interpretable algorithm. Additionally, a liveliness detection module has been developed that detects fake attempts to present a face and protects against widespread spoofing attacks.

Experimental results showed 96% recognition accuracy and high attack filtering efficiency, which confirms the applicability of the proposed solution in biometric access control and remote verification systems.

Related works

Modern face recognition methods rely mainly on deep convolutional neural networks, which ensure the construction of stable embeddings with high identification accuracy. One of the first significant solutions was the FaceNet model proposed by Shroff et al. [1], which uses the triplet loss function to minimize intra-class variability and increase inter-class distances. This approach has made it possible to create a metric space with high performance on the LFW dataset, but it requires significant computing resources, careful selection of triplets and does not take into account anti-spoofing issues.

Methods with modified margin-based loss functions were further developed. In the work of CosFace (Wang et al. [2]) a cosine normalized representation of features is proposed, which increases the angular intervals between classes, which improves the separability of embeddings. The method demonstrates high accuracy, but is sensitive to normalization parameters. A more advanced approach is implemented in ArcFace (Deng et al. [3]), where an additive angular indentation is used to improve the geometry of the feature space and achieve state-of-the-art on open benchmarks. Based on Arc Face, the Insight Face system [4] has been created, which includes optimized iresnet and mobileface architectures, as well as a set of improved loss functions. Insight Face has become one of the most popular face recognition tools, providing high accuracy and stability based on real data. However, the lack of built-in anti-counterfeiting features limits its use in systems with a high degree of protection.

In parallel, active development of anti-spoofing methods is underway. Texture algorithms (LBP, HOG) analyze the artifacts of fake images, but lose effectiveness with high-quality attacks. CNN approaches are trained to directly distinguish between a live face and fake, requiring large training samples. Temporal methods use micro-eye movements, facial expressions, and the dynamics of a sequence of frames, which increases the stability of photos and videos. Depth methods build or measure a depth map, providing high accuracy, but require specialized equipment. Standard datasets such as CASIA-FASD (Zhang et al. [5]), Replay-Attack, and MSU-MFSD, which allow for comparable studies.

Despite the high level of development of Face Net, Cos Face, Arc Face and Insight Face methods, most of the work does not integrate vivacity detection mechanisms, which reduces resistance to real threats of spoofing. In addition, lightweight classifiers such as logistic regression are rarely considered in published studies, although they provide stability, interpretability, and low computational costs when using high-quality embeddings.

The system proposed in this paper combines the advantages of the powerful Insight Face embedding model, an effective classifier based on logistic regression, and a liveliness detection module. This hybrid approach ensures high recognition accuracy with low computational requirements and resistance to spoofing attacks, which makes the system suitable for use in real biometric complexes.

Materials and methods

The task being solved in this work belongs to the class of tasks of biometric identification of a person from a face image in real time. Formally, the task can be defined as mapping an input image or video frame to a user ID space or to a failure class:

where H×W×3 is the size of the input color image, and K is the number of registered users. Unlike verification tasks where two images are compared, this paper considers a one-to-many identification mode in which the input image is compared with the entire set of known identities.

A special feature of the problem under consideration is the need to ensure stable operation in a real video stream, which imposes additional restrictions on the computational complexity of the algorithms and the response time of the system. In addition, the system must be resistant to changing lighting conditions, face angles, expressions, and partial overlaps, as well as provide protection against image substitution attacks.

The developed system uses a combined algorithm to solve the problem of face recognition, including embedding extraction using the InsightFace model, Random Forest classification, and a backup metric method based on cosine similarity. The software implementation is implemented in Python using the OpenCV, scikit-learn, NumPy, and InsightFace libraries.

For the classifier to work, a pre-trained Random Forest model and a label encoder object are loaded. This allows you to match class indexes with real user identities. Next, the InsightFace (FaceAnalysis) module is initialized, which provides face detection and the formation of normalized embeddings of a vector shape.

After receiving the embedding, its normalization is performed:

this allows to compare features in a common metric space and eliminates dependence on the scale of the vector.

To implement a backup identification mechanism, the class centroids are calculated in advance. Let the training sample contain user embeddings , then the centroid of the 𝑘 th user is defined as

These centroids are used as reference representations of the class during fallback recognition.

The main processing is carried out in a loop, where brightness normalization and resolution reduction are performed for each frame to increase the speed of operation. InsightFace detects faces and extracts embeddings. Then the classification is performed using the Random Forest method, which returns the probability distribution by class:

If the maximum probability satisfies the condition

where 𝜏RF is the confidence threshold, the system makes a decision based on Random Forest.

If the probability is below the threshold, a fallback identification is performed based on the cosine similarity between the embedding and the centroids:

since both vectors are normalized. The result is accepted if the condition is met

To reduce the number of errors caused by short-term changes in image quality, smoothing of predictions in a time window of 𝑀 is used. For the sequence of the last labels

The final decision is chosen by the majority principle:

Visualization includes displaying a frame around the face and a signature with the name and confidence level. The frame rate (FPS) is also calculated, which allows you to monitor system performance in real time.

Thus, the proposed system is a multicomponent recognition method combining probabilistic classification, metric analysis and temporal smoothing, which ensures the reliability and stability of recognition in streaming conditions.

Figure 1. The general architecture of the proposed real-time facial recognition system

Experimental studies and results

To train and evaluate the developed system, a proprietary dataset of facial images was created, including 150 different users. For each user, 10 source images were collected, obtained in various lighting conditions, angles, and facial expressions. Thus, the initial dataset volume was 1,500 images.

To increase the generalizing ability of the model and its resistance to variations in external conditions, a data augmentation procedure was applied, including geometric and photometric transformations. As a result of the augmentation, the amount of training data was significantly increased to about 23 GB. This approach has made it possible to increase the system's resistance to noise, changes in illumination, and minor head turns.

Figure 2. Organizing a training dataset for a facial recognition system

One of the key problems of biometric systems is the limited number of source images per user. As part of this work, the initial dataset included 10 images per user, which is not enough to train a stable classifier without additional measures. To solve this problem, advanced data augmentation was applied.

The augmentation procedure included geometric transformations (rotations, scaling, reflections), as well as photometric changes such as varying brightness, contrast, and color balance. As a result, the amount of training data was increased to about 23 GB, which significantly expanded the representation of each user in the feature space.

Experimental observations have shown that the use of augmentation leads to a decrease in the sensitivity of the system to shooting conditions and a decrease in the number of false failures when lighting changes. Thus, data augmentation is an important factor in increasing the generalizing ability of the system and the stability of recognition when working with real video streams.

The experiments were conducted in a one-to-many identification mode in a real video stream. The system accepted the image from the webcam as input and performed the successive stages of authentication (liveness detection), face detection and user identification. In each experiment, the correctness of personality recognition and the performance of the system in real time were evaluated.

To prevent attacks using photos, videos, and images displayed on screens, an algorithm for determining user authenticity (liveness detection) has been integrated into the system. This module performs a preliminary check of the input video stream and blocks the facial recognition procedure in case of signs of spoofing.

The experimental results of the module's operation are confirmed by visual examples shown in the screenshots, which show cases of successful detection of image substitution attempts. The integration of the liveness detection mechanism significantly improves system security and expands the scope of its practical application.

The visual results of the liveness detection module are shown in Figure 3.

Figure 3. Examples of the operation of the user authentication module (liveness detection)

In the course of experimental studies, it was found that the proposed system provides 96% recognition accuracy, which confirms the effectiveness of a combined approach that includes probabilistic classification, metric fallback, and temporal smoothing of results.

The system's real-time performance averaged 3.4 frames per second, which is sufficient for interactive biometric applications aimed at access control and user identification.

Figure 4 Real-time user identification result with confidence level and frame rate display

The quality of the facial recognition system was assessed using standard metrics used in biometric identification systems. The main indicators used were Accuracy, accuracy of positive predictions, and completeness, determined by the following expressions:

TP — the number of correctly recognized users, TN — the number of correctly rejected attempts, FP — the number of false tolerances, FN — the number of false bounces.

Additionally, the indicators of false admission (FAR — False Acceptance Rate) and false rejection (FRR — False Rejection Rate) were analyzed, which is especially important for biometric security systems.

Experimental studies were conducted on our own dataset, which included 150 users, 10 images each, using advanced data augmentation. The quality assessment was performed in the one-to-many identification mode in a real video stream.

Based on the experimental results, it was found that the average recognition accuracy of the system was 96%, which confirms the effectiveness of the proposed combined approach, including Random Forest probabilistic classification and backup metric identification based on cosine similarity.

System performance is a critical parameter for real-time facial recognition tasks. In this paper, performance was assessed based on the average frame rate (FPS). The obtained FPS value of ≈ 3.4 indicates the possibility of practical application of the system in interactive scenarios such as access control and user authentication.

The main contribution to computational complexity is made by the stages of face detection and embedding extraction using the InsightFace model. Classification using Random Forest and calculation of cosine similarity have significantly lower computational complexity and are not a bottleneck of the system. To improve performance, we applied methods for scaling input frames and skipping part of the video stream frames without significantly reducing recognition accuracy.

Thus, the proposed architecture represents a compromise between accuracy and computational efficiency, which is an important aspect for the practical implementation of biometric systems.

Table 1.

Performance indicators of the facial recognition system

Metric	Meaning
Accuracy	0.96
Precision	0.95
Recall	0.96
FAR	0.03
FRR	0.04

The results show that the system correctly recognizes the vast majority of users, demonstrating low values of false admission and false refusal. Using a backup identification mechanism based on cosine similarity makes it possible to reduce the number of erroneous classifications in conditions of poor lighting and partial face overlap.

Additional temporary smoothing of predictions increases the stability of identification in the video stream, which is confirmed by the visual results of the system. The average performance of 3.4 frames per second provides the possibility of practical use of the algorithm in interactive access control systems.

Discussion

The results obtained in the course of the work confirm the effectiveness of the proposed combined approach to the task of real-time face recognition. The use of embeddings extracted by the InsightFace model, combined with Random Forest probabilistic classification and a backup metric mechanism based on cosine similarity, made it possible to achieve high identification accuracy with a limited number of source images per user. The average recognition accuracy of 96% indicates the correct operation of the system in a real video stream and confirms the feasibility of the chosen architecture.

An important factor influencing the stability of recognition is the use of advanced data augmentation. Despite the relatively small volume of the initial set of images (10 images per user), increasing the training sample due to geometric and photometric transformations significantly improved the generalizing ability of the classifier. This is especially evident in conditions of changing lighting and partial overlap of the face, where without augmentation there would be a noticeable decrease in accuracy. Thus, data augmentation plays a key role in the practical applicability of the system with limited data collection resources.

The integration of the user authentication algorithm (liveness detection) significantly expands the scope of the proposed system. Experimental examples have shown that the module is able to effectively detect spoofing attempts using photos and images displayed on the screen of mobile devices. The presence of this mechanism allows us to consider the developed system not only as an identification tool, but also as a component of more complex biometric security systems, where protection against spoofing attacks is a mandatory requirement.

In terms of computational efficiency, the system demonstrates performance on the order of 3.4 frames per second, which is an acceptable indicator for interactive scenarios such as access control or user authentication. At the same time, the main computational load falls on the stages of face detection and embedding extraction, while classification and metric comparison do not significantly affect the overall processing time. The use of frame scaling and skipping part of the video stream has allowed us to achieve a balance between speed and accuracy, which is an important aspect for practical implementation.

In comparison with approaches using exclusively metric embedding comparison or only probabilistic classification, the proposed solution demonstrates more stable behavior in conditions of uncertainty. A backup identification mechanism based on cosine similarity allows you to correctly handle cases when Random Forest's confidence is insufficient, and temporary smoothing of predictions reduces the impact of short-term errors typical of streaming data. This multi-level approach increases the reliability of the system without significantly complicating the architecture.

However, the proposed method has a number of limitations. In particular, the system performance is limited by the computing capabilities of the equipment used, and with a significant increase in the number of users, scalability problems with the Random Forest classifier may occur. In addition, recognition efficiency may decrease with severe facial occlusion or when using cameras with low image quality. These aspects determine the directions of further research, including optimization of computing modules and the study of alternative classification architectures.

In general, the results of the work demonstrate that the proposed facial recognition system is a balanced and practically applicable solution combining high accuracy, resistance to external factors and the availability of protection mechanisms against spoofing attacks. The results obtained confirm the prospects of using combined identification algorithms in applied biometric systems.

Conclusion

In this paper, a real-time facial recognition system was developed and investigated, based on the use of embeddings of the InsightFace model, the Random Forest classifier and a backup identification mechanism based on cosine similarity. The proposed architecture is complemented by a user authentication module, which increases the system's security and resilience to image substitution attacks.

Experimental studies conducted on our own dataset, which includes 150 users, followed by data augmentation, have shown that the developed method provides an average recognition accuracy of 96% with a processing frequency of about 3.4 frames per second. The results obtained confirm the effectiveness of the combined approach when working in conditions of a real video stream and a limited number of source images per user.

The use of a backup metric mechanism and temporary smoothing of predictions made it possible to reduce the impact of short-term image distortions and increase the stability of identification. The integration of the liveness detection algorithm expands the scope of the system's practical application and makes it suitable for use in biometric authentication and access control tasks.

The results of the work demonstrate the prospects of the proposed approach and can be used in the further development of computer vision systems and intelligent biometric technologies.

References:

Schroff F., Kalenichenko D., Philbin J. FaceNet: A unified embedding for face recognition and clustering // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. URL: https://arxiv.org/abs/1503.03832
Wang H. et al. CosFace: Large margin cosine loss for deep face recognition. URL: https://arxiv.org/abs/1801.09414
Deng J. et al. ArcFace: Additive angular margin loss for deep face recognition. URL: https://arxiv.org/abs/1801.07698
InsightFace Project. URL: https://github.com/deepinsight/insightface
Zhang Z. et al. CASIA Face Anti-Spoofing Database // 2012 International Conference on Computer Human Interaction. DOI: 10.1109/CHCI.2012.6518181
Wen D., Han H., Jain A. Face Spoof Detection with Image Distortion Analysis // IEEE TIP. 2015.
A. Ross, K. Nandakumar and A. K. Jain, Handbook of Multibiometrics (Springer-Verlag, 2006).
V. K. N. Kumar, ”Performance of Personal Identification System Technique Using Iris Biometrics Technology”, International Journal on Image, Graphics and Signal Processing, vol. 5, no. 5, pp. 63-71, Apr. 2013.
A. Chikhaoui, B. Djebbar, and R. Mekki, ”New Method for Finding an Optimal Solution to Quadratic Programming Problem”, Journal of Applied Sciences, vol.
Y. Yin, L. Liu, and X. Sun, ”SDUMLA-HMT: A Multimodal Biometric Database”, In Chinese Conference on Biometric Recognition (CCBR 2011), Springer Berlin Heidelberg, pp. 260–268, Dec. 2011.
N. Wang, L. Lu, G. Gao, F. Wang, and S. Li, ”Multibiometrics Fusion Using Aczél-Alsina Triangular Norm”, KSII Transactions on Internet and Information Systems(TIIS), vol. 8, no. 7, pp. 2420-2433, Jul. 2014

Информация об авторах

Makhmudova Shakhzoda Yorkinovna

PhD student, Department of Artificial intelligence, Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, Uzbekistan, Tashkent

Махмудова Шахзода Ёркиновна

докторант кафедры Искусственный интеллект Ташкентский университет информационных технологий имени Мухаммада ал-Хорезми, Узбекистан, г. Ташкент

Saydazimov Javlonbek

PhD student, Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, Uzbekistan, Tashkent

Сайдазимов Жавлонбек

докторант, Ташкентский университет информационных технологий имени Мухаммада ал-Хорезми, Узбекистан, г. Ташкент