Development of a Unimodal RGB-Based Face Spoof Detection System Using Deep Learning with ResNet50

Project leader: Praveen Kumar, PhD, Professor, Astana IT University, Department of Computer Engineering.

Project Goal: To develop a passive, RGB-only face anti-spoofing model based on deep convolutional neural networks (ResNet-50) capable of detecting presentation attacks without additional hardware.

Objectives

Analyze existing face anti-spoofing methods (active, multimodal, RGB-only).
Leverage the large-scale WFAS dataset for robust spoof detection.
Train and evaluate a ResNet-50-based CNN on RGB-only input.
Conduct grid search to optimize hyperparameters.
Perform ablation study on data augmentation (blur, flip).
Measure performance using ISO/IEC 30107-3 metrics: APCER, BPCER, ACER.
Validate model generalization on a held-out test set.
Publish findings and explore future directions like cross-dataset testing and attention-based models.

Results

1. A ResNet-50 CNN model was trained on the WFAS dataset with over 1.3 million images and evaluated using APCER, BPCER, and ACER.

2. The data preprocessing pipeline used for the dataset described below:

Here is the formulas of the evaluation metrics of ISO/IEC 30107-3:

, where FN, FP, TN, and TP represent false negatives, false positives, true negatives, and true positives, respectively.

3. Here is the demonstration of the live/spoof classification on 3 examples: live faces, spoof face and mixed photo (both live and spoof faces).

4. Grid search identified optimal hyperparameters: learning rate 0.0001, batch size 32, 10 training epochs.

After selecting the best configuration based on the development set, we further analyze the impact of increasing the number of training epochs on the model’s performance. The evaluation metrics reported include APCER, BPCER and ACER. As shown in Table 2, the optimal learning rate was 0.0001 with a batch size of 32 that can achieve the maximum value of ACER was 4.9% on the validation set.

The summary of comparing the model trained on different number of epochs can be seen in Figure 1.

5. Ablation study showed data augmentation (blur + flip) significantly improved performance.

The evaluation metrics includes APCER, BPCER and ACER. Table 3 summarizes the results:

6. Final model achieved ACER of 2.99% on the test set.

The results on the test set are summarized in Table 4.

7. The approach proved suitable for deployment on consumer devices using RGB cameras.

8. 1 conference publication accepted (Scopus-indexed).

Team of project

Praveen Kumar

Project Leader, PhD, Professor

Miras Aliyev

Project Executor, 1st year master student, Astana IT University

Riza Rakhim

Project Executor, 1st year bachelor student Astana IT University