Facial Anti-Spoofing Using “Clue Maps”

Spoofing attacks (or Presentation Attacks) are easily accessible to facial recognition systems, making the online financial system vulnerable. Thus, it is urgent to develop an anti-spoofing solution with superior generalization ability due to the high demand for spoofing attack detection. Although m...

Full description

Saved in:
Bibliographic Details
Main Authors: Liang Yu Gong, Xue Jun Li, Peter Han Joo Chong
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/23/7635
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849220427310694400
author Liang Yu Gong
Xue Jun Li
Peter Han Joo Chong
author_facet Liang Yu Gong
Xue Jun Li
Peter Han Joo Chong
author_sort Liang Yu Gong
collection DOAJ
description Spoofing attacks (or Presentation Attacks) are easily accessible to facial recognition systems, making the online financial system vulnerable. Thus, it is urgent to develop an anti-spoofing solution with superior generalization ability due to the high demand for spoofing attack detection. Although multi-modality methods such as combining depth images with RGB images and feature fusion methods could currently perform well with certain datasets, the cost of obtaining the depth information and physiological signals, especially that of the biological signal is relatively high. This paper proposes a representation learning method of an Auto-Encoder structure based on Swin Transformer and ResNet, then applies cross-entropy loss, semi-hard triplet loss, and Smooth L1 pixel-wise loss to supervise the model training. The architecture contains three parts, namely an Encoder, a Decoder, and an auxiliary classifier. The Encoder part could effectively extract the features with patches’ correlations and the Decoder aims to generate universal “Clue Maps” for further contrastive learning. Finally, the auxiliary classifier is adopted to assist the model in making the decision, which regards this result as one preliminary result. In addition, extensive experiments evaluated Attack Presentation Classification Error Rate (APCER), Bonafide Presentation Classification Error Rate (BPCER) and Average Classification Error Rate (ACER) performances on the popular spoofing databases (CelebA, OULU, and CASIA-MFSD) to compare with several existing anti-spoofing models, and our approach could outperform existing models which reach 1.2% and 1.6% ACER on intra-dataset experiment. In addition, the inter-dataset on CASIA-MFSD (training set) and Replay-attack (Testing set) reaches a new state-of-the-art performance with 23.8% Half Total Error Rate (HTER).
format Article
id doaj-art-acffdcdacee549529a99bedffcb64f53
institution Kabale University
issn 1424-8220
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-acffdcdacee549529a99bedffcb64f532024-12-13T16:32:15ZengMDPI AGSensors1424-82202024-11-012423763510.3390/s24237635Facial Anti-Spoofing Using “Clue Maps”Liang Yu Gong0Xue Jun Li1Peter Han Joo Chong2Department of Electrical and Electronic Engineering, Auckland University of Technology, Auckland 1010, New ZealandDepartment of Electrical and Electronic Engineering, Auckland University of Technology, Auckland 1010, New ZealandDepartment of Electrical and Electronic Engineering, Auckland University of Technology, Auckland 1010, New ZealandSpoofing attacks (or Presentation Attacks) are easily accessible to facial recognition systems, making the online financial system vulnerable. Thus, it is urgent to develop an anti-spoofing solution with superior generalization ability due to the high demand for spoofing attack detection. Although multi-modality methods such as combining depth images with RGB images and feature fusion methods could currently perform well with certain datasets, the cost of obtaining the depth information and physiological signals, especially that of the biological signal is relatively high. This paper proposes a representation learning method of an Auto-Encoder structure based on Swin Transformer and ResNet, then applies cross-entropy loss, semi-hard triplet loss, and Smooth L1 pixel-wise loss to supervise the model training. The architecture contains three parts, namely an Encoder, a Decoder, and an auxiliary classifier. The Encoder part could effectively extract the features with patches’ correlations and the Decoder aims to generate universal “Clue Maps” for further contrastive learning. Finally, the auxiliary classifier is adopted to assist the model in making the decision, which regards this result as one preliminary result. In addition, extensive experiments evaluated Attack Presentation Classification Error Rate (APCER), Bonafide Presentation Classification Error Rate (BPCER) and Average Classification Error Rate (ACER) performances on the popular spoofing databases (CelebA, OULU, and CASIA-MFSD) to compare with several existing anti-spoofing models, and our approach could outperform existing models which reach 1.2% and 1.6% ACER on intra-dataset experiment. In addition, the inter-dataset on CASIA-MFSD (training set) and Replay-attack (Testing set) reaches a new state-of-the-art performance with 23.8% Half Total Error Rate (HTER).https://www.mdpi.com/1424-8220/24/23/7635anti-spoofing detectionSwin TransformerResNetauto-encoder
spellingShingle Liang Yu Gong
Xue Jun Li
Peter Han Joo Chong
Facial Anti-Spoofing Using “Clue Maps”
Sensors
anti-spoofing detection
Swin Transformer
ResNet
auto-encoder
title Facial Anti-Spoofing Using “Clue Maps”
title_full Facial Anti-Spoofing Using “Clue Maps”
title_fullStr Facial Anti-Spoofing Using “Clue Maps”
title_full_unstemmed Facial Anti-Spoofing Using “Clue Maps”
title_short Facial Anti-Spoofing Using “Clue Maps”
title_sort facial anti spoofing using clue maps
topic anti-spoofing detection
Swin Transformer
ResNet
auto-encoder
url https://www.mdpi.com/1424-8220/24/23/7635
work_keys_str_mv AT liangyugong facialantispoofingusingcluemaps
AT xuejunli facialantispoofingusingcluemaps
AT peterhanjoochong facialantispoofingusingcluemaps