Towards Generalizable Deepfake Face Forgery Detection With Semi-Supervised Learning and Knowledge Distillation

Yuzhen Lin, Han Chen, Bin Li, Junqiang Wu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:11:44

06 Oct 2022

Driven by the appeal of real-world applicable models, we investigate how temporal and spatial occlusion affect sign language recognition. Utilizing only a crop of the hands and pose flow, we maintain accuracies comparable to an I3D baseline for the WLASL dataset using a video transformer model (VTN), implying that hand crops might contain enough information for accurate prediction. Moreover, we find that a crop of only the right hand provides enough data to train an accurate model, achieving results of 0.2% less than the baseline for AUTSL and 4.7% less across all WLASL datasets. Sampling a video every fifth frame achieves comparative results to baseline, with 8 frame sequences performing better for AUTSL (0.4% less than baseline) and 16 frames performing better for WLASL (0.2% for WLASL 100 and 300). Our results indicate the feasibility of utilizing less information for sign language recognition, however more research is necessary to apply these findings in real-world scenarios.

Tags:

International Conference on Image Processing

IEEE ICIP 2022

icip

Towards Generalizable Deepfake Face Forgery Detection With Semi-Supervised Learning and Knowledge Distillation

Yuzhen Lin, Han Chen, Bin Li, Junqiang Wu

Value-Added Bundle(s) Including this Product

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

More Like This

Earthquake Location and Magnitude Estimation With Graph Neural Networks

Automating Detection of Papilledema in Pediatric Fundus Images With Explainable Machine Learning

Revisiting The Efficiency of Ugc Video Quality Assessment

Join an IEEE Society