USURP: UNIVERSAL SINGLE-SOURCE ADVERSARIAL PERTURBATIONS ON MULTIMODAL EMOTION RECOGNITION
Yin Yin Low, Rapha ̈el C.-W. Phan, Arghya Pal, Xiaojun Chang
-
SPS
IEEE Members: $11.00
Non-members: $15.00
The field of affective computing has progressed from traditional unimodal analysis to more complex multimodal analysis due to the proliferation of videos posted online. Multimodal learning has shown remarkable performance in emotion recognition tasks, but its robustness in an adversarial setting remains unknown. This paper investigates the robustness of multimodal emotion recognition models against worst-case adversarial perturbations on a single modality. We found that standard multimodal models are susceptible to single-source adversaries and can be easily fooled by perturbations on any single modality. We draw some key observations that serve as guidelines for designing universal adversarial attacks on multimodal emotion recognition models. Motivated by these findings, we propose a novel universal single-source adversarial perturbations framework on multimodal emotion recognition models: \textsf{USURP}. Through our analysis of adversarial robustness, we demonstrate the necessity of studying adversarial attacks on multimodal models. Our experimental results show that the proposed \textsf{USURP} method achieves high attack success rates and significantly improves adversarial transferability in multimodal settings. The observations and novel attack methods presented in this paper provide a new understanding of the adversarial robustness of multimodal models, contributing to their safe and reliable deployment in more real-world scenarios.