Multi-Step Test-Time Adaptation With Entropy Minimization and Pseudo-Labeling
Hiroaki Kingetsu, Kenichi Kobayashi, Yoshihiro Okawa, Yasuto Yokota, Katsuhito Nakazawa
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:17
Face videos own abundant structured information and prior knowledge which can be utilized by generative neural networks to achieve ultra-low bitrate compression. However, generative neural network based face video compression suffers from large head motion which may easily result in deformed images. in this paper, the dy-namic multi-reference prediction method is proposed for generative face video compression. Specifically, key map is extracted as the compact latent to represent the face image. The key maps of the current frame and mul-tiple reference frames are used together to estimate mul-tiple dense motion maps. The multiple motion maps are further applied to the corresponding reference frames to generate the final prediction of the current frame. More-over, the reference frame can be dynamically refreshed during encoding to convert large head motion to relative-ly small motion. Experimental results show that the pro-posed method achieves superior compression perfor-mance compared to the state-of-the-art VVC standard as well as the latest generative face compression frame-works.