Let Them Choose What They Want: A Multi-Task Cnn Architecture Leveraging Mid-Level Deep Representations For Face Attribute Classification
Zhenduo Chen, Feng Liu, Zhenglai Zhao
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:22
Face Attributes Classification (FAC) is an important task in computer vision, aiming to predict the facial attributes of a given image. However, the value of mid-level feature information and the correlation between face attributes are always ignored by deep learning-based FAC methods. In order to solve these problems, we propose a novel and effective Multi-task CNN architecture. Instead of predicting all 40 attributes together, an attribute grouping strategy is proposed to divide the 40 attributes into 8 task groups correlatively. Meanwhile, through the Fusion Layer, mid-level deep representations are fused into the original feature representations to jointly predict the face attributes. Furthermore, the Task-unique Attention Modules can help learn more task-specific feature representations, obtaining higher FAC accuracy. Extensive experiments on the CelebA dataset demonstrate that our method outperforms state-of-the-art FAC methods.