ADDING DISTANCE INFORMATION TO SELF-SUPERVISED LEARNING FOR RICH REPRESENTATIONS
Yeji Kim, Bai-Sun Kong
-
SPS
IEEE Members: $11.00
Non-members: $15.00
While contrastive learning has been the mainstream in self-supervised learning, negative sampling for learning is a challenging issue. To resolve this problem, recent studies instead used a pair of augmented images generated from each image as positive samples and trained models by maximizing the similarity between representation vectors. However, training models using positive samples may not be enough to let the models learn rich representations because these methods cannot consider the information from different images. To address the issue, we propose a learning method that uses clustered positive samples from different images to allow the network to learn better. As the first step, our method learns invariance representations from augmented images. Next, the resulting representation vectors are clustered to learn rich representations. According to our evaluation, it was found that our method achieved better performance than other models using a clustering method from scratch. Moreover, in downstream tasks, our model achieved a higher classification accuracy than Barlow Twins under the same experimental setting. Therefore, adding the distance information on the class centroid to the loss function of self-supervised learning can improve performance by exploiting rich representations in the data.