ICCL: SELF-SUPERVISED INTRA- AND CROSS-MODAL CONTRASTIVE LEARNING WITH 2D-3D PAIRS FOR 3D SCENE UNDERSTANDING

Kyota Higa, Masahiro Yamaguchi, Toshinori Hosoi

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Poster 10 Oct 2023

This paper proposes self-supervised intra- and cross-modal contrastive learning (ICCL) with 2D-3D pairs for 3D scene understanding. Learning from different modalities has produced substantial results in self-supervised learning. Our method learns a model with high transferability by minimizing contrastive losses based on 2D, 3D, and 2D-3D features. Compared with a conventional approach minimizing 3D and 2D-3D contrastive losses, our method minimizes a 2D contrastive loss in addition to them. It leads to learning a better feature representation. We evaluate the transferability by conducting three downstream tasks, including object classification and part segmentation. The results of the 3D object classification show that our approach achieves an accuracy of 91.7 and 85.4 (0.5 and 3.7 points higher than the conventional method). The results of the few-shot object classification and the part segmentation show that our accuracy is equal to or higher than conventional methods. With better feature representation for 2D images and 3D point clouds, transfer learning can be more accessible, enabling the implementation of various applications in many fields.

Tags:

contrastive learning

image

point cloud

multi-modality

ICCL: SELF-SUPERVISED INTRA- AND CROSS-MODAL CONTRASTIVE LEARNING WITH 2D-3D PAIRS FOR 3D SCENE UNDERSTANDING

Kyota Higa, Masahiro Yamaguchi, Toshinori Hosoi

More Like This

SPS IVMSP TC Webinar: Deep Learning for Inverse Problems in Imaging

EXPLORING SELF-SUPERVISED REPRESENTATION LEARNING FOR LOW-RESOURCE MEDICAL IMAGE ANALYSIS

EVENT DATA STREAM COMPRESSION BASED ON POINT CLOUD REPRESENTATION

Join an IEEE Society