MIXED KNOWLEDGE RELATION TRANSFORMER FOR IMAGE CAPTIONING

Tianyu Chen, Zhixin Li, Jiahui Wei, Tiantao Xian

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:23

13 May 2022

Internal relationship of image objects has contributed significantly to the development of image captioning, especially when combined with Transformer architecture. Most of these methods only calculate the relationship between entities and ignore the information between entities and background. Besides, the way of exploring the relational information inside the image can also be extended. In this paper, we continually explore the relationship between objects from both internal and external perspectives, and embed the vital image global information into the internal relationship module. To validate the effectiveness of our model, we conduct extensive experiments on the most popular MSCOCO dataset, and achieve state-of-the-art performance on both online and offline test sets.

Tags:

object relation

external knowledge

image captioning

MIXED KNOWLEDGE RELATION TRANSFORMER FOR IMAGE CAPTIONING

Tianyu Chen, Zhixin Li, Jiahui Wei, Tiantao Xian

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

Zero-shot Human-Object Interaction (HOI) Classification by Bridging Generative and Contrastive Image-Language Models

Mitigating Dataset Bias in Image Captioning through CLIP Confounder-free Captioning Network

SELF ADAPTIVE GLOBAL-LOCAL FEATURE ENHANCEMENT FOR RADIOLOGY REPORT GENERATION

Join an IEEE Society