PMP-NET: RETHINKING VISUAL CONTEXT FOR SCENE GRAPH GENERATION

Xuezhi Tong, Rui Wang, Chuan Wang, Sanyi Zhang, Xiaochun Cao

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:05:17

09 May 2022

Scene graph generation aims to describe the contents in scenes by identifying the objects and their relationships. In previous works, visual context is widely utilized in message passing networks to generate the representations for classification. However, the noisy estimation of visual context limits model performance. In this paper, we revisit the concept of incorporating visual context via a randomly ordered bidirectional Long Short Temporal Memory (biLSTM) based baseline, and show that noisy estimation is worse than random. To alleviate the problem, we propose a new method, dubbed Progressive Message Passing Network (PMP-Net) that better estimates the visual context in a coarse to fine manner. Specifically, we first estimate the visual context with a random initiated scene graph, then refine it with multi-head attention. The experimental results on the benchmark dataset Visual Genome show that PMP-Net achieves better or comparable performance on all three tasks: scene graph generation (SGGen), scene graph classification (SGCls), and predicate classification (PredCls).

Tags:

message passing

scene graph generation

multi-head attention

visual context

PMP-NET: RETHINKING VISUAL CONTEXT FOR SCENE GRAPH GENERATION

Xuezhi Tong, Rui Wang, Chuan Wang, Sanyi Zhang, Xiaochun Cao

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

USGG: UNION MESSAGE BASED SCENE GRAPH GENERATION

Plenary - Physics-inspired learning on graphs

LEARNING SEMANTIC-ALIGNED FEATURE REPRESENTATION FOR TEXT-BASED PERSON SEARCH

Join an IEEE Society