SCENE TEXT SEGMENTATION BY PAIRED DATA SYNTHESIS

Quang-Vinh Dang, Guee-Sang Lee

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Poster 09 Oct 2023

Scene text segmentation task has numerous practical applications. However, the number of images in the available datasets for scene text segmentation is too small to effectively train deep learning-based models, leading to limited performance. To solve this problem, we perform the segmentation in two aspects: paired data synthesis and methodology. The former is executed via the proposed Text Image-conditional GANs to generate realistic paired data. We exploit real-world images by self-supervised pre-training scheme via inpainting approach before training the proposed GANs to produce realistic synthetic data. The latter is carried out by the proposed scene text segmentation network to optimize learning the generated paired data, called Multi-task Cascade Transformer. It includes two auxiliary tasks and one main task for text segmentation. The functions of the two auxiliary tasks are to learn the text region to focus on, together with learning the structure of text through their fonts, and then they support the main task. We implement three publicly available datasets for scene text segmentation: ICDAR13 FST, Total Text, and TextSeg datasets to demonstrate the effectiveness of our method. Our experimental result outperforms existing methods.

Tags:

Scene Text Segmentation

Paired Data Synthesis

gans

transformer

Multi-task Cascade

SCENE TEXT SEGMENTATION BY PAIRED DATA SYNTHESIS

Quang-Vinh Dang, Guee-Sang Lee

More Like This

Slides: Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

Devising Transformers as an Autoencoder for Unsupervised Multivariate Time Series Imputation

P4.17-Generative Adversarial Networks

Join an IEEE Society