LEARNED IMAGE COMPRESSION WITH MULTI-SCAN BASED CHANNEL FUSION
Yuan Li, Weilian Zhou, Pengfeng Lu, Sei-ichiro Kamata
-
SPS
IEEE Members: $11.00
Non-members: $15.00
While classical compression standards heavily rely on scan methods, few learned compression methods tend to use scan. The lack of scan has led to an increase in the encoding bits and the loss of relative information. Besides, recent works also struggle to choose between convolutional neural networks and transformers due to the presence of both local redundancy and global semantic information. CNN methods raise the distortion problem while the transformers increase additional encoding bits. To solve these problems, in this paper, we novelly introduce the Multi-Scan based Channel Fusion (MSCF) to the hyper-prior VAE structure to reduce the redundant bits. We propose a novel Residual Local-Global Consolidation (RLGC) module that utilizes the latest ConvNeXt and Swin transformer to enhance the image quality without additional bits. Experiments have shown that our model outperforms the state-of-the-art methods in terms of PSNR, MS-SSIM and BD-rate.