-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:02:10
Retinopathy of prematurity (ROP) is an abnormal proliferative vascular disorder that causes blindness and vision impairment in premature children. Early diagnosis of ROP is crucial in reducing the blindness rate. However, traditional convolutional neural networks (CNNs) fail to optimize the global information of pathology images, while the recent popular vision transformer (ViT) model neglects the local details. Also, ROP has a small lesion region and relatively little data. To solve above issues, we propose a dual-branch network, which is a parallel structure composed of ResNet-50 and multi-axis vision transformer (MaxViT) for capturing local and global information about fundus images simultaneously. Multi-scale features are extracted using the Swin Spatial Pyramid Pooling (Swin-SPP) operation in SwinViT blocks. The convolutional block attention module (CBAM) is utilized to highlight the channel and spatial relationships of ROP staging features. The experimental results on internal ROP dataset indicate that our method shows promising for ROP staging.