AIMNET: ADAPTIVE IMAGE-TAG MERGING NETWORK FOR AUTOMATIC MEDICAL REPORT GENERATION
Jijun Shi, Shanshe Wang, Ronggang Wang, Siwei Ma
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:17
In recent years, medical report generation has received increasing research interest with the goal of automatically generating long and coherent descriptive paragraphs that can describe in detail the observations of normal and abnormal regions in the input medical images. Unlike general image captioning tasks, medical report generation is more challenging for data-driven neural models. This is mainly due to severe visual and textual data biases. To address these problems, we propose an Adaptive Image-Tag Merging Network (AIMNet) that first predicts the tags of diseases from the input image, and then adaptively merges the visual information of the input image and disease information from the disease tags to learn the disease-oriented visual features that can better represent abnormal regions of the input image, and thus can be used to alleviate data bias problems. The experiments and analyses on the public MIMIC-CXR and IU-Xray datasets show that our proposed AIMNet achieves the state-of-the-art results under all metrics and significantly outperforms previous models on the MIMIC-CXR and IU-Xray datasets with relatively 15.1% and 6.5% margins in terms of BLEU-4 score.