A DILATED RESIDUAL VISION TRANSFORMER FOR ATRIAL FIBRILLATION DETECTION FROM STACKED TIME-FREQUENCY ECG REPRESENTATIONS
Sawon Pratiher, Apoorva Srivastava, Yedla Bindu Priyatha, Nirmalya Ghosh, Amit Patra
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:15:06
Atrial fibrillation (AF), the most frequent type of cardiac arrhythmia, has no apparent clinical symptoms in most cases for patients, making it more challenging to diagnose. However, the alterations in regular heart rhythm with an absence of visible P-waves in an Electrocardiogram (ECG) are often characterized as AF symptoms. ECG signals are often employed for AF prognosis to minimize the risk of stroke, coronary artery disease, and other cardiovascular diseases. This work proposes a new vision transformer (ViT) variant, namely, Dilated Residual ViT (\textbf{DiResViT}), by replacing the original patchify stem in ViT with dilated convolutional stem having residual connections for improved AF detection from an ensemble of ECG time-frequency representations. Dilated convolutions facilitate dense feature representation by comprehending the multi-scale contextual information and allowing the receptive field to expand exponentially without losing resolution. The introduction of residual connections alleviates the vanishing and exploding gradients problem by enabling the gradients to bypass some of the non-linear activation functions with improved training convergence. \textbf{DiResViT}?s exhaustive experimental validation outperforms the prior art in ECG-based AF detection, while the ablation study evinces enhanced performance compared to the existing ViT.