A Review of Deep Learning Fusion based Multimodal for Disease Diagnosis and Classification.
Abstract
Deep Fusion multimodal learning, have revolutionized early disease detection and prediction. Deep Fusion multimodal integrates data from different sources like clinical records, medical imaging, genomic data, clinical records and come up with optimal results of patients. Recent approaches includes Convolution Neural Network(CNNs) for feature extraction ,Vision Transformers(ViTs) for global attention, self-supervised multimodal transformers, self-attention, cross-attention and Hand Crafted Features authorize more accurate and efficient predictions. This enables early detection of diseases like Alzheimer’s, tumor growth before it becomes visible. While challenges occurs in data standardization, interpretability and clinical validation.We believe that the multimodal robustness can be enhanced by developing more interpretable and generalized model which enable accurate monitoring early disease diagnosis and classification disease prediction years before clinical onset.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Baljit Kaur, Navreet Kaur, Sunaina, Priya Thakur

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.