Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection

Title: Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection
Creator: Sahu, Premananda; Kumar, Ashwani; Jain, Rituraj; Upreti, Kamal; Yadav, Dileep Kumar; Radhakrishnan, G.V.
Description: Diabetic Retinopathy (DR) is a leading cause of vision loss, making accurate and interpretable detection critical. This study proposes a hybrid interpretable machinedeep learning framework that integrates multimodal data for enhanced DR severity classification. The model combines unstructured fundus images from EyePACS, Messidor, and APTOS with structured clinical and lifestyle variables such as age, sex, HbA1c, BMI, blood pressure, and diabetes duration. Fundus images undergo preprocessing through resizing, normalization, augmentation, and noise reduction, while clinical data are imputed, normalized, and one-hot encoded. For feature extraction, EfficientNetV2, ResNet50, and Swin Transformer are applied to images, and XGBoost, LightGBM, and TabNet to clinical data. Features are fused via concatenation and attention, followed by classification using Logistic Regression, Random Forest, and MLP. Explainability is provided by Grad-CAM for imaging data and SHAP/LIME for clinical data, supporting clinical interpretability. The proposed model outperformed unimodal baselines, achieving 99.34% accuracy, 98.5% precision, 98.0% recall, 99.0% specificity, 98.2% F1-score, and 0.99 AUC-ROC, with a 10% gain over ResNet50 alone. Performance improvements included a 9% increase in recall and 8% in F1-score, alongside excellent calibration. Confusion matrix analysis confirmed balanced severity detection, and clinicians validated the interpretability outputs. This framework demonstrates robust accuracy, generalization, and clinical applicability for DR screening. 2026, An-Najah National University. All rights reserved.
Source: An-Najah University Journal for Research - A (Natural Sciences);Volume;40;Issue;3;pp.319-332
Date: 01-01-2026
Publisher: An-Najah National University
Subject: Diabetic Retinopathy; Explainability; Eyepacs; Fundus Image; Grad-Cam; Lime; Shap
Coverage: Sahu P., School of Computer Science Engineering, Lovely Professional University, Punjab, India; Kumar A., School of Computer Science Engineering and Technology, Bennett University, Greater Noida, India; Jain R., Department of Information Technology, Marwadi University, Gujarat, Rajkot, India; Upreti K., Department of Computer Science, Christ University, Delhi NCR Campus, Ghaziabad, India; Yadav D.K., School of Computer Science Engineering and Technology, Bennett University, Greater Noida, India; Radhakrishnan G.V., Kalinga School of Management, Kalinga Institute of Industrial Technology, Bhubaneswar, India
Rights: Restricted Access; Hardcopy may be available in the library
Relation: ISSN: 17272114;
Format: online
Language: English
Type: Article
Identifier: https://doi.org/10.35552/anujr.a.40.2.2617

https://www.scopus.com/pages/publications/105038182686?origin=resultslist

Collection

Citation

Sahu, Premananda; Kumar, Ashwani; Jain, Rituraj; Upreti, Kamal; Yadav, Dileep Kumar; Radhakrishnan, G.V., “Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection,” CHRIST (Deemed To Be University) Institutional Repository, accessed July 31, 2026, https://archives.christuniversity.in/items/show/23554.

Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection

Collection

Citation

Output Formats