Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection
- Title
- Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection
- Creator
- Sahu, Premananda; Kumar, Ashwani; Jain, Rituraj; Upreti, Kamal; Yadav, Dileep Kumar; Radhakrishnan, G.V.
- Description
- Diabetic Retinopathy (DR) is a leading cause of vision loss, making accurate and interpretable detection critical. This study proposes a hybrid interpretable machinedeep learning framework that integrates multimodal data for enhanced DR severity classification. The model combines unstructured fundus images from EyePACS, Messidor, and APTOS with structured clinical and lifestyle variables such as age, sex, HbA1c, BMI, blood pressure, and diabetes duration. Fundus images undergo preprocessing through resizing, normalization, augmentation, and noise reduction, while clinical data are imputed, normalized, and one-hot encoded. For feature extraction, EfficientNetV2, ResNet50, and Swin Transformer are applied to images, and XGBoost, LightGBM, and TabNet to clinical data. Features are fused via concatenation and attention, followed by classification using Logistic Regression, Random Forest, and MLP. Explainability is provided by Grad-CAM for imaging data and SHAP/LIME for clinical data, supporting clinical interpretability. The proposed model outperformed unimodal baselines, achieving 99.34% accuracy, 98.5% precision, 98.0% recall, 99.0% specificity, 98.2% F1-score, and 0.99 AUC-ROC, with a 10% gain over ResNet50 alone. Performance improvements included a 9% increase in recall and 8% in F1-score, alongside excellent calibration. Confusion matrix analysis confirmed balanced severity detection, and clinicians validated the interpretability outputs. This framework demonstrates robust accuracy, generalization, and clinical applicability for DR screening. 2026, An-Najah National University. All rights reserved.
- Source
- An-Najah University Journal for Research - A (Natural Sciences);Volume;40;Issue;3;pp.319-332
- Date
- 01-01-2026
- Publisher
- An-Najah National University
- Subject
- Diabetic Retinopathy; Explainability; Eyepacs; Fundus Image; Grad-Cam; Lime; Shap
- Coverage
- Sahu P., School of Computer Science Engineering, Lovely Professional University, Punjab, India; Kumar A., School of Computer Science Engineering and Technology, Bennett University, Greater Noida, India; Jain R., Department of Information Technology, Marwadi University, Gujarat, Rajkot, India; Upreti K., Department of Computer Science, Christ University, Delhi NCR Campus, Ghaziabad, India; Yadav D.K., School of Computer Science Engineering and Technology, Bennett University, Greater Noida, India; Radhakrishnan G.V., Kalinga School of Management, Kalinga Institute of Industrial Technology, Bhubaneswar, India
- Rights
- Restricted Access; Hardcopy may be available in the library
- Relation
- ISSN: 17272114;
- Format
- online
- Language
- English
- Type
- Article
Collection
Citation
Sahu, Premananda; Kumar, Ashwani; Jain, Rituraj; Upreti, Kamal; Yadav, Dileep Kumar; Radhakrishnan, G.V., “Explainable Hybrid Deep Learning Framework with Multimodal Inputs for Diabetic Retinopathy Detection,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 18, 2026, https://archives.christuniversity.in/items/show/23554.
