Gene Expression Data-Based Interpretable Machine Learning Framework for Classifying Brain Cancer Subtypes
- Title
- Gene Expression Data-Based Interpretable Machine Learning Framework for Classifying Brain Cancer Subtypes
- Creator
- Kushwah, Virendra Singh; Krishnan, Sivaneasan Bala; Upreti, Kamal; Kshirsagar, Pravin; Kumar, Manoj; Shankar, Uma; Radhakrishnan, Ganesh Veluswamy
- Description
- Early detection, therapeutic stratification, and precision medicine all rely on the precise classification of brain cancer subtypes. To categorize brain tumor subtypes, we examine the application of ensemble machine learning modelsRandom Forest, XGBoost, and LightGBMusing high-dimensional gene expression data from the GSE50161 dataset (CuMiDa). The top 1000 genes were selected using variance thresholding, and models were then trained and evaluated on a stratified split of the dataset. Despite the availability of models achieving similar accuracies (~9596%) in existing works, our framework integrates SHAP-based interpretability to identify biologically significant genes, such as CDK4, EGFR, and TP53, offering dual benefits of high predictive power and explainability. The use of SHAP (SHapley Additive exPlanations) values to assess model predictions and identify physiologically important gene features revealed that key gene probes, including as CDK4, EGFR, and TP53, were significant across different tumor subtypes. This study demonstrates how SHAP and interpretable ensemble learning may be used to diagnose brain tumors with excellent classification accuracy and physiologically meaningful gene identification. Published by Oriental Scientific Publishing Company 2025.
- Source
- Biomedical and Pharmacology Journal;Volume;18;Issue;3;pp.2014-2023
- Date
- 01-01-2025
- Publisher
- Oriental Scientific Publishing Company
- Subject
- Biomarker Identification; Brain Cancer; Cell Lines; Gene-expression; Machine Learning; Microarray Data
- Coverage
- Kushwah V.S., Department of CSE and AI, VIT Bhopal University, Madhya Pradesh, Sehore, India; Krishnan S.B., Department. of Electrical and Electronics Engineering, Singapore Institute of Technology, Singapore; Upreti K., Department of Computer Science, Christ University, Delhi NCR Campus, Ghaziabad, India; Kshirsagar P., Department of Electronics and Telecommunication Engineering, J D College of Engineering and Management, Nagpur, India; Kumar M., Department of Mathematics and Statistics, Gurukula Kangri University, Uttarakhand, Haridwar, India; Shankar U., Department of Management and Social Sciences, Qaiwan International University, Kurdistan, Sulaymaniyah, Iraq; Radhakrishnan G.V., Department of Management, Kalinga Institute of Industrial Technology, Bhubaneswar, India
- Rights
- All Open Access; Gold Open Access
- Relation
- ISSN: 9746242;
- Format
- online
- Language
- English
- Type
- Article
Collection
Citation
Kushwah, Virendra Singh; Krishnan, Sivaneasan Bala; Upreti, Kamal; Kshirsagar, Pravin; Kumar, Manoj; Shankar, Uma; Radhakrishnan, Ganesh Veluswamy, “Gene Expression Data-Based Interpretable Machine Learning Framework for Classifying Brain Cancer Subtypes,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 18, 2026, https://archives.christuniversity.in/items/show/23223.
