Decoding Breast Cancer Mutational Signatures: A Hybrid ElasticNetXGBoost Approach Using Gene Expression Data
- Title
- Decoding Breast Cancer Mutational Signatures: A Hybrid ElasticNetXGBoost Approach Using Gene Expression Data
- Creator
- Porwal, Omji; Upreti, Kamal; Kshirsagar, Pravin R.; Panwar, Sarika; Sharma, Anurag; Radhakrishnan, Ganesh V.; Jain, Rituraj
- Description
- TP53, PIK3CA, and MUC16 are somatic mutations that are useful in breast cancer progression and prognosis, but direct mutation profiling based on sequencing is not always practicable in practice. The data about gene expression can contain indirect transcriptomic patterns linked with mutational underlying states. This paper proposes an expression-based machine learning model to predict the status of mutations using METABRIC breast cancer cohort. Instead of directly estimating genetic changes, the suggested method estimates statistical relationships between transcriptomic phenotypes and binary somatic mutation states. A multi-stage gene features selection pipeline using variance filtering, mutual information ranking, and correlation pruning was used to reduce the number of genes (19,000). A hybrid predictive architecture was trained using these features that combined ElasticNet logistic regression and XGBoost that allowed balancing between linear regularization and nonlinear interaction modeling. The hybrid model with a combination of five-fold stratified cross validation yielded mean ROC-AUC of 0.94 (TP53), 0.92 (PIK3CA), and 0.90 (MUC16) with the stability of the calibration and equal error rates. Coefficient analysis and SHAP-based explanations were used to investigate the interpretability of the models to describe the expression patterns on mutation status. The suggested framework is a hypothesis-generating, complementary method of transcriptomic analysis, which must be reevaluated by external validation to determine the wider generalizability. 2026, International Journal of Prognostics and Health Management. All rights reserved.
- Source
- International Journal of Prognostics and Health Management;Volume;17;Issue;1;
- Date
- 01-01-2026
- Publisher
- Prognostics and Health Management Society
- Coverage
- Porwal O., Faculty of Pharmacy, Qaiwan Research Center, Qaiwan International University, Kurdistan, Sulaymaniyah, 46001, Iran; Upreti K., Department of Computer Science, Christ University, Ghaziabad, 201002, India; Kshirsagar P.R., Department of Electronics & Telecommunication Engineering, J D College of Engineering & Management, Maharashtra, Nagpur, 441501, India; Panwar S., Departmemt of Electronics and Telecommunication Engineering, Academy of Engineering, Pune, India; Sharma A., School of Electrical and Electronic Engineering, Newcastle University, NE1 7RU, Singapore; Radhakrishnan G.V., Kalinga School of Management, Kalinga Institute of Industrial Technology, Bhubaneswar, 751024, India; Jain R., Department of Information Technology, Marwadi University, Gujarat, Rajkot, 360003, India
- Rights
- All Open Access; Gold Open Access
- Relation
- ISSN: 21532648;
- Format
- online
- Language
- English
- Type
- Article
Collection
Citation
Porwal, Omji; Upreti, Kamal; Kshirsagar, Pravin R.; Panwar, Sarika; Sharma, Anurag; Radhakrishnan, Ganesh V.; Jain, Rituraj, “Decoding Breast Cancer Mutational Signatures: A Hybrid ElasticNetXGBoost Approach Using Gene Expression Data,” CHRIST (Deemed To Be University) Institutional Repository, accessed June 18, 2026, https://archives.christuniversity.in/items/show/23556.
