Proposed Comprehensive Methodology Integrated with Explainable Artificial Intelligence for Prediction of Possible Biomarkers in Metabolomics Panel of Plasma Samples for Breast Cancer Detection

dc.contributor.authorColak, Cemil
dc.contributor.authorYagin, Fatma Hilal
dc.contributor.authorAlgarni, Abdulmohsen
dc.contributor.authorAlgarni, Ali
dc.contributor.authorAl-Hashem, Fahaid
dc.contributor.authorArdigo, Luca Paolo
dc.date.accessioned2026-04-04T13:31:00Z
dc.date.available2026-04-04T13:31:00Z
dc.date.issued2025
dc.departmentİnönü Üniversitesi
dc.description.abstractBackground and objectives: Breast cancer (BC) is the most common type of cancer in women, accounting for more than 30% of new female cancers each year. Although various treatments are available for BC, most cancer-related deaths are due to incurable metastases. Therefore, the early diagnosis and treatment of BC are crucial before metastasis. Mammography and ultrasonography are primarily used in the clinic for the initial identification and staging of BC; these methods are useful for general screening but have limitations in terms of sensitivity and specificity. Omics-based biomarkers, like metabolomics, can make early diagnosis much more accurate, make tracking the disease's progression more accurate, and help make personalized treatment plans that are tailored to each tumor's specific molecular profile. Metabolomics technology is a feasible and comprehensive method for early disease detection and biomarker identification at the molecular level. This research aimed to establish an interpretable predictive artificial intelligence (AI) model using plasma-based metabolomics panel data to identify potential biomarkers that distinguish BC individuals from healthy controls. Method and materials: A cohort of 138 BC patients and 76 healthy controls were studied. Plasma metabolites were examined using LC-TOFMS and GC-TOFMS techniques. Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Random Forest (RF) were evaluated using performance metrics such as Receiver Operating Characteristic-Area Under the Curve (ROC AUC), accuracy, sensitivity, specificity, and F1 score. ROC and Precision-Recall (PR) curves were generated for comparative analysis. The SHapley Additive Descriptions (SHAP) analysis evaluated the optimal prediction model for interpretability. Results: The RF algorithm showed improved accuracy (0.963 +/- 0.043) and sensitivity (0.977 +/- 0.051); however, LightGBM achieved the highest ROC AUC (0.983 +/- 0.028). RF also achieved the best Precision-Recall Area under the Curve (PR AUC) at 0.989. SHAP search found glycerophosphocholine and pentosidine as the most significant discriminatory metabolites. Uracil, glutamine, and butyrylcarnitine were also among the significant metabolites. Conclusions: Metabolomics biomarkers and an explainable AI (XAI)-based prediction model showed significant diagnostic accuracy and sensitivity in the detection of BC. The proposed XAI system using interpretable metabolite data can serve as a clinical decision support tool to improve early diagnosis processes.
dc.description.sponsorshipKing Khalid University [RGP1/369/45]; Deanship of Research and Graduate Studies at King Khalid University
dc.description.sponsorshipThe authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through small group research under grant number: RGP1/369/45.
dc.identifier.doi10.3390/medicina61040581
dc.identifier.issn1010-660X
dc.identifier.issn1648-9144
dc.identifier.issue4
dc.identifier.orcid0000-0002-9848-7958
dc.identifier.orcid0000-0001-7677-5070
dc.identifier.orcid0000-0002-7556-958X
dc.identifier.orcid0000-0001-5406-098X
dc.identifier.orcid0009-0002-6718-517X
dc.identifier.pmid40282875
dc.identifier.scopus2-s2.0-105003583391
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.3390/medicina61040581
dc.identifier.urihttps://hdl.handle.net/11616/108511
dc.identifier.volume61
dc.identifier.wosWOS:001475112800001
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakPubMed
dc.language.isoen
dc.publisherMdpi
dc.relation.ispartofMedicina-Lithuania
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20250329
dc.subjectbreast cancer
dc.subjectmetabolomics
dc.subjectbiomarker
dc.subjectmachine learning
dc.subjectexplainable artificial intelligence
dc.titleProposed Comprehensive Methodology Integrated with Explainable Artificial Intelligence for Prediction of Possible Biomarkers in Metabolomics Panel of Plasma Samples for Breast Cancer Detection
dc.typeArticle

Dosyalar