Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research

dc.authoridÇOLAK, CEMİL/0000-0001-5406-098X
dc.authoridKadry, Seifedine/0000-0002-1939-4842
dc.authoridYagin, Fatma Hilal/0000-0002-9848-7958
dc.authoridİnceoğlu, Feyza/0000-0003-1453-0937
dc.authoridYagin, Burak/0000-0001-6687-979X
dc.authorwosidÇOLAK, CEMİL/ABI-3261-2020
dc.authorwosidKadry, Seifedine/C-7437-2011
dc.authorwosidYagin, Fatma Hilal/ABI-8066-2020
dc.authorwosidİnceoğlu, Feyza/GVK-2847-2022
dc.contributor.authorYagin, Burak
dc.contributor.authorYagin, Fatma Hilal
dc.contributor.authorColak, Cemil
dc.contributor.authorInceoglu, Feyza
dc.contributor.authorKadry, Seifedine
dc.contributor.authorKim, Jungeun
dc.date.accessioned2024-08-04T20:54:49Z
dc.date.available2024-08-04T20:54:49Z
dc.date.issued2023
dc.departmentİnönü Üniversitesien_US
dc.description.abstractAim: Method: This research presents a model combining machine learning (ML) techniques and eXplainable artificial intelligence (XAI) to predict breast cancer (BC) metastasis and reveal important genomic biomarkers in metastasis patients. Method: A total of 98 primary BC samples was analyzed, comprising 34 samples from patients who developed distant metastases within a 5-year follow-up period and 44 samples from patients who remained disease-free for at least 5 years after diagnosis. Genomic data were then subjected to biostatistical analysis, followed by the application of the elastic net feature selection method. This technique identified a restricted number of genomic biomarkers associated with BC metastasis. A light gradient boosting machine (LightGBM), categorical boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Ada boosting (AdaBoost) algorithms were utilized for prediction. To assess the models' predictive abilities, the accuracy, F1 score, precision, recall, area under the ROC curve (AUC), and Brier score were calculated as performance evaluation metrics. To promote interpretability and overcome the black box problem of ML models, a SHapley Additive exPlanations (SHAP) method was employed. Results: The LightGBM model outperformed other models, yielding remarkable accuracy of 96% and an AUC of 99.3%. In addition to biostatistical evaluation, in XAI-based SHAP results, increased expression levels of TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T (p <= 0.05) were found to be associated with an increased incidence of BC metastasis. Finally, decreased levels of expression of CACTIN, TGFB3, SCUBE2, ARL4D, OR1F1, ALDH4A1, PHF1, and CROCC (p <= 0.05) genes were also determined to increase the risk of metastasis in BC. Conclusion: The findings of this study may prevent disease progression and metastases and potentially improve clinical outcomes by recommending customized treatment approaches for BC patients.en_US
dc.description.sponsorshipTechnology Development Program of MSS [S3033853]; Kongju National Universityen_US
dc.description.sponsorshipThis research was partly supported by the Technology Development Program of MSS (No. S3033853) and by the research grant of the Kongju National University in 2023.en_US
dc.identifier.doi10.3390/diagnostics13213314
dc.identifier.issn2075-4418
dc.identifier.issue21en_US
dc.identifier.pmid37958210en_US
dc.identifier.scopus2-s2.0-85176353499en_US
dc.identifier.scopusqualityQ2en_US
dc.identifier.urihttps://doi.org/10.3390/diagnostics13213314
dc.identifier.urihttps://hdl.handle.net/11616/101666
dc.identifier.volume13en_US
dc.identifier.wosWOS:001100253500001en_US
dc.identifier.wosqualityQ1en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakPubMeden_US
dc.language.isoenen_US
dc.publisherMdpien_US
dc.relation.ispartofDiagnosticsen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectbreast cancer metastasisen_US
dc.subjectmachine learning algorithmsen_US
dc.subjectgenomic biomarkersen_US
dc.subjecteXplainable artificial intelligenceen_US
dc.subjectSHAPen_US
dc.titleCancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Researchen_US
dc.typeArticleen_US

Dosyalar