Pilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discovery

dc.authoridAL-Hashem, Fahaid/0000-0001-5795-9966
dc.authoridAHMAD, IRSHAD/0000-0003-0077-3065
dc.authoridYagin, Fatma Hilal/0000-0002-9848-7958
dc.authoridAhmad, Fuzail/0000-0002-1189-2206
dc.authoridAlkhateeb, Abedalrhman/0000-0002-1751-7570
dc.authoridAhmad, Irshad/0000-0002-6012-9207
dc.authorwosidAL-Hashem, Fahaid/GPS-8057-2022
dc.authorwosidAHMAD, IRSHAD/R-4469-2018
dc.authorwosidYagin, Fatma Hilal/ABI-8066-2020
dc.contributor.authorYagin, Fatma Hilal
dc.contributor.authorAl-Hashem, Fahaid
dc.contributor.authorAhmad, Irshad
dc.contributor.authorAhmad, Fuzail
dc.contributor.authorAlkhateeb, Abedalrhman
dc.date.accessioned2024-08-04T20:56:04Z
dc.date.available2024-08-04T20:56:04Z
dc.date.issued2024
dc.departmentİnönü Üniversitesien_US
dc.description.abstractBackground: This study aims to identify unique metabolomics biomarkers associated with Type 2 Diabetes (T2D) and develop an accurate diagnostics model using tree-based machine learning (ML) algorithms integrated with bioinformatics techniques. Methods: Univariate and multivariate analyses such as fold change, a receiver operating characteristic curve (ROC), and Partial Least-Squares Discriminant Analysis (PLS-DA) were used to identify biomarker metabolites that showed significant concentration in T2D patients. Three tree-based algorithms [eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Adaptive Boosting (AdaBoost)] that demonstrated robustness in high-dimensional data analysis were used to create a diagnostic model for T2D. Results: As a result of the biomarker discovery process validated with three different approaches, Pyruvate, D-Rhamnose, AMP, pipecolate, Tetradecenoic acid, Tetradecanoic acid, Dodecanediothioic acid, Prostaglandin E3/D3 (isobars), ADP and Hexadecenoic acid were determined as potential biomarkers for T2D. Our results showed that the XGBoost model [accuracy = 0.831, F1-score = 0.845, sensitivity = 0.882, specificity = 0.774, positive predictive value (PPV) = 0.811, negative-PV (NPV) = 0.857 and Area under the ROC curve (AUC) = 0.887] had the slight highest performance measures. Conclusions: ML integrated with bioinformatics techniques offers accurate and positive T2D candidate biomarker discovery. The XGBoost model can successfully distinguish T2D based on metabolites.en_US
dc.description.sponsorshipDeanship of Scientific Research, King Khalid University, Kingdom of Saudi Arabiaen_US
dc.description.sponsorshipNo Statement Availableen_US
dc.identifier.doi10.3390/nu16101537
dc.identifier.issn2072-6643
dc.identifier.issue10en_US
dc.identifier.pmid38794775en_US
dc.identifier.scopus2-s2.0-85194219122en_US
dc.identifier.scopusqualityQ1en_US
dc.identifier.urihttps://doi.org/10.3390/nu16101537
dc.identifier.urihttps://hdl.handle.net/11616/102017
dc.identifier.volume16en_US
dc.identifier.wosWOS:001231595400001en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakPubMeden_US
dc.language.isoenen_US
dc.publisherMdpien_US
dc.relation.ispartofNutrientsen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjecttype 2 diabetesen_US
dc.subjectbiomarker discoveryen_US
dc.subjectmetabolomicsen_US
dc.subjectmachine learningen_US
dc.subjectbioinformaticsen_US
dc.titlePilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discoveryen_US
dc.typeArticleen_US

Dosyalar