Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches

dc.contributor.authorArslan, Ahmet Kadir
dc.contributor.authorYagin, Fatma Hilal
dc.contributor.authorAlgarni, Abdulmohsen
dc.contributor.authorKaraaslan, Erol
dc.contributor.authorAl-Hashem, Fahaid
dc.contributor.authorArdigo, Luca Paolo
dc.date.accessioned2026-04-04T13:31:18Z
dc.date.available2026-04-04T13:31:18Z
dc.date.issued2024
dc.departmentİnönü Üniversitesi
dc.description.abstractBackground Type 2 diabetes mellitus (T2DM) is a global health problem characterized by insulin resistance and hyperglycemia. Early detection and accurate prediction of T2DM is crucial for effective management and prevention. This study explores the integration of machine learning (ML) and explainable artificial intelligence (XAI) approaches based on metabolomics panel data to identify biomarkers and develop predictive models for T2DM.Methods Metabolomics data from T2DM (n = 31) and healthy controls (n = 34) were analyzed for biomarker discovery (mostly amino acids, fatty acids, and purines) and T2DM prediction. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression to enhance the model's accuracy and interpretability. Advanced three tree-based ML algorithms (KTBoost: Kernel-Tree Boosting; XGBoost: eXtreme Gradient Boosting; NGBoost: Natural Gradient Boosting) were employed to predict T2DM using these biomarkers. The SHapley Additive exPlanations (SHAP) method was used to explain the effects of metabolomics biomarkers on the prediction of the model.Results The study identified multiple metabolites associated with T2DM, where LASSO feature selection highlighted important biomarkers. KTBoost [Accuracy: 0.938; CI: (0.880-0.997), Sensitivity: 0.971; CI: (0.847-0.999), Area under the Curve (AUC): 0.965; CI: (0.937-0.994)] demonstrated its effectiveness in using complex metabolomics data for T2DM prediction and achieved better performance than other models. According to KTBoost's SHAP, high levels of phenylactate (pla) and taurine metabolites, as well as low concentrations of cysteine, laspartate, and lcysteate, are strongly associated with the presence of T2DM.Conclusion The integration of metabolomics profiling and XAI offers a promising approach to predicting T2DM. The use of tree-based algorithms, in particular KTBoost, provides a robust framework for analyzing complex datasets and improves the prediction accuracy of T2DM onset. Future research should focus on validating these biomarkers and models in larger, more diverse populations to solidify their clinical utility.
dc.description.sponsorshipUniversity Higher Education Fund under Research Support Program for Central labs at King Khalid University [CL/CO/C/6]
dc.description.sponsorshipThe author(s) declare that financial support was received for the research, authorship, and/or publication of this article. The authors extend their appreciation to University Higher Education Fund for funding this research work under Research Support Program for Central labs at King Khalid University through the project number CL/CO/C/6'.
dc.identifier.doi10.3389/fendo.2024.1444282
dc.identifier.issn1664-2392
dc.identifier.orcid0000-0002-9848-7958
dc.identifier.orcid0000-0002-7556-958X
dc.identifier.pmid39588339
dc.identifier.scopus2-s2.0-85210072294
dc.identifier.scopusqualityN/A
dc.identifier.urihttps://doi.org/10.3389/fendo.2024.1444282
dc.identifier.urihttps://hdl.handle.net/11616/108709
dc.identifier.volume15
dc.identifier.wosWOS:001362654500001
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakPubMed
dc.language.isoen
dc.publisherFrontiers Media Sa
dc.relation.ispartofFrontiers in Endocrinology
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20250329
dc.subjecttype 2 diabetes
dc.subjectmetabolomics
dc.subjectmachine learning
dc.subjectexplainable artificial intelligence
dc.subjectbiomarkers
dc.subjectpredictive modeling
dc.titleEnhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches
dc.typeArticle

Dosyalar