Yazar "Al-Hashem, Fahaid" seçeneğine göre listele
Listeleniyor 1 - 12 / 12
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Biomarker discovery and development of prognostic prediction model using metabolomic panel in breast cancer patients: a hybrid methodology integrating machine learning and explainable artificial intelligence(Frontiers Media Sa, 2024) Yagin, Fatma Hilal; Gormez, Yasin; Al-Hashem, Fahaid; Ahmad, Irshad; Ahmad, Fuzail; Ardigo, Luca PaoloBackground Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC.Methods Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction.Results The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively.Conclusion In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.Öğe Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches(Frontiers Media Sa, 2024) Arslan, Ahmet Kadir; Yagin, Fatma Hilal; Algarni, Abdulmohsen; Karaaslan, Erol; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground Type 2 diabetes mellitus (T2DM) is a global health problem characterized by insulin resistance and hyperglycemia. Early detection and accurate prediction of T2DM is crucial for effective management and prevention. This study explores the integration of machine learning (ML) and explainable artificial intelligence (XAI) approaches based on metabolomics panel data to identify biomarkers and develop predictive models for T2DM.Methods Metabolomics data from T2DM (n = 31) and healthy controls (n = 34) were analyzed for biomarker discovery (mostly amino acids, fatty acids, and purines) and T2DM prediction. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression to enhance the model's accuracy and interpretability. Advanced three tree-based ML algorithms (KTBoost: Kernel-Tree Boosting; XGBoost: eXtreme Gradient Boosting; NGBoost: Natural Gradient Boosting) were employed to predict T2DM using these biomarkers. The SHapley Additive exPlanations (SHAP) method was used to explain the effects of metabolomics biomarkers on the prediction of the model.Results The study identified multiple metabolites associated with T2DM, where LASSO feature selection highlighted important biomarkers. KTBoost [Accuracy: 0.938; CI: (0.880-0.997), Sensitivity: 0.971; CI: (0.847-0.999), Area under the Curve (AUC): 0.965; CI: (0.937-0.994)] demonstrated its effectiveness in using complex metabolomics data for T2DM prediction and achieved better performance than other models. According to KTBoost's SHAP, high levels of phenylactate (pla) and taurine metabolites, as well as low concentrations of cysteine, laspartate, and lcysteate, are strongly associated with the presence of T2DM.Conclusion The integration of metabolomics profiling and XAI offers a promising approach to predicting T2DM. The use of tree-based algorithms, in particular KTBoost, provides a robust framework for analyzing complex datasets and improves the prediction accuracy of T2DM onset. Future research should focus on validating these biomarkers and models in larger, more diverse populations to solidify their clinical utility.Öğe Explainable Boosting Machines Identify Key Metabolomic Biomarkers in Rheumatoid Arthritis(Mdpi, 2025) Yagin, Fatma Hilal; Colak, Cemil; Algarni, Abdulmohsen; Algarni, Ali; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground and Objectives: Rheumatoid arthritis (RA) is a chronic autoimmune disease characterised by joint inflammation and pain. Metabolomics approaches, which are high-throughput profiling of small molecule metabolites in plasma or serum in RA patients, have so far provided biomarker discovery in the literature for clinical subgroups, risk factors, and predictors of treatment response using classical statistical approaches or machine learning models. Despite these recent developments, an explainable artificial intelligence (XAI)-based methodology has not been used to identify RA metabolomic biomarkers and distinguish patients with RA. This study constructed a XAI-based EBM model using global plasma metabolomics profiling to identify metabolites predictive of RA patients and to develop a classification model that can distinguish RA patients from healthy controls. Materials and Methods: Global plasma metabolomics data were analysed from RA patients (49 samples) and healthy individuals (10 samples). SMOTE technique was used for class imbalance in data preprocessing. EBM, LightGBM, and AdaBoost algorithms were applied to generate a discriminatory model between RA and controls. Comprehensive performance metrics were calculated, and the interpretability of the optimal model was assessed using global and local feature descriptions. Results: A total of 59 samples were analysed, 49 from RA patients, and 10 from healthy subjects. The EBM generated better results than LightGBM and AdaBoost by attaining an AUC of 0.901 (95% CI: 0.847-0.955) with 87.8% sensitivity which helps prevent false negative early RA diagnosis. The primary biomarkers EBM-based XAI identified were N-acetyleucine, pyruvic acid, and glycerol-3-phosphate. EBM global explanation analysis indicated that elevated pyruvic acid levels were significantly correlated with RA, whereas N-acetyleucine exhibited a nonlinear relationship, implying possible protective effects at specific concentrations. Conclusions: This study underscores the promise of XAI and evidence-based medicine methodology in developing biomarkers for RA through metabolomics. The discovered metabolites offer significant insights into RA pathophysiology and may function as diagnostic biomarkers or therapeutic targets. Incorporating EBM methodologies integrated with XAI improves model transparency and increases the therapeutic applicability of predictive models for RA diagnosis/management. Furthermore, the transparent structure of the EBM model empowers clinicians to understand and verify the reasoning behind each prediction, thereby fostering trust in AI-assisted decision-making and facilitating the integration of metabolomic insights into routine clinical practice.Öğe Identification of a Novel Lipidomic Biomarker for Hepatocyte Carcinoma Diagnosis: Advanced Boosting Machine Learning Techniques Integrated with Explainable Artificial Intelligence(Mdpi, 2025) Yagin, Fatma Hilal; Colak, Cemil; Al-Hashem, Fahaid; Alzakari, Sarah A.; Alhussan, Amel Ali; Aghaei, MohammadrezaBackground: Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality worldwide, often diagnosed at late stages due to the limited sensitivity of current screening tools. This study explores whether blood-based lipidomic profiling, combined with explainable artificial intelligence (XAI), can improve early and interpretable detection of HCC. Methods: We analyzed lipidomic data from 219 HCC patients and 219 matched healthy controls using liquid chromatography-mass spectrometry. An Explainable Boosting Machine (EBM) was employed to identify discriminatory lipid biomarkers and was compared against several standard machine learning algorithms. Results: The EBM model achieved superior performance with 87.0% accuracy, 87.7% sensitivity, 86.3% specificity, and an AUC of 91.8%, outperforming other models. Key lipid biomarkers identified included specific phosphatidylcholines (PC 38:2, PC 40:4), sphingomyelins (SM d40:2 B), and lysophosphatidylcholines (LPC 18:2), which exhibited significant alterations in HCC patients and highlighted disruptions in sphingolipid metabolism. Conclusions: Integration of lipidomics with explainable machine learning offers a powerful, transparent approach for HCC biomarker discovery, achieving high diagnostic accuracy while providing biological insights. This strategy holds promise for developing non-invasive, clinically interpretable screening tools to improve early detection of liver cancer.Öğe Leveraging Explainable Automated Machine Learning (AutoML) and Metabolomics for Robust Diagnosis and Pathophysiological Insights in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS)(Mdpi, 2025) Yagin, Fatma Hilal; Colak, Cemil; Al-Hashem, Fahaid; Alzakari, Sarah A.; Alhussan, Amel Ali; Aghaei, MohammadrezaBackground/Objectives: Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) is a debilitating complex disease with an elusive etiology, lacking objective diagnostic biomarkers. This study leverages advanced Automated Machine Learning (AutoML) to analyze plasma metabolomic and lipidomic profiles for the purpose of ME/CFS detection. Methods: We utilized a publicly available dataset comprising 888 metabolic features from 106 ME/CFS patients and 91 matched controls. Three AutoML frameworks-TPOT, Auto-Sklearn, and H2O AutoML-were benchmarked under identical time constraints. Univariate ROC and PLS-DA analyses with cross-validation, permutation testing, and VIP-based feature selection were applied to standardized, log-transformed omics data to identify significant discriminatory metabolites/lipids and assess their intercorrelations. Results: TPOT significantly outperformed its counterparts, achieving an area under the curve (AUC) of 92.1%, accuracy of 87.3%, sensitivity of 85.8%, and specificity of 89.0%. The PLS-DA model revealed a moderate but statistically significant discrimination between ME/CFS and controls. Explainable artificial intelligence (XAI) via SHAP analysis of the optimal TPOT model identified key metabolites implicating dysregulated pathways in mitochondrial energy metabolism (succinic acid, pyruvic acid, leucine), chronic inflammation (prostaglandin D2, 11,12-EET), gut-brain axis communication (glycocholic acid), and cell membrane integrity (pc(35:2)a). Conclusions: Our results demonstrate that TPOT-derived models not only provide a highly accurate and robust diagnostic tool but also yield biologically interpretable insights into the pathophysiology of ME/CFS, highlighting its potential for clinical decision support and elucidating novel therapeutic targets.Öğe Metabolomics Biomarker Discovery to Optimize Hepatocellular Carcinoma Diagnosis: Methodology Integrating AutoML and Explainable Artificial Intelligence(Mdpi, 2024) Yagin, Fatma Hilal; El Shawi, Radwa; Algarni, Abdulmohsen; Colak, Cemil; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground: This study aims to assess the efficacy of combining automated machine learning (AutoML) and explainable artificial intelligence (XAI) in identifying metabolomic biomarkers that can differentiate between hepatocellular carcinoma (HCC) and liver cirrhosis in patients with hepatitis C virus (HCV) infection. Methods: We investigated publicly accessible data encompassing HCC patients and cirrhotic controls. The TPOT tool, which is an AutoML tool, was used to optimize the preparation of features and data, as well as to select the most suitable machine learning model. The TreeSHAP approach, which is a type of XAI, was used to interpret the model by assessing each metabolite's individual contribution to the categorization process. Results: TPOT had superior performance in distinguishing between HCC and cirrhosis compared to other AutoML approaches AutoSKlearn and H2O AutoML, in addition to traditional machine learning models such as random forest, support vector machine, and k-nearest neighbor. The TPOT technique attained an AUC value of 0.81, showcasing superior accuracy, sensitivity, and specificity in comparison to the other models. Key metabolites, including L-valine, glycine, and DL-isoleucine, were identified as essential by TPOT and subsequently verified by TreeSHAP analysis. TreeSHAP provided a comprehensive explanation of the contribution of these metabolites to the model's predictions, thereby increasing the interpretability and dependability of the results. This thorough assessment highlights the strength and reliability of the AutoML framework in the development of clinical biomarkers. Conclusions: This study shows that AutoML and XAI can be used together to create metabolomic biomarkers that are specific to HCC. The exceptional performance of TPOT in comparison to traditional models highlights its capacity to identify biomarkers. Furthermore, TreeSHAP boosted model transparency by highlighting the relevance of certain metabolites. This comprehensive method has the potential to enhance the identification of biomarkers and generate precise, easily understandable, AI-driven solutions for diagnosing HCC.Öğe Pilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discovery(Mdpi, 2024) Yagin, Fatma Hilal; Al-Hashem, Fahaid; Ahmad, Irshad; Ahmad, Fuzail; Alkhateeb, AbedalrhmanBackground: This study aims to identify unique metabolomics biomarkers associated with Type 2 Diabetes (T2D) and develop an accurate diagnostics model using tree-based machine learning (ML) algorithms integrated with bioinformatics techniques. Methods: Univariate and multivariate analyses such as fold change, a receiver operating characteristic curve (ROC), and Partial Least-Squares Discriminant Analysis (PLS-DA) were used to identify biomarker metabolites that showed significant concentration in T2D patients. Three tree-based algorithms [eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Adaptive Boosting (AdaBoost)] that demonstrated robustness in high-dimensional data analysis were used to create a diagnostic model for T2D. Results: As a result of the biomarker discovery process validated with three different approaches, Pyruvate, D-Rhamnose, AMP, pipecolate, Tetradecenoic acid, Tetradecanoic acid, Dodecanediothioic acid, Prostaglandin E3/D3 (isobars), ADP and Hexadecenoic acid were determined as potential biomarkers for T2D. Our results showed that the XGBoost model [accuracy = 0.831, F1-score = 0.845, sensitivity = 0.882, specificity = 0.774, positive predictive value (PPV) = 0.811, negative-PV (NPV) = 0.857 and Area under the ROC curve (AUC) = 0.887] had the slight highest performance measures. Conclusions: ML integrated with bioinformatics techniques offers accurate and positive T2D candidate biomarker discovery. The XGBoost model can successfully distinguish T2D based on metabolites.Öğe Platelet Metabolites as Candidate Biomarkers in Sepsis Diagnosis and Management Using the Proposed Explainable Artificial Intelligence Approach(Mdpi, 2024) Yagin, Fatma Hilal; Aygun, Umran; Algarni, Abdulmohsen; Colak, Cemil; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground: Sepsis is characterized by an atypical immune response to infection and is a dangerous health problem leading to significant mortality. Current diagnostic methods exhibit insufficient sensitivity and specificity and require the discovery of precise biomarkers for the early diagnosis and treatment of sepsis. Platelets, known for their hemostatic abilities, also play an important role in immunological responses. This study aims to develop a model integrating machine learning and explainable artificial intelligence (XAI) to identify novel platelet metabolomics markers of sepsis. Methods: A total of 39 participants, 25 diagnosed with sepsis and 14 control subjects, were included in the study. The profiles of platelet metabolites were analyzed using quantitative 1H-nuclear magnetic resonance (NMR) technology. Data were processed using the synthetic minority oversampling method (SMOTE)-Tomek to address the issue of class imbalance. In addition, missing data were filled using a technique based on random forests. Three machine learning models, namely extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and kernel tree boosting (KTBoost), were used for sepsis prediction. The models were validated using cross-validation. Clinical annotations of the optimal sepsis prediction model were analyzed using SHapley Additive exPlanations (SHAP), an XAI technique. Results: The results showed that the KTBoost model (0.900 accuracy and 0.943 AUC) achieved better performance than the other models in sepsis diagnosis. SHAP results revealed that metabolites such as carnitine, glutamate, and myo-inositol are important biomarkers in sepsis prediction and intuitively explained the prediction decisions of the model. Conclusion: Platelet metabolites identified by the KTBoost model and XAI have significant potential for the early diagnosis and monitoring of sepsis and improving patient outcomes.Öğe Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy(Mdpi, 2025) Yagin, Fatma Hilal; Gormez, Yasin; Colak, Cemil; Algarni, Abdulmohsen; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework-leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints-to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC-AUC, precision-recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC-AUC and precision-recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning-particularly dANNs and SVM-in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders.Öğe Proposed Comprehensive Methodology Integrated with Explainable Artificial Intelligence for Prediction of Possible Biomarkers in Metabolomics Panel of Plasma Samples for Breast Cancer Detection(Mdpi, 2025) Colak, Cemil; Yagin, Fatma Hilal; Algarni, Abdulmohsen; Algarni, Ali; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground and objectives: Breast cancer (BC) is the most common type of cancer in women, accounting for more than 30% of new female cancers each year. Although various treatments are available for BC, most cancer-related deaths are due to incurable metastases. Therefore, the early diagnosis and treatment of BC are crucial before metastasis. Mammography and ultrasonography are primarily used in the clinic for the initial identification and staging of BC; these methods are useful for general screening but have limitations in terms of sensitivity and specificity. Omics-based biomarkers, like metabolomics, can make early diagnosis much more accurate, make tracking the disease's progression more accurate, and help make personalized treatment plans that are tailored to each tumor's specific molecular profile. Metabolomics technology is a feasible and comprehensive method for early disease detection and biomarker identification at the molecular level. This research aimed to establish an interpretable predictive artificial intelligence (AI) model using plasma-based metabolomics panel data to identify potential biomarkers that distinguish BC individuals from healthy controls. Method and materials: A cohort of 138 BC patients and 76 healthy controls were studied. Plasma metabolites were examined using LC-TOFMS and GC-TOFMS techniques. Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Random Forest (RF) were evaluated using performance metrics such as Receiver Operating Characteristic-Area Under the Curve (ROC AUC), accuracy, sensitivity, specificity, and F1 score. ROC and Precision-Recall (PR) curves were generated for comparative analysis. The SHapley Additive Descriptions (SHAP) analysis evaluated the optimal prediction model for interpretability. Results: The RF algorithm showed improved accuracy (0.963 +/- 0.043) and sensitivity (0.977 +/- 0.051); however, LightGBM achieved the highest ROC AUC (0.983 +/- 0.028). RF also achieved the best Precision-Recall Area under the Curve (PR AUC) at 0.989. SHAP search found glycerophosphocholine and pentosidine as the most significant discriminatory metabolites. Uracil, glutamine, and butyrylcarnitine were also among the significant metabolites. Conclusions: Metabolomics biomarkers and an explainable AI (XAI)-based prediction model showed significant diagnostic accuracy and sensitivity in the detection of BC. The proposed XAI system using interpretable metabolite data can serve as a clinical decision support tool to improve early diagnosis processes.Öğe The REDOX balance in the prefrontal cortex is positively modulated by aerobic exercise and altered by overfeeding(Nature Portfolio, 2025) Silva, Deyvison Guilherme Martins; de Santana, Jonata Henrique; Bernardo, Elenilson Maximino; Fernandes, Matheus Santos de Sousa; Yagin, Fatma Hilal; Al-Hashem, Fahaid; Fernandes, Mariana P.While obesity rates increase worldwide, physical activity levels are reduced. Obesity and physical inactivity may be inversely related to the production of reactive oxygen species (ROS) and cause oxidative stress in the central nervous system. In this study, we aimed to investigate the effects of aerobic physical exercise on the oxidative balance of the prefrontal cortex of rats subjected to overnutrition during lactation. For this, male Wistar rats were subjected to overnutrition during lactation between postnatal day 3 to 21. On postnatal day 23, the two groups of animals were subdivided into trained and untrained animals. Trained rats were subjected to a treadmill training protocol for four weeks, five days/week, 60 min/day, at 50% of maximum running capacity. Our findings demonstrate that overnutrition impairs REDOX balance in the prefrontal cortex through increased prooxidants and reduced antioxidant defenses. On the contrary, exercise tends to restore most of these measures to control levels, possibly due to the increase in mRNA levels of Sirt1 and reduction in Il-6 in the prefrontal cortex. Overnutrition causes oxidative stress in the prefrontal cortex, while exercise re-covers most of its adverse effects through activating anti-inflammatory mechanisms.Öğe Untargeted Lipidomic Biomarkers for Liver Cancer Diagnosis: A Tree-Based Machine Learning Model Enhanced by Explainable Artificial Intelligence(Mdpi, 2025) Colak, Cemil; Yagin, Fatma Hilal; Algarni, Abdulmohsen; Algarni, Ali; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground and Objectives: Liver cancer ranks among the leading causes of cancer-related mortality, necessitating the development of novel diagnostic methods. Deregulated lipid metabolism, a hallmark of hepatocarcinogenesis, offers compelling prospects for biomarker identification. This study aims to employ explainable artificial intelligence (XAI) to identify lipidomic biomarkers for liver cancer and to develop a robust predictive model for early diagnosis. Materials and Methods: This study included 219 patients diagnosed with liver cancer and 219 healthy controls. Serum samples underwent untargeted lipidomic analysis with LC-QTOF-MS. Lipidomic data underwent univariate and multivariate analyses, including fold change (FC), t-tests, PLS-DA, and Elastic Network feature selection, to identify significant biomarker candidate lipids. Machine learning models (AdaBoost, Random Forest, Gradient Boosting) were developed and evaluated utilizing these biomarkers to differentiate liver cancer. The AUC metric was employed to identify the optimal predictive model, whereas SHAP was utilized to achieve interpretability of the model's predictive decisions. Results: Notable alterations in lipid profiles were observed: decreased sphingomyelins (SM d39:2, SM d41:2) and increased fatty acids (FA 14:1, FA 22:2) and phosphatidylcholines (PC 34:1, PC 32:1). AdaBoost exhibited a superior classification performance, achieving an AUC of 0.875. SHAP identified PC 40:4 as the most efficacious lipid for model predictions. The SM d41:2 and SM d36:3 lipids were specifically associated with an increased risk of low-onset cancer and elevated levels of the PC 40:4 lipid. Conclusions: This study demonstrates that untargeted lipidomics, in conjunction with explainable artificial intelligence (XAI) and machine learning, may effectively identify biomarkers for the early detection of liver cancer. The results suggest that alterations in lipid metabolism are crucial to the progression of liver cancer and provide valuable insights for incorporating lipidomics into precision oncology.











