Yazar "Gormez, Yasin" seçeneğine göre listele
Listeleniyor 1 - 8 / 8
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Biomarker discovery and development of prognostic prediction model using metabolomic panel in breast cancer patients: a hybrid methodology integrating machine learning and explainable artificial intelligence(Frontiers Media Sa, 2024) Yagin, Fatma Hilal; Gormez, Yasin; Al-Hashem, Fahaid; Ahmad, Irshad; Ahmad, Fuzail; Ardigo, Luca PaoloBackground Breast cancer (BC) is a significant cause of morbidity and mortality in women. Although the important role of metabolism in the molecular pathogenesis of BC is known, there is still a need for robust metabolomic biomarkers and predictive models that will enable the detection and prognosis of BC. This study aims to identify targeted metabolomic biomarker candidates based on explainable artificial intelligence (XAI) for the specific detection of BC.Methods Data obtained after targeted metabolomics analyses using plasma samples from BC patients (n = 102) and healthy controls (n = 99) were used. Machine learning (ML) models based on raw data were developed, then feature selection methods were applied, and the results were compared. SHapley Additive exPlanations (SHAP), an XAI method, was used to clinically explain the decisions of the optimal model in BC prediction.Results The results revealed that variable selection increased the performance of ML models in BC classification, and the optimal model was obtained with the logistic regression (LR) classifier after support vector machine (SVM)-SHAP-based feature selection. SHAP annotations of the LR model revealed that Leucine, isoleucine, L-alloisoleucine, norleucine, and homoserine acids were the most important potential BC diagnostic biomarkers. Combining the identified metabolite markers provided robust BC classification measures with precision, recall, and specificity of 89.50%, 88.38%, and 83.67%, respectively.Conclusion In conclusion, this study adds valuable information to the discovery of BC biomarkers and underscores the potential of targeted metabolomics-based diagnostic advances in the management of BC.Öğe Estimation of Obesity Levels through the Proposed Predictive Approach Based on Physical Activity and Nutritional Habits(Mdpi, 2023) Gozukara Bag, Harika Gozde; Yagin, Fatma Hilal; Gormez, Yasin; Gonzalez, Pablo Prieto; Colak, Cemil; Gulu, Mehmet; Badicu, GeorgianObesity is the excessive accumulation of adipose tissue in the body that leads to health risks. The study aimed to classify obesity levels using a tree-based machine-learning approach considering physical activity and nutritional habits. Methods: The current study employed an observational design, collecting data from a public dataset via a web-based survey to assess eating habits and physical activity levels. The data included gender, age, height, weight, family history of being overweight, dietary patterns, physical activity frequency, and more. Data preprocessing involved addressing class imbalance using Synthetic Minority Over-sampling TEchnique-Nominal Continuous (SMOTE-NC) and feature selection using Recursive Feature Elimination (RFE). Three classification algorithms (logistic regression (LR), random forest (RF), and Extreme Gradient Boosting (XGBoost)) were used for obesity level prediction, and Bayesian optimization was employed for hyperparameter tuning. The performance of different models was evaluated using metrics such as accuracy, recall, precision, F1-score, area under the curve (AUC), and precision-recall curve. The LR model showed the best performance across most metrics, followed by RF and XGBoost. Feature selection improved the performance of LR and RF models, while XGBoost's performance was mixed. The study contributes to the understanding of obesity classification using machine-learning techniques based on physical activity and nutritional habits. The LR model demonstrated the most robust performance, and feature selection was shown to enhance model efficiency. The findings underscore the importance of considering both physical activity and nutritional habits in addressing the obesity epidemic.Öğe Estimation of Obesity Levels with a Trained Neural Network Approach optimized by the Bayesian Technique(Mdpi, 2023) Yagin, Fatma Hilal; Gulu, Mehmet; Gormez, Yasin; Castaneda-Babarro, Arkaitz; Colak, Cemil; Greco, Gianpiero; Fischetti, FrancescoBackground: Obesity, which causes physical and mental problems, is a global health problem with serious consequences. The prevalence of obesity is increasing steadily, and therefore, new research is needed that examines the influencing factors of obesity and how to predict the occurrence of the condition according to these factors. This study aimed to predict the level of obesity based on physical activity and eating habits using the trained neural network model. Methods: The chi-square, F-Classify, and mutual information classification algorithms were used to identify the most critical factors associated with obesity. The models' performances were compared using a trained neural network with different feature sets. The hyperparameters of the models were optimized using Bayesian optimization techniques, which are faster and more effective than traditional techniques. Results: The results predicted the level of obesity with average accuracies of 93.06%, 89.04%, 90.32%, and 86.52% for all features using the neural network and for the features selected by the chi-square, F-Classify, and mutual information classification algorithms. The results showed that physical activity, alcohol consumption, use of technological devices, frequent consumption of high-calorie meals, and frequency of vegetable consumption were the most important factors affecting obesity. Conclusions: The F-Classify score algorithm identified the most essential features for obesity level estimation. Furthermore, physical activity and eating habits were the most critical factors for obesity prediction.Öğe Explainable Artificial Intelligence Paves the Way in Precision Diagnostics and Biomarker Discovery for the Subclass of Diabetic Retinopathy in Type 2 Diabetics(Mdpi, 2023) Yagin, Fatma Hilal; Yasar, Seyma; Gormez, Yasin; Yagin, Burak; Pinar, Abdulvahap; Alkhateeb, Abedalrhman; Ardigo, Luca PaoloDiabetic retinopathy (DR), a common ocular microvascular complication of diabetes, contributes significantly to diabetes-related vision loss. This study addresses the imperative need for early diagnosis of DR and precise treatment strategies based on the explainable artificial intelligence (XAI) framework. The study integrated clinical, biochemical, and metabolomic biomarkers associated with the following classes: non-DR (NDR), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR) in type 2 diabetes (T2D) patients. To create machine learning (ML) models, 10% of the data was divided into validation sets and 90% into discovery sets. The validation dataset was used for hyperparameter optimization and feature selection stages, while the discovery dataset was used to measure the performance of the models. A 10-fold cross-validation technique was used to evaluate the performance of ML models. Biomarker discovery was performed using minimum redundancy maximum relevance (mRMR), Boruta, and explainable boosting machine (EBM). The predictive proposed framework compares the results of eXtreme Gradient Boosting (XGBoost), natural gradient boosting for probabilistic prediction (NGBoost), and EBM models in determining the DR subclass. The hyperparameters of the models were optimized using Bayesian optimization. Combining EBM feature selection with XGBoost, the optimal model achieved (91.25 +/- 1.88) % accuracy, (89.33 +/- 1.80) % precision, (91.24 +/- 1.67) % recall, (89.37 +/- 1.52) % F1-Score, and (97.00 +/- 0.25) % the area under the ROC curve (AUROC). According to the EBM explanation, the six most important biomarkers in determining the course of DR were tryptophan (Trp), phosphatidylcholine diacyl C42:2 (PC.aa.C42.2), butyrylcarnitine (C4), tyrosine (Tyr), hexadecanoyl carnitine (C16) and total dimethylarginine (DMA). The identified biomarkers may provide a better understanding of the progression of DR, paving the way for more precise and cost-effective diagnostic and treatment strategies.Öğe Hybrid Explainable Artificial Intelligence Models for Targeted Metabolomics Analysis of Diabetic Retinopathy(Mdpi, 2024) Yagin, Fatma Hilal; Colak, Cemil; Algarni, Abdulmohsen; Gormez, Yasin; Guldogan, Emek; Ardigo, Luca PaoloBackground: Diabetic retinopathy (DR) is a prevalent microvascular complication of diabetes mellitus, and early detection is crucial for effective management. Metabolomics profiling has emerged as a promising approach for identifying potential biomarkers associated with DR progression. This study aimed to develop a hybrid explainable artificial intelligence (XAI) model for targeted metabolomics analysis of patients with DR, utilizing a focused approach to identify specific metabolites exhibiting varying concentrations among individuals without DR (NDR), those with non-proliferative DR (NPDR), and individuals with proliferative DR (PDR) who have type 2 diabetes mellitus (T2DM). Methods: A total of 317 T2DM patients, including 143 NDR, 123 NPDR, and 51 PDR cases, were included in the study. Serum samples underwent targeted metabolomics analysis using liquid chromatography and mass spectrometry. Several machine learning models, including Support Vector Machines (SVC), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), and Multilayer Perceptrons (MLP), were implemented as solo models and in a two-stage ensemble hybrid approach. The models were trained and validated using 10-fold cross-validation. SHapley Additive exPlanations (SHAP) were employed to interpret the contributions of each feature to the model predictions. Statistical analyses were conducted using the Shapiro-Wilk test for normality, the Kruskal-Wallis H test for group differences, and the Mann-Whitney U test with Bonferroni correction for post-hoc comparisons. Results: The hybrid SVC + MLP model achieved the highest performance, with an accuracy of 89.58%, a precision of 87.18%, an F1-score of 88.20%, and an F-beta score of 87.55%. SHAP analysis revealed that glucose, glycine, and age were consistently important features across all DR classes, while creatinine and various phosphatidylcholines exhibited higher importance in the PDR class, suggesting their potential as biomarkers for severe DR. Conclusion: The hybrid XAI models, particularly the SVC + MLP ensemble, demonstrated superior performance in predicting DR progression compared to solo models. The application of SHAP facilitates the interpretation of feature importance, providing valuable insights into the metabolic and physiological markers associated with different stages of DR. These findings highlight the potential of hybrid XAI models combined with explainable techniques for early detection, targeted interventions, and personalized treatment strategies in DR management.Öğe Machine Learning Classification of Cognitive Status in Community-Dwelling Sarcopenic Women: A SHAP-Based Analysis of Physical Activity and Anthropometric Factors(Mdpi, 2025) Gormez, Yasin; Yagin, Fatma Hilal; Aygun, Yalin; Alzakari, Sarah A.; Alhussan, Amel Ali; Aghaei, MohammadrezaBackground and Objectives: Sarcopenia, characterized by progressive loss of skeletal muscle mass and function, has increasingly been recognized not only as a physical health concern but also as a potential risk factor for cognitive decline. This study investigates the application of machine learning algorithms to classify cognitive status based on Mini-Mental State Examination (MMSE) scores in community-dwelling sarcopenic women. Materials and Methods: A dataset of 67 participants was analyzed, with MMSE scores categorized into severe (<= 17) and mild (>17) cognitive impairment. Eight classification models-MLP, CatBoost, LightGBM, XGBoost, Random Forest (RF), Gradient Boosting (GB), Logistic Regression (LR), and AdaBoost-were evaluated using a repeated holdout strategy over 100 iterations. Hyperparameter optimization was performed via Bayesian optimization, and model performance was assessed using metrics including weighted F1-score (w_f1), accuracy, precision, recall, PR-AUC, and ROC-AUC. Results: Among the models, CatBoost achieved the highest w_f1 (87.05 +/- 2.85%) and ROC-AUC (90 +/- 5.65%), while AdaBoost and GB showed superior PR-AUC scores (92.49% and 91.88%, respectively), indicating strong performance in handling class imbalance and threshold sensitivity. SHAP (SHapley Additive exPlanations) analysis revealed that moderate physical activity (moderatePA minutes), walking days, and sitting time were among the most influential features, with higher physical activity associated with reduced risk of cognitive impairment. Anthropometric factors such as age, BMI, and weight also contributed significantly. Conclusions: The results highlight the effectiveness of boosting-based models in capturing complex patterns in clinical data and provide interpretable evidence supporting the role of modifiable lifestyle factors in cognitive health. These findings suggest that machine learning, combined with explainable AI, can enhance risk assessment and inform targeted interventions for cognitive decline in older women.Öğe Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy(Mdpi, 2025) Yagin, Fatma Hilal; Gormez, Yasin; Colak, Cemil; Algarni, Abdulmohsen; Al-Hashem, Fahaid; Ardigo, Luca PaoloBackground and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework-leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints-to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC-AUC, precision-recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC-AUC and precision-recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning-particularly dANNs and SVM-in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders.Öğe Prediction of obesity levels based on physical activity and eating habits with a machine learning model integrated with explainable artificial intelligence(Frontiers Media Sa, 2025) Gormez, Yasin; Yagin, Fatma Hilal; Yagin, Burak; Aygun, Yalin; Boke, Hulusi; Badicu, Georgian; De Sousa Fernandes, Matheus SantosObjectives This study aims to build a machine learning (ML) prediction model integrated with explainable artificial intelligence (XAI) to categorize obesity levels from physical activity and dietary patterns. The inclusion of XAI methodologies facilitates a comprehensive understanding of the risk factors influencing the model predictions and thus increases transparency in the identification of obesity risk factors.Methods Six ML models were used: Bernoulli Naive Bayes, CatBoost, Decision Tree, Extra Trees Classifier, Histogram-based Gradient Boosting and Support Vector Machine. For each model, hyperparameters were tuned by random search methodology and model effectiveness was evaluated by repeated holdout testing. SHAP (SHapley Additive Annotations) and LIME (Local Interpretable Model Independent Annotations) interpretability methods were used to generate local and global feature importance measures.Results The CatBoost model exhibited the highest overall performance and achieved superior results in accuracy, precision, F1 score and AUC metrics. Nonetheless, other models such as Decision Tree and Histogram-based Gradient Boosting also yielded strong and competitive results. The results also highlighted age, weight, height and specific food patterns as key predictors of obesity. In terms of interpretability, LIME showed superior in fidelity, whereas SHAP showed improved sparsity and consistency across models, facilitating a comprehensive understanding of trait importance.Conclusion This research demonstrates that ML algorithms, when integrated with XAI technologies, can accurately predict obesity levels and explain important contributing risk factors. The use of SHAP and LIME increases model transparency, facilitating the identification of specific lifestyle patterns linked to obesity risk. These findings help to formulate more precise intervention techniques guided by a reliable and understandable predictive framework.











