Classification of colorectal cancer based on gene sequencing data with XGBoost model: An application of public health informatics

dc.authoridAkbulut, Sami/0000-0002-6864-7711
dc.authoridÇOLAK, CEMİL/0000-0001-5406-098X
dc.authorwosidAkbulut, Sami/L-9568-2014
dc.authorwosidÇOLAK, CEMİL/ABI-3261-2020
dc.contributor.authorAkbulut, Sami
dc.contributor.authorKucukakcali, Zeynep
dc.contributor.authorColak, Cemil
dc.date.accessioned2024-08-04T20:10:35Z
dc.date.available2024-08-04T20:10:35Z
dc.date.issued2022
dc.departmentİnönü Üniversitesien_US
dc.description.abstractPurpose: This study aims to classify open-access colorectal cancer gene data and identify essential genes with the XGBoost method, a machine learning method. Materials and Methods: The open-access colorectal cancer gene dataset was used in the study. The dataset included gene sequencing results of 10 mucosae from healthy controls and the colonic mucosa of 12 patients with colorectal cancer. XGboost, one of the machine learning methods, was used to classify the disease. Accuracy, balanced accuracy, sensitivity, selectivity, positive predictive value, and negative predictive value performance metrics were evaluated for model performance. Results: According to the variable selection method, 17 genes were selected, and modeling was performed with these input variables. Accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score obtained from modeling results were 95.5%, 95.8%, 91.7%, 1%, 1%, and 90.9%, and 95.7%, respectively. According to the variable impotance acquired from the XGboost technique results, the CYR61, NR4A, FOSB, and NR4A2 genes can be employed as biomarkers for colorectal cancer. Conclusion: As a consequence of this research, genes that may be linked to colorectal cancer and genetic biomarkers for the illness were identified. In the future, the detected genes' reliability can be verified, therapeutic procedures can be established based on these genes, and their usefulness in clinical practice may be documented.en_US
dc.identifier.doi10.17826/cumj.112853
dc.identifier.endpage1186en_US
dc.identifier.issn2602-3032
dc.identifier.issn2602-3040
dc.identifier.issue3en_US
dc.identifier.startpage1179en_US
dc.identifier.trdizinid1122188en_US
dc.identifier.urihttps://doi.org/10.17826/cumj.112853
dc.identifier.urihttps://search.trdizin.gov.tr/yayin/detay/1122188
dc.identifier.urihttps://hdl.handle.net/11616/92887
dc.identifier.volume47en_US
dc.identifier.wosWOS:000889635700030en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakTR-Dizinen_US
dc.language.isoenen_US
dc.publisherCukurova Univ, Fac Medicineen_US
dc.relation.ispartofCukurova Medical Journalen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectColorectal canceren_US
dc.subjectgenomicsen_US
dc.subjectmachine learningen_US
dc.subjectXGboost modelen_US
dc.titleClassification of colorectal cancer based on gene sequencing data with XGBoost model: An application of public health informaticsen_US
dc.typeArticleen_US

Dosyalar