Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application

dc.authoridÇOLAK, CEMİL/0000-0001-5406-098X
dc.authoridAkbulut, Sami/0000-0002-6864-7711
dc.authorwosidÇOLAK, CEMİL/ABI-3261-2020
dc.authorwosidAkbulut, Sami/L-9568-2014
dc.contributor.authorAkbulut, Sami
dc.contributor.authorCicek, Ipek Balikci
dc.contributor.authorColak, Cemil
dc.date.accessioned2024-08-04T20:10:12Z
dc.date.available2024-08-04T20:10:12Z
dc.date.issued2022
dc.departmentİnönü Üniversitesien_US
dc.description.abstractAim: The diagnosis of breast cancer can be accomplished using an algorithm or an early detection model of breast cancer risk via determining factors. In the present study, gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) models were applied and their performances were compared. Methods: The open-access Breast Cancer Wisconsin Dataset, which includes 10 features of breast tumors and results from 569 patients, was used for this study. The GBM, XGBoost, and LightGBM models for classifying breast cancer were established by a repeated stratified K-fold cross validation method. The performance of the model was evaluated with accuracy, recall, precision, and area under the curve (AUC). Results: Accuracy, recall, AUC, and precision values obtained from the GBM, XGBoost, and LightGBM models were as follows: (93.9%, 93.5%, 0.984, 93.8%), (94.6%, 94%, 0.985, 94.6%), and (95.3%, 94.8%, 0.987, 95.5%), respectively. According to these results, the best performance metrics were obtained from the LightGBM model. When the effects of the variables in the dataset on breast cancer were assessed in this study, the five most significant factors for the LightGBM model were the mean of concave points, texture mean, concavity mean, radius mean, and perimeter mean, respectively. Conclusion: According to the findings obtained from the study, the LightGBM model gave more successful predictions for breast cancer classification compared with other models. Unlike similar studies examining the same dataset, this study presented variable significance for breast cancer-related variables. Applying the LightGBM approach in the medical field can help doctors make a quick and precise diagnosis.en_US
dc.identifier.doi10.4274/haseki.galenos.2022.8440
dc.identifier.endpage203en_US
dc.identifier.issn1302-0072
dc.identifier.issn2147-2688
dc.identifier.issue3en_US
dc.identifier.scopus2-s2.0-85133955884en_US
dc.identifier.scopusqualityQ4en_US
dc.identifier.startpage196en_US
dc.identifier.trdizinid530930en_US
dc.identifier.urihttps://doi.org/10.4274/haseki.galenos.2022.8440
dc.identifier.urihttps://search.trdizin.gov.tr/yayin/detay/530930
dc.identifier.urihttps://hdl.handle.net/11616/92658
dc.identifier.volume60en_US
dc.identifier.wosWOS:000823150600002en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakTR-Dizinen_US
dc.language.isoenen_US
dc.publisherGalenos Publ Houseen_US
dc.relation.ispartofHaseki Tip Bulteni-Medical Bulletin of Hasekien_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectBreast canceren_US
dc.subjectboosting algorithmen_US
dc.subjectgradient boosting algorithmen_US
dc.subjectXGBoost algorithmen_US
dc.subjectLightGBM algorithmen_US
dc.titleClassification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Applicationen_US
dc.typeArticleen_US

Dosyalar