An investigation of ensemble learning methods in classification problems and an application on non-small-cell lung cancer data

Kıvrak, Mehmet; Çolak, Cemil

An investigation of ensemble learning methods in classification problems and an application on non-small-cell lung cancer data

dc.contributor.author	Kıvrak, Mehmet
dc.contributor.author	Çolak, Cemil
dc.date.accessioned	2022-12-27T09:35:52Z
dc.date.available	2022-12-27T09:35:52Z
dc.date.issued	2022
dc.department	İnönü Üniversitesi	en_US
dc.description.abstract	This study aims to classify NSCLC death status and consists of patient records of 24 variables created by the open-source dataset of the cancer data site. Besides, basic classifiers such as SMO (Sequential Minimal Optimization), K-NN (K-Nearest Neighbor), random forest, and XGBoost (Extreme Gradient Boosting), which are machine learning methods, and their performances, and voting, bagging, boosting, and stacking methods from ensemble learning methods were used. Performance evaluation of models was compared in terms of accuracy, specificity, sensitivity, precision, and Roc curve. The basic classifier performances of random forest, SMO, K-NN, and XGBoost classifiers, their performances in the bagging ensemble learning method, and their performances in the boosting ensemble learning method are evaluated. In addition, Model 1 (random forest + SMO), Model 2 (XGBoost + K-NN), Model 3 (random forest + K-NN), Model 4 (XGBoost+SMO), Model 5 (SMO+K-NN + random forest), Model 6 (SMO+K-NN+XGBoost) and Model 7 (SMO+K-NN + random forest + XGBoost) the performances of in different metrics were expressed. The boosting ensemble learning method, which provides the maximum classification performance with XGBoost, achieved a 0.982 accuracy value, 0.971 sensitivity value, 0.989 precision value, 0.989 specificity value, and 0.998 ROC curve. It is recommended to use ensemble learning methods for classification problems in patients with a high prevalence of cancer to achieve successful results.	en_US
dc.identifier.citation	KIVRAK M, ÇOLAK C (2022). An investigation of ensemble learning methods in classification problems and an application on non-small-cell lung cancer data. Medicine Science, 11(2), 924 - 933. 10.5455/medscience.2021.10.339	en_US
dc.identifier.doi	10.5455/medscience.2021.10.339	en_US
dc.identifier.endpage	933	en_US
dc.identifier.issn	2147-0634
dc.identifier.issue	2	en_US
dc.identifier.startpage	924	en_US
dc.identifier.trdizinid	529902	en_US
dc.identifier.uri	https://doi.org/10.5455/medscience.2021.10.339
dc.identifier.uri	https://hdl.handle.net/11616/85924
dc.identifier.uri	https://search.trdizin.gov.tr/yayin/detay/529902
dc.identifier.volume	11	en_US
dc.indekslendigikaynak	TR-Dizin	en_US
dc.language.iso	en	en_US
dc.relation.ispartof	Medicine Science	en_US
dc.relation.publicationcategory	Makale - Ulusal Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.title	An investigation of ensemble learning methods in classification problems and an application on non-small-cell lung cancer data	en_US
dc.type	Article	en_US

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: document - 2022-12-27T123537.367.pdf
Boyut:: 1.14 MB
Biçim:: Adobe Portable Document Format
Açıklama:

İndir

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.71 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

TR-Dizin İndeksli Yayınlar Koleksiyonu