Open Access

Prediction of Recurrence of Differentiated Thyroid Cancer with Hybrid SMOTE-Stacking Model

Erkan Akkur1, Serkan Cizmecioğulları2*, Ahmet Cankat Öztürk3
1Turkish Medicine and Medical Devices Agency, Ankara, Türkiye
2Kırsehir Ahi Evran University, Kırşehir, Türkiye
3Presidency of The Republic of Turkey Secretariat of Defence Industries, Ankara, Türkiye
* Corresponding author: serkan.cizmeciogullari@ahievran.edu.tr

Presented at the International Trend of Tech Symposium (ITTSCONF2024), İstanbul, Türkiye, Dec 07, 2024

SETSCI Conference Proceedings, 2024, 21, Page (s): 33-37 , https://doi.org/10.36287/setsci.21.6.033

Published Date: 12 December 2024

Differentiated thyroid cancer (DTC) is the most frequent form of thyroid cancer. Although this type of cancer shows a favourable prognosis, the risk of recurrence remains a critical concern. Early and accurate prediction of the risk of recurrence is essential to improve patient outcomes and minimize this risk. This study proposes a hybrid model combining SMOTE (Synthetic Minority Oversampling Technique) and a stacking ensemble approach to predict DTC relapse. First, the dataset is balanced using the SMOTE technique, ensuring equal representation across classes. Then, the overall accuracy of the model is improved by the stacking method, which combines the predictions of multiple classifiers. This model has been tested on a publicly available dataset, with impressive results such as an accuracy of 99.09% and an AUC-ROC score of 0.998.

Keywords - Differentiated thyroid cancer, machine learning, ensemble learning, stacking, SMOTE

[1] Cabanillas, M. E., McFadden, D. G., & Durante, C. “Thyroid cancer”, The Lancet, 388(10061), 388(10061), 2783-2795, 2016.

[2] Hoff, A. O., Chaves, A. L. F., de Oliveira, T. B., Ramos, H. E., Penna, G. C., Santos, L. V. D., ... & Vizzotto, F. P. “Differentiated thyroid carcinoma: what the nonspecialists needs to know,” Archives of Endocrinology and Metabolism, 68, e230375, 2024.

[3] Pałyga, I., Rumian, M., Kosel, A., Albrzykowski, M., Krawczyk, P., Kalwat, A., ... & Kowalska, A. (2024). “The frequency of differentiated thyroid cancer recurrence in 2302 patients with excellent response to primary therapy”, The Journal of Clinical Endocrinology & Metabolism, 109(2), e569-e578, 2024.

[4] Yu L, Hong H, Han J, Leng SX, Zhang H, Yan X. “Comparison of Survival and Risk Factors of Differentiated Thyroid Cancer in the Geriatric Population”, Front Oncol. 2020 Feb 3; 10:42, 2020.

[5] Alowais, S. A., Alghamdi, S. S., Alsuhebany, N., Alqahtani, T., Alshaya, A. I., Almohareb, S. N., ... & Albekairy, A. M. “Revolutionizing healthcare: the role of artificial intelligence in clinical practice”, BMC Med Educ, 23(1), 689, 2023.

[6] Maurya, S., Tiwari, S., Mothukuri, M. C., Tangeda, C. M., Nandigam, R. N. S., & Addagiri, D. C. “A review on recent developments in cancer detection using Machine Learning and Deep Learning models”. Biomedical Signal Processing and Control, 2023, 80, 104398.

[7] Mooijman, P., Catal, C., Tekinerdogan, B., Lommen, A., & Blokland, M. (2023). “The effects of data balancing approaches: A case study”. Applied Soft Computing, 2023, 132, 109853.

[8] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. “SMOTE: synthetic minority over-sampling technique”. Journal of artificial intelligence research, 2002, 16, 321-357.

[9] Mahajan P, Uddin S, Hajati F, Moni MA. “Ensemble Learning for Disease Prediction: A Review”, Healthcare. 11(12):1808, 2023.

[10] Borzooei, S., Briganti, G., Golparian, M., Lechien, J. R., & Tarokhian, A. “Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study”, European Archives of Oto-Rhino-Laryngology, 281(4), 2095-2104, 2024.

[11] UCI Machine Learning Repository: Differentiated Thyroid Cancer Recurrence Available online:
https://archive.ics.uci.edu/dataset/915/differentiated+thyroid+cancer+recurrence (accessed on 1 August 2024).

[12] Alasadi, S. A., & Bhaya, W. S. “Review of data preprocessing techniques in data mining”. Journal of Engineering and Applied Sciences, 2017, 12(16), 4102-4107

[13] Jadhav, A., Dhaulakhandi, D., Shandilya, S. K., Malviya, L., & Mewada, A. “Data transformation: A preprocessing stage in machine learning regression problems”. In Artificial Intelligence Techniques in Power Systems Operations and Analysis, 2023, (pp. 183-194). Auerbach Publications.

[14] Sagan, A., & Łapczyński, M. “SEM-Tree hybrid models in the preferences analysis of the members of Polish households”, Advances in Data Analysis and Classification, 14, 855-869, 2020.

[15] Perlich, C., & Świrszcz, G. “On cross-validation and stacking: “Building seemingly predictive models on random data”, ACM SIGKDD Explorations Newsletter, 12(2), 11-15, 2011.

[16] Yaşar, Ş. “Determination of Possible Biomarkers for Predicting Well-Differentiated Thyroid Cancer Recurrence by Different Ensemble Machine Learning Methods.” Middle Black Sea Journal of Health Science, 2024, 10(3), 255-265.

0
Citations (Crossref)
4.3K
Total Views
34
Total Downloads

Licence Creative Commons This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
SETSCI 2025
info@set-science.com
Copyright © 2025 SETECH
Tokat Technology Development Zone Gaziosmanpaşa University Taşlıçiftlik Campus, 60240 TOKAT-TÜRKİYE