Early Detection of Breast Cancer: Comparative Analysis of Machine Learning and Deep Learning Algorithms
Main Article Content
Abstract
Breast cancer classification remains a critical challenge in medical diagnostics due to the imbalanced nature of available datasets, where the minority (cancerous malignant) class is often overshadowed by the majority (benign) class. This study proposes a hybrid model based on logistic regression, enhanced with class balancing techniques and ant search optimization, to improve the identification of the malignant class. The model is compared with SVM, Random Forest, and KNearest Neighbors (KNN) across three stages: prediction before diagnosis, at diagnosis and therapy, and post-treatment outcomes. The experiments, conducted on the Jupyter platform using the Wisconsin breast cancer dataset, demonstrate that the hybrid model achieves a high accuracy of 92.98%, significantly reducing false negatives. The study highlights the strengths of logistic regression in providing interpretable results, crucial for clinical decision-making, especially when compared to more complex models like Artificial Neural Networks (ANN). This research offers a reliable and accurate tool for early breast cancer detection and prognosis, contributing to ongoing efforts to enhance patient outcomes through the integration of hybrid machine learning models in medical diagnostics.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
How to Cite
References
Park SH, Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiol Soc N Am. DOI: https://doi.org/10.1148/radiol.2017171920
Islam, M. M., Haque, M. R., Iqbal, H., Hasan, M. M., Hasan, M., & Kabir, M. N. (2020). Breast cancer prediction: A comparative study using machine learning techniques. DOI: https://doi.org/10.1007/s42979-020-00305-w
Y. Khourdifi and M. Bahaj, “Applying best machine learning algorithms for breast cancer prediction and classification,” in 2018 International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS), pp. 1–5. DOI: https://doi.org/10.1109/ICECOCS.2018.8610632
Y. Lu, J. Y. Li, Y. T. Su, and A. A. Liu, “A review of breast cancer detection in medical images,” in 2018 IEEE Visual Communications and Image Processing (VCIP). DOI: https://doi.org/10.1109/VCIP.2018.8698732
A. Reddy, B. Soni, and S. Reddy, “Breast cancer detection by leveraging machine learning,” ICT Express, 2020. DOI: https://doi.org/10.1016/j.icte.2020.04.009
Z. Salod and Y. Singh, “Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol”. Journal of Public Health Research, vol. 8, no. 3, 2019. DOI: https://doi.org/10.4081/jphr.2019.1677
S. Eltalhi and H. Kutrani, “Breast cancer diagnosis and prediction using machine learning and data mining techniques: A review,” IOSR Journal of Dental and Medical Sciences (IOSR-JDMS). https://www.researchgate.net/publication/333092560_Breast_Cancer_Diagnosis_and_Prediction_Using_Machine_Learning_and_Data_Mining_Techniques_A_Review
I. H. Witten and E. Frank, Data mining: practical machine learning tools and techniques with Java implementations, vol. 31 of Acm Sigmod Record. Elsevier, 2005. DOI: https://doi.org/10.1145/507338.507355
Milosevic, Marina; Jankovic, Dragan; Milenkovic, Aleksandar; Stojanov, Dragan . (2018). Early diagnosis and detection of breast cancer. Technology and Health Care, (), 1–31. DOI: https://doi.org/10.3233/THC-181277
Connolly JL, Schnitt SJ. Benign breast disease: resolved and unresolved issues. Cancer 1993;71:1187-9. DOI: https://doi.org/10.1002/1097-0142(19930215)71:4%3C1187::aid-cncr2820710402%3E3.0.co;2-v
L. C. Hartmann et al., “Benign Breast Disease and the Risk of Breast Cancer,” New England Journal of Medicine, vol. 353, no. 3. Massachusetts Medical Society, pp. 229–237, Jul. 21, 2005. DOI: https://doi.org/10.1056/NEJMoa044383
Wang J, Costantino JP, Tan-Chiu E, Wickerham DL, Paik S, Wolmark N. Lowercategory benign breast disease and the risk of invasive breast cancer. J Natl Cancer Inst 2004;96:616-20. DOI: https://doi.org/10.1093/jnci/djhs105
Haagensen CD. Carcinoma of the Breast: A Monograph for the Physician. American Cancer Society, 1958; 7. DOI: https://doi.org/10.1097/00000658-194311850-00008
Roger S. Foster Jr; Michael C. Costanza. (1984). Breast self-examination practices and breast cancer survival. , 53(4), 999–1005. DOI: https://doi.org/10.1002/1097-0142(19840215)53:4<999::AID-CNCR2820530429>3.0.CO;2-N
F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, and A. Jemal, ‘Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries’, CA Cancer J. Clin., vol. 68, no. 6, pp. 394–424, Nov. 2018. DOI: https://doi.org/10.3322/caac.21492
C. E. DeSantis, S. A. Fedewa, A. Goding Sauer, J. L. Kramer, R. A. Smith, and A. Jemal, ‘Breast cancer statistics, 2015: Convergence of incidence rates between black and white women’, CA Cancer J. Clin., vol. 66, no. 1, pp. 31–42, Jan. 2016. DOI: https://doi.org/10.3322/caac.21320
Verma, B. and Zakos, J. (2001) A Computer-Aided Diagnosis System for Digital Mammograms Based on Fuzzy-Neural and Feature Extraction Techniques. IEEE Transactions on Information Technology in Biomedicine, 5, 46-54. DOI: https://doi.org/10.1109/4233.908389
L. Wang, ‘Early diagnosis of breast cancer’, Sensors (Basel), vol. 17, no. 7, p. 1572, Jul. 2017. DOI: https://doi.org/10.3390/s17071572
M. M. Islam, M. R. Haque, H. Iqbal, M. M. Hasan, M. Hasan, and M. N. Kabir, ‘Breast cancer prediction: A comparative study using machine learning techniques’, SN Comput. Sci., vol. 1, no. 5, Sep. 2020. DOI: https://doi.org/10.1007/s42979-020-00305-w
Sakri SB, Rashid NBA, Zain ZM. Particle swarm optimization feature selection for breast cancer recurrence prediction. IEEE Access. 2018;6:29637–47. DOI: https://doi.org/10.1109/ACCESS.2018.2843443
D. L. Olson and D. Delen, Advanced data mining techniques. Springer Science and Business Media, 2008. DOI: https://doi.org/10.1007/978-3-540-76917-0
L. Li et al., “Research on machine learning algorithms and feature extraction for time series,” in 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), pp. 1–5, IEEE. DOI: https://doi.org/10.1109/PIMRC.2017.8292668
Azar AT, El-Said SA. Performance analysis of support vector machines classifiers in breast cancer mammography recognition. Neural Comput Appl. 2013;24(5):1163–77. DOI: https://doi.org/10.1007/s00521-012-1324-4
Yue W, et al. Machine learning with applications in breast cancer diagnosis and prognosis. Designs. 2018;2(2):13. DOI: https://doi.org/10.3390/designs2020013
Banu AB, Subramanian PT. Comparison of Bayes classifiers for breast cancer classification. Asian Pac J Cancer Prev (APJCP). 2018;19(10):2917–20. DOI: https://doi.org/10.22034/apjcp.2018.19.10.2917
Chaurasia V, Pal S, Tiwari B. Prediction of benign and malignant breast cancer using data mining techniques. J Algorithms Comput Technol. 2018;12(2):119–26. DOI: https://doi.org/10.1177/1748301818756225
Azar AT, El-Metwally SM. Decision tree classifiers for automated medical diagnosis. Neural Comput Appl. 2012;23(7–8):2387–403. DOI: https://doi.org/10.1007/s00521-012-1196-7
Senapati MR, Mohanty AK, Dash S, Dash PK. Local linear wavelet neural network for breast cancer recognition. Neural Comput Appl. 2013;22(1):125–31. DOI: https://doi.org/10.1007/s00521-011-0670-y
Breast Cancer Wisconsin (Original) Data Set, [Online]. Accessed 25 Oct 2024. https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data/data
James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning. 1st ed. New York: Springer; 2013. DOI: https://doi.org/10.1007/978-1-0716-1418-1
Guido S, Müller AC. Introduction to
machine learning with python. Sebastopol: O’Reilly Media Inc.; 2016. https://www.nrigroupindia.com/e-book/Introduction%20to%20Machine%20Learning%20with%20Python%20(%20PDFDrive.com%20)-min.pdf
Dong L, Wesseloo J, Potvin Y, Li X. Discrimination of mine seismic events and blasts using the fisher classifier, naive Bayesian classifier and logistic regression. Rock Mech Rock Eng. 2015;49(1):183–211. DOI: https://doi.org/10.1007/s00603-015-0733-y
Fatima, Noreen; Liu, Li; Sha, Hong; Ahmed, Haroon . (2020). Prediction of Breast Cancer, Comparative Review of Machine Learning Techniques and their Analysis. IEEE Access, (), 1–1. DOI: https://doi.org/10.1109/ACCESS.2020.3016715
Ratner B. Statistical and machine-learning data mining: techniques for better predictive modeling and analysis of big data. Oxford: Chapman and Hall/CRC; 2017. DOI: https://doi.org/10.1201/9781315156316
Biswas, R., Roy, S., & Biswas, A. (2019). Mammogram Classification using Curvelet Coefficients and Gray Level Co-Occurrence Matrix for Detection of Breast Cancer. In International Journal of Innovative Technology and Exploring Engineering (Vol. 8, Issue 12, pp. 4819–4824). DOI: https://doi.org/10.35940/ijitee.l3694.1081219
Rani, Dr. Y. U., Kotturi, L. S., & Sudhakar, Dr. G. (2021). A Deep Learning Technique for Classification of Breast Cancer Disease. In International Journal of Engineering and Advanced Technology (Vol. 11, Issue 1, pp. 9–14). DOI: https://doi.org/10.35940/ijeat.a3119.1011121
Rajasekaran, G., & Ram, Dr. C. S. (2023). Breast Cancer Prediction Based on Feature Extraction using Hybrid Methodologies. In International Journal of Soft Computing and Engineering (Vol. 13, Issue 2, pp. 20–28). DOI: https://doi.org/10.35940/ijsce.b3612.0513223