Identification of Breast Cancer Metastasis Using Boosting Algorithms on Cytopathologic Data
Identification of Breast Cancer Metastasis Using Boosting Algorithms on Cytopathologic Data
Breast cancer is the second most common cancer among women after lung cancer. Early diagnosis of cancer can positively affect the recovery process from disease. Several machine learning-based approaches have been studied for cancer detection on histopathological images. In this study, identification of cancer type has been made using Gradient Boosting Machine (GBM), eXtreme Gradient Boost (XGBoost), and Light Gradient Boosting Machine (LightGBM) algorithms. The performances of these techniques have been measured on the Breast Cancer Wisconsin (Diagnostic) dataset. According to the results obtained, Gradient Boosting Machine (GBM) got the highest accuracy rate with 97.02% success. Although there is no pathological prior knowledge about the disease, high success has been achieved in diagnosing with the deep learning architectures used.
___
- [1] N. Yassin, S. Omran, M. E. Houby, and H. Allam, “Machine learning techniques for breast cancer computer aided diagnosis using
different image modalities,” A Systematic Review. Comput Meth Prog Bio, vol. 156, no. 1, pp. 25-45, 2018.
- [2] M. Allinen, R. Beroukhim, L. Cai, C. Brennan, J. Lahti-Domenici, H. Huang, and K. Polyak, “Molecular characterization of the
tumor microenvironment in breast cancer,” Cancer Cell, vol. 6, no. 1, pp. 17-32, 2004.
- [3] S. H. Jafari, Z. Saadatpour, A. Salmaninejad, F. Momeni, M. Mokhtari, J. S. Nahand, and M. Kianmehr, “Breast cancer diagnosis:
imaging techniques and biochemical markers,” Journal of Cellular Physiology, vol. 233, no. 7, pp. 5200-5213, 2018.
- [4] N. Antropova, B. Q. Huynh, and M. L. Giger, “A deep feature fusion methodology for breast cancer diagnosis demonstrated on
three imaging modality datasets,” Medical physics, vol. 44, no. 10, pp. 5162-5171, 2017.
- [5] W. Yue, Z. Wang, H. Chen, A. Payne, and X. Liu, “Machine learning with applications in breast cancer diagnosis and prognosis”
Designs,vol, 2, no. 2, pp. 13, 2018.
- [6] F. Spanhol, E. Oliveira, C. Petitjean, and L. Heutte, “Breast cancer histopathological image classification using convolutional
neural networks,” International Joint Conference on Neural Networks (IJCNN), vol. 32, no. 4, pp. 2560-2567, 2016.
- [7] Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, and S. Li, “Breast cancer multi-classification from histopathological images with
structured deep learning model,” Scientific Reports, vol. 7, no. 1, pp. 4172-4182, 2017.
- [8] H. Wang, B. Zheng, S. W. Yoon, and H. S. Ko, “A support vector machine-based ensemble algorithm for breast cancer diagnosis,”
European Journal of Operational Research, vol. 267, no. 2, pp. 687-699, 2018.
- [9] D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck, “Deep learning for identifying metastatic breast cancer,” arXiv
preprint arXiv:1606.05718, 2016.
- [10] B. E. Bejnordi, M. Veta, P. J. V. Diest, B. V. Ginneken, N. Karssemeijer, G. Litjens, and CAMELYON16 Consortium. “Diagnostic
assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer,” Jama, vol. 318, no.
22, pp. 2199-2210, 2017.
- [11] A. Yala, C. Lehman, T. Schuster, T. Portnoi, and R. Barzilay, “A deep learning mammography-based model for improved breast
cancer risk prediction,” Radiology, vol. 292, no. 1, pp. 60-66, 2019.
- [12] S. Khan, N. Islam, Z. Jan, I. U. Din, and J. J. C. Rodrigues, “A novel deep learning based framework for the detection and
classification of breast cancer using transfer learning,” Pattern Recognition Letters, vol. 125, pp. 1-6, 2019.
- [13] P. Filipczuk, T. Fevens, A. Krzyzak, and R. Monczak, “Computer-Aided Breast Cancer Diagnosis Based on the Analysis of
Cytological Images of Fine Needle Biopsies,” IEEE Trans. Med. Imaging, vol. 32, no. 12, pp. 2169-2178, 2013.
- [14] S. Albarqouni, C. Baur, F. Achilles, V. Belagiannis, S. Demirci, and N. Navab, “Aggnet: deep learning from crowds for mitosis
detection in breast cancer histology images,” IEEE transactions on medical imaging, vol. 35 no. 5, pp. 1313-1321, 2016.
- [15] L. Q. Zhou, X. L. Wu, S. Y. Huang, G. G. Wu, H. R. Ye, Q. Wei, and C. F. Dietrich, “Lymph node metastasis prediction from
primary breast cancer US images using deep learning,” Radiology, vol. 294 no. 1, pp. 19-28, 2020.
- [16] T. Araujo, G. Aresta, F. Castro, J. Rouco, P. Aguiar, C. Eloy, A. Polónia, and A. Campilho, “Classification of breast cancer
histology images using convolutional neural networks,” Plos One, vol. 12 no. 6, e0177544, 2017.
- [17] N. Bayramoglu, J. Kannala, and J. Heikkila, “Deep learning for magnification independent breast cancer histopathology image
classification,” 23rd International Conference on Pattern Recognition (ICPR) ’2016 pp. 2440-2445.
- [18] Z. Alom, C. Yakopcic, M. Taha, and K. Asari, “Breast Cancer Classification from Histopathological Images with Inception
Recurrent Residual Convolutional Neural Network,” J Digit Imaging, vol. 45, no. 3, pp. 1-13, 2019.
- [19] M. Veta, P. Pluim, J. Van, and A. Viergever, “Breast cancer histopathology image analysis: A review,” IEEE T BIO-MED ENG,
vol. 61, no. 5, pp. 1400-1411, 2014.
- [20] D. Dua, and C. Graff, “UCI Machine Learning Repository [http://archive.ics.uci.edu/ml],” Irvine, CA: University of California,
School of Information and Computer Science, 2019.
- [21] Karl Pearson F.R.S. “LIII. On lines and planes of closest fit to systems of points in space” The London, Edinburgh, and Dublin
Philosophical Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, https://doi.org/10.1080/14786440109462720, 1901.
- [22] H. Hotelling, “Analysis of a complex of statistical variables into principal components,” Journal of Educational Psychology, vol.
24, no. 6, pp. 417, 1993.
- [23] J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of Statistics, pp. 1189–1232, 2001.
- [24] B. Greenwell, B. Boehmke, J. Cunningham, G. B. M. Developers, M. B. Greenwell, 2019. Package ‘gbm’.
- [25] G. Ridgeway, “Generalized Boosted Models: A guide to the gbm package,” Update, vol. 1, no. 1, 2007.
- [26] T. Chen, and C. Guestrin, “Xgboost: a scalable tree boosting system,” In Proceedings of the 22nd acm sigkdd international
conference on knowledge discovery and data mining ’07,2016 pp. 785-794.
- [27] A. M. Abdi, “Land cover and land use classification performance of machine learning algorithms in a boreal landscape using
Sentinel-2 data,” GIScience and Remote Sensing, vol. 57, no. 1, pp. 1-20, 2020. doi: 10.1080/15481603.2019.1650447
- [28] L. Rumora, M. Miler, and D. Medak, “Impact of various atmospheric corrections on sentinel-2 land cover classification accuracy
using machine learning classifiers,” ISPRS International Journal of Geo-Information, vol. 9, no. 4, pp. 277, doi:
10.3390/ijgi9040277.
- [29] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, … T.-Y. Liu, “Lightgbm: a highly efficient gradient boosting decision
tree,” Advances in Neural Information Processing Systems, vol. 30, pp. 3146–3154, 2017.
- [30] R. Wang, Y. Liu, X. Ye, Q. Tang, J. Gou, M. Huang, and Y. Wen, “Power system transient stability assessment based on bayesian
optimized LightGBM,” In 2019 IEEE 3rd Conference on Energy Internet and Energy System Integration (EI2) ‘11,2019, pp. 263-
268.