Identification of Breast Cancer Metastasis Using Boosting Algorithms on Cytopathologic Data

Identification of Breast Cancer Metastasis Using Boosting Algorithms on Cytopathologic Data

Breast cancer is the second most common cancer among women after lung cancer. Early diagnosis of cancer can positively affect the recovery process from disease. Several machine learning-based approaches have been studied for cancer detection on histopathological images. In this study, identification of cancer type has been made using Gradient Boosting Machine (GBM), eXtreme Gradient Boost (XGBoost), and Light Gradient Boosting Machine (LightGBM) algorithms. The performances of these techniques have been measured on the Breast Cancer Wisconsin (Diagnostic) dataset. According to the results obtained, Gradient Boosting Machine (GBM) got the highest accuracy rate with 97.02% success. Although there is no pathological prior knowledge about the disease, high success has been achieved in diagnosing with the deep learning architectures used.

___

  • [1] N. Yassin, S. Omran, M. E. Houby, and H. Allam, “Machine learning techniques for breast cancer computer aided diagnosis using different image modalities,” A Systematic Review. Comput Meth Prog Bio, vol. 156, no. 1, pp. 25-45, 2018.
  • [2] M. Allinen, R. Beroukhim, L. Cai, C. Brennan, J. Lahti-Domenici, H. Huang, and K. Polyak, “Molecular characterization of the tumor microenvironment in breast cancer,” Cancer Cell, vol. 6, no. 1, pp. 17-32, 2004.
  • [3] S. H. Jafari, Z. Saadatpour, A. Salmaninejad, F. Momeni, M. Mokhtari, J. S. Nahand, and M. Kianmehr, “Breast cancer diagnosis: imaging techniques and biochemical markers,” Journal of Cellular Physiology, vol. 233, no. 7, pp. 5200-5213, 2018.
  • [4] N. Antropova, B. Q. Huynh, and M. L. Giger, “A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets,” Medical physics, vol. 44, no. 10, pp. 5162-5171, 2017.
  • [5] W. Yue, Z. Wang, H. Chen, A. Payne, and X. Liu, “Machine learning with applications in breast cancer diagnosis and prognosis” Designs,vol, 2, no. 2, pp. 13, 2018.
  • [6] F. Spanhol, E. Oliveira, C. Petitjean, and L. Heutte, “Breast cancer histopathological image classification using convolutional neural networks,” International Joint Conference on Neural Networks (IJCNN), vol. 32, no. 4, pp. 2560-2567, 2016.
  • [7] Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, and S. Li, “Breast cancer multi-classification from histopathological images with structured deep learning model,” Scientific Reports, vol. 7, no. 1, pp. 4172-4182, 2017.
  • [8] H. Wang, B. Zheng, S. W. Yoon, and H. S. Ko, “A support vector machine-based ensemble algorithm for breast cancer diagnosis,” European Journal of Operational Research, vol. 267, no. 2, pp. 687-699, 2018.
  • [9] D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck, “Deep learning for identifying metastatic breast cancer,” arXiv preprint arXiv:1606.05718, 2016.
  • [10] B. E. Bejnordi, M. Veta, P. J. V. Diest, B. V. Ginneken, N. Karssemeijer, G. Litjens, and CAMELYON16 Consortium. “Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer,” Jama, vol. 318, no. 22, pp. 2199-2210, 2017.
  • [11] A. Yala, C. Lehman, T. Schuster, T. Portnoi, and R. Barzilay, “A deep learning mammography-based model for improved breast cancer risk prediction,” Radiology, vol. 292, no. 1, pp. 60-66, 2019.
  • [12] S. Khan, N. Islam, Z. Jan, I. U. Din, and J. J. C. Rodrigues, “A novel deep learning based framework for the detection and classification of breast cancer using transfer learning,” Pattern Recognition Letters, vol. 125, pp. 1-6, 2019.
  • [13] P. Filipczuk, T. Fevens, A. Krzyzak, and R. Monczak, “Computer-Aided Breast Cancer Diagnosis Based on the Analysis of Cytological Images of Fine Needle Biopsies,” IEEE Trans. Med. Imaging, vol. 32, no. 12, pp. 2169-2178, 2013.
  • [14] S. Albarqouni, C. Baur, F. Achilles, V. Belagiannis, S. Demirci, and N. Navab, “Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images,” IEEE transactions on medical imaging, vol. 35 no. 5, pp. 1313-1321, 2016.
  • [15] L. Q. Zhou, X. L. Wu, S. Y. Huang, G. G. Wu, H. R. Ye, Q. Wei, and C. F. Dietrich, “Lymph node metastasis prediction from primary breast cancer US images using deep learning,” Radiology, vol. 294 no. 1, pp. 19-28, 2020.
  • [16] T. Araujo, G. Aresta, F. Castro, J. Rouco, P. Aguiar, C. Eloy, A. Polónia, and A. Campilho, “Classification of breast cancer histology images using convolutional neural networks,” Plos One, vol. 12 no. 6, e0177544, 2017.
  • [17] N. Bayramoglu, J. Kannala, and J. Heikkila, “Deep learning for magnification independent breast cancer histopathology image classification,” 23rd International Conference on Pattern Recognition (ICPR) ’2016 pp. 2440-2445.
  • [18] Z. Alom, C. Yakopcic, M. Taha, and K. Asari, “Breast Cancer Classification from Histopathological Images with Inception Recurrent Residual Convolutional Neural Network,” J Digit Imaging, vol. 45, no. 3, pp. 1-13, 2019.
  • [19] M. Veta, P. Pluim, J. Van, and A. Viergever, “Breast cancer histopathology image analysis: A review,” IEEE T BIO-MED ENG, vol. 61, no. 5, pp. 1400-1411, 2014.
  • [20] D. Dua, and C. Graff, “UCI Machine Learning Repository [http://archive.ics.uci.edu/ml],” Irvine, CA: University of California, School of Information and Computer Science, 2019.
  • [21] Karl Pearson F.R.S. “LIII. On lines and planes of closest fit to systems of points in space” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, https://doi.org/10.1080/14786440109462720, 1901.
  • [22] H. Hotelling, “Analysis of a complex of statistical variables into principal components,” Journal of Educational Psychology, vol. 24, no. 6, pp. 417, 1993.
  • [23] J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of Statistics, pp. 1189–1232, 2001.
  • [24] B. Greenwell, B. Boehmke, J. Cunningham, G. B. M. Developers, M. B. Greenwell, 2019. Package ‘gbm’.
  • [25] G. Ridgeway, “Generalized Boosted Models: A guide to the gbm package,” Update, vol. 1, no. 1, 2007.
  • [26] T. Chen, and C. Guestrin, “Xgboost: a scalable tree boosting system,” In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining ’07,2016 pp. 785-794.
  • [27] A. M. Abdi, “Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data,” GIScience and Remote Sensing, vol. 57, no. 1, pp. 1-20, 2020. doi: 10.1080/15481603.2019.1650447
  • [28] L. Rumora, M. Miler, and D. Medak, “Impact of various atmospheric corrections on sentinel-2 land cover classification accuracy using machine learning classifiers,” ISPRS International Journal of Geo-Information, vol. 9, no. 4, pp. 277, doi: 10.3390/ijgi9040277.
  • [29] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, … T.-Y. Liu, “Lightgbm: a highly efficient gradient boosting decision tree,” Advances in Neural Information Processing Systems, vol. 30, pp. 3146–3154, 2017.
  • [30] R. Wang, Y. Liu, X. Ye, Q. Tang, J. Gou, M. Huang, and Y. Wen, “Power system transient stability assessment based on bayesian optimized LightGBM,” In 2019 IEEE 3rd Conference on Energy Internet and Energy System Integration (EI2) ‘11,2019, pp. 263- 268.