Comparing The Effect of Under-Sampling and Over-Sampling on Traditional Machine Learning Algorithms for Epileptic Seizure Detection

Epilepsy disease, a neurological disorder that causes recurrent and sudden crises, occurs at unforeseen times. This study presents the classification of electroencephalogram signals for epileptic seizure prediction. The performances of the machine learning algorithms were evaluated on the dataset extracted from electroencephalogram signals. The dataset consists of brain activities for 23.5 seconds of 500 individuals with each has 178 data points for one second, and totally of 11500 pieces of information. In this study, since the aim was to develop a model to predict epileptic seizure, the problem was transformed into a two-class problem by combining target categories except than epileptic seizure. Since combined target categories made the dataset unbalanced, Random Under Sampling and Random Over Sampling methods were applied to prevent the machine learning algorithms from overfitting the dominant class. Thus, each of the three datasets was divided into training and test sets by ratios of 60/40, 70/30, 80/20. The performance of the several machine learning algorithms were evaluated and discussed through three different scenarios. Overall results showed us that Random Forest algorithm offered superior performance than others for all scenarios in terms of accuracy, sensitivity and specificity metrics.

___

[1] B. Sharif and A. H. Jafari, “Prediction of epileptic seizures from EEG using analysis of ictal rules on Poincaré plane”, Computer Methods and Programs in Biomedicine, vol. 145, pp. 11-22, 2017.

[2] E. Caplan, I. Dey, A. Scammell, K. Burnage, S.P. Paul, “Recognition and management of seizures in children in emergency departments”, Emergency Nurse, vol. 24, no. 5, pp. 30-38, 2016.

[3] O. Kocadagli, R. Langari, “Classification of EEG signals for epileptic seizures using hybrid artificial neural networks based wavelet transforms and fuzzy relations”, Expert Systems with Applications, vol. 88, pp. 419-434, 2017.

[4] H. Chu, C.K. Chung, W. Jeong, K.H. Cho, “Predicting epileptic seizures from scalp EEG based on attractor state analysis”, Computer Methods and Programs in Biomedicine, vol. 143, pp. 75-87, 2017.

[5] Z. Mohammadpoory, M. Nasrolahzadeh, J. Haddadnia, “Epileptic seizure detection in EEGs signals based on the weighted visibility graph entropy”, Seizure, vol. 50, pp. 202-208, 2017.

[6] T. Wan, M. Wu, X. Lai, X. Wan, J. She, Y. Du, “A four-stage localization method for epileptic seizure onset zones”, IFAC-Papers OnLine, vol. 50, no.1, pp. 4412-4417, 2017. [7] I. Kiral-Kornek, S. Roy, E. Nurse, B. Mashford, P. Karoly, T. Carroll, et al., “Epileptic Seizure Prediction Using Big Data and Deep Learning: Toward a Mobile System”, Ebiomedicine, vol. 27, pp. 103-111, 2018.

[8] A.R. Hassan, A. Subasi, “Automatic identification of epileptic seizures from EEG signals using linear programming boosting”, Computer Methods and Programs in Biomedicine, vol. 136, pp. 65-77, 2016.

[9] N.D. Truong, L. Kuhlmann, M.R. Bonyadi, J. Yang, A. Faulks, O. Kavehei, “Supervised learning in automatic channel selection for epileptic seizure detection”, Expert Systems with Applications, vol. 86, pp. 199-207, 2017.

[10] J. Jia, B. Goparaju, J. Song, R. Zhang, M.B. Westover, “Automated identification of epileptic seizures in EEG signals based on phase space representation and statistical features in the CEEMD domain”, Biomedical Signal Processing and Control, vol. 38, pp. 148-157, 2017.

[11] N.S. Tawfik, S.M. Youssef, M. Kholief, “A hybrid automated detection of epileptic seizures in EEG records”, Computers & Electrical Engineering, vol. 53, pp. 177-190, 2015.

[12] D. Gajic, Z. Djurovic, S.D. Gennaro, F. Gustafsson, “Classification of eeg signals for detection of epileptic seizures based on wavelets and statistical pattern recognition”, Biomed. Eng. Appl. Basis. Commun, vol. 26, no. 2, 1450021, 2014.

[13] E. Acar, C.A. Bingol, H. Bingol, R. Bro, B. Yener, “Seizure recognition on epilepsy feature tensor”, 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4273-4276, Lyon, France, 22-26 Aug. 2007.

[14] A. Tzallas, M. Tsipouras, D. Fotiadis, “Automatic seizure detection based on time-frequency analysis and artificial neural networks”, Comput. Intell. Neurosci. 2007: 80510, (2007).

[15] L. Guo, D. Rivero, J. Dorado, J.R. Rabunal, A. Pazos, “Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks”, J. Neurosci. Methods, vol. 191, no. 1, pp. 101–109, 2010.

[16] U. Orhan, M. Hekim, M. Ozer, “EEG signals classification using the k-means clus- tering and a multilayer perceptron neural network model”, Expert Syst. Appl. vol.38, no. 10, pp. 13475–13481, 2011.

[17] T. Gandhi, B.K. Panigrahi, S. Anand, “A comparative study of wavelet families for EEGsignal classification”, Neurocomputing, vol. 74, no. 17, pp. 3051– 3057, 2011.

[18] N. Nicolaou, J. Georgiou, “Detection of epileptic electroencephalogram based on permutation entropy and support vector machines”, Expert Syst. Appl. vol. 39, no. 1, pp. 202–209, 2012.

[19] Fu K, Qu J, Chai Y, Zou T. “Hilbert marginal spectrum analysis for automatic seizure detection in EEG signals”, Biomed Signal Process Control, vol. 18, pp. 179– 85, 2015.

[20] K. Samiee, P. Kovács, M. Gabbouj, “Epileptic seizure classification of EEG time-series using rational discrete short-time Fourier transform”, IEEE Trans. Biomed. Eng. vol. 62, no. 2, pp. 541–552, 2015.

[21] P. Swami, T.K. Gandhi, B.K. Panigrahi, M. Tripathi, S. Anand, “A novel robust di- agnostic model to detect seizures in electroencephalography”, Expert Syst. Appl. vol. 56, pp. 116–130, 2016.

[22] A.K. Jaiswal, H. Banka, “Local pattern transformation based feature extraction techniques for classification of epileptic EEG signals”, Biomed Signal Process Control, vol. 34, pp. 81–92, 2017.

[23] M. Sharma, R.B. Pachori, U.R. Acharya, “A new approach to characterize epileptic seizures using analytic time-frequency flexible wavelet transform and fractal dimension”, Pattern Recognition Letters, vol. 94, pp. 172- 179, 2017.

[24] R.G. Andrzejak, K. Lehnertz, C. Rieke, F. Mormann, P. David, C.E. Elger, “Indications of nonlinear deterministic and finite dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state”, Phys. Rev. E, vol. 64, 061907, 2001.

[25] L. Breiman, “Random forests”, Mach Learn, vol. 45, pp. 5-32, 2001.

[26] O. Akar and O. Gungor, “Classification of multispectral images using Random Forest algorithm”, Journal of Geodesy and Geoinformation, vol. 1, pp. 139-146, 2012.

[27] A. Agresti, An Introduction to Categorical Data Analysis, 2nd ed. New Jersey, USA: Wiley, 2007.

[28] S. Lemeshow and D. Hosmer, Applied Logistic Regression, 2nd ed. New York, USA: Wiley, 2000.

[29] R. Fisher, “The use of multiple measurements in taxonomic problems”, Annals of Eugenics, vol. 7, pp. 179- 188, 1936.

[30] Y. Jieping, “Least squares linear discriminant analysis,” ICML '07 Proceedings of the 24th international conference on Machine learning, pp. 1087-1093, Corvalis, Oregon, USA - June 20 - 24, 2007.

[31] S. Dudoit, J. Fridlyand, T.P. Speed, “Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data”, Journal of the American Statistical Association, vol. 97, no. 457, pp. 77-87, 2002.

[32] V.N. Vapnik, The Nature of Statistical Learning Theory, 2. Baskı, Springer-Verlag, New York, 2000.

[33] D.J. Dittman, T.M. Khoshgoftaar, R. Wald, A. Napolitano, “Comparison of data sampling approaches for imbalanced bioinformatics data”, Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference, 2014, May 21-23, Florida.

[34] A.O. Durahim, “Comparison of sampling techniques for imbalanced learning”, Yönetim Bilişim Sistemleri Dergisi, vol. 1, no. 3, pp. 181-191, 2016.

[35] S.A. Shaikh, “Measures derived from a 2x2 table for an accuracy of a diagnostic test”, J Biom Biostat, vol. 2, no. 128, pp. 1-4, 2011.