Nihan KAHRAMAN, Mohanad Abd SHEHAB

Optimum, projected, and regularized extreme learning machine methods with singular value decomposition and $L_2$ -Tikhonov regularization

The theory and implementation of an extreme learning machine (ELM) have proved that it is a simple,efficient, and accurate machine learning methodology. In an ELM, the hidden nodes are randomly initiated and fixedwithout iterative tuning. However, the optimal hidden layer neuron number ($L_{opt}$) is the key to ELM generalizationperformance where initializing this number by trial and error is not reasonably satisfied. Optimizing the hidden layersize using the leave-one-out cross validation method is a costly approach. In this paper, a fast and reliable statisticalapproach called optimum ELM (OELM) was developed to determine the minimum hidden layer size that yields anoptimum performance. Another improvement that exploits the advantages of orthogonal projections with singular valuedecomposition was proposed in order to tackle the problem of randomness and correlated features in the input data. Thisapproach, named projected ELM (PELM), achieves more than 2% advance in average accuracy. The final contributionof this paper was implementing Tikhonov regularization in the form of the $L_2$ -penalty with ELM (TRELM), whichregularizes and improves the matrix computations utilizing the L-curve criterion and SVD. The L-curve, unlike iterativemethods, can estimate the optimum regularization parameter by illustrating a curve with few points that represents thetradeoff between minimizing the training error and the residual of output weight. The proposed TRELM was tested in3 different scenarios of data sizes: small, moderate, and big datasets. Due to the simplicity, robustness, and less timeconsumption of OELM and PELM, it is recommended to use them with small and even moderate amounts of data.TRELM demonstrated that when enhancing the ELM performance it is necessary to enlarge the size of hidden nodes(L). As a result, in big data, increasing L in TRELM is necessary, which concurrently leads to a better accuracy.Various well-known datasets and state-of-the-art learning approaches were compared with the proposed approaches.

PDF

___

[1] Huang GB, Zhu QY, Siew CK. Extreme learning machine: a new learning scheme of feedforward neural networks. In: IEEE International Joint Conference on Neural Networks; 2004. New York, NY, USA: IEEE. pp. 489-501.
[2] Wang Y, Cao F, Yuan Y. A study on effectiveness of extreme learning machine. Neurocomputing 2011; 74: 2483- 2490.
[3] Luo M, Zhang K. A hybrid approach combining extreme learning machine and sparse representation for image classification. Eng Appl Artif Intel 2014; 27: 228-235.
[4] Huang GB, Song S, You K. Trends in extreme learning machines: a review. Neural Networks 2015; 61: 32-48.
[5] Liu X, Wang L, Huang GB, Zhang J, Yin J. Multiple kernel extreme learning machine. Neurocomputing 2015; 149: 253-264.
[6] Miche Y, Sorjamaa A, Bas P, Simula O, Jutten C, Lendasse A. OP-ELM: Optimally pruned extreme learning machine. IEEE T Neural Networ 2010; 21: 158-162.
[7] Huang G, Song S, Gupta J, Wu C. Semi-supervised and unsupervised extreme learning machines. IEEE T Cybernetics 2014; 44: 2405-2417.
[8] Liang NY, Huang GB, Saratchandran P, Sundararajan N. A fast and accurate online sequential learning algorithm for feedforward networks. IEEE T Neural Networ 2006; 7: 1411-1423.
[9] Cao J, Lin Z, Huang GB. Self-adaptive evolutionary extreme learning machine. Neural Process Lett 2012; 36: 285-305.
[10] Akusok A, Björk KM, Miche Y, Lendasse A. High-performance extreme learning machines: a complete toolbox for big data applications. IEEE Access 2015; 3: 1011-1025.
[11] Anton A. Extreme learning machines: novel extensions and application to big data. PhD, University of Iowa, Iowa City, Iowa, USA, 2016.
[12] Yang YM, Wang YN, Yuan XF. Bidirectional extreme learning machine for regression problem. IEEE T Neural Networ 2012; 23: 1498-1505.
[13] Cambria E, Huang GB. Extreme learning machines [trends & controversies]. IEEE Intell Syst 2013; 28: 30-59.
[14] Huang GB. What are extreme learning machines? Filling the gap between Frank Rosenblatt’s dream and John Neumann’s puzzle. Cogn Comput 2015; 7: 263-278.
[15] Huang GB, Chen L, Siew CK. Universal approximation using incremental constructive feedforward networks. IEEE T Neural Networ 2006; 17: 879-892.
[16] Huang GB, Chen L. Enhanced random search based incremental extreme learning machine. Neurocomputing 2008; 71: 3460-3468.
[17] Feng G, Huang GB, Lin Q, Gay R. Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE T Neural Networ 2009; 20: 1342-1357.
[18] Rong HJ, Ong YS, Tan AH, and Zhu Z. A fast pruned-extreme learning machine for classification problem. Neurocomputing 2008; 72: 359-366.
[19] Zhu W, Miao J, Qing L. Constrained extreme learning machines: a study on classification cases. Journal of Computer Vision Pattern Recognition 2015; 14: 1-14.
[20] Moravec P and Snasel V. Dimension Reduction Methods for Iris Recognition. Spindleruv Mlyn, Czech Republic: Czech Technical University in Prague, 2009.
[21] Pisani D. Matrix decomposition algorithms for feature extraction. PhD, University of Malta, Msida, Malta, 2004.
[22] Xu X, Wang Z, Zhang X, Yan W, Deng W and Lu L. Human face recognition using multi-class projection extreme learning machine. In: IEIE Transactions on Smart Processing & Computing; 2013. pp. 323-331.
[23] Giovannelli JF, Idier J. Regularization and Bayesian Methods for Inverse Problems in Signal and Image Processing. London, UK: Wiley-ISTE, 2015.
[24] Chung J, Espanol MI, Nguyen T. Optimal regularization parameters for general form Tikhonov regularization. Inverse Probl 2014; 33: 1-21.
[25] Heeswijk M, Miche Y. Binary/ternary extreme learning machines. Neurocomputing 2015; 149; 187-197.
[26] Grigorievskiy A, Miche Y, Käpylä M, Lendasse A. Singular value decomposition update and its application to INC-OP-ELM. Neurocomputing 2016; 174: 99-108.
[27] Xiang H, Zou J. Regularization with randomized SVD for large-scale discrete inverse problems. Inverse Probl 2013; 29: 1-23.
[28] Hansen PC. Regularization Tools: A MATLAB Package for Analysis and Solution of Discrete Ill-Posed Problems. Version 4.1 for MATLAB 7.3. Natick, MA, USA: MathWorks, 2008.
[29] Petersen KB, Pedersen MS. The Matrix Cookbook. Waterloo, Canada: University of Waterloo, 2007.
[30] UCI. UCI Machine Learning Repository. Irvine, CA, USA: UCI, 2016.
[31] Cao J, Zhang K, Luo M, Yin C, Lai X. Extreme learning machine and adaptive sparse representation for image classification. Neural Networks 2016; 81: 91-102.
[32] Suykens JA, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett 1999; 9: 293-300.
[33] Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting. Ann Stat 2000; 28: 337-407.
[34] Zhang L, Yang M, Feng X. Sparse representation or collaborative representation: which helps face recognition. In: Proceedings of the IEEE International Conference on Computer Vision; 2011. New York, NY, USA: IEEE. pp. 471-478.