Kübra UYAR, Şakir TAŞDEMİR, İlker Ali ÖZKAN

The analysis and optimization of CNN Hyperparameters with fuzzy tree model for image classification

The meaningful performance of convolutional neural network (CNN) has enabled the solution of various state-of-the-art problems. Although CNNs achieve satisfactory results in computer-vision problems, they still have some difficulties. As the designed CNN models are deepened to achieve much better accuracy, computational cost and complexity increase. It is significant to train CNNs with suitable topology and training hyperparameters that include initial learning rate, minibatch size, epoch number, filter size, number of filters, etc. because the initialization of hyperparameters affects classification results. On the other hand, it is not possible to make a definite inference for the hyperparameter initialization and there is uncertainty. This study is carried out to model uncertainty using fuzzy inference system (FIS). The designed fuzzy model provides estimation of classification result depending on CNN topology and training hyperparameters. GoogleNet and Inceptionv3 that contain inception-modules, ShuffleNet that contains shuffleblocks, DenseNet201 that contains dense-blocks, EfficientNet, ResNet18, ResNet50, ResNet101, and MobileNetv2 that contain residual-blocks, and InceptionResNetv2 that includes both inception-modules and residual-blocks were evaluated as CNN models. Test sample dataset was obtained by training CNN models with various training hyperparameter combinations. CNN models were trained on Animal Diagnostics Lab (ADL) which is a histopathological dataset includes healthy and inflamed kidney, lung, and spleen images. A new FIS tree model that is more computationally efficient and easier to understand than a single FIS was designed and classification accuracy prediction of CNN models depending on hyperparameter combinations was performed. The best, the worst, and the average classification accuracies obtained with CNN models that use best training hyperparameter set are 97.70%, 93.60%, and 96.30%, respectively. Moreover, Cifar10 and Cifar100 benchmark datasets were experimented to reveal true capability and limitations of the proposed approach. Experimental results indicate that the designed FIS tree model provides a successful hyperparameter evaluation mechanism with an average RMSE value of 1.2652.

PDF

___

[1] Bochinski E, Senst T, Sikora T. Hyper-parameter optimization for convolutional neural network committees based on evolutionary algorithms. In: IEEE 2017 ICIP International Conference on Image Processing; 2017. pp. 3924-3928.
[2] Mukhtar N, Kong Y. Hyper-parameter optimization for machine-learning based electromagnetic side-channel analysis. In: 2018 ICSEng 26th International Conference on Systems Engineering; 2018. pp. 1-7.
[3] Soon FC, Khaw HY, Chuah JH, Kanesan J. Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. Intelligent Transport Systems 2018; 12 (8): 939-946. doi: 10.1049/iet-its.2018.5127
[4] Yoo Y. Hyperparameter optimization of deep neural network using univariate dynamic encoding algorithm for searches. Knowledge-Based Systems 2019; 178: 74-83. doi: 10.1016/j.knosys.2019.04.019
[5] Zhang M, Li H, Lyu J, Ling SH, Su S. Multi-level CNN for lung nodule classification with gaussian process assisted hyperparameter optimization 2019. arXiv preprint arXiv:1901.00276.
[6] Cabada RZ, Rangel HR, Estrada MLB, López HMC. Hyperparameter optimization in CNN for learning-centered emotion recognition for intelligent tutoring systems. Soft Computing 2020; 24 (10): 7593-7602. doi: 10.1007/s00500- 019-04387-4
[7] Pan S, Guan H, Chen Y, Yu Y, Gonçalves WN et al. Land-cover classification of multispectral LiDAR data using CNN with optimized hyper-parameters. ISPRS Journal of Photogrammetry and Remote Sensing 2020; 166: 241-254. doi: 10.1016/j.isprsjprs.2020.05.022
[8] Andonie R, Florea AC. Weighted random search for CNN hyperparameter optimization. International Journal of Computers Communications & Control 2020; 15 (2): doi: 10.15837/ijccc.2020.2.3868.
[9] Wang Y, Wang Y, Li H, Cai Z, Tang X et al. CNN Hyperparameter optimization based on CNN visualization and perception hash algorithm. In: 2020 DCABES 19th International Symposium on Distributed Computing and Applications for Business Engineering and Science; 2020. pp. 78-82.
[10] Yilmaz AA, Guzel MS, Bostanci E, Askerzade I. A novel action recognition framework based on deep-learning and genetic algorithms, in IEEE Access 2020; 8: 100631-100644. doi:10.1109/ACCESS.2020.2997962.
[11] Soylu K, Güzel MS, Soylu BE, Bostanci GE. Genetic hyperparameter optimization library development and its application on plant disease detection problem. In: 2020 SIU 28th Signal Processing and Communications Applications Conference; 2020. pp. 1-4. doi: 10.1109/SIU49456.2020.9302246
[12] Singh P, Chaudhury S, Panigrahi BK. Hybrid MPSO-CNN: Multi-level particle swarm optimized hyperparameters of convolutional neural network. Swarm and Evolutionary Computation 2021; 63:100863. doi: 10.1016/j.swevo.2021.100863
[13] Yeh WC, Lin YP, Liang YC, Lai CM. Convolution neural network hyperparameter optimization using simplified swarm optimization 2021. arXiv preprint arXiv:2103.03995.
[14] Sugeno M. Industrial Applications of Fuzzy Control. USA: Elsevier Science Inc., 1985.
[15] Vu TH, Mousavi HS, Monga V, Rao G, Rao UKA. Histopathological image classification using discriminative feature-oriented dictionary learning. IEEE 2016 Transactions on Medical Imaging 2016; 35 (3): 738-751. doi : 10.1109/TMI.2015.2493530
[16] Perez L, Wang J.The effectiveness of data augmentation in image classification using deep learning 2017. arXiv preprint arXiv:1712.04621.
[17] Xu Y, Jia R, Mou L, Li G, Chen Y et al. Improved relation classification by deep recurrent neural networks with data augmentation 2016. arXiv preprint arXiv:1601.03651.
[18] Wong SC, Gatt A, Stamatescu V, McDonnell MD. Understanding data augmentation for classification: when to warp?. In: IEEE 2016 DICTA International Conference on Digital Image Computing: Techniques and Applications; 2016. pp. 1-6.
[19] İnik Ö, Uyar K, Ülker E. Gender classification with a novel Convolutional Neural Network (CNN) model and comparison with other machine learning and deep learning CNN models. Journal Of Industrial Engineering Research 2018; 4 (4): 57-63.
[20] Thenmozhi K, Reddy US. Crop pest classification based on deep convolutional neural network and transfer learning. Computers and Electronics in Agriculture 2019;164: 104906. doi: 10.1016/j.compag.2019.104906
[21] Breuel TM. The Effects of Hyperparameters on SGD Training of Neural Networks 2015. arXiv preprint arXiv:1508.02788.