GÖRÜNTÜ SINIFLANDIRMADA DERİN ÖĞRENME YÖNTEMLERİNİN KARŞILAŞTIRILMASI

Bu çalışmada ESA (Evrişimsel Sinir Ağları), ResNet ve AİA (Ağ İçinde Ağ) yaklaşımları kullanılarak oluşturulan ve E-Model, R-Model, A-Model şeklinde adlandırılan derin öğrenme modellerinin farklı veri kümeleri üzerinde performansları karşılaştırılmıştır. CIFAR-10 veri kümesi için derin öğrenme modelleri sadece MİB (Merkezi İşlem Birimi) içeren bir makinede ve MİB ile GİB (Grafik İşlem Birimi) içeren bir makinede ayrı ayrı çalıştırılmıştır. Sadece MİB içeren makinede R-Model, A-Model ve E-Model için sırasıyla yaklaşık 415 saatlik, 129 saatlik ve 3.5 saatlik eğitim aşamaları sonucunda doğrulama veri seti üzerinde sırasıyla %82.76, %87.64 ve %83.47 doğruluk oranları elde edilmiştir. MİB ve GİB içeren makinede ise R-Model, A-Model ve E-Model için sırasıyla yaklaşık 4.45 saatlik, 2.20 saatlik ve 1.82 saatlik eğitim aşamaları sonucunda doğrulama veri seti üzerinde sırasıyla %82.61, %87.95 ve %82.43 doğruluk oranları elde edilmiştir. Diğer veri kümeleri için ise modeller MİB ve GİB içeren makinede çalıştırılarak deneysel sonuçlar elde edilmiştir. Oluşturulan derin öğrenme modellerinin yapıları, eğitim için kullanılan parametre değerleri, doğrulama verileri için elde edilen karmaşıklık matrisleri, doğruluk ve kayıp grafikleri ayrıntılı olarak verilmiştir.

Comparison of Deep Learning Models in Image Classification

In this study, various experiments have been performed via deep learning models based on CNN (Convolutional Neural Networks), ResNet (Residential Energy Services Network) and NIN (Network In Network) approaches and their performances on various datasets have been investigated. The deep learning models were named as E-Model, R-Model and A-Model, respectively. The deep learning models were trained with CIFAR-10 dataset on a machine having only CPU (Central Processing Unit) and a machine having both CPU and GPU (Graphical Processing Unit). On the machine having only CPU, the traning time of the R-Model, A-Model and E-Model were approximately 415 hours, 129 hours and 3.5 hours, respectively. The percentage correct values on the validation data set were %82.76, %87.64 ve %83.47, respectively. On the machine having both CPU and GPU, the traning time of the R-Model, AModel and the E-Model were approximately 4.45 hours, 2.20 hours and 1.82 hours, respectively. The percentage correct values on the validation data set were %82.61, %87.95 ve %82.43, respectively. The experimental results for the other data sets were obtained by training the models on the machine having both CPU and GPU. The structures of the constructed deep learning models, the parameters used for the training, the obtained confusion matrices for the validation data, the accuracy and loss graphics are given in detail.

PDF

___

Chu, X., Zhang, B., & Li, X., 2020, Noisy differentiable architecture search, arXiv:2005.03566.
Ciregan, D., Meier, U., & Schmidhuber, J., 2012, Multi-column Deep Neural Networks for Image Classification, 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 3642-3649.
Dabhi, R., 2020, (2020, 03.06.2021). Casting product image data for quality inspection. Available: https://www.kaggle.com/ravirajsinh45/real-life-industrial-dataset-of-casting-product.
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L., 2009, Imagenet: A large-scale hierarchical image database, in Computer Vision and Pattern Recognition, 2009, CVPR 2009, IEEE Conference on. IEEE, 2009, pp. 248–255.
Ganin, Y., Kononenko, D., Sungatullina, D., & Lempitsky V., 2016, Deepwarp: Photorealistic image resynthesis for gaze manipulation, European Conference on Computer Vision. Springer, Cham, pp. 311-326.
Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., & Bengio, Y., 2013, Proceedings of the 30th International Conference on Machine Learning, PMLR, 28(3):1319-1327.
Graham, B., 2014, Spatially-sparse convolutional neural networks, arXiv:1409.6070.
He, K., Zhang, X., Ren, S. & and Sun, J., 2016, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q., 2017, Densely Connected Convolutional Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 2261-2269.
İnik, Ö. & Ülker, E., 2017, Derin Öğrenme ve Görüntü Analizinde Kullanılan Derin Öğrenme Modelleri, Gaziosmanpaşa Bilimsel Araştırma Dergisi, 6(3): 85-104.
Kabir, H.M.D.; Abdar, M.; Jalali, S.M.J.; Khosravi, A.; Atiya, A.F.; Nahavandi, S.; Srinivasan, D., 2020, SpinalNet: Deep neural net workwith gradual input, arXiv:2007.03347.
Kızrak, M.A., Bolat, B., 2018, Derin Öğrenme ile Kalabalık Analizi Üzerine Detaylı Bir Araştırma, Bilişim Teknolojileri Dergisi, (11) 3: 263-286.
Koryakin, P., 2019, (2019, 03.06.2021). Fingers. Available: https://www.kaggle.com/koryakinp/fingers. Krizhevsky, A., 2009, Learning multiple layers of features from tiny images, Master’s thesis, Department of Computer Science, University of Toronto.
Krizhevsky, A., Sutskever, I. & Hinton, G. E., 2012, ImageNet Classification with Deep Convolutional Neural Networks, NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe Nevada, USA, 2012, pp. 1097-1115. Lin, M., Chen, Q., & Yan, S., 2013, Network in network, arXiv:1312.4400.
Liu, Q & Mukhopadhyay, S., 2018, Unsupervised Learning using Pretrained CNN and Associative Memory Bank, International Joint Conference on Neural Networks (IJCNN 2018), 8-13 Jul 2018, Rio, Brazil, pp. 1-8.
Liu, Q., Zhang, N., Yang, W., Wang, S., Cui, Z., Chen, X., & Chen, L., 2017, A Review of Image Recognition with Deep Convolutional Neural Network, In: Huang DS., Bevilacqua V., Premaratne P., Gupta P. (eds) Intelligent Computing Theories and Application. ICIC 2017. Lecture Notes in Computer Science, Vol 10361. Springer, Cham.
Lu, Z., Sreekumar, G., Goodman, E., Banzhaf, W., Deb, K. & Boddeti, V. N., 2021, Neural Architecture Transfer, IEEE Transactions on Pattern Analysis and Machine Intelligence, (43): 2971-2989.
Mc Culloch, W.S. & Pitts, W., 1943, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics, 5(4): 115–133.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, Bo. & Ng, A.Y., 2011, Reading Digits in Natural Images with Unsupervised Feature Learning, NIPS Workshop on Deep Learning and Unsupervised Feature Learning.
Nilsback, M. & Zisserman, A., 2006, A Visual Vocabulary for Flower Classification, IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, pp.1447-1454.
Rumelhart, D.E., McClelland, J.L. & Group T.P.R., 1986, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, MIT Press, Cambridge.
Springenberg, J.T., Dosovitskiy, A., Brox, T. & Riedmiller, M., 2015, Striving for simplicity: The All Convolutional Net, arXiv:1412.6806.
Ulucan, O., Karakaya, D. & Turkan, M., 2020, A Large-Scale Dataset for Fish Segmentation and Classification, 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), Istanbul, Turkey, 2020, pp. 1-5.
Wang, G., Wang, K., & Lin, L., 2019, Adaptively Connected Neural Networks, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1781-1790.