Derin Öğrenme Modellerinin Doğruluk, Süre ve Boyut Temelli Ödünleşme Değerlendirmesi

Makine öğrenmesi ve özellikle derin öğrenme modellerinin gerçek-zamanlı saha uygulamalarında operasyona alınması için üç ana kriterin aynı anda optimizasyonu gerekmektedir. Bunlar modelin tahmin doğruluğu, eğitim-test süreleri ile dosya boyutu olup ilgili çalışmalarda sadece iki kriter (örnek: doğruluk-süre) beraber göz önüne alınmıştır. Ancak, modellerin tahmin doğruluğunu artırmak için oluşturulan derin sinir ağlarının (DSA) eğitim süresi ve boyutunu artırdığı, boyutunu küçültmek için yapılan çalışmaların ise doğruluğunu düşürdüğü gözlemlenmiştir. Bu üç kriter arasında bir ödünleşme yapılması gerekmektedir.Farklı optimizasyon tekniklerinin modelin performansına etkisini göstermek için, bu makalede DSA araştırma alanında sıklıkla kullanılan ResNet50, ResNet101, VGG16, VGG19 ve EfficientNet ön-eğitimli modellerini CIFAR10, CIFAR100 görsel veri kümeleriyle test ettik. Google Colab Pro ve Tensorflow sistemi üzerinde yaptığımız başarım çalışmalardan elde edilen önemli sonuçların arasında ağırlık quantizasyonun çok-boyutlu optimizasyonunda şu ana kadarki en başarılı teknik olduğu, ağırlık kümeleme ve transfer öğrenimi tekniklerinin ise ancak 2-boyutta fayda sağladıkları söylenebilir. Çalışmamızda ayrıca, literatürde ilk defa DSA’lar için bir operasyonel skor ve modelden-modele katman aktarımı metodunu tasarlayıp, sınadık. Oluşturulan çerçevenin, yeni geliştirilen DSA modellerinin operasyona sokulmadan önce çok-boyutlu değerlendirilebilmeleri için bir referans teşkil etmesi umuyoruz.

Anahtar Kelimeler:

operasyonel makine öğrenmesi, derin öğrenme, derin sinir ağları, ön eğitimli modeller, RESNET, VGG, CIFAR, doğruluk, eğitim süresi

Tradeoff Assessment of Deep Learning Models based on Accuracy, Time and Size

Machine Learning and especially deep learning models need to be optimized over three main criteria concurrently, to be operationalized in real-time field applications. These criteria are model’s accuracy, training-testing times and file size. Related work only considers two criteria (e.g. accuracy-time) together. However, it is observed that deep neural networks (DNN) designed to improve model accuracy can increase training time and size, while efforts to reduce model size can lead to lower accuracy. A trade-off needs to be made among these three criteria. In this paper, to demonstrate the effects of different optimization techniques on model performance, we tested ResNet50, ResNet101, VGG16, VGG19, EfficientNet pre-trained models with CIFAR10, CIFAR100 image datasets, which are commonly utilized in the DNN research field. Important performance results obtained over Google Colab Pro and TensorFlow system show that weight quantization is the most successful technique so far in multi-dimensional optimization, while weight clustering and transfer learning techniques remain useful in 2-dimensions. In addition, we designed and tested a new DNN operational score and model-to-model layer transfer method for the first time in literature. We hope that our framework will constitute a multi-dimensional evaluation reference for DNN models before they are operationalized.

Keywords:

operational machine learning, deep learning, deep neural networks, pretrained models, RESNET, CIFAR, VGG, accuracy, training time,

PDF

___

F. Fabio, G. Lami, A. M. Costanza. "Deep learning in automotive software." IEEE Software 34(3), 56-63, 2017.
J. Villalba-Diez, D. Schmidt, R. Gevers, J. Ordieres-Meré, M. Buchwitz, W. Wellbrock, “Deep learning for industrial computer vision quality control in the printing Industry 4.0”. Sensors, 19(18), 3987, 2019.
Z. Hu, Y.Zhao, M. Khushi. "A survey of forex and stock price prediction using deep learning." Applied System Innovation 4(1), 9, 2021.
J. Kim, Y. Shin, E. Choi. "An intrusion detection model based on a convolutional neural network." Journal of Multimedia Information System 6(4), 165-172, 2019.
Deng, L., Liu, Y. (Eds.). Deep learning in natural language processing. Springer, 2018.
Bashar, A. “Survey on evolving deep learning neural network architectures”. Journal of Artificial Intelligence, 1(02), 73-82, 2019.
A. Collette, Python and HDF5, O'Reilly Media, Inc., November ISBN: 9781449367831, 2013.
Internet: Open Neural Network Exchange (ONNX), The open standard for machine learning interoperability, https://onnx.ai, 24.10.202.
Internet: Model Zoo, Open source deep learning code and pretrained models. https://modelzoo.co, 24.10.202.
Internet: Google Colaboratory, https://colab.research.google.com/, 24.10.202
M. Abadi, P. Barham, J.Chen, Z. Chen, A. Davis, J. Dean, M. Devin et al. "Tensorflow: A system for large-scale machine learning." In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 265-283. 2016.
Yu, R., Li, P. “Toward resource-efficient federated learning in mobile edge computing”. IEEE Network, 35(1), 148-155, 2021.
M. , Paulius, S. Narang, J. Alben, G. Diamos, E. Elsen, D. Garcia, B. Ginsburg, et al. "Mixed precision training." arXiv preprint arXiv:1710.03740 (2017).
D. Lin, T. Sachin Talathi, A. Sreekanth "Fixed point quantization of deep convolutional networks." In International Conference on Machine Learning, PMLR, 2849-2858, 2016.
D. Zhang, J. Yang, D. Ye,, G. Hua, “LQ-Nets: Learned quantization for highly accurate and compact deep neural networks”. In Proceedings of the European Conference on Computer Vision (ECCV) 365-382, 2018.
S. Han, H. Mao, W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding”. arXiv preprint arXiv:1510.00149, 2015.
K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition”. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 770-778, 2016.
Simonyan, K., Zisserman, A. “Very deep convolutional networks for large-scale image recognition”. arXiv preprint arXiv:1409.1556, 2014.
Internet: Keras Applications, https://keras.io/api/applications, 24.10.202
M. Tan, Q. Le. "EfficientNet: Rethinking model scaling for convolutional neural networks." In International Conference on Machine Learning (ICML), 6105-6114. PMLR, 2019.
Tan, M., Le, Q. “EfficientNetv2: Smaller models and faster training”. In International Conference on Machine Learning, 10096-10106, PMLR, 2021
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al. "Scikit-learn: Machine learning in Python." The Journal of Machine Learning Research V.12, 2825-2830, 2011.
He, Y., Shen, Z., Cui, P. “Towards non-IID image classification: A dataset and baselines”. Pattern Recognition, 110, 107383, 2021.
M.B. Çamlı, I. Ari, “Sensitivity Analysis of Federated Learning over Decentralized Data and Communication Rounds”, 7. Ulusal Yüksek Basarimli Hesaplama Konferansı (BAŞARIM), no.14, 2022
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H. P. “Pruning filters for efficient ConvNets”. arXiv preprint arXiv:1608.08710, 2016.
Glorot, X., Bengio, Y. “Understanding the difficulty of training deep feedforward neural networks”. In Proceedings of the 13th international conference on artificial intelligence and statistics, 249-256, JMLR Workshop and Conference Proceedings, 2010.
Luo, J., Wu, X., Luo, Y., Huang, A., Huang, Y., Liu, Y., Yang, Q. “Real-world image datasets for federated learning”. arXiv preprint arXiv:1910.11089, 2019
Zhu, H., Xu, J., Liu, S., Jin, Y. “Federated Learning on Non-IID Data: A Survey”. arXiv preprint arXiv:2106.06843, 2021.
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., Chandra, V. “Federated learning with non-IID data”.arXiv preprint arXiv:1806.00582, 2018.
F. Iandola, M. Moskewicz, S. Karayev, R. Girshick, T. Darrell, K. Keutzer, “DenseNet: Implementing efficient convnet descriptor pyramids” arXiv preprint arXiv:1404.1869, 2014.
F. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer."SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360, 2016.
H.R. Roth , D. Yang, W. Li, A. Myronenko, W. Zhu, Z. Xu, X. Wang, D. Xu, “Technique to perform neural network architecture search with federated learning”, WO/2021-247338A1, WIPO Patent, 2021.
M. Zhang, S. Rajbhandari, W. Wang, Y. He, “DeepCPU: Serving RNN-based deep learning models 10x faster”. In 2018 USENIX Annual Technical Conference (ATC), 951-965, 2018.
P. Chao, C. Kao, Y. Ruan, C. Huang, Y. Lin. "Hardnet: A low memory traffic network" In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3552-3561. 2019.
Sung, W., Shin, S., Hwang, K. “Resiliency of deep neural networks under quantization”. arXiv preprint arXiv:1511.06488, 2015.
Gong, Y., Liu, L., Yang, M., Bourdev, L. “Compressing deep convolutional networks using vector quantization”. arXiv preprint arXiv:1412.6115, 2014.
Nath, U., Kushagra, S. “Better Together: Resnet-50 accuracy with 13x fewer parameters and at 3x speed”. arXiv preprint arXiv:2006.05624, 2020.
Ma, X., Yuan, G., Lin, S., Li, Z., Sun, H., Wang, Y. “ResNet can be pruned 60×: Introducing network purification and unused path removal (p-rm) after weight pruning”. In 2019 IEEE/ACM International Symposium on Nanoscale Architectures, 1-2, IEEE, 2019.
Shen, Z., Savvides, M. “Meal v2: Boosting vanilla Resnet-50 to 80%+ Top-1 accuracy on Imagenet without tricks”. arXiv preprint arXiv:2009.08453, 2020.
B. Recht, R. Roelofs, L. Schmidt, L., V. Shankar, “Do Imagenet classifiers generalize to Imagenet?” In International Conference on Machine Learning, 5389-5400, PMLR, 2019.
Zhang, X., Wang, Q., Zhang, J., Zhong, Z. “Adversarial autoaugment”. arXiv preprint arXiv:1912.11188, 2019.
C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, C. Liu, “A survey on deep transfer learning”. In International conference on artificial neural networks 270-279, Springer, 2018.
M. Li, Y. Liu, X. Liu, Q. Sun, X. You, H. Yang, Z. Luan, L. Gan, G. Yang, D. Qian. "The deep learning compiler: A comprehensive survey." IEEE Transactions on Parallel and Distributed Systems, Vol. 32, No. 3, 708-727, 2020.
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B. A. “Communication-efficient learning of deep networks from decentralized data”. In Artificial intelligence and statistics, 1273-1282, PMLR, 2017.
T. Li, A. K. Sahu, A. Talwalkar, V. Smith, V. “Federated learning: Challenges, methods, and future directions”. IEEE Signal Processing Magazine, 37(3), 50-60, 2020.
Maraş, A. Erol, Ç. “Emerging Trends in Classification with Imbalanced Datasets: A Bibliometric Analysis of Progression”. Bilişim Teknolojileri Dergisi, 15(3), 275-288, 2022.