Adaptif Sigmoid, Lojistik Sigmoid ve Tanjant Hiperbolik Aktivasyon Fonksiyonların Tam Bağlı ve Konvolüsyonel Sinir Ağlarında Kıyaslanması

Sinir ağının başarısını aktivasyon fonksiyonu doğrudan etkileyen bir parametredir. Genellikle lineer olmayan aktivasyon fonksiyonlar karmaşık problemleri çözebilmek için kullanılır. Lojistik sigmoid fonksiyon ve tanjant hiperbolik fonksiyon en yaygın kullanılan aktivasyon fonksiyonlarındandır. Aktivasyon fonksiyonunda serbest parametrenin bulunması fonksiyonun adaptif olmasını sağlar. Bu çalışmada genelleştirilmiş sigmoid tabanlı adaptif aktivasyon fonksiyonu incelenmiş ve tam bağlı ve konvolüsyonel sinir ağ mimarili yapılarda standart lojistik sigmoid ve tanjant hiperbolik fonksiyonla kıyaslanmıştır. Elde edilen deneysel sonuçlar, serbest parametreler sayesinde ağın daha hızlı eğitildiğini ancak aşırı uyumlamaya daha yatkın olduğunu göstermektedir.

Comparison of Adaptive Sigmoid, Logistic Sigmoid and Tangent Hyperbolic Activation Functions in Fully Connected and Convolutional Neural Networks

Activation function is a parameter that directly affects the success of the neural network. Generally, nonlinear activation functions are used to solve complex problems. Logistic sigmoid function and tangent hyperbolic function are the most commonly used activation functions. The presence of a free parameter in the activation function makes the function adaptive. In this study, generalized sigmoid-based adaptive activation function was investigated and compared with standard logistic sigmoid and tangent hyperbolic function in fully connected and convolutional neural network architectures. The experimental results show that the network is trained faster thanks to the free parameters, but it is more prone to overfitting.

___

  • 1. Dubey, S. R., Singh, S. K., Chaudhuri, B. B. Activation functions in deep learning: A comprehensive survey and benchmark, Neurocomputing, 2022.
  • 2. Cybenko, G. Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2(4), 303-314, 1989.
  • 3. Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural networks, 4(2), 251-257, 1991.
  • 4. Leshno, M., Lin, V. Y., Pinkus, A., Schocken, S. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural networks, 6(6), 861-867, 1993.
  • 5. Apicella, A., Donnarumma, F., Isgrò, F., Prevete, R. A survey on modern trainable activation functions, Neural Networks, 138, 14-32, 2021.
  • 6. Chen, C. L., Nutter, R. S. An extended back-propagation learning algorithm by using heterogeneous processing units, IJCNN International Joint Conference on Neural Networks, IEEE, (Vol. 3, pp. 988-993), June, 1992.
  • 7. Chen, C. T., Chang, W. D. A feedforward neural network with function shape autotuning, Neural networks, 9(4), 627-641, 1996.
  • 8. Wu, Y., Zhao, M., Ding, X. Beyond weights adaptation: a new neuron model with trainable activation function and its supervised learning, International Conference on Neural Networks (ICNN'97), IEEE, (Vol. 2, pp. 1152-1157), June, 1997.
  • 9. Nakayama, K., Ohsugi, M. A simultaneous learning method for both activation functions and connection weights of multilayer neural networks, IEEE World Congress on Computational Intelligence (Cat. No. 98CH36227), (Vol. 3, pp. 2253-2257), May, 1998.
  • 10. Xu, S., Zhang, M. A novel adaptive activation function. International Joint Conference on Neural Networks, (Cat. No. 01CH37222) ,IEEE, (Vol. 4, pp. 2779-2782) July 2001.
  • 11. Yu, C. C., Tang, Y. C., Liu, B. D. An adaptive activation function for multilayer feedforward neural networks, IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering, TENCOM'02, (Vol. 1, pp. 645-650), October 2002.
  • 12. Chandra, P., Singh, Y. An activation function adapting training algorithm for sigmoidal feedforward networks, Neurocomputing, 61, 429-437, 2004.
  • 13. Kunc, V., Kléma, J. On Transformative Adaptive Activation Functions in Neural Networks for Gene Expression Inference, bioRxiv, 587287, 2019.
  • 14. Rumelhart, D. E., Hinton, G. E., Williams, R. J. Learning representations by back-propagating errors. nature, 323(6088), 533-536, 1986.
  • 15. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86(11), 2278-2324,1998.
  • 16. Bengio, Y. Practical recommendations for gradient-based training of deep architectures, In Neural networks: Tricks of the trade (pp. 437-478). Springer, Berlin, Heidelberg, 2012.
  • 17. Glorot, X., Bengio, Y. Understanding the difficulty of training deep feedforward neural networks, In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249-256), March, 2010.
  • 18. LeCun, Y., Bengio, Y., Hinton, G. Deep learning, nature, 521(7553), 436-444, 2015.
  • 19. Xiao, H., Rasul, K., Vollgraf, R. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747, 2017.