Yapay Sinir Ağlarında Parçalı Eğitim

Yapay sinir ağları ile büyük veri kümeleri modellenirken eğitimi paralelleştirmek için, eğitim kümesi ağa toptan yerine parçalara ayrılarak verilir. Bu sayede eğitim süresi azaltılır. Bu çalışmada parçalı eğitimin küçük veri kümelerine uygulandığındaki etkisi incelenmiştir. 4 farklı eğitim algoritması ve 11 veri kümesi üzerinde yapılan testlerde, 3 eğitim algoritması için parçalı eğitimin, toptan eğitimden daha başarılı olduğunu görülmüştür.

Mini‐Batching for Artificial Neural Network Training

When the large data sets are modeling with Artificial Neural Networks, the training set is divided into mini-batches to parallelize training phase. In this way, training time is reduced. In this study, the effect of the mini-batch training was investigated when it applied to small data sets. In our experiments, 4 different learning algorithms over 11 datasets were used. It is shown that the mini-batch training is more successful than the full batch training with 3 learning algorithm.

___

  • LeCun Y., Bottou, L., Orr, G. ve Muller, K., 1998. “Efficient BackProp”, in Orr, G. and Muller K. (Eds), Neural Networks: Tricks of the trade, Springer.
  • Wilson, D.R., Martinez, T.R., 2003, "The general inefficiency of batch training for gradient descent learning", Neural Networks, 16 (2003) 1429–1451, Elsevier.
  • Dekel, O., Bachrach, R.G., Shamir, O., Xiao, L., 2012, "Optimal Distributed Online Prediction Using Mini-Batches", Journal of Machine Learning Research, 13 (2012) 165- 202.
  • Gimpel, K., Das, D., Smith, N.A., 2010, "Distributed Asynchronous Online Learning for Natural Language Processing", CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning, 213-222.
  • Zhao, K., Huang, L., 2013, "Minibatch and Parallelization for Online Large Margin Structured Learning", Proceedings of NAACL- HLT 2013, 370–379.
  • Cotter, A., Shamir, O., Srebro, N., ve Sridharan, K., 2011, "Better Mini-Batch Algorithms via Accelerated Gradient Methods", Advances in Neural Information Processing Systems (NIPS).
  • Mİller, M.F., 1993, "A scaled conjugate gradient algorithm for fast supervised learning", NEURAL NETWORKS, 6(4), 525-533.
  • Boyd, S., Vandenberghe, L., 2004, Convex Optimization, Cambridge University Press.
  • Fletcher, R., 1987, Practical methods of optimization, New York: John Wiley & Sons.
  • Igel, C., Hüsken, M., 2003, "Empirical evaluation of the improved Rprop learning algorithm", Neurocomputing, 50(C), 105-123.
  • http://wangwzhao.googlepages.com/
  • Blake, C.L., Merz, C.J., 1998, UCI Repository of www.ics.uci.edu/~mlearn/MLRepository.html Databases -