Çoklu Nesne Takibi FairMOT Algoritması İçin Optimizasyon Algoritmalarının Karşılaştırılması

Çoklu nesne takibi alanında günümüzde birçok yöntem kullanılmaktadır. Derin öğrenme algoritmaları bu alanda en çok çalışılan yöntemler arasında yer almaktadır. Derin öğrenme tabanlı sistemlerde yüksek performans elde edilebilmesi için uyarlanması gereken birçok parametre vardır. Derin öğrenme sistemlerinde performansı etkileyen önemli parametrelerden birisi de kullanılan optimizasyon algoritmasıdır. Bu çalışmada FairMOT algoritması için Adam, RMSProp, Rprop, SGD optimizasyon algoritmaları karşılaştırılmıştır. Optimizasyon algoritmaları karşılaştırılırken MOT20 veri seti kullanılmıştır. MOT20 doğrulama veri setinde ortalama en yüksek doğruluk değeri, RMSprop optimizasyon algoritması ile %76.7 olarak elde edilmiştir.

Comparison of Optimization Algorithms for Multi-Object Tracking FairMOT Algorithm

Many methods are used in the field of multi-object tracking today. Deep learning algorithms are among the most studied methods in this field. There are many parameters that need to be adapted in order to achieve high performance in deep learning based systems. One of the important parameters affecting performance in deep learning systems is the optimization algorithm used. In this study, Adam, RMSProp, Rprop, SGD optimization algorithms were compared for FairMOT algorithm. The MOT20 dataset was used when comparing the optimization algorithms. In the MOT20 validation dataset, overall highest accuracy value was obtained as 76.7% with the RMSprop optimization algorithm.

___

  • America, N. E. C. L. and Nj, P. (2010) ‘Large-Scale Machine Learning with Stochastic Gradient Descent (SGD)’, Proceedings of COMPSTAT’2010, pp. 3–4. doi: 10.1007/978-3-7908-2604-3.
  • Bernardin, K. and Stiefelhagen, R. (2008) ‘Evaluating multiple object tracking performance: The CLEAR MOT metrics’, Eurasip Journal on Image and Video Processing, 2008. doi: 10.1155/2008/246309.
  • Dendorfer, P. et al. (2020) ‘MOT20: A benchmark for multi object tracking in crowded scenes’, arXiv, pp. 1–7.
  • Girshick, R. et al. (2014) ‘Rich feature hierarchies for accurate object detection and semantic segmentation’, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 580–587. doi: 10.1109/CVPR.2014.81
  • Girshick, R. (2015) ‘Fast R-CNN’, Proceedings of the IEEE International Conference on Computer Vision, 2015 Inter, pp. 1440–1448. doi: 10.1109/ICCV.2015.169.
  • Kingma, D. P. and Ba, J. L. (2015) ‘Adam: A method for stochastic optimization’, 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, pp. 1–15.
  • Leal-Taixé, L. et al. (2015) ‘MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking’, pp. 1–15. Available at: http://arxiv.org/abs/1504.01942.
  • Lin, T. Y. et al. (2014) ‘Microsoft COCO: Common objects in context’, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8693 LNCS(PART 5), pp. 740–755. doi: 10.1007/978-3-319-10602-1_48.
  • Milan, A. et al. (2016) ‘MOT16: A Benchmark for Multi-Object Tracking’, pp. 1–12. Available at: http://arxiv.org/abs/1603.00831.
  • Redmon, J. et al. (2016) ‘You only look once: Unified, real-time object detection’, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-Decem, pp. 779–788. doi: 10.1109/CVPR.2016.91.
  • Ren, S. et al. (2017) ‘Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), pp. 1137–1149. doi: 10.1109/TPAMI.2016.2577031.
  • Riedmiller, M. and Braun, H. (1993) ‘Direct adaptive method for faster backpropagation learning: The RPROP algorithm’, 1993 IEEE International Conference on Neural Networks, pp. 586–591. doi: 10.1109/icnn.1993.298623.
  • Ristani, E. et al. (2016) ‘Performance measures and a data set for multi-target, multi-camera tracking’, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9914 LNCS(c), pp. 17–35. doi: 10.1007/978-3-319-48881-3_2.
  • Shao, S. et al. (2018) ‘CrowdHuman: A benchmark for detecting human in a crowd’, arXiv, pp. 1–9.
  • Tieleman, T., Hinton, G. and others (2012) ‘RMSProp’, COURSERA: Neural networks for machine learning, 4(2), pp. 26–31.
  • Wojke, N., Bewley, A. and Paulus, D. (2018) ‘Simple online and realtime tracking with a deep association metric’, Proceedings - International Conference on Image Processing, ICIP, 2017-Septe, pp. 3645–3649. doi: 10.1109/ICIP.2017.8296962.
  • Yang, H. et al. (2020) ‘Online multi-object tracking using KCF-based single-object tracker with occlusion analysis’, Multimedia Systems, 26(6), pp. 655–669. doi: 10.1007/s00530-020-00675-4.
  • Yu, F. et al. (2018) ‘Deep Layer Aggregation’, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2403–2412. doi: 10.1109/CVPR.2018.00255.
  • Zhang, Y. et al. (2020) ‘FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking’, pp. 1–13. Available at: http://arxiv.org/abs/2004.01888.
  • Zhou, X., Wang, D. and Krähenbühl, P. (2019) ‘Objects as points’, arXiv.