Comparison of Object Detection and Classification Methods For Mobile Robots

As one of today's popular research field, mobile robots, are widely used in entertainment, search and rescue, health, military, agriculture and many other fields with the advantages of technological developments. Object detection is one of the methods used for mobile robots to gather and report information about its environment during these tasks. With the ability to detect and classify objects, a robot can determine the type and number of objects around it and use this knowledge in its movement and path planning or reporting the objects with the desired features. Considering the dimensions of mobile robots and weight constraints of flying robots, the use of these algorithms is more limited. While the size and weight of mobile devices should be kept relatively small, successful object classification algorithms require processors with high computational power. In this study, to be able to use object detection information for mapping and path planning, object detection and classification methods were examined, and for the usage in low weight and low energy consuming platforms through developer boards, detection algorithms were compared to each other.

___

[1] Hassani, Imen, Imen Maalej, and Chokri Rekik. "Robot path planning with avoiding obstacles in known environment using free segments and turning points algorithm." Mathematical Problems in Engineering 2018 (2018).

[2] O. Khatib, "Real-Time Obstacle Avoidance for Manipulators and Mobile Robots", International Journal of Robotics Research, 5(1):90-98, 1986.

[3] J.F.Canny and J.H. Reif, "New Lower Bound Techniques for Robot Motion Planning Problems", Proceedings of the 28th IEEE Symposium on Foundations of Computer Science, pp. 49-60, Los Angeles, CA, 1987.

[4] R. A. Jarvis, ”Distance Transform Based Collision-Free Path Planning for Robot”, Advanced Mobile Robots, World Scientific Publishing, pp.3-31, 1994.

[5] Ganganath, Nuwan, and Henry Leung. "Mobile robot localization using odometry and kinect sensor." 2012 IEEE International Conference on Emerging Signal Processing Applications. IEEE, 2012.

[6] Patle, B. K., et al. "A review: On path planning strategies for navigation of mobile robot." Defence Technology (2019).

[7] J. Leonard and H. Durrant-Whyte, “Simultaneous map building and localization for an autonomous mobile robot,” Proceedings IROS ’91:IEEE/RSJ International Workshop on Intelligent Robots and Systems ’91, no. 91, pp. 1442– 1447, 1991.

[8] C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. Reid, and J. Leonard, “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age,” IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309–1332, 2016.

[9] Cruz, Nicolás, Kenzo Lobos-Tsunekawa, and Javier Ruiz-del-Solar. "Using convolutional neural networks in robots with limited computational resources: detecting NAO robots while playing soccer." Robot World Cup. Springer, Cham, 2017.

[10] Badue, Claudine, et al. "Self-driving cars: A survey." Expert Systems with Applications (2020): 113816.

[11] Seo, T., Jeon, Y., Park, C., & Kim, J. (2019). Survey on Glass And FaçadeCleaning Robots: Climbing Mechanisms, Cleaning Methods, and Applications. International Journal of Precision Engineering and Manufacturing-Green Technology, 6(2), 367-376.

[12] Jaradat, Mohammad Abdel Kareem, Mohammad H. Garibeh, and Eyad A. Feilat. "Autonomous mobile robot dynamic motion planning using hybrid fuzzy potential field." Soft Computing 16.1 (2012): 153-164.

[13] Rus, Daniela, Bruce Donald, and Jim Jennings. "Moving furniture with teams of autonomous robots." Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots. Vol. 1. IEEE, 1995.

[14] Yao, P., Zhao, Z., & Zhu, Q. (2019). Path planning for autonomous underwater vehicles with simultaneous arrival in ocean environment. IEEE Systems Journal, 14(3), 3185-3193.

[15] M. P. Aghababa, “3D path planning for underwater vehicles using five evolutionary optimization algorithms avoiding static and energetic obstacles,” Applied Ocean Research, vol. 38, pp. 48–62, 2012.

[16] R. Yue, J. Xiao, S. L. Joseph, and S. Wang, “Modeling and path planning of the cityclimber robot part II: 3D path planning using mixed integer linear programming,” in Proceedings of the IEEE International Conference on Robotics and Biomimetics (ROBIO ’09), vol. 6, pp. 2391–2396, Guilin, China, December 2009.

[17] Shiri, H., Park, J., & Bennis, M. (2019, December). Massive autonomous UAV path planning: A neural network based mean-field game theoretic approach. In 2019 IEEE Global Communications Conference (GLOBECOM) (pp. 1-6). IEEE.

[18] Ropero, F., Muñoz, P., & R-Moreno, M. D. (2019). TERRA: A path planning algorithm for cooperative UGV–UAV exploration. Engineering Applications of Artificial Intelligence, 78, 260-272.

[19] Tunggal, Tatiya Padang, et al. "Pursuit algorithm for robot trash can based on fuzzy-cell decomposition."International Journal of Electrical and Computer Engineering 6.6 (2016): 2863.

[20] Ma, Xiaozhi, et al. "Conceptual framework and roadmap approach for integrating BIM into lifecycle project management."Journal of Management in Engineering 34.6 (2018): 05018011.

[21] J. Sun, J. Tang and S. Lao, "Collision Avoidance for Cooperative UAVs With Optimized Artificial Potential Field Algorithm," IEEE Access, vol. 5, pp. 18382-18390, 2017.

[22] Azzeddine Bakdi, Abdelfetah Hentout, Hakim Boutami, Abderraouf Maoudj, Ouarda Hachour, Brahim Bouzouia, Optimal path planning and execution for mobile robots using genetic algorithm and adaptive fuzzy-logic control, Robotics and Autonomous Systems, Volume 89, pp. 95- 109, 2017.

[23] Aleksandr I. Panov, Konstantin S. Yakovlev, Roman Suvorov, Grid Path Planning with Deep Reinforcement Learning: Preliminary Results, Procedia Computer Science, Volume 123, pp. 347- 353, 2018.

[24] Manh Duong Phung, Cong Hoang Quach, Tran Hiep Dinh, Quang Ha, Enhanced discrete particle swarm optimization path planning for UAV vision-based surface inspection, Automation in Construction, Volume 81, Pages 25-33, 2017.

[25] Liu, J., Yang, J., Liu, H. et al. An improved ant colony algorithm for robot path planning. Soft Comput 21, 5829–5839 (2017).

[26] Filipenko, Maksim, and Ilya Afanasyev. "Comparison of various slam systems for mobile robot in an indoor environment." 2018 International Conference on Intelligent Systems (IS). IEEE, 2018.

[27] C. Tao, Z. Gao, J. Yan, C. Li and G. Cui, "Indoor 3D Semantic Robot VSLAM Based on Mask Regional Convolutional Neural Network," in IEEE Access, vol. 8, pp. 52906-52916, 2020.

[28] Chen, L.; Jin, S.; Xia, Z. Towards a Robust Visual Place Recognition in Large-Scale vSLAM Scenarios Based on a Deep Distance Learning. Sensors, 2021.

[29] Szeliski, Richard. Computer vision: algorithms and applications. Springer Science & Business Media, 2010.

[30] Fischler, M., & Elschlager, R. (1973). The representation and matching of pictorial structures. IEEE Transactions on Computers, 100(1), 67–92.

[31] Mundy, J. (2006). Object recognition in the geometric era: A retrospective. In J. Ponce, M. Hebert, C. Schmid, & A. Zisserman (Eds.), Book toward category level object recognition (pp. 3–28). Berlin: Springer.

[32] Rowley, H., Baluja, S., & Kanade, T. (1998). Neural network based face detection. IEEE TPAMI, 20(1), 23–38.

[33] Osuna, E., Freund, R., & Girosit, F. (1997). Training support vector machines: An application to face detection. In CVPR (pp. 130–136).

[34] Viola, P., Jones, M. Rapid, “Object detection using a boosted cascade of simple features.” CVPR, 1, 1–8, 2001.

[35] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, pp. 1097–1105, 2012.

[36] Valueva, M.V.; Nagornov, N.N.; Lyakhov, P.A.; Valuev, G.V.; Chervyakov, N.I. (2020). "Application of the residue number system to reduce hardware costs of the convolutional neural network implementation". Mathematics and Computers in Simulation. Elsevier BV. 177: 232–243.

[37] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E.Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551, 1989.

[38] K. Fukushima. Neocognitron: A selforganizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4): 93-202, 1980.

[39] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradientbased learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.

[40] P. Sermanet, S. Chintala, and Y. LeCun. Convolutional neural networks applied to house numbers digit classification. In Pattern Recognition (ICPR), 2012 21st International Conference on, pages 3288– 3291. IEEE, 2012.

[41] P. Sermanet and Y. LeCun. Traffic sign recognition with multi-scale convolutional networks. In Neural Networks (IJCNN), The 2011 International Joint Conference on, pages 2809–2813. IEEE, 2011.

[42] R. Girshick, J. Donahue, T. Darrell, J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation", 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, 2014.

[43] R. Girshick, "Fast R-CNN IEEE International Conference on Computer Vision", IEEE, vol. 2015, pp. 1440-1448.

[44] S. Ren, K. He, R. Girshick, J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, June 2017.

[45] K. He, G. Gkioxari, P. Dollár, R. Girshick, "Mask R-CNN", 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980-2988, 2017.

[46] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, "You Only Look Once: Unified Real-Time Object Detection", 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.

[47] J. Redmon, A. Farhadi, "YOLO9000: Better Faster Stronger", 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525, 2017.

[48] J Redmon, A. Farhadi, YOLOv3: An Incremental Improvement, 2018.

[49] W Liu, D Anguelov, D Erhan et al., SSD: Single Shot MultiBox Detector, pp. 21-37, 2015.

[50] C Y Fu, W Liu, A Ranga et al., DSSD: Deconvolutional Single Shot Detector, 2017.

[51] T Y Lin, P Goyal, R Girshick et al., "Focal Loss for Dense Object Detection", IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 99, pp. 2999- 3007, 2017.

[52] Huang, G., Liu, Z., Weinberger, K. Q., & van der Maaten, L. (2017a). Densely connected convolutional networks. In CVPR.

[53] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real time object detection. In CVPR (pp. 779–788).

[54] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.

[55] Salman, Ahmad, et al. "Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system." ICES Journal of Marine Science, 2020, pp. 1295-1307.

[56] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In CVPR (pp. 1–9).

[57] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR (pp. 770–778).

[58] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.

[59] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorflow: Largescale machine learning on heterogeneous systems, 2015,” Software available from tensorflow. org, vol. 1, 2015.

[60] D. Yu, A. Eversole, M. Seltzer, K. Yao, Z. Huang, B. Guenter, O. Kuchaiev, Y. Zhang, F. Seide, H. Wang et al., “An introduction to computational networks and the computational network toolkit,” Technical report, Tech. Rep. MSR, Microsoft Research, 2014, 2014. research. microsoft. com/apps/pubs, Tech. Rep., 2014.

[61] R. Collobert, K. Kavukcuoglu, and C. Farabet, “Torch7: A matlablike environment for machine learning,” in BigLearn, NIPS Workshop, no. EPFLCONF-192376, 2011.

[62] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. arXiv preprint arXiv:1411.4555, 2014.

[63] A. Karpathy and L. Fei-Fei. Deep visualsemantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306, 2014.

[64] Moghadam, Peyman, Wijerupage Sardha Wijesoma, and Dong Jun Feng. "Improving path planning and mapping based on stereo vision and lidar." 2008 10th International Conference on Control, Automation, Robotics and Vision. IEEE, 2008.

[65] Sabe, Kohtaro, et al. "Obstacle avoidance and path planning for humanoid robots using stereo vision." IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA'04. 2004. Vol. 1. IEEE, 2004.

[66] PTGrey, 2004. Point grey research inc. http://www.ptgrey.com/

[67] Pomerleau, Dean A. "Efficient training of artificial neural networks for autonomous navigation." Neural computation 3.1 (1991): 88-97.

[68] Ran, Lingyan, et al. "Convolutional neural network-based robot navigation using uncalibrated spherical images." Sensors 17.6 (2017): 1341.

[69] Hadsell, R.; Sermanet, P.; Ben, J.; Erkan, A.; Scoffier, M.; Kavukcuoglu, K.; Muller, U.; LeCun, Y. Learning long-range vision for autonomous off-road driving. J. Field Robot. 2009, 26, 120–144.

[70] Zoto, J., Musci, M. A., Khaliq, A., Chiaberge, M., & Aicardi, I. (2019, June). Automatic path planning for unmanned ground vehicle using uav imagery. In International Conference on Robotics in Alpe-Adria Danube Region (pp. 223-230). Springer, Cham.

[71] Tang, Y. “Deep Learning using Linear Support Vector Machines.” arXiv: Learning (2013).

[72] Everingham, M., Gool, L. V., Williams, C., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. IJCV, 88(2), 303–338.

[73] Everingham, M., Eslami, S., Gool, L. V., Williams, C., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.

[74] Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, L. (2014). Microsoft COCO: Common objects in context. In ECCV (pp. 740–755).

[75] Deng, J., Dong, W., Socher, R., Li, L., Li, K., & Li, F. (2009). ImageNet: A large scale hierarchical image database. In CVPR (pp. 248–255).

[76] Kuznetsova, Alina, et al. "The open images dataset v4." International Journal of Computer Vision (2020): 1-26.

[77] Fei-Fei, Li, Rob Fergus, and Pietro Perona. "Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories." 2004 conference on computer vision and pattern recognition workshop. IEEE, 2004.

[78] Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.

[79] He, K., Zhang, X., Ren, S., & Sun, J. (2014). Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV (pp. 346–361).

[80] Lenc, K., & Vedaldi, A. (2015). R-CNN minus R. In BMVC15.

[81] Dai, J., Li, Y., He, K., & Sun, J. (2016c). RFCN: Object detection via region based fully convolutional networks. In NIPS (pp. 379–387).

[82] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "YOLOv4: Optimal Speed and Accuracy of Object Detection." arXiv preprint arXiv:2004.10934 (2020).