A Comprehensive Performance Comparison of Dedicated and Embedded GPU Systems

A Comprehensive Performance Comparison of Dedicated and Embedded GPU Systems

General purpose usage of graphics processing units (GPGPU) is becoming increasingly important asgraphics processing units (GPUs) get more powerful and their widespread usage in performance-orientedcomputing. GPGPUs are mainstream performance hardware in workstation and cluster environments andtheir behavior in such setups are highly analyzed. Recently, NVIDIA, the leader hardware and softwarevendor in GPGPU computing, started to produce more energy efficient embedded GPGPU systems, Jetsonseries GPUs, to make GPGPU computing more applicable in domains where energy and space are limited.Although, the architecture of the GPUs in Jetson systems is the same as the traditional dedicated desktopgraphic cards, the interaction between the GPU and the other components of the system such as mainmemory, central processing unit (CPU), and hard disk, is a lot different than traditional desktop solutions.To fully understand the capabilities of the Jetson series embedded solutions, in this paper we run severalapplications from many different domains and compare the performance characteristics of theseapplications on both embedded and dedicated desktop GPUs. After analyzing the collected data, we haveidentified certain application domains and program behaviors that Jetson series can deliver performancecomparable to dedicated GPU performance.

___

  • 1. Reese, J. and Zaranek, S., Gpu programming in matlab. MathWorks News&Notes. Natick, MA: The MathWorks Inc, pp.22-5. 2012.
  • 2. Kirk, D., NVIDIA CUDA software and GPU parallel computing architecture. In ISMM (Vol. 7, pp. 103- 104). 2007, October.
  • 3. Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks, 25th Int. Conf. on Neural Information Processing Systems, p.1097-1105. 2012.
  • 4. CUDA Spotlight GPU Applications Showcase. https://devblogs.nvidia.com/parallelforall/cudaspotlight-gpu-accelerated-speech-recognition/ (Accessed at 22.05.2020)
  • 5. GPU Technology Conference, Tutorials. http://ondemand.gputechconf.com/gtc/2015/webinar/deeplearning-course/intro-to-deep-learning.pdf (Accessed: 22.05.2020)
  • 6. GPU Technology Conference, Tutorials. http://ondemand.gputechconf.com/gtc/2014/presentations/S46 21-deep-neural-networks-automotive-safety.pdf (Accessed: 22.05.2020)
  • 7. NVIDIA Embedded Platform. https://developer.nvidia.com/embedded/jetsonembedded-platform (Accessed : 22.05.2020)
  • 8. B. Baumann. “Jetson TK1”, Institut Für Technische Informatik, Advanced Seminar Computer Engineering, Seminar Winter Term 2014/2015. 2015.
  • 9. C. Alicea-Nieves. Caffe Framework on the Jetson TK1: Using Deep Learning for Real Time Object Detection. SUNFEST at PENN. (https://sunfest.seas.upenn.edu/) 2018.
  • 10. R. J. Abbasi. HPCG benchmark for characterising performance of SoC devices, (Unpublished Master Thesis). The Australian National University. 2015.
  • 11. Stone JE, Hallock MJ, Phillips JC, Peterson JR, Luthey-Schulten Z, Schulten K. Evaluation of emerging energy-efficient heterogeneous computing platforms for biomolecular and cellular simulation workloads. IEEE 30th Int. Parallel and Distr. Processing Symposium Workshops, IPDPSW. IEEE Computer Society. p. 89-100. 2016.
  • 12. Nathan Otterness, Ming Yang, Sarah Rust, Eunbyung Park, James H. Anderson, F. Donelson Smith, Alexander C. Berg, Shige Wang. An Evaluation of the NVIDIA TX1 for Supporting Real-Time ComputerVision Workloads. RTAS 2017: 353-364. 2017.
  • 13. D. Bourque, CUDA-Accelerated Visual SLAM For UAVs, (Unpublished Master Thesis). Worcester Polytechnic Institute. 2017.
  • 14. Jose, E., Greeshma, M., TP, M.H. and Supriya, M.H., March. Face recognition based surveillance system using facenet and mtcnn on jetson tx2. 5th Int. Conf. on Advanced Computing & Communication Systems (ICACCS) (pp. 608-613). IEEE. 2019.
  • 15. Giubilato, R., Chiodini, S., Pertile, M. and D., S., An evaluation of ROS-compatible stereo visual SLAM methods on a nVidia Jetson TX2. Measurement, 140, pp.161-170. 2019.
  • 16. Van Essen, B., Macaraeg, C., Gokhale, M. and Prenger, R., Accelerating a random forest classifier: Multi-core, GP-GPU, or FPGA. 20th International Symposium on Field-Programmable Custom Computing Machines (pp. 232-239). 2012.
  • 17. Jones, D.H., Powell, A., Bouganis, C.S. and Cheung, P.Y., GPU versus FPGA for high productivity computing. International Conference on Field Programmable Logic and Applications (pp. 119-124). IEEE. 2010, August.
  • 18. Nurvitadhi, E., Venkatesh, G., Sim, J., Marr, D., Huang, R., Ong Gee Hock, J., Liew, Y.T., Srivatsan, K., Moss, D., Subhaschandra, S. and Boudoukh, G., Can FPGAs beat GPUs in accelerating nextgeneration deep neural networks?. In Proceedings of the 2017 ACM/SIGDA Int. Symposium on FieldProgrammable Gate Arrays (pp. 5-14). 2017, February.
  • 19. Nurvitadhi, E., Sim, J., Sheffield, D., Mishra, A., Krishnan, S. and Marr, D., Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC. 26th International Conference on Field Programmable Logic and Applications (FPL) (pp. 1-4). IEEE. 2016, August.
  • 20. CUDA C Programming Guide, http://docs.nvidia.com/cuda/cuda-c-programmingguide/index.html (Accessed : 22.05.2020)
  • 21. Paralution Benchmark Suite. https://developer.nvidia.com/paralution, (Accessed: 22.05.2020)
  • 22. Danalis, A., Marin, G., McCurdy, C., Meredith, J.S., Roth, P.C., Spafford, K., Tipparaju, V. and Vetter, J.S., March. SHOC benchmark suite. 3rd Workshop on GPGPU (pp. 63-74). 2010.
  • 23. GeForce Titan X Specifications, http://www.geforce.com/hardware/desktopgpus/geforce-gtx-titan-x/specifications (Acessed : 22.05.2020)
  • 24. Jetson TX2 Module Data Sheet. https://developer.nvidia.com/embedded/jetson-tx2 (Acessed : 22.05.2020)
  • 25. EVGA GeForce GTX TITAN X(12G-P4-2990-KR) on Amazon.com , https://www.amazon.com/dp/B07MK6CWLR/ref=dp _cr_wdg_tit_rfb (Accessed : 22.05.2020)
  • 26. NVIDIA Jetson TX2 Development Kit on Amazon.com,https://www.amazon.com/B06XPFH93 9 (Accessed : 22.05.2020)
  • 27. Matrix Market, (Accessed: 22.05.2020) http://math.nist.gov/MatrixMarket/
  • 28. The SuiteSparse Matrix Collection, https://www.cise.ufl.edu/research/sparse/matrices/ (Accessed : 22.05.2020)
  • 29. Mittal, Sparsh. "A Survey on optimized implementation of deep learning models on the NVIDIA Jetson platform." Journal of Systems Architecture 97 (2019): 428-442.
  • 30. Cui, Han, and Naim Dahnoun. "Real-Time Stereo Vision Implementation on Nvidia Jetson TX2." In 2019 8th Mediterranean Conference on Embedded Computing (MECO), pp. 1-5. IEEE, 2019
Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi-Cover
  • ISSN: 1309-8640
  • Başlangıç: 2009
  • Yayıncı: DÜ Mühendislik Fakültesi / Dicle Üniversitesi
Sayıdaki Diğer Makaleler

Password-Based SIMSec Protocol

Sedat AKLEYLEK, Engin KARACAN

Kesici Takım Geometrisinin Ti-6Al-4V’nin İşlenmesi Sırasında Oluşan Tırtıklı Talaş Üzerindeki Etkisinin Sonlu Elemanlar Yöntemi ile Araştırılması

Okan Deniz YILMAZ, Samad Nadimi Bavil OLIAEİ

Otomobil Radyatöründe Su Bazlı Grafen Nanoakışkan Kullanımının Isıl Verimliliğe Etkisinin Deneysel Olarak İncelenmesi

Tarkan KOCA

Kaldırma ve İletme Makinalarında Kullanılan Halatlarının Deneysel ve Nümerik Analizi

Berna BOLAT, Birgül Aşçioğlu TEMİZTAŞ, Muharrem Erdem BOĞOÇLU, Burak BAYRAKTAROĞLU

Kafes yapıların JAYA algoritmasıyla doğal frekans sınırlayıcıları altında optimum tasarımı

S. Özgür DEĞERTEKİN, Gülay Yalçin BAYAR

Gauss ve Laplace Gürültülü Kanallar Arasında Optimal Anahtarlama

M. Emin TUTAY

Tetrachloroethylene Ve Peroksiasetik Asit İle Klasik Ve Mikrodalga Enerji Kullanarak Kömürlerden Organik Desülfürizasyon

Selçuk ÖZGEN, Okyat BAYAT

Demiryollarında Gelişme Eğilimleri, Yüksek Hızlı Demiryollarının Küresel Mevcut Durumu ve Ülkemiz İçin Bir Derleme

Mehmet Çağrı KIZILTAŞ, Yunus Emre AYÖZEN, Mehmet Fatih ALTAN

Kayseri İli İçin Büyükbaş Hayvan Atığından Biyogaz ve Elektrik Üretim Potansiyelinin ve Maliyetinin Araştırılması

Gamze GENÇ, Gülşah ELDEN, Hande NURALAN POYRAZ

Henry Gaz Çözünürlük Optimizasyonu ile Uçak Eğim Kontrol Sistemi için Etkin Kontrolör Tasarımı

Serdar EKİNCİ, Veysi KAÇTI, Davut İZCİ