An optimized FPGA design of inverse quantization and transform for HEVC decoding blocks and validation in an SW/HW environment

This paper presents an optimized hardware architecture of the inverse quantization and the inverse transform IQ/IT for a high-efficiency video coding HEVC decoder. Our highly parallel and pipelined architecture was designed to support all HEVC Transform Unit TU sizes: 4 × 4, 8 × 8, 16 × 16, and 32 × 32. The IQ/IT was described in the VHSIC hardware description language and synthesized to Xilinx XC7Z020 field-programmable gate array FPGA and to TSMC 180 nm standard-cell library. The throughput of the hardware architecture reached in the worst case a processing rate of up to 1080 p at 33 fps at 146 MHz and 1080 p at 25 fps at 110 MHz when mapped to FPGA and standardcells, respectively. The validation of our architecture was conducted on the ZC702 platform using a Software/Hardware SW/HW enviro nment in order to evaluate different implementation methods SW and SW/HW in terms of power consumption and run-time. The experimental results demonstrate that the SW/HW accelerations were enhanced by more than 70% in terms of the run-time speed relative to the SW solution. Besides, the power consumption of the SW/HW designs was reduced by nearly 60% compared with the SW case.

___

  • [1] High Efficiency Video Coding, ITU-T Rec. H.265 and ISO/IEC 23008-2 (HEVC), ITU-T and ISO/IEC, 2013.
  • [2] Pourazad M T, Doutre C, Azimi M, Nasiopoulos P. HEVC: The new gold standard for video compression. IEEE Consumer Electronics Magazine 2012; 1 (3): 36-46. doi: 10.1109/MCE.2012.2192754
  • [3] Kammoun M, Ben Atitallah A, Ali KMA, Ben Atitallah R. Case study of an HEVC decoder application using high-level synthesis: intra prediction, dequantization, and inverse transform blocks. Journal of Electronic Imaging 2019; 28 (03): 1-20. doi: 10.1117/1.JEI.28.3.033010
  • [4] Martuza MA, Wahid K. A cost effective implementation of 8 × 8 transform of HEVC from H.264/AVC. In: IEEE Canadian Conference on Electrical and Computer Engineering (CCECE); Montreal, QC, Canada; 2012. pp. 1-20.
  • [5] Pai-Tse C, Tian Sheuan C. A Reconfigurable inverse transform architecture design for HEVC Decoder. In: IEEE International Symposium on Circuit and System (ISCAS); Beijing, China; 2013. pp. 1006-1009.
  • [6] Hong L, He W, He G, Mao Z. Area-efficient HEVC IDCT/IDST architecture for 8Kx4K video decoding. IEICE Electronics Express 2016; 13(6): 1-20. doi: 10.1587/elex.13.20160019
  • [7] Goebel J, Paim G, Agostini L. An HEVC multi-size DCT hardware with constant throughput and supporting heterogeneous CUs. In: IEEE International Symposium on Circuits and Systems (ISCAS); Montreal, QC, Canada; 2016. pp. 2202-2205.
  • [8] Chen M, Zhang Y, Chao L. Efficient architecture of variable size HEVC 2D-DCT for FPGA platforms. International Journal of Electronics and Communications (AE) 2017; 73: 1-8. doi: 10.1016/j.aeue.2016.12.024
  • [9] Kalali E, Ozcan E, Yalcinkaya O, Hamzaoglu I. A Low Energy HEVC Inverse Transform Hardware. IEEE Transactions on Consumer Electronics 2014; 60 (4): 754-761.
  • [10] Mohamed B, Elsayed A, Amin O, Khafagy E, Kennelly J et al. High-Level Synthesis Hardware Implementation and Verification of HEVC DCT on SoC-FPGA. In: 13th International Computer Engineering Conference (ICENCO); Cairo, Egypt; 2017. pp.1-20.
  • [11] Kthiri M, Loukil H, Ben Atitallah A, Kadionik P. FPGA architecture of the LDPS motion estimation for H.264/AVC video coding. Journal of Signal Processing Systems 2012; 68 (2): 273-285. doi: 10.1007/s11265-011-0614-x
  • [12] Ben Atitallah A, Loukil H, Kadionik P, Masmoudi N. Advanced design of TQ/IQT component for H.264/AVC based on SoPC validation. WSEAS Transactions on Circuits and Systems 2012; 11(7): 211-223.
  • [13] Xilinx. Inc. Zynq®-7000 family is based on the Xilinx SoC architecture. Zynq-7000 SoC Data Sheet: Overview; DS190 (v1.11.1); 2018.
  • [14] Ben Atitallah A, Kadionik P, Masmoudi N, Levi H. FPGA implementation of a HW/SW platform for multimedia embedded systems. Design Automation for Embedded Systems 2008; 12(4): 293-311.
  • [15] Ben Atitallah A, Kadionik P, Ghozzi F, Nouel P. An FPGA implementation of HW/SW codesign architecture for H. 263 video coding. AEU-International Journal of Electronics and Communications 2007; 61(9): 605–620.
  • [16] Gweon R, Lee Y. N-Level Quantization in HEVC. In: IEEE International Symposium on Broadband Multimedia Systems and Broadcasting; Seoul, South Korea; 2012. pp.1-20.
  • [17] Budagavi M, Fuldseth A, Bjontegaard G. HEVC transform and quantization. High Efficiency Video Coding (HEVC) 2014; 1: 141-169.
  • [18] Sullivan Gary J, Ohm JR, Han WJ, Wiegand T. Overview of the high efficiency video coding (HEVC) standard. IEEE Transition Circuits System Video Technology 2012; 22(12): 1648-1667. doi: 10.1109/TCSVT.2012.2221191
  • [19] Artisan Components. TSMC 0.18um 1.8-Volt SAGE-XTM Standard CellLibrary Databook, 2001.
  • [20] Bross B, Han W-J, Ohm J-R, Sullivan Gary J. High Efficiency Video Coding (HEVC) text specification draft 10. JCT-VC, Doc. JCTVC-L1003 Geneva, Switzerland, 2013.
  • [21] Kammoun M, Ben Atitallah A, Ben Atitallah R, Masmoudi N. Design exploration of efficient implementation on SoC heterogeneous platform: HEVC intra prediction application. Wiley & Sons, International Journal of Circuit Theory and Applications 2017; 45 (12): 2243-2259. doi: 10.1002/cta.2308
  • [22] Ahmed A, Muhammad Usman S, Ata ur R. N Point DCT VLSI Architecture for Emerging HEVC Standard. VLSI Design 2012, 2012(2): 1-13. doi: 10.1155/2012/752024
  • [23] Pramod K M, Sang Y P, Basant K M, Khoon S L. Efficient Integer DCT Architectures for HEVC. IEEE Transactions on circuits and systems for video technology 2014, 24(1):168-178.
  • [24] Stankowski J, Grajek T, Karwowski D, Klimaszewski K, Stankiewicz O et al. Analysis of frame partitioning in HEVC. International Conference on Computer Vision and Graphics (ICCVG); Warsaw, Polandpp; 2014, pp.602- 609.
  • [25] ZC702 Evaluation Board for the Zynq-7000 XC7Z020 SoC User Guide, UG850 (v1.7), March 27, 2019.
  • [26] Xilinx AXI reference guide, UG761 (v13.1), March 7, 2011.
  • [27] Kthiri M, Kadionik P, Le gal B, Lévi H. Performances analysis and evaluation of Xenomai with a H.264/AVC decoder. In: IEEE ICM; Hammamet, Tunisia; 2011. pp. 1-20.
  • [28] Chi CC, Alvarez-Mesa M, Bross B, Juurlink B, Schierl T. SIMD acceleration for HEVC decoding. IEEE Transactions on Circuits and Systems for Video Technology 2015; 25(5):1-20.
  • [29] Ayadi LA, Loukil H, Ben Ayed MA, Masmoudi N. Efficient implementation of HEVC decoder on Zynq SoC platform. In: IEEE ATSIP; Sousse, Tunisia; 2018. pp. 21-24.
  • [30] Chi CC, Alvarez-Mesa M, Lucas J, Juurlink B, Schierl T. Parallel HEVC Decoding on Multi-and Many-core Architectures. Journal of Signal Processing Systems 2013; 71:247-260. doi: 10.1007/s11265-012-0714-2