A New Arabic Coding Scheme

A New Arabic Coding Scheme

In this paper, we designed a new Arabic letter encoding scheme based on the characteristics of the Arabic language to solve many Arabic coding problems, especially those related to formulation problems. In the proposed coding scheme, we were able to represent the Arabic letter and its accent marks using one byte instead of two, thus, the size of the Arabic text was reduced in half. The suggested coding scheme can be used as a bilingual coding scheme instead of ASCII in an Arabic platform environment or as a text compression scheme.

___

  • [1] W. Helali, Z. Hajaiej, and A. Cherif, "Arabic corpus implementation: Application to speech recognition," 2018 Int. Conf. Adv. Syst. Electr. Technol. IC_ASET 2018, pp. 50–53, 2018, DOI: 10.1109/ASET.2018.8379833.
  • [2] M. Johnson et al., "Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation," Trans. Assoc. Comput. Linguist., vol. 5, pp. 339–351, 2017, DOI: 10.1162/tacl_a_00065.
  • [3] A. A ajan “M ltilayer model for Arabic text compression ” Int. Arab J. Inf. Technol., vol. 8, no. 2, pp. 188–196, 2011.
  • [4] R. Ayadi, M. Maraoui, and M. Zrigui, "A Survey of Arabic Text Representation and Classification Methods," Res. Comput. Sci., vol. 117, no. 1, pp. 51–62, 2016, DOI: 10.13053/RCS-117-1-4.
  • [5] M. Mehrnoush, B. J. Belzer, K. Sivakumar, and R. Wood, "EXIT Chart-Based IRA Code Design for TDMR Turbo-Equalization System," IEEE Trans. Commun., vol. 65, no. 4, pp. 1762–1774, 2017, DOI: 10.1109/TCOMM.2017.2662003.
  • [6] P. Update “Unicode character encoding model ” pp. 1–23, 2008.
  • [7] T. A. Hilal and H. A. Hilal, "Arabic text lossless compression by characters encoding," Procedia Comput. Sci., vol. 155, no. 2018, pp. 618–623, 2019, DOI: 10.1016/j.procs.2019.08.087.
  • [8] A. Ibady “A ne locali ation and compression system Dr. Abd lkareem Ibady Baghdad college of economic science 2010 ” 2010.
  • [9] S. K. Mukhopadhyay, M. O. Ahmad, and M. N. S. Swamy, "SVD and ASCII Character Encoding-Based Compression of Multiple Biosignals for Remote Healthcare Systems," IEEE Trans. Biomed. Circuits Syst., vol. 12, no. 1, pp. 137–150, 2018, DOI: 10.1109/TBCAS.2017.2760298.
  • [10] S. S. Ismail, I. F. Moawad, and M. Aref “Arabic text representation sing rich semantic graph: A case st dy ” Recent Adv. Inf. Sci. pp. 148–153, 2013.