Hibrit bir Derin Öğrenme Yöntemi Kullanarak Türkçe Cümlelerdeki Anlatım Bozukluklarının Tespiti

Anlatım bozukluğu, Türkçe cümlelerde hem anlamsal hem de biçimsel belirsizlikleri ifade eden bir dilbilgisi terimidir. Daha önceki çalışmalarda, kural tabanlı dile özgü modeller oluşturularak Doğal Dil İşleme (DDİ) teknikleri kullanılmıştır. Bununla birlikte, daha az talepkar açıklama gereksinimlerine ve harici bilgiyi birleştirme kolaylığına rağmen, kural tabanlı sistemler, işleme verimliliği açısından bazı büyük engellere sahiptir. Uzun Kısa-Süreli Bellek (UKSB (ing: LSTM)) veya Evrişimsel Sinir Ağları (ESA (ing: CNN)) gibi derin öğrenme teknikleri son yıllarda büyük ilerlemeler kaydetmiş, bu da DDİ uygulamalarında performans açısından benzeri görülmemiş bir artışa yol açmıştır. Bu çalışmada, anlatım bozukluklarını tespit etmek için UKSB ve ESA'nın hibrit modeli olan bir derin öğrenme yaklaşımı (E-UKSB (ing: C-LSTM)) ve buna ek olarak sonuçları doğruluk açısından karşılaştırmak için Destek Vektör Makinesi (DVM (ing: SVM)) ve Rastgele Orman (RO (ing: RF)) gibi geleneksel makine öğrenmesi sınıflandırıcıları önerilmiştir. Önerilen hibrit model, geleneksel DVM ve rastgele orman sınıflandırıcılarına ek olarak, ESA ve UKSB’nin mevcut modellerinden daha yüksek başarım elde etmiştir. Bu durum, metin sınıflandırma için geleneksel sınıflandırıcılara kıyasla derin sinirsel yaklaşımların daha çok ön plana çıktığını göstermektedir.

Anahtar Kelimeler:

Anlatım bozukluğu, Makine öğrenmesi, Doğal dil işleme, Anlamsal belirsizlik, Türkçe

Detecting Defective Expressions in Turkish Sentences Using a Hybrid Deep Learning Method

Defective expression is a grammatical term that refers to both semantic and morphologic ambiguities in Turkish sentences. In earlier studies, Natural Language Processing (NLP) techniques have been used by constructing rule-based language-specific models. However, despite less demanding annotations requirements and ease of incorporating external knowledge, rule-based systems have some significant obstacles in terms of processing efficiency. Deep learning techniques such as long short-term memory (LSTM) or convolutional neural network (CNN) have made significant advances in recent years, which led to an unprecedented boost in NLP applications in terms of performance. In this study, a hybrid approach of LSTM and CNN (C-LSTM) for detecting defective expressions in addition to traditional machine learning classifiers such as support vector machine (SVM) and random forest (RF) to compare the results in terms of accuracy are proposed. The proposed hybrid approach achieved higher accuracy than the existing deep neural models of CNN and LSTM, in addition to the traditional classifiers of SVM and random forest. This study shows that deep neural approaches come into prominence for text classification compared to traditional classifiers.

Keywords:

Defective Expression, Machine Learning, NLP, Semantic ambiguity, Turkish,

PDF

___

[1] Sirbu, A. 2015. The significance of language as a tool of communication. Scientific Bulletin" Mircea cel Batran" Naval Academy, 18(2), 405.
[2] Üşür, G. 2004. Anlatım Bozukluklarının Düzeltilmesinde Geri Bildirimin Etkisi (Master's thesis, Afyon Kocatepe Üniversitesi, Sosyal Bilimler Enstitüsü).
[3] Jones, R. K. 2003. Miscommunication between pilots and air traffic control. Language problems and language planning, 27(3), 233-248.
[4] Liu, C., McKenzie, A., & Sutkin, G. 2021. Semantically Ambiguous Language in the Teaching Operating Room. Journal of Surgical Education.
[5] Çetinkaya, G., & Ülper, H. 2015. Anlatım bozukluğu taşıyan tümcelerin kabul edilebilirliği ve kavranılabilirliği öğrenci okurlar üzerinden karşılaştırmalı bir i̇nceleme. Hasan Ali Yücel Eğitim Fakültesi Dergisi, 12-1(23), 341-361
[6] Ferrari, A., & Esuli, A. 2019. An NLP approach for cross-domain ambiguity detection in requirements engineering. Automated Software Engineering, 26(3), 559-598.
[7] Bano, M. 2015, August. Addressing the challenges of requirements ambiguity: A review of empirical literature. In 2015 IEEE Fifth International Workshop on Empirical Requirements Engineering (EmpiRE) (pp. 21-24). IEEE.
[8] Hoceini, Y., Cheragui, M. A., & Abbas, M. 2011. Towards a New Approach for Disambiguation in NLP by Multiple Criterian Decision-Aid. Prague Bull. Math. Linguistics, 95, 19-32.
[9] Büyükikiz, K.K. 2007. İlköğretim 8. sınıf öğrencilerinin yazılı anlatım becerilerinin söz dizimi ve anlatım bozukluğu açısından değerlendirilmesi. Gazi University, Ankara, Turkey.
[10] Bahar, M. 2006. Teorik gramer bilgisi ile yazılı anlatım bozukluğu arasındaki ilişki (ilköğretim II. kademe Uşak örneği). Afyon Kocatepe University, Afyon, Turkey.
[11] Özdem, A. 2012. Çanakkale'deki yerel gazetelerin anlatım bozuklukları açısından incelenmesi. Çanakkale Onsekiz Mart University, Çanakkale, Turkey.
[12] Chen, Y., Xu, L., Liu, K., Zeng, D., & Zhao, J. 2015, July. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 167-176).
[13] Mollá, D., Van Zaanen, M., & Smith, D. 2006. Named entity recognition for question answering. In Proceedings of the Australasian Language Technology Workshop, pp. 51–58.
[14] Saju, C. J., & Shaja, A. S. 2017, February. A survey on efficient extraction of named entities from new domains using big data analytics. In 2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM) (pp. 170-175). IEEE.
[15] Ahir, K., Govani, K., Gajera, R., & Shah, M. 2020. Application on virtual reality for enhanced education learning, military training and sports. Augmented Human Research, 5(1), 1-9.
[16] Jani, K., Chaudhuri, M., Patel, H., & Shah, M. 2020. Machine learning in films: an approach towards automation in film censoring. Journal of Data, Information and Management, 2(1), 55-64.
[17] Shah, K., Patel, H., Sanghvi, D., & Shah, M. 2020. A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augmented Human Research, 5(1), 1-16.
[18] Shi, J., & Hurdle, J. F. 2018. Trie-based rule processing for clinical NLP: A use-case study of n-trie, making the ConText algorithm more efficient and scalable. Journal of biomedical informatics, 85, 106-113.
[19] Lauriola, I., Lavelli, A., & Aiolli, F. 2021. An introduction to Deep Learning in Natural Language Processing: models, techniques, and tools. Neurocomputing.
[20] Bahdanau, D., Cho, K., & Bengio, Y. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
[21] Kumhar, S. H., Kirmani, M. M., Sheetlani, J., & Hassan, M. 2021. Word Embedding Generation for Urdu Language using Word2vec model. Materials Today: Proceedings.
[22] Aktas, Ö., Birant, Ç. C., Aksu, B., & Çebi, Y. 2013. Automated synonym dictionary generation tool for Turkish (ASDICT). Bilig, 65, 47.
[23] Muhammad, P. F., Kusumaningrum, R., & Wibowo, A. 2021. Sentiment analysis using Word2vec and long short-term memory (LSTM) for Indonesian hotel reviews. Procedia Computer Science, 179, 728-735.
[24] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
[25] Fang, G., Zeng, F., Li, X., & Yao, L. 2021. Word2vec based deep learning network for DNA N4-methylcytosine sites identification. Procedia Computer Science, 187, 270-277.
[26] Nourani, V., & Behfar, N. 2021. Multi-station runoff-sediment modeling using seasonal LSTM models. Journal of Hydrology, 601, 126672.
[27] Zaytar, M. A., & El Amrani, C. 2016. Sequence to sequence weather forecasting with long short-term memory recurrent neural networks. International Journal of Computer Applications, 143(11), 7-11.
[28] Zhang, D., Lindholm, G., & Ratnaweera, H. 2018. Use long short-term memory to enhance Internet of Things for combined sewer overflow monitoring. Journal of Hydrology, 556, 409-418.
[29] Hochreiter, S., & Schmidhuber, J. (1997). “Long short-term memory.”, Neural computation, 9(8), 1735-1780.
[30] Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural computation, 12(10), 2451-2471.
[31] Graves, A., & Jaitly, N. 2014, June. Towards end-to-end speech recognition with recurrent neural networks. In International conference on machine learning (pp. 1764-1772). PMLR.
[32] Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., & Schmidhuber, J. 2008. A novel connectionist system for unconstrained handwriting recognition. IEEE transactions on pattern analysis and machine intelligence, 31(5), 855-868.
[33] Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., ... & Dean, J. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
[34] Schmidhuber, J. 2015. “Deep learning in neural networks: An overview.”, Neural networks, 61, 85-117.
[35] Liu, Y., Pu, H., & Sun, D. W. 2021. Efficient extraction of deep image features using convolutional neural network (CNN) for applications in detecting and analysing complex food matrices. Trends in Food Science & Technology.
[36] Teng, J., Zhang, D., Lee, D. J., & Chou, Y. 2019. Recognition of Chinese food using convolutional neural network. Multimedia Tools and Applications, 78(9), 11155-11172.
[37] Cho, J., Lee, K., Shin, E., Choy, G., & Do, S. 2015. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy?. arXiv preprint arXiv:1511.06348.
[38] Shin, H. C., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., ... & Summers, R. M. 2016. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging, 35(5), 1285-1298.
[39] Hua, K. L., Hsu, C. H., Hidayati, S. C., Cheng, W. H., & Chen, Y. J. 2015. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. OncoTargets and therapy, 8.
[40] Anavi, Y., Kogan, I., Gelbart, E., Geva, O., & Greenspan, H. 2015, August. A comparative study for chest radiograph image retrieval using binary texture and deep learning classification. In 2015 37th annual international conference of the IEEE engineering in medicine and biology society (EMBC) (pp. 2940-2943). IEEE.
[41] Banerjee, I., Crawley, A., Bhethanabotla, M., Daldrup-Link, H. E., & Rubin, D. L. 2018. Transfer learning on fused multiparametric MR images for classifying histopathological subtypes of rhabdomyosarcoma. Computerized Medical Imaging and Graphics, 65, 167-175.
[42] Banerjee, I., Ling, Y., Chen, M. C., Hasan, S. A., Langlotz, C. P., Moradzadeh, N., ... & Lungren, M. P. 2019. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artificial intelligence in medicine, 97, 79-88.
[43] Conneau, A., Schwenk, H., Barrault, L., & Lecun, Y. 2016. Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781.
[44] Shin, B., Chokshi, F. H., Lee, T., & Choi, J. D. 2017, May. Classification of radiology reports using neural attention models. In 2017 international joint conference on neural networks (IJCNN) (pp. 4363-4370). IEEE.
[45] Hughes, M., Li, I., Kotoulas, S., & Suzumura, T. 2017. “Medical text classification using convolutional neural networks.”, In Informatics for Health: Connected Citizen-Led Wellness and Population Health, 246-250.
[46] Mo, K., Park, J., Jang, M., & Kang, P. 2018. “Text Classification based on Convolutional Neural Network with word and character level.”, Journal of the Korean Institute of Industrial Engineers, 44(3).
[47] Deng, J., Cheng, L., & Wang, Z. 2021. Attention-based BiLSTM fused CNN with gating mechanism model for Chinese long text classification. Computer Speech & Language, 68, 101182.
[48] Vapnik, V. N. 1999. An overview of statistical learning theory. IEEE transactions on neural networks, 10(5), 988-999.
[49] Vapnik, V. 1998. The support vector method of function estimation. In Nonlinear modeling (pp. 55-85). Springer, Boston, MA.
[50] Suykens, J. A., & Vandewalle, J. 1999. Least squares support vector machine classifiers. Neural processing letters, 9(3), 293-300.
[51] Garla, V., Taylor, C., & Brandt, C. 2013. Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. Journal of biomedical informatics, 46(5), 869-875.
[52] Sun, A., Lim, E. P., & Liu, Y. 2009. On strategies for imbalanced text classification using SVM: A comparative study. Decision Support Systems, 48(1), 191-201.
[53] Fayed, H. A., & Atiya, A. F. 2021. Decision boundary clustering for efficient local SVM. Applied Soft Computing, 110, 107628.
[54] Breiman, L. 2001. Random forests. Machine learning, 45(1), 5-32.
[55] Aria, M., Cuccurullo, C., & Gnasso, A. 2021. A comparison among interpretative proposals for Random Forests. Machine Learning with Applications, 6, 100094.
[56] Liang, Y., & Zhao, P. 2019, September. A machine learning analysis based on big data for eagle ford shale formation. In SPE Annual Technical Conference and Exhibition. OnePetro.
[57] Suncak, A., & Aktaş, Ö. 2021. “A novel approach for detecting defective expressions in Turkish.”, Journal of Artificial Intelligence and Data Science (JAIDA), vol.1, 35-40.
[58] Chollet, F., & others. 2015. Keras. GitHub. Retrieved from https://github.com/fchollet/keras
[59] Abadi, M., & others. 2016. Github. Retrieved from https://github.com/tensorflow