KELİME TEMSİLLERİ İÇİN TEST PERFORMANSINI GELİŞTİRMEYE YÖNELİK EŞDİZİMLİLİK AĞIRLIKLARININ SEÇİMİ

Bu çalışma, matematiksel kelime temsillerinin belirli bir görev için performanslarının en iyilenmesi problemini yeniden ele almaktadır. Sayma tabanlı (kelimelerin eşdizimlilik istatistiklerini hesaba katan) kelime temsili oluşturma yöntemlerinde klasik olarak kullanılan sayma ağırlıkları yerine yenilikçi ağırlıklar önererek analoji ve benzerlik bulma görevlerinde performans artışı sağlamak hedeflenmektedir. Çalışma dili olarak Türkçe seçilmiş, derlem oluşturulurken Türkçe’ye has ek-kök yapıları ek alan her kelime yeni bir kelime gibi kabul edilecek şekilde yorumlanmıştır. Önerilen eşdizimlilik ağırlıklarının performansı değişen parametreye göre analiz edilerek sonuçlar çalışma içerisinde paylaşılmıştır. 

Co-occurrence Weight Selection for Word Embeddings to Enhance Test Performance

This study revisits the problem of maximizing the performance of mathematical word representations for a given task. It is aimed to improve performance in analogy and similarity tasks by suggesting innovative weights instead of the counting weights used conventionally in counting-based methods of generating word representations (adding the statistics of word co-occurrences to the account). The language of study was selected as Turkish. The root structures of Turkish words were managed during the compilation of corpus such that each word having a suffix was considered as a new word. The performance of the proposed co-occurrence weights are analyzed with respect to the varying parameter and the results are presented within the paper.

___

  • Bahdanau, D., Cho, K. and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv: 1409.0473.
  • Bengio, Y., Ducharme, R., Vincent, P., and Jauvin C. (2003). A neural probabilistic language model. Journal of machine learning research, 1137 – 1155. doi: 10.1162/153244303322533223
  • Bojanowski, P., Grave, E., Joulin, A., ve Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.
  • Faruqui, M., Dodge, J. , Jauhar, S. K., Dyer, C., Hovy, E. ve Smith, N. A. (2014) Retrofitting word vectors to semantic lexicons, arXiv preprint arXiv:1411.4166. doi: 10.3115/v1/N15-1184
  • Firth, J. R., (1957). A synopsis of linguistic theory 1930-1955. In Studies in linguistic analysis, 1-32. Oxford:Blackwell.
  • Huth, A.G., de Heer, W.A., Griffiths, T.L., Theunissen, F.E. and Gallant, J.L. (2016) Natural speech reveals the semantic map that tile human cerebral cortex. Nature, vol. 532, no. 7600, 453 – 458. doi:10.1038/nature17637
  • Karpathy, A. and Fei-Fei, L. (2016). Deep visual-semantic alignments for generating image descriptions. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39 (4), 664-676. doi: 10.1109/TPAMI.2016.2598339
  • Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural netwroks. In Advances in neural information processing systems, 1097-1105. doi: 10.1145/3065386
  • Le, Q. V. ve Mikolov, T. (2014) Distributed representations of sentences and documents, ICML, vol. 14, 1188–1196.
  • Luong, T., Socher, R. ve Manning, C.D. (2013) Better word representations with recursive neural networks for morphology, CoNLL, 104–113.
  • Mikolov, T., Karafiat, M., Burget, L., Cernocky, J. and Khudanpur S. (2010) Recurrent neural network based language model, Interspeech, Vol 2, 3.
  • Mikolov, T., Chen, K., Corrado, G. ve Dean, J. (2013a) Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781.
  • Mikolov, T., Le, Q.V. and Sutskever, I. (2013b) Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168.
  • Mikolov, T., Sutskever, I, Chen, K., Corrado, G.S. ve Dean J. (2013c) Distributed representations of words and phrases and their compositionality, Advances in neural information processing systems, 3111–3119.
  • Mnih, A., ve Hinton, G., (2007) Three new graphical models for statistical language modelling, Proceedings of the 24th International Conference on Machine Learning. ACM, 641–648. doi:10.1145/1273496.1273577
  • Pennington, J., Socher, R. ve Manning, C.D. (2014) Glove: Global vectors for word representation, Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. doi: 10.3115/v1/D14-1162
  • Ravichandran, D., Pantel, P. ve Hovy, E. (2005) Randomized algorithms and nlp: Using locality sensitive hash function for high speed noun clustering, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, 622–629. doi: 10.3115/1219840.1219917
  • Salton, G., Wong, A., and Yang, C.S. (1975) A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620. doi: 10.1145/361219.361220
  • Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., ve Potts C., (2013) Recursive deep models for semantic compositionality over a sentiment treebank, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 1631-1642.
  • Şenel, L.K., Yücesoy, V., Koç, A. and Çukur, T. (2017) Measuring cross-lingual semantic similarity across European languages. In International Conference on Telecommunications and Signal Processing (TSP), 359-363. doi: 10.1109/TSP.2017.8076005
  • Şenel, L. K., Yücesoy, V., Koç, A., Çukur, T. (2017b). Semantic similarity between Turkish and european languages using word embeddings. 25th Signal Processing and Communications Applications Conference, 1-4. doi: 10.1109/SIU.2017.7960365
  • Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T. and Qin, B. (2014) Learning sentiment specific word embedding for twitter sentiment classification. ACL, 1555 – 1565. doi: 10.3115/v1/P14-1146
  • Yücesoy, V., Koç A. (2017). Effect of co-occurrence weighting to English word embeddings. 25th Signal Processing and Communications Applications Conference, 1-4. doi: 10.1109/SIU.2017.7960385
Uludağ Üniversitesi Mühendislik Fakültesi Dergisi-Cover
  • ISSN: 2148-4147
  • Yayın Aralığı: Yılda 3 Sayı
  • Başlangıç: 2002
  • Yayıncı: BURSA ULUDAĞ ÜNİVERSİTESİ > MÜHENDİSLİK FAKÜLTESİ
Sayıdaki Diğer Makaleler

PAMUK İPLİKLERİN BOBİN FORMUNDA AĞARTILMASINDA ÇEVRE DOSTU YÖNTEM OLARAK OZON KULLANIM OLANAĞININ ARAŞTIRILMASI

SEMİHA EREN, ERHAN KENAN ÇEVEN

PİSTONLU YARI-HERMETİK SOĞUTMA KOMPRESÖRÜ KRANK MİLİNİN SANKİ-STATİK YAKLAŞIMLA SONLU ELEMANLAR ANALİZİ

HÜSEYİN LEKESİZ

ÇELİK YAPILARDA KULLANILAN DİYAGONAL ÇELİK ÇAPRAZLARIN YAPAY ARI KOLONİ ALGORİTMASI İLE OPTİMİZASYONU

Turan KARABÖRK, MUSTAFA SÖNMEZ, ERSİN AYDIN, TÜLİN ÇELİK, Yakup BÖLÜKBAŞ

ÇOK KRİTERLİ KARAR VERME PROBLEMLERİNDE FAYDA FONKSİYONU AĞIRLIKLARININ TAHMİN EDİLMESİ İÇİN MATEMATİKSEL MODEL TEMELLİ BİR YÖNTEM

CEREN TUNCER ŞAKAR, BARBAROS YET

SIZINTI SUYU ARITIMINDA STRÜVİT ÇÖKTÜRME OPTİMİZASYONU

SELİM DOĞAN, AHMET AYGÜN, MEHMET EMİN ARGUN, ERTUĞRUL ESMERAY

BETONARME BİNALARIN WEB TABANLI HIZLI DEĞERLENDİRİLMESİ

ERCAN IŞIK, MEHMET FATİH IŞIK, MEHMET AKİF BÜLBÜL

C.I. REACTIVE BLACK 5 BOYARMADDESİNİN FOTOKALİTİK RENK GİDERİMİ

SEMİHA EREN

AKIŞ KAYNAKLI TİTREŞİMDE İKİ ARDIŞIK SİLİNDİRDEN ELDE EDİLEN GÜCÜN PARAMETRİK OLARAK İNCELENMESİ

Erinç DOBRUCALI

ÖĞÜTÜLMÜŞ ARAÇ LASTİĞİNİN VE PİROLİZ İŞLEMİNDEN SONRA OLUŞAN KARBON SİYAHININ BİTÜMLÜ BAĞLAYICILARIN REOLOJİK ÖZELLİKLERİ ÜZERİNDEKİ ETKİSİ

TANER ALATAŞ, MEHMET YILMAZ, BAHA VURAL KÖK, Muhammed Ertuğrul ÇELOĞLU, MUSTAFA AKPOLAT, Özge Erdoğan YAMAÇ, ERKUT YALÇIN

MALZEME YÜKLEME AMAÇLI 6 EKSEN BİR ROBOT MANİPÜLATÖRÜN DİNAMİK PROGRAMLAMA İLE HAREKET OPTİMİZASYONU

AYTAÇ GÖREN, Umut ÇAKIR