Automated Synonym Dictionary Generation Tool for Turkish (ASDICT)

Bu makalede, gerçekleştirimi yapılan Türkçe için Otomatik Eşanlamlılar Sözlüğü Oluşturma Aracı (ASDICT) ve algoritmaların geliştirme süreçleri detaylı bir şekilde anlatılmıştır. ASDICT, Türk Dil Kurumu (TDK) tarafından yayımlanan Güncel Türkçe Sözlük veritabanı üzerinde uygulanarak bir eşanlamlılar veritabanı elde edilmiştir. Eşanlamlılar sözlüğü oluşturma süreci, dört ayrı işlem uygulanarak gerçekleştirilmiştir. Bu işlemlerin sonucunda kesin olarak belirlenmiş olan eşanlamlı kelimeler Kesin Eşanlamlı (Definite Synonym (Dn) ) olarak sınıflandırılmış ve Eşanlamlılar Listesi'ne (Synonym List (SLi)) kaydedilmiştir. Kesin Eşanlamlı olarak sınıflandırılama- yan bazı kelimeler, Belirsiz (Ambiguity) olarak sınıflandırıla- rak, daha güvenilir bir eşanlamlılar veritabanı oluşturabilmek amacıyla, denetimli yöntemlerle kontrol edilip belirlenmek üzere Belirsizlik Listesine (Ambiguity File (AF) ) kaydedilmiştir. İşlemlerin sonucunda, Kesin Eşanlamlılar Veritabanı (Definite Synonyms Database (DSDB)) olarak adlandırılan eşanlamlı kelimeleri içeren veritabanı oluşturulmuş ve Türk Dil Kuru- mu'nun resmi web sitesinde kullanıma açılmıştır (TDK 2009).

Türkçe için Otomatik Eşanlamlılar Sözlüğü Oluşturma Aracı (ASDICT)

In this paper, an Automated Synonym Dictionary GenerationTool for Turkish (ASDICT) was briefly described and the development process of the algorithms was given in detail.By applying the ASDICT onto the data of ContemporaryTurkish Dictionary published by Turkish Linguistic Association (TDK: Türk Dil Kurumu), a synonym database was obtained. The synonym dictionary generation process was carriedout by applying four processes. As a result of these processes,the definite synonyms were classified as Definite Synonym(Dn) and put into the Synonym List (SLi). Some words, whichcould not be classified as Dn, were classified as Ambiguity andstored in a file called Ambiguity File (AF) to be checked out bysupervised methods to build a more reliable synonym database.The synonym database for Contemporary Turkish Dictionary, which is called "Definite Synonyms Database (DSDB)",was built by applying ASDICT, and it is currently availableon the official web site of TDK (TDK 2009)

___

Agnes, Michael E. (1999). Webster's New World College Dictionary. 978- 0028631189, Fourth Edition.

Chief, Lian-Cheng, (2000). What can near synonyms tell us?, Computational linguistics and Chinese Language Processing (5/1): 47-60.

Chu-Ren, Huang and Jia-Fei Hong (2005). "Deriving Conceptual Structures Form Sense: A Study of Near Synonymous Sensation Verbs", Journal of Chinese Language and Computing (JCLC). Volume 15. No 3. Singapore.

Donnely, Colleen (1994). Linguistics for Writers, Albany, State University of New York Press.

Edmonds, Philip (1999). Semantic Representations of Near-Synonyms for Automatic Lexical Choice, Ph.D. thesis, Computer Science Department University of Toronto, Toronto.

Martin, Marilyn (1984). Advanced Vocabulary Teaching: The Problem of Synonyms, The Modern Language Journal (68/2): 130-137.

Reiter, Ehud and Somayajulu Sripada (2002). "The SMART Retrieval System - Experiments in Automatic Document Processing", Computational Lin- guistics 28/4, 447-485

Salton, Gerard (1971). The SMART retrieval system: Experiments processing, Eng- lewood Cliffs, NJ: Prentice-Hall Inc.

Turkish Language Association (Türk Dil Kurumu: TDK), tdk.org.tr/esveyakin/ (25.10.2009)

Wen, Ji-Rong, Jian-Yun Nie and Hong-Jiang Zhang (2002). "Query clustering using user logs, ACM Transactions on Information Systems. 20/1 59-81.