Character Level Authorship Attribution for Turkish Text Documents

Character Level Authorship Attribution for Turkish Text Documents

Individuals have their own style of speaking and writing. Style of a text can be used as a distinctive feature to recognize its author. In recent years, practical applications for authorship attribution have grown in areas such as criminal law, civil law and computer security. Recent research has used techniques from machine learning, information retrieval and natural language processing in authorship attribution. In this paper, Statistical Language Modeling is utilized in Authorship Attribution. Each author is represented with feature statistics. Letters, punctuations and special characters which build up the feature set are utilized to calculate the profiles of the authors