Ali Aydın SELÇUK, M. Buğra TORUSDAĞ, Mucahid KUTLU

Evaluation of social bot detection models

Social bots are employed to automatically perform online social network activities; thereby, they can also be utilized in spreading misinformation and malware. Therefore, many researchers have focused on the automatic detection of social bots to reduce their negative impact on society. However, it is challenging to evaluate and compare existing studies due to difficulties and limitations in sharing datasets and models. In this study, we conduct a comparative study and evaluate four different bot detection systems in various settings using 20 different public datasets. We show that high-quality datasets covering various social bots are critical for a reliable evaluation of bot detection methods. In addition, our experiments suggest that Botometer is preferable to others in order to detect social bots.

PDF

___

[1] Vosoughi S, Roy D, Aral S. The spread of true and false news online. Science. 2018; 359 (6380):1146-51. doi: 10.1126/science.aap9559
[2] MacAvaney S, Yao HR, Yang E, Russell K, Goharian N et al. Hate speech detection: Challenges and solutions. PloS one. 2019; 14 (8):e0221152. doi: 10.1371/journal.pone.0221152
[3] Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M. Social fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling. IEEE Transactions on Dependable and Secure Computing. 2017; 15(4):561–576. doi: 10.1109/TDSC.2017.2681672
[4] Davis CA, Varol O, Ferrara E, Flammini A, Menczer F. Botornot: A system to evaluate social bots. In: 25th International Conference Companion on World Wide Web; 2016. pp. 273–274.
[5] Kudugunta S, Ferrara E. Deep neural networks for bot detection. Information Sciences. 2018; 467:312–322. doi: 10.1016/j.ins.2018.08.019
[6] Yang KC, Varol O, Hui PM, Menczer F. Scalable and generalizable social bot detection through data selection. In: AAAI Conference on Artificial Intelligence. vol. 34; 2020. pp. 1096–1103.
[7] Orabi M, Mouheb D, Al Aghbari Z, Kamel I. Detection of bots in social media: a systematic review. Information Processing & Management. 2020; 57(4):102250. doi: 10.1016/j.ipm.2020.102250
[8] Almerekhi H, Elsayed T. Detecting automatically-generated arabic tweets. In: Asia Information Retrieval Symposium; 2015. pp. 123–134.
[9] Halawa H, Beznosov K, Boshmaf Y, Coskun B, Ripeanu M, Santos-Neto E. Harvesting the low-hanging fruits: defending against automated large-scale cyber-intrusions by focusing on the vulnerable population. In: 2016 New Security Paradigms Workshop; 2016. pp. 11–22.
[10] Cornelissen LA, Barnett RJ, Schoonwinkel P, Eichstadt BD, Magodla HB. A network topology approach to bot classification. In: Annual Conference of the South African Institute of Computer Scientists and Information Technologists; 2018. pp. 79–88.
[11] Hurtado S, Ray P, Marculescu R. Bot detection in reddit political discussion. In: Fourth International Workshop on Social Sensing; 2019. pp. 30–35.
[12] Ping H, Qin S. A social bots detection model based on deep learning algorithm. In: 2018 IEEE 18th International Conference on Communication Technology; 2018. pp. 1435–1439.
[13] Wang Y, Wu C, Zheng K, Wang X. Social bot detection using tweets similarity. In: International Conference on Security and Privacy in Communication Systems; 2018. pp. 63–78.
[14] Jr SB, Campos GF, Tavares GM, Igawa RA, Jr MLP, et al. Detection of human, legitimate bot, and malicious bot in online social networks based on wavelets. ACM Transactions on Multimedia Computing, Communications, and Applications. 2018; 14(1s):1–17. doi: 10.1145/3183506
[15] Igawa RA, Barbon Jr S, Paulo KCS, Kido GS, Guido RC, et al. Account classification in online social networks with LBCA and wavelets. Information Sciences. 2016; 332:72–83. doi: 10.1016/j.ins.2015.10.039
[16] Morstatter F, Wu L, Nazer TH, Carley KM, Liu H. A new approach to bot detection: striking the balance between precision and recall. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining; 2016. pp. 533–540.
[17] Chavoshi N, Hamooni H, Mueen A. Identifying correlated bots in twitter. In: International conference on social informatics; 2016. pp. 14–21.
[18] Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M. The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race. In: 26th international conference on world wide web companion; 2017. pp. 963–972.
[19] Wang G, Mohanlal M, Wilson C, Wang X, Metzger M, Zheng H et al. Social Turing Tests: Crowdsourcing Sybil Detection. In: 20th Annual Network and Distributed System Security Symposium; 2013.
[20] Pan J, Liu Y, Liu X, Hu H. Discriminating bot accounts based solely on temporal features of microblog behavior. Physica A: Statistical Mechanics and its Applications. 2016; 450:193–204. doi: 10.1016/j.physa.2015.12.148
[21] Boshmaf Y, Muslukhov I, Beznosov K, Ripeanu M. Design and analysis of a social botnet. Computer Networks. 2013; 57 (2):556–578. doi: 10.1016/j.comnet.2012.06.006
[22] Stein T, Chen E, Mangla K. Facebook immune system. In: 4th Workshop on Social Network Systems; 2011. pp. 1–8.
[23] Elyashar A, Fire M, Kagan D, Elovici Y. Homing socialbots: intrusion on a specific organization’s employee using socialbots. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining; 2013. pp. 1358–1365.
[24] Freitas C, Benevenuto F, Ghosh S, Veloso A. Reverse engineering socialbot infiltration strategies in twitter. In: 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining; 2015. pp. 25–32.
[25] Grimme C, Preuss M, Adam L, Trautmann H. Social bots: Human-like by means of human control? Big data. 2017; 5 (4):279–293. doi: 10.1089/big.2017.0044
[26] Cresci S, Petrocchi M, Spognardi A, Tognazzi S. On the capability of evolved spambots to evade detection via genetic engineering. Online Social Networks and Media. 2019; 9:1–16. doi: 10.1016/j.osnem.2018.10.005
[27] Sayyadiharikandeh M, Varol O, Yang KC, Flammini A, Menczer F. Detection of novel social bots by ensembles of specialized classifiers. In: 29th ACM International Conference on Information & Knowledge Management; 2020. pp. 2725–2732.
[28] Giglietto F, Righetti N, Rossi L, Marino G. It takes a village to manipulate the media: coordinated link sharing behavior during 2018 and 2019 Italian elections. Information, Communication & Society. 2020; 23 (6):867–891. doi: 10.1080/1369118X.2020.1739732
[29] Pasquetto IV, Swire-Thompson B, Amazeen MA, Benevenuto F, Brashier NM, et al. Tackling misinformation: What researchers could do with social media data. The Harvard Kennedy School Misinformation Review. 2020. doi: 10.37016/mr-2020-49
[30] Martini F, Samula P, Keller TR, Klinger U. Bot, or not? Comparing three methods for detecting social bots in five political discourses. Big Data & Society. 2021; 8 (2):20539517211033566. doi: 10.1177/20539517211033566
[31] Gallwitz F, Kreil M. The Rise and Fall of “Social Bot” Research. SSRN. 2021.
[32] Rauchfleisch A, Kaiser J. The False positive problem of automatic bot detection in social science research. PloS one. 2020; 15(10):e0241045. doi: 10.2139/ssrn.3565233
[33] Efthimion PG, Payne S, Proferes N. Supervised machine learning bot detection techniques to identify social twitter bots. SMU Data Science Review. 2018; 1 (2):5.
[34] Yang KC, Varol O, Davis CA, Ferrara E, Flammini A, Menczer F. Arming the public with artificial intelligence to counter social bots. Human Behavior and Emerging Technologies. 2019; 1 (1):48–61.
[35] Cresci S, Lillo F, Regoli D, Tardelli S, Tesconi M. $ FAKE: Evidence of spam and bot activity in stock microblogs on Twitter. In: 12th International AAAI Conference on Web and Social Media; 2018. pp. 580–583.
[36] Cresci S, Lillo F, Regoli D, Tardelli S, Tesconi M. Cashtag piggybacking: Uncovering spam and bot activity in stock microblogs on Twitter. ACM Transactions on the Web. 2019; 13 (2):1–27. doi: 10.1145/3313184
[37] Mazza M, Cresci S, Avvenuti M, Quattrociocchi W, Tesconi M. Rtbust: Exploiting temporal patterns for botnet detection on twitter. In: 10th ACM Conference on Web Science; 2019. pp. 183–192.
[38] Gilani Z, Farahbakhsh R, Tyson G, Wang L, Crowcroft J. Of bots and humans (on twitter). In: 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017; 2017. pp. 349–354.
[39] Yang C, Harkreader R, Gu G. Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Transactions on Information Forensics and Security. 2013; 8 (8):1280–1293. doi: 10.1109/TIFS.2013.2267732
[40] Cresci S. Detecting malicious social bots: story of a never-ending clash. In: 1st Multidisciplinary International Symposium on Disinformation in Open Online Media; 2019. pp. 77–88
[41] Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J et al. Language models are few-shot learners. arXiv preprint arXiv:200514165. 2020.
[42] Elkins K, Chun J. Can GPT-3 Pass a Writer’s Turing Test? Journal of Cultural Analytics. 2020; 1(1):17212 doi: 10.22148/001c.17212