Discovering the same job ads expressed with the different sentences by using hybrid clustering algorithms

Text mining studies on job ads have become widespread in recent years to determine the qualifications required for each position. It can be said that the researches made for Turkish are limited while a large resource pool is encountered for the English language. Kariyer.Net is the biggest company for the job ads in Turkey and 99% of the ads are Turkish. Therefore, there is a necessity to develop novel Natural Language Processing (NLP) models in Turkish for analysis of this big database. In this study, the job ads of Kariyer.Net have been analyzed, and by using a hybrid clustering algorithm, the hidden associations in this dataset as the big data have been discovered. Firstly, all ads in the form of HTML codes have been transformed into regular sentences by the means of extracting HTML codes to inner texts. Then, these inner texts containing the core ads have been converted into the sub ads by traditional methods. After these NLP steps, hybrid clustering algorithms have been used and the same ads expressed with the different sentences could be managed to be detected. For the analysis, 57 positions about Information Technology sectors with 6,897 ad texts have been focused on. As a result, it can be claimed that the clusters obtained contain useful outcomes and the model proposed can be used to discover common and unique ads for each position.

___

  • R. Loth, D. Battistelli, F. R. Chaumartin, H. De Mazancourt, J. L. Minel, and A. Vinckx, “Linguistic information extraction for job ads (SIRE project),” In 9th RIAO: Adaptivity, Personalization and Fusion of Heterogeneous Information, 2010, pp. 222-224.
  • J. L. F. D. M. Pombo, “Landing on the right job: a machine learning approach to match candidates with jobs applying semantic embeddings,” Doctoral dissertation, 2019.
  • J. Grüger, and G. J. Schneider, “Automated analysis of job requirements for computer scientists in online job advertisements,” in 15th International Conference on Web Information Systems and Technologies, 2019, pp 226-233.
  • M. A. Kennan, P. Willard, P., D. C. Kecmanovic, and C. S. Wilson, “25. IS early career job advertisements: A content analysis,” in 11th Pacific-Asia Conference on Information Systems, New Zealand, 2007, pp. 340-353.
  • Y. Choi, and E. Rasmussen, “What qualifications and skills are important for digital librarian positions in academic libraries? A job advertisement analysis,” The Journal of Academic Librarianship, vol. 35, no. 5, pp. 457–467, 2009.
  • M. Pember, “Content analysis of recordkeeping job advertisements in Western Australia: Knowledge and skills required by employers,” Australian Academic & Research Libraries, vol. 34, no 3, pp. 194-210, 2003.
  • D. C. Angelides. “From the present to the future of civil engineering education in Europe: A strategic approach,” in Proceedings of the International Meeting in Civil Engineering Education, Ciudad Real, Spain, 2003, pp. 1-21.
  • C. Kwon Lee, and H. Han, “Analysis of skills requirement for entry-level programmer/analysts in fortune 500 corporations,” Journal of Information Systems Education, vol. 19, no. 1, pp. 17-27, 2008.
  • K. Yongbeom, H. Jeffrey, and S. Mel, “An update on the is/it skills gap,” Journal of Information Systems Education, vol. 17, no. 4, pp. 395–402, 2008.
  • T. Chamorro-Premuzic, D. Winsborough, R. A. Sherman, and R. Hogan, “New talent signals: Shiny new objects or a brave new world,” Industrial and Organizational Psychology, vol. 9, no. 3, pp. 621–640, 2016.
  • A. Amado, P. Cortez, P. Rita, and S. Moro, “Research trends on Big data in marketing: A text mining and topic modeling based literature analysis,” European Research on Management and Business Economics, vol. 24, no. 1, pp. 1–7, 2018.
  • M. Mezzanzanica, “Italian web job vacancies for marketing-related professions,” Symphonya. Emerging Issues in Management, vol. 3, no. 1, pp. 110–124, 2017.
  • L. Guo, C. J. Vargo, Z. Pan, W. Ding, and P. Ishwar, “Big social data analytics in journalism and mass communication: Comparing dictionary-based text analysis and unsupervised topic modeling,” Journalism & Mass Communication Quarterly, vol. 93, no. 2, pp. 332–359, 2016.
  • Y. Kino, H. Kuroki, T. Machida, N. Furuya, and K. Takano, “Text analysis for job matching quality improvement,” Procedia Computer Science, vol. 112, no. 1, pp. 1523–1530, 2017.
  • I. Karakatsanis, W. AlKhader, F. MacCrory, A. Alibasic, M. A. Omar, and Z. Aung, “Data mining approach to monitoring the requirements of the job market: A case study,” Information Systems, vol. 65, no. 4, pp. 1–6, 2016.
  • O. Müller, T. Schmiedel, E. Gorbacheva, and J. vom Brocke, “Towards a typology of business process management professionals: Identifying patterns of competences through latent semantic analysis,” Enterprise Information Systems, vol. 10, no. 1, pp. 50–80, 2016.
  • F. Amato, R. Boselli, M. Cesarini, F. Mercorio, M. Mezzanzanica, and V. Moscato, “Challenge: Processing web texts for classifying job offers,” in Semantic Computing (ICSC), 2015 IEEE International Conference on Semantic Computing, 2015, pp. 460–463.
  • R. Boselli, M. Cesarini, F. Mercorio, and M. Mezzanzanica, “Classifying online job advertisements through machine learning,” Future Generation Computer Systems, vol. 86, no. 9, pp. 319–328, 2018.
  • X. Xu, X. Wang, Y. Li, and M. Haghigh, “Business intelligence in online customer textual reviews: Understanding consumer perceptions and influential factors,” International Journal of information management, vol. 37, no. 6, pp. 673–683, 2017.
  • W. He, S. Zha, and L. Li, “Social media competitive analysis and text mining: A case study in the pizza industry,” International Journal of Information Management, vol. 33, no. 3, pp. 464–472, 2013.
  • B. Jeong, J. Yoon, and J. M. Lee, “Social media mining for product planning: A product opportunity mining approach based on topic modeling and sentiment analysis,” International Journal of Information Management, vol. 48, no. 1, pp. 280-290, 2019.
  • M. Pejić Bach, Ž. Krstić, S. Seljan, and L. Turulja, “Text mining for big data analysis in financial sector: A literature review,” Sustainability, vol. 11, no. 5, pp. 1-27, 2019.
  • H. C. Chang, C. Y. Wang, and S. Hawamdeh, “Emerging trends in data analytics and knowledge management job market: Extending KSA framework,” Journal of Knowledge Management, vol. 23, no. 4, pp. 664-686, 2018.
  • I. Kregel, N. Ogonek, and B. Matthies, “Competency profiles for lean professionals-an international perspective,” International Journal of Productivity and Performance Management, vol. 68, no. 2, pp. 423–446, 2019.
  • A. de Mauro, M. Greco, M. Grimaldi, and P. Ritala, “Human resources for big data professions: A systematic classification of job roles and required skill sets,” Information Processing and Management, vol. 54, no. 9, pp. 807–817, 2017.
  • A. Gardiner, C. Aasheim, P. Rutner, and S. Williams, “Skill requirements in big data: A content analysis of job advertisements,” Journal of Computer Information Systems, vol. 58, no. 4, pp. 374–384, 2018.
  • P. A. Todd, J. D. McKeen, and R. B. Gallupe, “The evolution of IS job skills: A content analysis of IS job advertisements from 1970 to 1990,” MIS quarterly, vol. 19, no.1, pp. 1-27, 1995.
  • A. Amado, P. Cortez, P. Rita, and S. Moro, “Research trends on Big data in marketing: A text mining and topic modeling based literature analysis,” European Research on Management and Business Economics, vol. 24, no. 1, pp. 1-7, 2018.
  • A. AlAlwan, N. P. Rana, Y. K. Dwivedi, and R. Algharabat, “Social media in marketing: A review and analysis of the existing literature,” Telematics and Informatics, vol. 34, no. 7, pp. 1177-1190, 2017.
  • Y. K. Dwivedi, K. K. Kapoor, and H. Chen, “Social media marketing and advertising,” The Marketing Review, vol. 15, no. 3, pp. 289-309, 2015.
  • L. Guo, C. J. Vargo, Z. Pan, W. Ding, and P. Ishwar, “Big social data analytics in journalism and mass communication: Comparing dictionary-based text analysis and unsupervised topic modeling,” Journalism & Mass Communication Quarterly, vol. 93, no. 2, pp. 332-359, 2016.
  • K. K. Kapoor, K. Tamilmani, N. P. Rana, P. Patil, Y. K. Dwivedi, and S. Nerur, “Advances in social media research: Past, present and future,” Information Systems Frontiers, vol. 20, no. 3, pp. 531-558, 2018.
  • W. L. Shiau, Y. K. Dwivedi, and H. S. Yang, “Co-citation and cluster analyses of extant literature on social networks,” International Journal of Information Management, vol. 37, no. 5, pp. 390-399, 2017.
  • W. L. Shiau, Y. K. Dwivedi, and H.H. Lai, “Examining the core knowledge on Facebook,” International Journal of Information Management, vol. 43, no. 1, pp. 52-63, 2018.
  • I. Rahhal, I. Makdoun, G. Mezzour, I. Khaouja, K., Carley, and I. Kassou, “Analyzing cybersecurity job market needs in Morocco by mining job ads. In 2019 IEEE Global Engineering Education Conference (EDUCON), 2019, pp. 535-543.