Automatic concept identification of software requirements in Turkish

  Software requirements include description of the features for the target system and express the expectations of users. In the analysis phase, requirements are transformed into easy-to-understand conceptual models that facilitate communication between stakeholders. Although creating conceptual models using requirements is mostly implemented manually by analysts, the number of models that automate this process has increased recently. Most of the models and tools are developed to analyze requirements in English, and there is no study for agglutinative languages such as Turkish or Finnish. In this study, we propose an automatic concept identification model that transforms Turkish requirements into Unified Modeling Language class diagrams to ease the work of individuals on the software team and reduce the cost of software projects. The proposed work is based on natural language processing techniques and a new rule-set containing twenty-six rules is created to find object-oriented design elements from requirements. Since there is no publicly available dataset on the online repositories, we have created a well-defined dataset containing twenty software requirements in Turkish and have made it publicly available on GitHub to be used by other researchers. We also propose a novel evaluation model based on an analytical hierarchy process that considers the experts' views and calculate the performance of the overall system as 89 %. We can state that this result is promising for future works in this domain.

___

  • Pohl K. Requirements Engineering: Fundamentals, Principles, and Techniques. 1st ed. Berlin, Germany: Springer- Verlag, 2010.
  • Sagar V, Abirami S. Conceptual modeling of natural language functional requirements. J Syst Software 2014; 88: 25-41.
  • Hunt J. Guide to the Unified Process Featuring UML, Java and Design Patterns. 2nd ed. London, UK: Springer- Verlag, 2003.
  • Bozyiğit F, Aktaş Ö, Kılınç D. A novel evaluation approach for the systems transforming software requirements to object oriented source code. In: International Conference on Engineering Technologies; 7–9 December 2017; Konya, Turkey. pp. 129-134.
  • Mich L. NL-OOPS: From natural language to object oriented requirements using the natural language processing system LOLITA. Lect Notes Artif Int 1996; 2: 161-187.
  • Sagar VBRV, Abirami S. Conceptual modeling of natural language functional requirements. J Syst Software 2014; 88: 25-41.
  • Rumbaugh J, Blaha M, Premerlan W, Eddy F, Lorensen W. Object-Oriented Modeling and Design. 2nd ed. New York, NY, USA: Pearson Education, 2007.
  • Ibrahim M, Ahmad R. Class diagram extraction from textual requirements using natural language processing techniques. In: 2010 Second International Conference on Computer Research and Development; 7–10 May 2010; Kuala Lumpur, Malaysia. New York, NY, USA: IEEE. pp. 200-204.
  • Zhou X, Zhou N. Auto-generation of class diagram from free-text functional specifications and domain ontology. In: Artificial Intelligence; 2004.
  • Bajwa IS, Samad A, Mumtaz S. Object oriented software modelling using NLP based knowledge extraction. European Journal of Scientific Research 2009; 35: 22-33.
  • Tripathy A, Agrawal A, Rath, SK. Requirement analysis using natural language processing. In: Fifth International Conference on Advances in Computer Engineering; 26–27 December 2014; Kochi, India. pp. 463-472.
  • Kılınç D, Özçift A, Bozyiğit F, Yıldırım P, Yücalar F, Borandağ E. TTC-3600: A new benchmark dataset for Turkish text categorization. J Inf Sci 2017; 43: 174-185.
  • Aşlıyan R, Günel K, Filiz A. Türkçe Otomatik Heceleme Sistemi ve Hece İstatistikleri. In: Akademik Bilişim ’06; 9–11 February 2006; Denizli, Turkey (in Turkish).
  • Prakash M, Lucila O, Wendy W. Natural language processing: an introduction. J Am Med Inform Assn 2011; 18: 544-551.
  • Rehman Z, Anwar W, Bajwa UI, Xuan W, Chaoying Z. Morpheme matching based text tokenization for a scarce resourced language. PLoS One 2013; 8: e68178.
  • Can F, Kocberber S, Balcik E, Kaynak C, Ocalan HC, Vursavas OM. Information retrieval on Turkish texts. J Assoc Inf Syst 2008; 59: 407-421.
  • Eryiğit G. ITU Turkish NLP web service. In: Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics; 2014; Gothenburg, Sweden. pp. 1–4.
  • Türk Dil Kurumu. Büyük Türkçe Sözlük. Ankara, Turkey: TDK, 2018 (in Turkish).
  • Şeker GA, Eryiğit G. Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content of Turkish. Semant Web 2017; 8: 625-642.
  • Kim DK, Lu L, Lee B. Design pattern-based model transformation supported by QVT. J Syst Software 2017; 125: 289-308.
  • Saaty TL. Decision making with the analytic hierarchy process. International Journal of Services Sciences 2008; 1: 83-98.
  • Kiyavitskaya N, ZeniMich L, Berry DM. Requirements for tools for ambiguity identification and measurement in natural language requirements specifications. Requir Eng 2008; 13: 207–239.
  • Landhäußer M, Körner SJ, Tichy WF. From requirements to UML models and back: how automatic processing of text can support requirements engineering. Software Qual J 2014; 22: 121-149.
  • Kumar DD, Sanyal R. Static UML model generator from analysis of requirements (SUGAR). In: 2008 Advanced Software Engineering and Its Applications; 2008; Hainan Island, China. pp. 77–84.
  • Berry DM. Ambiguity in natural language requirements documents. In: Monterey Workshop; 2007; Monterey, CA, USA. pp. 1-7.
  • Ball CG, Kim RL. An Object-Oriented Analysis of Air Traffic Control. McLean, VA, USA: The MITRE Corporation, 1991.
  • Harmain HM, Gaizauskas R. Cm-builder: A natural language-based case tool for object-oriented analysis. Automat Softw Eng 2003; 10: 157-181.