Integration of Business Process Data using Advanced ETL tools
Integration of Business Process Data using Advanced ETL tools
The large proliferation of the BPM discipline has considerably propelled companies' information systems which become ubiquitous. Consequently, a huge volume of data is generated during BP execution and stored in adequate IT systems. However, such data are often heterogeneous and the problem of their integration is posed with acuity in order to allow data analysis and to coordinate companies' activities. In fact, different techniques and approaches have been suggested in the research literature to tackle such issue. In the business intelligence specific context, this challenge is addressed by deploying the conventional ETL software tools allowing to build an unique container supporing various data and which is perceived as a data warehouse. Nevertheless, current distributed information systems are spanning enterprises boundaries and deal with a large variety of data sources having various formats and produced in a continuous manner. Thus, actual ETL tools seems unsuitable to handle massive data and they remain limited for facing to issues imposed by related to business processes execution data stored in log files. In this paper we propose an enhancement of the structure and functionalities of standard ETL tools in order to handle heterogeneous data integration generated by business processes execution. The improved system, named OLE-ST, constitutes a fundamental enrichment of the existing ETL mechanism. The proposed approach has been implemented in a software tool that ensures best performances for exploiting the target data warehouse to be built.
___
- Wil M. P. van der Aalst. Process-Aware Information Systems: Lessons to Be
Learned from Process Mining, pages 1{26. Springer Berlin Heidelberg, Berlin,
Heidelberg, 2009.
- Mathias Weske. Business Process Management - Concepts, Languages, Architec-
tures, 2nd Edition. 2012.
- Richard Hull, Jianwen Su, and Roman Vaculin. Data management perspectives on
business process management: Tutorial overview. In Proceedings of the 2013 ACM
SIGMOD International Conference on Management of Data, SIGMOD '13, pages
943{948, New York, NY, USA, 2013. ACM.
- Thomas Jorg and Stefan Deloch. Towards generating etl processes for incremental
loading. volume 299, pages 101{110, 01 2008.
- Neepa Biswas, Anamitra Sarkar, and Dr-Kartick Mondal. Ecient incremental
loading in etl processing for real-time data integration. Innovations in Systems
and Software Engineering, 16, 03 2020.
- Sreemathy J, Infant Joseph V, Nisha. S, Chaaru Prabha I, and Gokula Rm. Data
integration in etl using talend. 2020 6th International Conference on Advanced
Computing and Communication Systems (ICACCS), pages 1444{1448, 2020.
- J Sreemathy, R Brindha, M Selva Nagalakshmi, N Suvekha, N Karthick Ragul, and
M Praveennandha. Overview of etl tools and talend-data integration. In 2021 7th
International Conference on Advanced Computing and Communication Systems
(ICACCS), volume 1, pages 1650{1654, 2021.
- Agnas Michael and Purnima Ahirao. Improved use of etl tool for updation and
creation of data warehouse from dierent rdbms. SSRN Electronic Journal, 01
2020.
- Gergely Pinter, Henrique Madeira, Marco Vieira, Istvan Majzik, and Andras Pataricza.
Integration of olap and data mining for analysis of results from dependability
evaluation experiments. Int. J. Knowledge Management Studies, 2, 01 2018.
- Il-Yeol Song. Data Warehousing Systems: Foundations and Architectures, pages
684{692. Springer US, Boston, MA, 2009.
- Leo Willyanto Santoso and Yulia. Data warehouse with big data technology for
higher education. Procedia Computer Science, 124:93{99, 2017. 4th Information
Systems International Conference 2017, ISICO 2017, 6-8 November 2017, Bali,
Indonesia.
- Oras Baker and Chuong Nguyen Thien. A new approach to use big data tools to
substitute unstructured data warehouse. In 2020 IEEE Conference on Big Data
and Analytics (ICBDA), pages 26{31, 2020.
- Syed Muhammad Fawad Ali. Next-generation etl framework to address the challenges
posed by big data. In DOLAP, 2018.
- M. Ahmed-Nacer and J. Estublier. Schema evolution in software engineering
databases - a new approach in adele environment. Computers and Arti cial Intel-
ligence, 19(2):183{203, 2000.
- J.Andany, Leonard M., and Palisser C. Management of schema evolution in
databases. In 17th, (VLDB), Spain, 1991.
- Alexander Stuckenholz. Component evolution and versioning state of the art.
SIGSOFT Softw. Eng. Notes, 30(1), January 2005.
- Andreas RAUSCH. Software evolution in componentware using requirements/
assurances contracts. ICSE '00, pages 147{156, USA, 2000.
- Bennett P. Lientz and E. Burton Swanson. Software Maintenance Management.
Boston, MA, USA, 1980.
- Andreas Meyer, Sergey Smirnov, and Mathias Weske. Data in business processes.
EMISA Forum, 31:5{31, 2011.
- Panos Vassiliadis and Alkis Simitsis. Extraction, Transformation, and Loading,
pages 1095{1101. Springer US, Boston, MA, 2009.
- William F. Holmgren, Robert W. Andrews, Antonio T. Lorenzo, and Joshua S.
Stein. Pvlib python 2015. In 2015 IEEE 42nd Photovoltaic Specialist Conference
(PVSC), pages 1{5, 2015.