A Novel Approach to Improve the Performance of the Database Storing Big Data with Time Information

A Novel Approach to Improve the Performance of the Database Storing Big Data with Time Information

Big data is defined as data sets that are too large and/or complex to be processed by classical data processing methods. Big data analysis is essential because it enables more competent business movements, more efficient operations, and higher profits by using the data of institutions and organizations. However, large datasets are difficult to analyze because they are produced quickly, require large storage areas in computer systems, and the diversity of their data. In this study, a new approach using the denormalization method is proposed to accelerate the response time of the database in database systems where large volumes of data containing historical information are stored. Denormalization is defined as the process of adding rows or columns that are not needed to increase the reading performance of the database to the database system that has been normalized. In the proposed approach in this study, a large-volume dataset consisting of real spatial data belonging to Kayseri Metropolitan Municipality, containing temporal information and having approximately 96,000,000 row records, was used. In the proposed approach, the response time of the query is accelerated by recording the time information as numbers to increase the query performance of large volumes of data recorded in date format due to the temporal query process. The performance of the proposed method is compared with the performance of the normalization method using actual data on Microsoft SQL Server and Oracle database systems. The method proposed in the experimental evaluations shows that it works approximately eight times faster. In addition, the experimental results showed that the proposed method improves query performance more than the normalization-based method as the data size increases.

___

  • [1] P. K. Malik, R. Sharma, R. Singh, A. Gehlot, S. C. Satapathy, W. S. Alnumay, D. Pelusi, U. Ghosh, and J. Nayak, “Industrial internet of things and its applications in industry 4.0: State of the art,” Computer Communications, vol. 166, pp. 125–139, 2021.
  • [2] V. Suma et al., “Internet-of-things (iot) based smart agriculture in indiaan overview,” Journal of ISMAC, vol. 3, no. 01, pp. 1–15, 2021.
  • [3] M. Ghasemaghaei, “Understanding the impact of big data on firm performance: The necessity of conceptually differentiating among big data characteristics,” International Journal of Information Management, vol. 57, p. 102055, 2021.
  • [4] C. Fan, D. Yan, F. Xiao, A. Li, J. An, and X. Kang, “Advanced data analytics for enhancing building performances: From data-driven to big data-driven approaches,” in Building Simulation, vol. 14, no. 1. Springer, 2021, pp. 3–24.
  • [5]M. Naeem, T. Jamal, J. Diaz-Martinez, S. A. Butt, N. Montesano, M. I. Tariq, E. De-la Hoz-Franco, and E. De-La-Hoz-Valdiris, “Trends and future perspective challenges in big data,” in Advances in Intelligent Data Analysis and Applications. Springer, 2022, pp. 309–325.
  • [6] J. Ranjan and C. Foropon, “Big data analytics in building the competitive intelligence of organizations,” International Journal of Information Management, vol. 56, p. 102231, 2021.
  • [7] M. L. Larrea and D. K. Urribarri, “Visualization technique for comparison of time-based large data sets,” in Conference on Cloud Computing, Big Data & Emerging Topics. Springer, 2021, pp. 179–187.
  • [8] J. D. Dinneen and C. Brauner, “Information-not-thing: further problems with and alternatives to the belief that information is physical,” 2017.
  • [9] M. Vaitis, H. Feidas, P. Symeonidis, V. Kopsachilis, D. Dalaperas, N. Koukourouvli, D. Simos, and S. Taskaris, “Development of a spatial database and web-gis for the climate of greece,” Earth Science Informatics, vol. 12, no. 1, pp. 97–115, 2019.
  • [10] M. Amin, G. W. Romney, P. Dey, and B. Sinha, “Teaching relational database normalization in an innovative way,” Journal of Computing Sciences in Colleges, vol. 35, no. 2, pp. 48–56, 2019.
  • [11] S. Alqithami, “A serious-gamification blueprint towards a normalized attention,” Brain Informatics, vol. 8, no. 1, pp. 1–13, 2021.
  • [12] I. Oditis, Z. Bicevska, J. Bicevskis, and G. Karnitis, “Implementation of nosql-based data wareh,” Baltic Journal of Modern Computing, vol. 6, no. 1, pp. 45–55, 2018.
  • [13] I. Hrubaru, G. Talab˘a, and M. Fotache, “A basic testbed for json data processing in sql data servers,” in Proceedings of the 20th International Conference on Computer Systems and Technologies, 2019, pp. 278–283.
  • [14] Y. G. Chung, E. Haldoupis, B. J. Bucior, M. Haranczyk, S. Lee, H. Zhang, K. D. Vogiatzis, M. Milisavljevic, S. Ling, J. S. Camp et al., “Advances, updates, and analytics for the computation-ready, experimental metal–organic framework database: Core mof 2019,” Journal of Chemical & Engineering Data, vol. 64, no. 12, pp. 5985–5998, 2019.
  • [15] P. Bouros and N. Mamoulis, “Spatial joins: what’s next?” SIGSPATIAL Special, vol. 11, no. 1, pp. 13–21, 2019.
  • [16] V. K. Myalapalli, T. P. Totakura, and S. Geloth, “Augmenting database performance via sql tuning,” in 2015 International Conference on Energy Systems and Applications. IEEE, 2015, pp. 13–18.
  • [17] W. G. Pedrozo and M. S. M. G. Vaz, “A tool for automatic index selection in database management systems,” in 2014 International Symposium on Computer, Consumer and Control. IEEE, 2014, pp. 1061–1064.
  • [18] J. Correia, M. Y. Santos, C. Costa, and C. Andrade, “Fast online analytical processing for big data warehousing,” in 2018 International Conference on Intelligent Systems (IS). IEEE, 2018, pp. 435–442.
  • [19] H. Sulistiani, S. Setiawansyah, and D. Darwis, “Penerapan metode agile untuk pengembangan online analytical processing (olap) pada data penjualan (studi kasus: Cv adilia lestari),” Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer dan Teknologi Informasi, vol. 6, no. 1, pp. 50–56, 2020.
  • [20] U. Erdinc¸, H. N. BULUS¸ , and C. ERDOG˘ AN, “Veritabanı tasarımının yazılım performansına etkisi: Normalizasyona kars¸ı denormalizasyon,” Su¨leyman Demirel U¨ niversitesi Fen Bilimleri Enstitu¨su¨ Dergisi, vol. 22, no. 2, pp. 887–895, 2018.
  • [21] B. Alshemaimri, R. Elmasri, T. Alsahfi, and M. Almotairi, “A survey of problematic database code fragments in software systems,” Engineering Reports, vol. 3, no. 10, p. e12441, 2021.
  • [22] D. Milicev, “Hyper-relations: A model for denormalization of transactional relational databases,” IEEE Transactions on Knowledge and Data Engineering, 2021.
  • [23] I. N. Chaparro-Cruz and J. A. Montoya-Zegarra, “Borde: Boundary and sub-region denormalization for semantic brain image synthesis,” in 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE, 2021, pp. 81–88.
  • [24] R. L. d. C. Costa, J. Moreira, P. Pintor, V. dos Santos, and S. Lifschitz, “A survey on data-driven performance tuning for big data analytics platforms,” Big Data Research, vol. 25, p. 100206, 2021.
  • [25] A. H. Chill´on, D. S. Ruiz, and J. G. Molina, “Towards a taxonomy of schema changes for nosql databases: the orion language,” in International Conference on Conceptual Modeling. Springer, 2021, pp. 176–185.
  • [26] E. Gupta, S. Sural, J. Vaidya, and V. Atluri, “Attribute-based access control for nosql databases,” in Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy, 2021, pp. 317–319.
  • [27] J. Yang, Y. Yue, and K. Rashmi, “A large-scale analysis of hundreds of in-memory key-value cache clusters at twitter,” ACM Transactions on Storage (TOS), vol. 17, no. 3, pp. 1–35, 2021.
  • [28] A. Hillenbrand, U. St¨orl, S. Nabiyev, and M. Klettke, “Self-adapting data migration in the context of schema evolution in nosql databases,” Distributed and Parallel Databases, pp. 1–21, 2021.
  • [29] A. Rafique, D. Van Landuyt, E. H. Beni, B. Lagaisse, and W. Joosen, “Cryptdice: Distributed data protection system for secure cloud data storage and computation,” Information Systems, vol. 96, p. 101671, 2021.
  • [30] J. S. Fong, Information Systems Reengineering, Integration and Normalization: Heterogeneous Database Connectivity. Springer Nature, 2021.
  • [31] J. Rand and A. Miranskyy, “On automatic parsing of log records,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 2021, pp. 41–45.