Knowledge Discovery Using Clustering Methods in Medicine: ACase Study for Reflux Disease

Digitalization spreads day by day around the world; thus, the amount of data collected is on the rise. An increasing amount of data leads us to use the data and get the advantage of it by using methods like Data mining. Data mining is used in several industries. Especially as medical data is essential to be understood, it is crucial to work on it. Reflux disease is a painful illness spreading around the world. Reflux is more common compared to formerly known numbers of patients. Even though reflux is not as fatal as cancer, it decreases the quality of life and makes many people suffer in their daily life. So, reflux is affecting mental health directly. If we can ease the process of diagnosis of reflux, we may provide a better quality of life for people. In this study, various data mining algorithms are applied, and it is seen from results that medical care can be improved by changing. Nowadays, artificial intelligence applications in the field of gastroenterology stand out in various sources in the literature. However, a large database required that is specific for Reflux disease to implement these applications is available only at the Reflux Research Center in Ege University in Turkey. By benefiting the Short Form36 and Quadrad12 questionnaire data in this database, 3,909 patients and many artificial intelligence algorithms were used to discover the hidden associations among responses in the quality of life of these patients. The algorithms used in the tests are Apriori, Frequent Pattern Growth, Density-Based Spatial Clustering of Applications with Noise, Self-Organizing Map, and KMeans. In the tests, it was observed that the most successful algorithm in terms of the structure of the data was KMeans, and a set of remarkable 27 rules according to the optimal Sum of Square Error value was obtained.

___

[1] N. A. Farooqui, and R. Mehra, “Design of a data warehouse for medical information system using data mining techniques,” Fifth International Conference on Parallel, Distributed and Grid Computing, pp. 199- 203, 2018.

[2] S. Mishra, and M. Panda, “Artificial intelligence in medical science,” In Intelligent Systems for Healthcare Management and Delivery, pp. 306–330, 2019.

[3] B. Allen Jr, S. E. Seltzer, C. P. Langlotz, K. P. Dreyer, R. M. Summers, N. Petrick, D. Marinac-Dabic, M. Cruz, T. K. Alkasab, R. J. Hanisch, W. J. Nilsen, J. Burleson, K. Lyman, and K. Kandarpa, “A road map for translational research on artificial intelligence in medical imaging,” Journal of the American College of Radiology, vol. 16, no. 9, pp. 1179–1189, 2019.

[4] G. Currie, K. E. Hawk, E. Rohren, A. Vial, and R. Klein, “Machine learning and deep learning in medical imaging: intelligent imaging,” Journal of Medical Imaging and Radiation Sciences, vol. 50, no. 4, pp. 477– 487, 2019.

[5] S. Agrawal, B. Singh, R. Kumar, and N. Dey, “Machine learning for medical diagnosis: A neural network classifier optimized via the directed bee colony optimization algorithm,” U-Healthcare Monitoring Systems, vol. 1, pp. 197–215, 2019.

[6] V. Levshinskii, M. Polyakov, A. Losev, and A. V. Khoperskov, “Verification and validation of computer models for diagnosing breast cancer based on machine learning for medical data analysis,” Creativity in Intelligent Technologies and Data Science, pp. 447–460, 2019.

[7] C. Le Berre, W. J. Sandborn, S. Aridhi, M. D. Devignes, L. Fournier, M. SmaïlTabbone, S. Danese, and L. PeyrinBiroulet, “Application of artificial intelligence to gastroenterology and hepatology,” Gastroenterology, vol. 158, no. 1, pp. 76–94, 2019.

[8] J. K. Ruffle, A. D. Farmer, and Q. Aziz, “Artificial intelligence in gastroenterology,” Precision Medicine for Investigators, Practitioners and Providers, pp. 343–350, 2020.

[9] L. Q. Zhou,J. Y. Wang, S. Y. Yu, G. G. Wu, Q. Wei, Y. B. Deng, X. L. Wu, X. W. Cui, and C. F. Dietrich, “Artificial intelligence in medical imaging of the liver,” World Journal of Gastroenterology, vol. 25, no. 6, pp. 672–682, 2019.

[10] M. McDonnell, R. Harris, T. Mills, L. Downey, S. Dharmasiri, R. Felwick, F. Borca, H. Phan, F. Cummings, and M. Gwiggner, “P384 High incidence of hyperglycaemia in steroid treated hospitalised inflammatory bowel disease (IBD) patients and its risk factors identified by machine learning methods,” Journal of Crohn's and Colitis, vol. 13, no. 1, pp. 299– 300, 2019.

[11] J. Han, and K. Micheline K. “Data Mining, Southeast Asia Edition: Concepts and Techniques,” Morgan Kaufmann, pp. 30- 39, 2006.

[12] U. Fayyad, G. Piatetsky-Shapiro, and S. Padhraic, “From data mining to knowledge discovery in databases,” AI magazine, vol. 17, no. 3, pp. 37-54, 1996.

[13] Y. K. Jain, V. K. Yadav, and G. S. Panday, “An efficient association rule hiding algorithm for privacy preserving data mining,” International Journal on Computer Science and Engineering, vol. 3, no. 7, pp. 2792-2798, 2011.

[14] O. Maimon, and R. Lior, “Data mining with decision trees: Theory and applications,” World Scientific New Jersey, pp. 42-51, 2008.

[15] T. Wu, and L. Xiangyang, “Data Storage and Management, The Handbook of Data Mining,” Ed. Nong Ye, New Jersey: Lawrence Erlbaum Associates, Inc., pp. 393-407, 2003.

[16] P. Giudici, and S. Figini, “Applied Data Mining for Business and Industry,Second Edition,” Wiley Publicition, West Sussex, pp. 23-29, 2009.

[17] R. Nisbet, J. Elder, and G. Miner, “Handbook of Statistical Analysis and Data Mining Applications,” Elsevier Inc, Burlington, pp. 18-26, 2009.

[18] M. Kantardzic, “Data Mining: Concepts, Models, Methods, and Algorithms,” John Wiley & Sons J. B. Speed Scientific School, University of Louisville IEEE Computer Society, Sponser, pp. 52-66, 2003.

[19] J. Arora, N. Bhalla, and R. Sanjeev, “A review on association rule mining algorithms,” International Journal of Innovative Research in Computer and Communication Engineering, vol. 1, no. 5, pp. 1246-1251, 2013.

[20] J. Nahar, T. Imam, K. S. Tickle, and Y. P. Chen, “association rule mining to detect factors which contribute to heart disease in males and females,” Expert Systems with Applications, vol. 40, no. 4 pp. 1086-1093, 2013.

[21] R. Agrawal, and R. Srikant, “Fast algorithms for mining association rules,” In Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile. Citeseer, pp. 487–499, 1994.

[22] M. Thamer, S. El-Sappagh, and T. ElShishtawy, “A Semantic Approach for Extracting Medical Association Rules,” International Journal of Intelligent Engineering Systems, vol. 13, no. 3, pp. 280-293, 2020.

[23] L. Wang, J. Li, T. H. Zhou, and W. Q. Liu, “Association Rules Extraction Method for Semantic Query Processing Over Medical Big Data,” In Proceeding of Asian Conference on Intelligent Information and Database Systems, Singapore, Springer. pp. 109-120, 2020.

[24] S. Zhou, J. He, H. Yang, D. Chen, and R. Zhang, “Big Data-Driven Abnormal Behavior Detection in Healthcare Based on Association Rules,” IEEE Access, vol. 8, pp. 129002-129011, 2020.

[25] S. Hançerlioğlu, Y. Yıldırım, and S. Bor, “Validity and reliability of the Quality of Life in Reflux and Dyspepsia (QoLRAD) questionnaire in patients with gastroesophageal reflux disease for the Turkish population,” The Turkish Journal of Gastroenterology, pp. 511-516, 2019.

[26] S. Kodati, R. Vivekanandam, and G. Ravi, “Comparative analysis of clustering algorithms with heart disease datasets using data mining Weka tool,” In Soft Computing and Signal Processing, Springer, Singapore, pp. 111-117, 2019.

[27] D. Arthur, and S. Vassilvitskii, “KMeans++: The advantages of careful seeding,”. Stanford, pp. 1-11, 2006.

[28] P. Bholowalia, “EBK-Means : A clustering technique based on elbow method and KMeans in WSN,” International Journal of Computer Applications, vol. 105, no. 9, pp. 17–24, 2014.

[29] E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “DBSCAN revisited, revisited: why and how you should (still) use DBSCAN,” ACM Transactions on Database Systems (TODS), vol. 42, no. 3, pp. 1-21, 2017.

[30] T. Bilgin, and Y. Çamurcu, "Applied Comparison of DBSCAN, OPTICS and KMeans Clustering Algorithms," Journal of Polytechnic, vol. 8 no. 2, pp. 139–145, 2005.

[31] T. Kohonen, “Essentials of the selforganizing map,” Neural networks, vol. 37, pp. 52-65, 2013.