Classification of Apple Varieties: Comparison of Ensemble Learning and Naive Bayes Algorithms in H2O Framework

In this study, H2O machine learning classification techniques were used to classify the apples according to the skin color of the fruits. For each variety, 60 samples were used at evaluations of the fruits. Fruit color values were based on L *, a * and b * color space, and measured by a portable spectrophotometer. Red Delicious, Golden Delicious, and Granny Smith apple varieties were studied to create the database, randomly. H2O Gradient Boosting Machine, H2O Random Forest, and H2O Naive Bayes Algorithms were used for data analysis. The data set was partitioned to 30% for testing and 70% for training. The classifier performance which accuracy (%), error percentage (%), F-Measure, Cohen’s Kappa, recall, precision, true positive (TP), false positive (FP), true negative (TN), false negative (FN) values were given at the conclusion section of the research. The results found that 100,0 % accuracy for H2O Gradient Boosting Machine, 98,4 % accuracy for H2O Random Forest and 100,0 % accuracy for H2O Naive Bayes.

___

  • Aiello S, Eckstrand E, Fu A, Landry M & Aboyoun P (2016). Machine learning with R and H2O, http://h2o.ai/resources/ (Accessed to web: 31.08.2019).
  • Akar Ö & Güngör O (2012). Rastgele orman algoritması kullanılarak çok bantlı görüntülerin sınıflandırılması. Jeodezi ve Jeoinformasyon Dergisi, s.139-146.
  • Amasyalı MF, Diri B, Türkoğlu F (2006). Farklı özellik vektörleri ile türkçe dokümanların yazarlarının belirlenmesi. The Fifteenth Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN'2006), Muğla, Turkey, 21-24 June, 2006.
  • Anonymous (2019). Yapay Zeka, Robotik ve Sinirbilim. https://devhunteryz.wordpress.com/2018/07/11/gradyan-arttirmagradient-boosting/ (Accessed to web: 31.08.2019).
  • Bhatt AK, Pant D & Singh R (2014). An analysis of the performance of Artificial Neural Network technique for apple classification. AI & Society, 29(1): 103-111.
  • Bühlmann P (2012). Bagging, boosting and ensemble methods. In Handbook of Computational Statistics, Springer, pp. 985-1022, Berlin, Heidelberg.
  • Candel A, Parmar V, LeDell E & Arora A (2016). Deep learning with H2O. H2O. AI. Inc.
  • Canizo BV, Escudero LB, Pellerano RG, Rodolfo GW (2019). Data mining approach based on chemical composition of grape skin for quality evaluation and traceability prediction of grapes. Computers and Electronics in Agriculture, 162(2019):514–522.
  • Caruana R, Niculescu-Mizil A (2006). An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd international conference on Machine learning, ACM, pp. 161-168.
  • Click C, Malohlava M, Parmar V, Roark H & Candel A (Nov 2017). Gradient Boosting Machine with H2O. http://h2o.ai/resources/ (Accessed to web: 31.08.2019).
  • Çalış K, Gazdağı O, Yıldız O (2013). Reklam içerikli epostaların metin madenciliği yöntemleri ile otomatik tespiti. Bilişim Teknolojileri Dergisi, Cilt: 6, Sayı: 1, Ocak 2013.
  • Fan L, Huang X, Yi L (2013). Fault diagnosis for fuel cell based on naive bayesian classification. TELKOMNIKA, 11(12): 7664-7670, December 2013, e-ISSN: 2087-278X.
  • Kavdir I & Guyer DE (2004). Apple grading using fuzzy logic. Turkish Journal of Agriculture and Forestry, 27(6): 375-382.
  • Kleynen O, Leemans V, Destain M (2005). Development of a multi-spectral vision system for the detection of defects on apples. Journal of Food Engineering, 69: 41–49.
  • Lonita I, Lonita L (2018). Classification algorithms of data mining applied for demographic processes. BRAIN – Broad Research in Artificial Intelligence and Neuroscience, Volume 9, Issue1, February 2018, ISSN 2067-8957.
  • Mitchell MW (2011). Bias of the Random Forest Out-of-Bag (OOB) Error for Certain Input Parameters. Open Journal of Statistics, 2011(1): 205-211, doi:10.4236/ojs.2011.13024.
  • Mohsenin NN (1980). Physical properties of plant and animal materials. Gordon and Breach Science Publishers, One Park Avenue, New York 10016, p. 742, USA.
  • Nandi CS, Tudu B & Koley C (2014). Computer vision based mango fruit grading system. In International Conference on Innovative Engineering Technologies (ICIET 2014) Dec, pp. 28-29.
  • Natekin A & Knoll A (2013). Gradient boosting machines, a tutorial. Frontiers in neurorobotics, 7 (2013): 21.
  • Öztürk R (1988) Bazı meyve ve sebzelere uygun kombine tip boylama makinelerinin yapısal karakteristikleri, Doktora Tezi, Ankara Üniversitesi, Fen Bilimleri Enstitüsü, Tarımsal Mekanizasyon Anabilim Dalı, Ankara.
  • Ronald M & Evans M (2016). Classification of selected apple fruit varieties using Naive Bayes. Indian Journal of Computer Science and Engineering, 7(1): 13-19.
  • Sabancı K, Ünlerşen MF, Dilay Y (2016). Determination using image processing techniques the classification parameters of apple varieties grown in the Karaman region. Journal of Agricultural Machinery Science,12 (2), 133-139.
  • Seker SE & Erdogan D (2018). End-to-end data science with KNIME. 1. Press, p. 440, Demet Erdogan Publishing.
  • Semary NA, Tharwat A, Elhariri E & Hassanien AE (2014). Fruit-based tomato grading system using features fusion and support vector machine. In Intelligent Systems' 2014, pp. 401-410, Springer, Cham.
  • Sofu MM, Er O, Kayacan MC & Cetişli B (2016). Design of an automatic apple sorting system using machine vision. Computers and Electronics in Agriculture, 127: 395-405.
  • Suleiman D & Al-Naymat G (2017). SMS spam detection using H2O framework. Procedia computer science, 113: 154-161.
  • Wu X, Wu B, Sun J & Yang N (2017). Classification of apple varieties using near infrared reflectance spectroscopy and fuzzy discriminant c‐means clustering model. Journal of Food Process Engineering, 40(2), e12355.
  • Zawbaa HM, Hazman M, Abbass M & Hassanien AE (2014). Automatic fruit classification using random forest algorithm. In 2014 14th International Conference on Hybrid Intelligent Systems, IEEE., pp. 164-168.
  • Zhang, H. (2004). The optimality of Naive Bayes. American Association for Artificial Intelligence, 1.2(2014): 3.