Data Mining and Application of Decision Tree Modelling on Electrochemical Data Used for Damaged Starch Detection

In this study, unsupervised and supervised machine learning techniques, principal component analysis and classification tree modelling which could be improved with additional input variables were applied on iodine oxidation voltammetric data in order to determine routes and extract information about the electrochemical conditions leading to different damaged starch ratios in flour. For this purpose a database of 3542 observations which was normalized and filtered from outliers was used. It was seen that although it was almost impossible to generalize information or determine correlations from voltammetric data at different conditions, principal component analysis indicate that on platinum electrode UCD values of 16.5 mostly seen at high potentials, optimized decision tree indicate that the impact of variables on UCD values can be ordered as current density > potential > electrode type > KI concentration and give routes to UCD values with high class membership leaf nodes. Therefore machine learning with decision tree modelling could open perspectives for practical and fast prediction of damaged starch ratio which would help food industry to speed up and economize costs for analysis in flour.

___

  • Baysal, M., Günay, M. E., & Yıldırım, R. (2017). Decision tree analysis of past publications on catalytic steam reforming to develop heuristics for performance: A statistical review. International Journal of Hydrogen Energy, 42(1), 243-254. doi:10.1016/j.ijhydene.2016.10.003
  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Chapman and Hall/CRC. doi:10.1201/9781315139470
  • Boschloo, G., & Hagfeldt, A. (2009). Characteristics of the Iodide/Triiodide Redox Mediator in Dye-Sensitized Solar Cells. Accounts of Chemical Research, 42(11), 1819-1826. doi:10.1021/ar900138m
  • Comon, P. (1994). Independent component analysis, A new concept?. Signal Processing, 36(3), 287-314. doi:10.1016/0165-1684(94)90029-9
  • Dhital, S., Shrestha, A. K., Flanagan, B. M., Hasjim, J., & Gidley, M. J. (2011). Cryo-milling of starch granules leads to differential effects on molecular size and conformation. Carbohydrate Polymers, 84(3), 1133-1140. doi:10.1016/j.carbpol.2011.01.002
  • Günay, M. E., Türker, L., & Tapan, N. A. (2018). Decision tree analysis for efficient CO2 utilization in electrochemical systems. Journal of CO2 Utilization, 28, 83-95. doi:10.1016/j.jcou.2018.09.011
  • Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques (3rd ed.). Elsevier.
  • Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374, 20150202. doi:10.1098/rsta.2015.0202
  • Larose, D. T., & Larose, C. D. (2014). Discovering Knowledge in Data: An Introduction to Data Mining (2nd ed.). John Wiley & Sons.
  • Li, M., Hasjim, J., Xie, F., Halley, P. J., & Gilbert, R. G. (2014). Shear degradation of molecular, crystalline, and granular structures of starch during extrusion. Starch‐Stärke, 66(7-8), 595-605. doi:10.1002/star.201300201
  • Liu, W-C., Halley, P. J., & Gilbert, R. G. (2010). Mechanism of Degradation of Starch, a Highly Branched Polymer, during Extrusion. Macromolecules, 43(6), 2855-2864. doi:10.1021/ma100067x
  • Liu, X., Xiao, X., Liu, P., Yu, L., Li, M., Zhou, S., & Xie, F. (2017). Shear degradation of corn starches with different amylose contents. Food Hydrocolloids, 66, 199-205. doi:10.1016/j.foodhyd.2016.11.023
  • Medcalf, D. G., & Gilles, K. A. (1965). Determination of Starch Damaged by Rate of Iodine Absorption. Cereal Chemistry, 42, 546-557.
  • Myles, A. J., Feudale, R. N, Liu, Y., Woody, N. A., & Brown, S. D. (2004). An Introduction to Decision Tree Modeling. Journal of Chemometrics, 18(6), 275-285. doi:10.1002/cem.873
  • Quinlan, J. R. (1986). Induction of Decision Trees. Machine Learning, 1, 81-106. doi:10.1007/BF00116251
  • Ringnér, M. (2008). What is Principal Component Analysis?. Nature Biotechnology, 26(3), 303-304. doi:10.1038/nbt0308-303
  • Tapan, N. A., Günay, M. E., & Yildirim, R. (2016). Constructing Global Models from Past Publications to Improve Design and Operating Conditions for Direct Alcohol Fuel Cells. Chemical Engineering Research and Design, 105, 162-170. doi:10.1016/j.cherd.2015.11.018
  • Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. (2012). Probability & Statistics for Engineers & Scientists (9th ed.). Prentice Hall, Boston.
  • Zhu, F. (2016). Buckwheat Starch: Structures, Properties and Applications. Trends in Food Science & Technology, 49, 121-135. doi:10.1016/j.tifs.2015.12.002