Classification of Baby Cries Using Machine Learning Algorithms

Classification of Baby Cries Using Machine Learning Algorithms

People are constantly engaged in communication with each other, and they mostly do so through language. The most effective form of communication for a newborn baby until they acquire this skill is crying. Although baby cries are often perceived as bothersome by adult individuals, they can contain a wealth of information. In this study, our goal is to interpret the information embedded in baby cry audio signals using sound processing methods and classify them using machine learning algorithms. To achieve this objective, we utilized a dataset consisting of baby cry audio signals divided into five distinct classes. Feature extraction operations were applied to the dataset, and performance metrics were measured using classification algorithms. Subsequently, to examine the impact of data augmentation on performance metrics, the data was partitioned into equal segments. The changes in performance metrics were analyzed based on the applied data augmentation technique, and it was determined that the employed method enhanced the classification accuracy.

___

  • BĂNICĂ, I.-A., CUCU, H., BUZO, A., BURILEANU, D., & BURILEANU, C. (2016). Automatic methods for infant cry classification. 2016 International Conference on Communications (COMM), 51-54. https://doi.org/10.1109/ICComm.2016.7528261
  • BASHIRI, A., & HOSSEINKHANI, R. (2020). Infant Crying Classification by Using Genetic Algorithm and Artificial Neural Network. Acta Medica Iranica, 531-539. https://doi.org/10.18502/acta.v58i10.4916
  • DEWI, S. P., PRASASTI, A. L., & IRAWAN, B. (2019). The Study of Baby Crying Analysis Using MFCC and LFCC in Different Classification Methods. 2019 IEEE International Conference on Signals and Systems (ICSigSys), 18-23. https://doi.org/10.1109/ICSIGSYS.2019.8811070
  • FRANTI, E., ISPAS, I., & DASCALU, M. (2018). Testing the Universal Baby Language Hypothesis—Automatic Infant Speech Recognition with CNNs. 2018 41st International Conference on Telecommunications and Signal Processing (TSP), 1-4. https://doi.org/10.1109/TSP.2018.8441412
  • HALPERN, R., & COELHO, R. (2016). Excessive crying in infants. Jornal De Pediatria, 92(3 Suppl 1), S40-45. https://doi.org/10.1016/j.jped.2016.01.004
  • HARIHARAN, M., SINDHU, R., & YAACOB, S. (2012). Normal and hypoacoustic infant cry signal classification using time-frequency analysis and general regression neural network. Computer Methods and Programs in Biomedicine, 108(2), 559-569. https://doi.org/10.1016/j.cmpb.2011.07.010
  • HARIHARAN, M., YAACOB, S., & AWANG, S. A. (2011). Pathological infant cry analysis using wavelet packet transform and probabilistic neural network. Expert Systems with Applications, 38(12), 15377-15382. https://doi.org/10.1016/j.eswa.2011.06.025
  • HUANG, L., PAN, W., ZHANG, Y., QIAN, L., GAO, N., & WU, Y. (2019). Data Augmentation for Deep Learning-based Radio Modulation Classification (arXiv:1912.03026; Versiyon 1). arXiv. http://arxiv.org/abs/1912.03026
  • IZMIRLI, Ö. (2000, Ocak 1). Using a Spectral Flatness Based Feature for Audio Segmentation and Retrieval.
  • JI, C., Mudiyanselage, T. B., Gao, Y., & Pan, Y. (2021). A review of infant cry analysis and classification. EURASIP Journal on Audio, Speech, and Music Processing, 2021(1), 8. https://doi.org/10.1186/s13636-021-00197-5
  • KRAMER, O. (2013). K-Nearest Neighbors. Içinde O. Kramer (Ed.), Dimensionality Reduction with Unsupervised Nearest Neighbors (ss. 13-23). Springer. https://doi.org/10.1007/978-3-642-38652-7_2
  • KULKARNI, P., UMARANI, S., DIWAN, V., KORDE, V., & REGE, P. P. (2021). Child Cry Classification—An Analysis of Features and Models. 2021 6th International Conference for Convergence in Technology (I2CT), 1-7. https://doi.org/10.1109/I2CT51068.2021.9418129
  • LAHTI, K., VÄNSKÄ, M., QOUTA, S. R., DIAB, S. Y., PERKO, K., & PUNAMÄKI, R. (2019). Maternal experience of their infants’ crying in the context of war trauma: Determinants and consequences. Infant Mental Health Journal, imhj.21768. https://doi.org/10.1002/imhj.21768
  • MAGHFIRA, T. N., BASARUDDIN, T., & KRISNADHI, A. (2020). Infant cry classification using CNN – RNN. Journal of Physics: Conference Series, 1528(1), 012019. https://doi.org/10.1088/1742-6596/1528/1/012019
  • MESSAOUD, A., & TADJ, C. (2010). A Cry-Based Babies Identification System. 6134, 192-199. https://doi.org/10.1007/978-3-642-13681-8_23
  • MICHELSSON, K., & MICHELSSON, O. (1999). Phonation in the newborn, infant cry. International Journal of Pediatric Otorhinolaryngology, 49 Suppl 1, S297-301. https://doi.org/10.1016/s0165-5876(99)00180-9
  • PISNER, D. A., & SCHNYER, D. M. (2020). Chapter 6—Support vector machine. Içinde A. Mechelli & S. Vieira (Ed.), Machine Learning (ss. 101-121). Academic Press. https://doi.org/10.1016/B978-0-12-815739-8.00006-7
  • PROBST, P., WRIGHT, M. N., & BOULESTEIX, A.-L. (2019). Hyperparameters and tuning strategies for random forest. WIREs Data Mining and Knowledge Discovery, 9(3), e1301. https://doi.org/10.1002/widm.1301
  • PURWINS, H., LI, B., VIRTANEN, T., SCHLÜTER, J., CHANG, S.-Y., & SAINATH, T. (2019). Deep Learning for Audio Signal Processing. IEEE Journal of Selected Topics in Signal Processing, 13(2), 206-219. https://doi.org/10.1109/JSTSP.2019.2908700
  • REZAEE, K., GHAYOUMI ZADEH, H., QI, L., RABIEE, H., & KHOSRAVI, M. R. (2023a). Can you Understand why I am Crying? A Decision-making System for Classifying Infants’ Cry Languages Based on deepSVM Model. ACM Transactions on Asian and Low-Resource Language Information Processing. https://doi.org/10.1145/3579032
  • REZAEE, K., GHAYOUMI ZADEH, H., QI, L., RABIEE, H., & KHOSRAVI, M. R. (2023b). Can you Understand why I am Crying? A Decision-making System for Classifying Infants’ Cry Languages Based on deepSVM Model. ACM Transactions on Asian and Low-Resource Language Information Processing. https://doi.org/10.1145/3579032
  • SAHAK, R., LEE, Y. K., MANSOR, W., YASSIN, A. I. M., & ZABIDI, A. (2010). Optimized Support Vector Machine for classifying infant cries with asphyxia using Orthogonal Least Square. 2010 International Conference on Computer Applications and Industrial Electronics, 692-696. https://doi.org/10.1109/ICCAIE.2010.5735023
  • SHARMA, K., GUPTA, C., & GUPTA, S. (2019). Infant Weeping Calls Decoder using Statistical Feature Extraction and Gaussian Mixture Models. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1-6. https://doi.org/10.1109/ICCCNT45670.2019.8944527
  • SUTANTO, E., FAHMI, F., SHALANNANDA, W., & ARIDARMA, A. (2021). Cry Recognition for Infant Incubator Monitoring System Based on Internet of Things using Machine Learning. International Journal of Intelligent Engineering and Systems, 14, 444-452. https://doi.org/10.22266/ijies2021.0228.41
  • TAUD, H., & MAS, J. F. (2018). Multilayer Perceptron (MLP). Içinde Geomatic Approaches for Modeling Land Change Scenarios (ss. 451-455). Springer, Cham. https://doi.org/10.1007/978-3-319-60801-3_27.
  • VERES, G. (2023). Donateacry-corpus. https://github.com/gveres/donateacry-corpus (Original work published 2015)
  • ZHOU, Z.-H. (2021). Machine Learning. Springer. https://doi.org/10.1007/978-981-15-1967-3
  • VELARDO, V. (2020). Deep Learning for Audio with Python, https://github.com/musikalkemist/DeepLearningForAudioWithPython/blob/master/12-%20Music%20genre%20classification:%20Preparing%20the%20dataset/code/extract_data.py