Derin Kalıntı Ağ Mimarisi Kullanarak Portre Görüntülerinin Bölütlenmesi

Portre görüntülerini anlamsal alanlara bölütlemek, sahne anlama ve görüntü analizinde önemli bir adımdır. Bölütleme çok aktif bir çalışma alanı olmakla birlikte, portre bölümlendirme alanında az sayıda çalışma bulunmaktadır. Portre bölütlemesindeki en önemli adımlardan biri, saç, yüz, gövde ve arka plan gibi anlamsal olarak ilişkili piksellerin birlikte gruplandığı, detaylı bölütleme işlemidir. Ancak, saç şekli, rengi ve arka planındaki aşırı farklılıklar nedeniyle bu zor bir problemdir. Çalışmamızda, bu çeşitliliklerin üstesinden gelmek için ERFNet mimarisine dayanan derin bir kalıntı ağı önerdik. Geometrik olarak normalleştirilmiş yüzleri ağ için bir girdi olarak kullandık. İki sınıflı EG1800 veri kümesi ve üç sınıflı LFW Parts Labels Veri Seti üzerinde yapılan deneysel çalışmalar, önerilen yöntemin yüksek doğrulukta ortalama kesişim değeri (mIoU) verdiğini ve piksel tabanlı doğruluğu sağladığını göstermiştir. EG1800 veri kümesi için %96,37 mIoU ve % 98,17 piksel tabanlı doğruluk ve LFW veri kümesi için %90,1 mIoU ve %97,14 doğruluk elde ettik.

Segmentation of Portrait Images Using A Deep Residual Network Architecture

Segmenting portrait images into semantic areas is an important step towards scene understanding and image analysis. Although segmentation is a very active field of study, there are few studies in the field of portrait segmentation.  One of the most crucial steps in portrait segmentation is the precise segmentation process where semantically related pixels grouped together including hair, face, body, and background. However, this is a challenging problem due to the extreme variations in hair shape, color, and background. In order to handle such variations, we proposed a deep residual network based on ERFNet architecture. We used geometrically normalized faces as an input for the network. Experimental studies on Adobe’s Portrait Segmentation dataset (two-classes) and LFW Part Labels Dataset (three-classes) showed that the proposed method provides state of the art mIoU (mean intersection over union) and pixel-based accuracy. We obtained 96.37% mIoU and 98.17% pixel‑based accuracy for EG1800 dataset and 90.1% mIoU and 97.14% accuracy for the LFW dataset.

___

  • Goodfellow, I., Bengio, Y., Courville, A. 2016. Deep Learning, MIT Press
  • He, K., Sun, J. 2014. Convolutional Neural Networks at Constrained Time Cost, CoRR, Vol. abs/1412.1
  • He, K., Zhang, X., Ren, S., Sun, J. 2015. Deep Residual Learning for Image Recognition, CoRR, Vol. abs/1512.0
  • Zaitoun, N. M., Aqel, M. J. 2015. Survey on Image Segmentation Techniques, Procedia Computer Science, Vol. 65, p. 797–806. DOI: https://doi.org/10.1016/j.procs.2015.09.027
  • Zhang, H., Fritts, J. E., Goldman, S. A. 2008. Image Segmentation Evaluation: A Survey of Unsupervised Methods, Computer Vision and Image Understanding, Vol. 110, No. 2, p. 260–280. DOI: https://doi.org/10.1016/j.cviu.2007.08.003
  • Otsu, N. 1979. A Threshold Selection Method from Gray-Level Histograms, IEEE Transactions on Cybernetics, Vol. 9, No. 1, p. 62–66. DOI: 10.1109/TSMC.1979.4310076
  • Liu, H., Yan, J., Li, Z., Zhang, H. 2007. Portrait Beautification: A Fast and Robust Approach, Image and Vision Computing, Vol. 25, No. 9, p. 1404–1413. DOI: https://doi.org/10.1016/j.imavis.2006.12.010
  • Lafferty, J. D., McCallum, A., Pereira, F. C. N. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, Proceedings of the Eighteenth International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, p. 282–289
  • Toyoda, T., Hasegawa, O. 2008. Random Field Model for Integration of Local Information and Global Information, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 8, p. 1483–1489. DOI: 10.1109/TPAMI.2008.105
  • Shotton, J., Winn, J., Rother, C., Criminisi, A. 2009. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, International Journal of Computer Vision
  • Boix, X., Gonfaus, J. M., Weijer, J., Bagdanov, A. D., Serrat, J., Gonzàlez, J. 2012. Harmony Potentials, International Journal of Computer Vision, Vol. 96, No. 1, p. 83–102. DOI: 10.1007/s11263-011-0449-8
  • Lin, G., Shen, C., Reid, I. D., van den Hengel, A. 2015. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation, CoRR, Vol. abs/1504.0
  • Ladický, L., Russell, C., Kohli, P., Torr, P. H. S. 2009. Associative Hierarchical CRFs for Object Class Image Segmentation, 2009 IEEE 12th International Conference on Computer Vision, p. 739–746. DOI: 10.1109/ICCV.2009.5459248
  • Boykov, Y. Y., Jolly, M. P. 2001. Interactive Graph Cuts for Optimal Boundary Amp; Region Segmentation of Objects in N-D Images, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001 (Vol. 1), p. 105–112 vol.1. DOI: 10.1109/ICCV.2001.937505
  • Luu, K., Le, T. H. N., Seshadri, K., Savvides, M. 2012. Facecut - a Robust Approach for Facial Feature Segmentation, 2012 19th IEEE International Conference on Image Processing, p. 1841–1844. DOI: 10.1109/ICIP.2012.6467241
  • Luu, K., Zhu, C., Bhagavatula, C., Le, T. H. N., Savvides, M. 2016. A Deep Learning Approach to Joint Face Detection and Segmentation, M. Kawulok; M. E. Celebi; B. Smolka (Eds.), Advances in Face Detection and Facial Image Analysis, Springer International Publishing, Cham, p. 1–12. DOI: 10.1007/978-3-319-25958-1_1
  • Lecun, Y., Bottou, L., Bengio, Y., Haffner, P. 1998. Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, Vol. 86, No. 11, p. 2278–2324. DOI: 10.1109/5.726791
  • Krizhevsky, A., Sutskever, I., Hinton, G. E. 2012. ImageNet Classification with Deep Convolutional Neural Networks, Proceedings of Neural Information Processing Systems (NIPS), p. 1106–1114
  • Simonyan, K., Zisserman, A. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition, CoRR, Vol. abs/1409.1
  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. 2015. Going Deeper with Convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 1–9. DOI: 10.1109/CVPR.2015.7298594
  • Xie, S., Girshick, R. B., Dollár, P., Tu, Z., He, K. 2016. Aggregated Residual Transformations for Deep Neural Networks, CoRR, Vol. abs/1611.0
  • Badrinarayanan, V., Kendall, A., Cipolla, R. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 12, p. 2481–2495. DOI: 10.1109/TPAMI.2016.2644615
  • Romera, E., Álvarez, J. M., Bergasa, L. M., Arroyo, R. 2018. ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation, IEEE T INTELL TRANSP, Vol. 19, No. 1, p. 263–272. DOI: 10.1109/TITS.2017.2750080
  • Yu, F., Koltun, V., Funkhouser, T. A. 2017. Dilated Residual Networks, CoRR, Vol. abs/1705.0
  • Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., Zisserman, A. 2010. The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, Vol. 88, No. 2, p. 303–338. DOI: 10.1007/s11263-009-0275-4
  • Shen, X., Hertzmann, A., Jia, J., Paris, S., Price, B., Shechtman, E., Sachs, I. 2016. Automatic Portrait Segmentation for Image Stylization, Computer Graphics Forum, Vol. 35, No. 2, p. 93–102. DOI: 10.1111/cgf.12814
  • Long, J., Shelhamer, E., Darrell, T. 2015. Fully Convolutional Networks for Semantic Segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 3431–3440. DOI: 10.1109/CVPR.2015.7298965
  • Wadhwa, N., Garg, R., Jacobs, D. E., Feldman, B. E., Kanazawa, N., Carroll, R., Movshovitz-Attias, Y., Barron, J. T., Pritch, Y., Levoy, M. 2018. Synthetic Depth-of-Field with a Single-Camera Mobile Phone, ACM Transactions on Graphics, Vol. 37, No. 4, p. 64:1--64:13. DOI: 10.1145/3197517.3201329
  • Mostajabi, M., Yadollahpour, P., Shakhnarovich, G. 2014. Feedforward Semantic Segmentation with Zoom-out Features, CoRR, Vol. abs/1412.0
  • Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Rodr\’\iguez, J. G. 2017. A Review on Deep Learning Techniques Applied to Semantic Segmentation, CoRR, Vol. abs/1704.0
  • Krähenbühl, P., Koltun, V. 2011. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, Proceedings of the 24th International Conference on Neural Information Processing Systems, Curran Associates Inc., USA, p. 109–117
  • Kae, A., Sohn, K., Lee, H., Learned-Miller, E. 2013. Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p. 2019–2026. DOI: 10.1109/CVPR.2013.263
  • Viola, P., Jones, M. 2001. Rapid Object Detection Using a Boosted Cascade of Simple Features, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001 (Vol. 1), p. I–I. DOI: 10.1109/CVPR.2001.990517
  • Rowley, H. A., Baluja, S., Kanade, T. 1998. Neural Network-Based Face Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 1, p. 23–38. DOI: 10.1109/34.655647
  • Milborrow, S., Nicolls, F. 2008. Locating Facial Features with an Extended Active Shape Model, D. Forsyth; P. Torr; A. Zisserman (Eds.), Computer Vision -- ECCV 2008, Springer Berlin Heidelberg, Berlin, Heidelberg, p. 504–513
  • Collobert, R., Kavukcuoglu, K., Farabet, C. 2011. Torch7: A Matlab-like Environment for Machine Learning, BigLearn, NIPS Workshop
  • Huang, G. B., Jain, V., Learned-Miller, E. G. 2007. Unsupervised Joint Alignment of Complex Images, 2007 IEEE 11th International Conference on Computer Vision, p. 1–8
  • Kalayeh, M. M., Seifu, M., LaLanne, W., Shah, M. 2015. How to Take a Good Selfie?, Proceedings of the 23rd ACM International Conference on Multimedia, ACM, New York, NY, USA, p. 923–926. DOI: 10.1145/2733373.2806365
Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi-Cover
  • ISSN: 1302-9304
  • Yayın Aralığı: Yılda 3 Sayı
  • Başlangıç: 1999
  • Yayıncı: Dokuz Eylül Üniversitesi Mühendislik Fakültesi