Anıl ÇELİK, Nafiz ARICA

Enhancing face pose normalization with deep learning

In this study, we propose a hybrid method for face pose normalization, which combines the 3-D modelbased method with stacked denoising autoencoder (SDAE) deep network. Instead of applying a mirroring operationfor the invisible face parts of the posed image, SDAE learns how to fill in those regions by a large set of trainingsamples. In the performance evaluation, we compare the proposed method to four different pose normalization methodsand investigate their effects on facial emotion recognition and verification problems in addition to visual quality tests.Methods evaluated in the experiments include 2-D alignment, 3-D model-based method, pure SDAE-based method, andgenerative adversarial network-based normalization method. Experiments performed on Multi-PIE dataset show thatthe proposed method produces visually reasonable results and outperforms the others in facial emotion recognition. Onthe other hand 2-D alignment is sufficient in the verification problem where the detailed face characteristics should bepreserved in the normalization process.

PDF

___

[1] Zhang X, Gao Y. Face recognition across pose: A review. Pattern Recognition 2009; 42 (11): 2876-2896. doi: 10.1016/j.patcog.2009.04.017
[2] Bookstein F. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence 1989; 11 (6): 567-585 doi: 10.1109/34.24792
[3] Ashraf A, Lucey S, Chen T. Learning patch correspondences for improved viewpoint invariant face recognition. IEEE Conference on Computer Vision and Pattern Recognition 2008; Anchorage, Alaska, USA; 2008. pp.1-7. doi: 10.1109/CVPR.2008.4587754
[4] Ho H, Chellappa R. Pose-invariant face recognition using Markov random fields. IEEE Transactions on Image Processing 2013; 22 (4): 1573-1584. doi: 10.1109/TIP.2012.2233489
[5] Beymer D, Poggio T. Face recognition from one example view. Proceedings of IEEE International Conference on Computer Vision 1995; Cambridge, MA, USA; 1995. pp. 500-507. doi: 10.1109/ICCV.1995.466898
[6] Li S, Liu X, Chai X, Zhang H, Lao S et al. Morphable displacement field-based image matching for face recognition across pose. European Conference on Computer Vision 2012; Firenze, Italy; 2012. pp. 102-115. doi: 10.1007/978 − 3 − 642 − 33718_8
[7] Li S, Liu X, Chai X, Zhang H, Lao S et al. Maximal likelihood correspondence estimation for face recognition across pose. IEEE Transactions on Image Processing 2014; 23 (10): 4587-4600. doi: 10.1109/TIP.2014.2351265
[8] Gao Y, Leung M, Wang W, Hui S. Fast face identification under varying pose from a single 2-D model view. IEE Proceedings - Vision Image and Signal Processing 2001; 148 (4): 248-253. doi: 10.1049/ip−vis:20010377
[9] Lee M, Ranganath S. Pose-invariant face recognition using a 3D deformable model. Pattern Recognition 2003; 36 (8): 1835-1846. doi: 10.1016/S0031-3203(03)00008-6
[10] Liu X, Chent T. Pose-robust face recognition using geometry assisted probabilistic modeling. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2005; 502-509. doi: 10.1109/CVPR.2005.276
[11] Hassner T, Harel S, Enbar R. Effective face frontalization in unconstrained images. IEEE Conference on Computer Vision and Pattern Recognition 2014; Columbus, Ohio, USA; 2014. pp. 4295-4304. doi: 10.1109/CVPR.2015.7299058
[12] Zhu X, Lei Z, Yan J, Yi D, Li S. High-fidelity pose and expression normalization for face recognition in the wild. IEEE Conference on Computer Vision and Pattern Recognition 2015; Salt Lake City, UT, USA; 2015. pp. 787-796. doi: 10.1109/CVPR.2015.7298679
[13] Kang Y, Lee K, Eun J, Park S, Choi S. Stacked denoising autoencoders for face pose normalization. International Conference on Neural Information Processing 2013; Daegu, South Korea; 2013. pp. 241-248. doi: 10.1007/978−3−642−42051−1_31
[14] Zhmoginov A, Sandler M. Inverting face embeddings with convolutional neural networks. Published in ArXiv 2016; arXiv:1606.04189
[15] Tran L, Yin X, Liu X. Disentangled Representation Learning GAN for pose-invariant face recognition. IEEE Conference on Computer Vision and Pattern Recognition 2017; Honolulu, HI; 2017.pp.1-10. doi: 10.1109/CVPR.2017.141
[16] Kazemi V, Sullivan J. One millisecond face alignment with an ensemble of regression trees. IEEE Conference on Computer Vision and Pattern Recognition 2014; Columbus, OH; 2014. pp.1867-1874. doi: 10.13140/2.1.1212.2243
[17] Zeiler M. ADADELTA: An Adaptive Learning Rate Method. Published in ArXiv 2012; arXiv:1212.5701
[18] Chen D, Cao X, Wen F, Sun J. Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. IEEE Conference on Computer Vision and Pattern Recognition 2013; Portland, OR, USA; 2013. pp. 3025-3032. doi: 10.1109/CVPR.2013.389
[19] Ahonen T, Hadir A, Pietikainen M. Face description with local binary patterns: Application to face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 2006; 28 (12): 2037-2041. doi: 10.1109/TPAMI.2006.244
[20] Lowe D. Distinctive image features from Scale-Invariant Keypoints. International Journal of Computer Vision 2004; 60 (2): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94
[21] Dalal N, Triggs B. Histograms of Oriented Gradients for Human Detection. IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2005; San Diego, CA, USA; 2005.pp. 886-893. doi: 10.1109/CVPR.2005.177
[22] Liu C, Wechsler H. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Transactions on Image Processing 2002; 11 (4): 467-476. doi: 10.1109/TIP.2002.999679
[23] Cao Z, Yin Q, Tang X, Sun J. Face recognition with learning-based descriptor. IEEE Computer Society Conference on Computer Vision and Pattern Recognition; San Francisco, CA; 2010. pp. 2707-2714. doi: 10.1109/CVPR.2010.5539992
[24] Chen D, Cao X, Wang L, Wen F, Sun J. Bayesian face revisited: A joint formulation. European Conference on Computer Vision 2012; Firenze, Italy; 2012. pp.566-579. doi: 10.1007/978−3−642−33712−3_41
[25] Yu Z, Zhang C. Image based static facial expression recognition with multiple deep network learning. International Conference on Multimodal Interaction; Seattle, USA; 2015. pp. 435-442. doi: 10.1145/2818346.2830595
[26] Chen D, Ren S, Wei Y, Cao X, Sun J. Joint cascade face detection and alignment. European Conference on Computer Vision; Zürich; 2014. pp. 109-122. doi: 10.1007/978−3−319−10599−4_8
[27] Zhang C, Zhang Z. Improving multiview face detection with multi-task deep convolutional neural networks. IEEE Winter Conference on Applications of Computer Vision; USA; 2014. pp. 1036–1041. doi: 10.1109/WACV.2014.6835990
[28] Meila M, Jordan M. Learning with mixtures of trees. The Journal of Machine Learning Research 2000; 1 (2000): 1-48. doi: 10.1162/153244301753344605