Determining overfitting and underfitting in generative adversarial networks using Fréchet distance

Determining overfitting and underfitting in generative adversarial networks using Fréchet distance

Generative adversarial networks (GANs) can be used in a wide range of applications where drawing samples from a data probability distribution without explicitly representing it is essential. Unlike the deep convolutional neural networks (CNNs) trained for mapping an input to one of the multiple outputs, monitoring the overfitting and underfitting in GANs is not trivial since they are not classifying but generating a data. While training set and validation set accuracy give a direct sense of success in terms of overfitting and underfitting for CNNs during the training process, evaluating the GANs mainly depends on the visual inspection of the generated samples and generator/discriminator costs of the GANs. Unfortunately, visual inspection is far away of being objective and generator/discriminator costs are very nonintuitive. In this paper, a method was proposed for quantitatively determining the overfitting and underfitting in the GANs during the training process by calculating the approximate derivative of the Fréchet distance between generated data distribution and real data distribution unconditionally or conditioned on a specific class. Both of the distributions can be obtained from the distribution of the embedding in the discriminator network of the GAN. The method is independent of the design architecture and the cost function of the GAN and empirical results on MNIST and CIFAR-10 support the effectiveness of the proposed method.

___

  • [1] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 1998; 86 (11): 2278-2324.
  • [2] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D et al. Generative adversarial nets. Advances in Neural Information Processing Systems 2014; 2672-2680.
  • [3] Arjovsky M, Chintala S, Bottou L. Wasserstein GAN. arXiv 2017; arXiv:1701.07875v3.
  • [4] Li CL, Chang WC, Cheng Y, Yang Y, Póczos B. Mmd gan: towards deeper understanding of moment matching network. Advances in Neural Information Processing Systems 2017; 2203-2213.
  • [5] Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A et al. Improved techniques for training gans. Advances in Neural Information Processing Systems 2016; 2234-2242.
  • [6] Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems 2017; 6626-6637.
  • [7] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016; 2818-2826.
  • [8] Dowson DC, Landau BV. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis 1982; 12 (3): 450-455.
  • [9] Borji A. Pros and cons of gan evaluation measures. Computer Vision and Image Understanding 2019; 179: 41-65.
  • [10] DeVries T, Romero A, Pineda L, Taylor GW, Drozdzal M. On the evaluation of conditional gans. arXiv preprint 2019; arXiv:1907.08175.
  • [11] Barratt S, Sharma R. A note on the inception score. In: ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models; Stockholm, Sweden; 2018.
  • [12] Deng J, Dong W, Socher R, Li LJ, Li K et al. Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition; Miami, FL, USA; 2009. pp. 248-255.
  • [13] Webster R, Rabin J, Simon L, Jurie F. Detecting overfitting of deep generative networks via latent recovery. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019; 11273-11282.
  • [14] Bińkowski M, Sutherland DJ, Michael Arbel, Gretton A. Demystifying MMD GANs. In: International Conference on Learning Representations; Vancouver, BC, Canada; 2018. pp. 1-36.
  • [15] Ravuri S, Vinyals O. Seeing is not necessarily believing: limitations of biggans for data augmentation. In: ICLR Workshop on International Conference on Learning Representations; New Orleans, LA, USA; 2019.
  • [16] Theis L, Oord A, Bethge M. A note on the evaluation of generative models. In: International Conference on Learning Representations (ICLR 2016); San Juan, Puerto Rico; 2016. 1-10.
  • [17] Mirza M, Osindero S. Conditional generative adversarial nets. arXiv preprint 2014; arXiv:1411.178.
  • [18] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint 2015; arXiv:1511.06434.
  • [19] Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning; Lille, France; 2015. pp. 448-456.
  • [20] Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning (ICML-10) 2010; 807-814.
  • [21] Krizhevsky A, Hinton G. Learning Multiple Layers of Features from Tiny Images (Tech. Report). Princeton, NJ, USA: Citeseer, 2009.
  • [22] Goodfellow I. NIPS 2016 tutorial: generative adversarial networks. arXiv preprint 2016; arXiv:1701.00160.
Turkish Journal of Electrical Engineering and Computer Sciences-Cover
  • ISSN: 1300-0632
  • Yayın Aralığı: Yılda 6 Sayı
  • Yayıncı: TÜBİTAK
Sayıdaki Diğer Makaleler

Optimal directional overcurrent relay coordination based on computational intelligence technique: a review

Suzana PIL RAMLI, Muhammad USAMA, Hazlie MOKHLIS, Wei Ru WONG, Muhamad Hatta HUSSAIN, Munir Azam MUHAMMAD, Nurulafiqah Nadzirah MANSOR

Impact of hybrid power generation on voltage, losses, and electricity cost in distribution networks

Yavuz ATEŞ, Tayfur GÖKÇEK, Ahmet Yiğit ARABUL

Information retrieval-based bug localization approach with adaptive attribute weighting

Deniz KILINÇ, Buket ERŞAHİN, Semih UTKU, Mustafa ERŞAHİN

An efficient deep learning based fog removal model for multimedia applications

Gaurav SAXENA, Sarita SINGH BHADAURIA

Constrained discrete-time optimal control of uncertain systems with adaptive Lyapunov redesign

Ali Emre TURGUT, Oğuz Han ALTINTAŞ

Optimal coordination of directional overcurrent relay based on combination of improved particle swarm optimization and linear programming considering multiple characteristics curve

Suzana PIL RAMLI, Hazlie MOKHLIS, Wei Ru WONG, Muhamad Hatta HUSSAIN, Munir Azam MUHAMMAD, Nurulafiqah Nadzirah MANSOR

A fuzzy expert system for predicting the mortality of COVID’19

Monika MANGLA, Nonita SHARMA, Poonam MITTAL

A topological overview of microgrids: from maturity to the future

Semanur SANCAR, Mustafa BAĞRIYANIK, Ozan ERDİNÇ, Ayşe Kübra ERENOĞLU

Novel OFDM transmission scheme using generalized prefix with subcarrier index modulation

Yusuf ACAR

Analysis and simulation of efficiency optimized IPM drives in constant torque region with reduced computational burden

Selçuk EMİROĞLU, Mikail KOÇ, Bünyamin TAMYÜREK