On the consistency of Bayes estimates for the infinite continuous mixture of Dirichlet distributions

In this paper, we introduced the infinite continuous mixture of Dirichlet distributions as a generalization of the infinite mixture of Dirichlet ones, in order to avoid the limitation of choosing the a priori sample size for the expectation \textit{a posteriori} estimator. Monte-Carlo sampling was used in order to obtain the \textit{posterior} distributions mixture, since this mixture is difficult to get analytically. A new parametrization of this proposed distribution was achieved. Then, we suggested a mixture expectation \textit{a posteriori} estimator of the unknown parameters. The proposed estimator solves the problem of how to construct a Bayesian estimation of proportions without specifying particular parameters and sample size of the prior knowledge. Some asymptotic properties of this estimator were derived, specifically, its bias and variance. The consistency and asymptotic normality of the estimator were also established when the sample size tends to infinity and its credible interval was determined. The performance of the proposed estimator was illustrated theoretically and by means of a simulation study. Ultimately, a comparative simulation study between the learned estimates, the proposed mixture expectation \textit{a posteriori}, standard Bayesian estimator, maximum likelihood and Jeffreys estimator, was established. According to this simulation, we were able to conclude that the prior infinite mixture of Dirichlet distributions offers higher accuracy and flexibility for modeling and learning data.

___

  • [1] J.O. Berger, J.M. Bernardo and D. Sun, Overall objective priors, Bayesian Anal. 10 (1), 189-221, 2015.
  • [2] N. Bouguila and D. Ziou, Using unsupervised learning of a finite Dirichlet mixture model to improve pattern recognition applications, Pattern Recognit. Lett. 26 (12), 1916-1925, 2005.
  • [3] N. Bouguila and D. Ziou, Online clustering via finite mixtures of Dirichlet and minimum message length, Eng. Appl. Artif. Intell. 19 (4), 371-379, 2006.
  • [4] N. Bouguila and D. Ziou, High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length, IEEE Trans. Pattern Anal. Mach. Intell. 29 (10), 1716-1731, 2007.
  • [5] N. Bouguila, D. Ziou and J. Vaillancourt, Novel mixtures based on the Dirichlet distribution: application to data and image classification, in: International Workshop on Machine Learning and Data Mining in Pattern Recognition, 172-181, Springer.
  • [6] C. Constantinopoulos, M.K. Titsias and A. Likas, Bayesian feature and model selection for Gaussian mixture models, IEEE Trans. Pattern Anal. Mach. Intell. 28 (6), 1013-1018, 2006.
  • [7] V.H.L. Dávila, C.R.B. Cabral and C.B. Zeller, Finite Mixture of Skewed Distributions, Springer, 2018.
  • [8] I.T. Dimov, Monte Carlo Methods for Applied Scientists, World Scientific, 2008.
  • [9] W. Fan and N. Bouguila, Infinite Dirichlet mixture models learning via expectation propagation, Adv. Data Anal. Classif. 7 (4), 465-489, 2013.
  • [10] W. Fan, N. Bouguila and D. Ziou, Variational learning for finite Dirichlet mixture models and applications, IEEE Trans. Neural Netw. Learn. Syst. 23 (5), 762-774, 2012.
  • [11] R. Gerlach, C. Carter and R. Kohn, Efficient Bayesian inference for dynamic mixture models, J. Amer. Statist. Assoc. 95 (451), 819-828, 2000.
  • [12] A. Ghribi and A. Masmoudi, A compound Poisson model for learning discrete Bayesian networks, Acta Math. Sci. 33 (6), 1767-1784, 2013.
  • [13] D. Heckerman, A tutorial on learning with Bayesian networks, in: Learning in Graphical Models, 301-354, Springer, 1998.
  • [14] H. Jeffreys, An invariant form for the prior probability in estimation problems, in: Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 186 (1007), 453-461, The Royal Society London, 1946.
  • [15] B. G. Lindsay, Mixture models: theory, geometry and applications, in: NSF-CBMS Regional Conference Series in Probability and Statistics, i-163, JSTOR, 1995.
  • [16] N. Manouchehri, H. Nguyen, P. Koochemeshkian, N. Bouguila and W. Fan, Online variational learning of Dirichlet process mixtures of scaled Dirichlet distributions, Inf. Syst. Front. 22 (5), 1085-1093, 2020.
  • [17] G.J. McLachlan and D. Peel, Finite Mixture Models, John Wiley & Sons, 2004.
  • [18] K.W. Ng, G.L. Tian and M.L. Tang, Dirichlet and Related Distributions: Theory, Methods and Applications, John Wiley & Sons, 2011.
  • [19] K. Sjölander, K. Karplus, M. Brown, R. Hughey, A. Krogh, I.S. Mian and D. Haussler, Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology, Bioinformatics 12 (4), 327-345, 1996.
  • [20] Y.W. Teh, M.I. Jordan, M.J. Beal and D.M. Blei, Hierarchical dirichlet processes, J. Amer. Statist. Assoc. 101 (476), 1566–1581, 2006.
  • [21] M. Zitouni, M. Zribi and A. Masmoudi, Asymptotic properties of the estimator for a finite mixture of exponential dispersion models, Filomat 32 (19), 6575-6598, 2018.