Infinite Gaussian Mixture Modeling with an Improved Estimation of the Number of Clusters

Avi Matza, Yuval Bistritz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Infinite Gaussian mixture modeling (IGMM) is a modeling method that determines all the parameters of a Gaussian mixture model (GMM), including its order. It has been well documented that it is a consistent estimator for probability density functions in the sense that, given enough training data from sufficiently regular probability density functions, it will converge to the shape of the original density curve. It is also known, however, that IGMM provides an inconsistent estimation of the number of clusters. The current paper shows that the nature of this inconsistency is an overestimation, and we pinpoint that this problem is an inherent part of the training algorithm. It stems mostly from a “self-reinforcing feedback” which is a certain relation between the likelihood function of one of the model hyperparameters (α) and the probability of sampling the number of components, that sustain their mutual growth during the Gibbs iterations. We show that this problem can be resolved by using informative priors for α and propose a modified training procedure that uses the inverse χ2 for this purpose. The modified algorithm successfully recovers the “known” order in all the experiments with synthetic data sets. It also demonstrates good results when compared to other methods used to evaluate model order, using real-world databases. Furthermore, the improved performance is attained without undermining the fidelity of estimating the original PDFs and with a significant reduction in computational cost.

Original languageEnglish
Title of host publication35th AAAI Conference on Artificial Intelligence, AAAI 2021
PublisherAssociation for the Advancement of Artificial Intelligence
Pages8921-8929
Number of pages9
ISBN (Electronic)9781713835974
StatePublished - 2021
Event35th AAAI Conference on Artificial Intelligence, AAAI 2021 - Virtual, Online
Duration: 2 Feb 20219 Feb 2021

Publication series

Name35th AAAI Conference on Artificial Intelligence, AAAI 2021
Volume10B

Conference

Conference35th AAAI Conference on Artificial Intelligence, AAAI 2021
CityVirtual, Online
Period2/02/219/02/21

Fingerprint

Dive into the research topics of 'Infinite Gaussian Mixture Modeling with an Improved Estimation of the Number of Clusters'. Together they form a unique fingerprint.

Cite this