Loading...

Reliable clustering of Bernoulli mixture models

Najafi, A ; Sharif University of Technology | 2020

323 Viewed
  1. Type of Document: Article
  2. DOI: 10.3150/19-BEJ1173
  3. Publisher: International Statistical Institute , 2020
  4. Abstract:
  5. A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs. © 2020 ISI/BS
  6. Keywords:
  7. High-dimensional statistics ; Mixture model analysis ; PAC-learnability ; Sample complexity
  8. Source: Bernoulli ; Volume 26, Issue 2 , May , 2020 , Pages 1535-1559
  9. URL: https://projecteuclid.org/journals/bernoulli/volume-26/issue-2/Reliable-clustering-of-Bernoulli-mixture-models/10.3150/19-BEJ1173.short