Simplex Factor Models for Multivariate Unordered Categorical Data
Gaussian latent factor models are routinely used for modeling of dependence in continuous, binary, and ordered categorical data. For unordered categorical variables, Gaussian latent factor models lead to challenging computation and complex modeling structures. As an alternative, we propose a novel class of simplex factor models. In the single-factor case, the model treats the different categorical outcomes as independent with unknown marginals. The model can characterize flexible dependence structures parsimoniously with few factors, and as factors are added, any multivariate categorical data distribution can be accurately approximated. Using a Bayesian approach for computation and inferences, a Markov chain Monte Carlo (MCMC) algorithm is proposed that scales well with increasing dimension, with the number of factors treated as unknown. We develop an efficient proposal for updating the base probability vector in hierarchical Dirichlet models. Theoretical properties are described, and we evaluate the approach through simulation examples. Applications are described for modeling dependence in nucleotide sequences and prediction from high-dimensional categorical features.
Year of publication: |
2012
|
---|---|
Authors: | Bhattacharya, Anirban ; Dunson, David B. |
Published in: |
Journal of the American Statistical Association. - Taylor & Francis Journals, ISSN 0162-1459. - Vol. 107.2012, 497, p. 362-377
|
Publisher: |
Taylor & Francis Journals |
Saved in:
Saved in favorites
Similar items by person
-
Simplex Factor Models for Multivariate Unordered Categorical Data
Bhattacharya, Anirban, (2012)
-
Unequal life chances : equity and the demographic transition in India
Mander, Harsh, (2019)
-
When the mask came off : lockdown 2020 : a people's history of cruelty and compassion
Mander, Harsh, (2021)
- More ...