2A New Regression Model for Count Compositions

Count compositions are vectors of non-negative integers summing to a fixed constant. The most popular distribution for this kind of data is the multinomial one, which has many advantages, but it is poorly parameterized in terms of the covariance matrix. An interesting approach to overcome this issue is to compound the multinomial distribution with a distribution defined on the simplex. For example, compounding the multinomial distribution with the Dirichlet leads to the well-known Dirichlet-multinomial (DM). With an additional parameter, the DM distribution fits real data better than the multinomial one, but its covariance structure may still be too rigid. The aim of this work is to propose a new distribution for count compositions and to develop a regression model based on it. The new distribution is obtained by compounding the multinomial with the flexible Dirichlet, and it can be expressed as a structured finite mixture with particular DM components. Thanks to this mixture structure and the additional parameters introduced, the new distribution can provide a better fit and an interesting interpretation in terms of latent groups. We compare the regression models based on these distributions through a simulation study and an application to a real dataset. Inferential issues are dealt with by a Bayesian approach through the Hamiltonian Monte Carlo algorithm.

2.1. Introduction

In this section, we briefly recall the main results ...

Get Data Analysis and Related Applications, Volume 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.