A mixture model is a probabilistic model of a sub-population within a population. These models are used to make statistical inferences about a sub-population, given the observations of pooled populations.
A Gaussian Mixture Model (GMM) is a mixture model represented as a weighted sum of Gaussian component densities. Its model coefficients are estimated from training data using the iterative Expectation-Maximization (EM) algorithm or Maximum A Posteriori (MAP) estimation from a trained model.
The spark.ml implementation uses the EM algorithm.
It has the following parameters:
- k: Number of desired clusters
- convergenceTol: Maximum change in log-likelihood at which one considers convergence achieved
- maxIterations: Maximum ...