10Expectation Maximization

10.1 Introduction

In machine learning and statistics, computing the maximum likelihood (ML) or maximum a posteriori (MAP) parameter estimate relies on the availability of complete data. However, if the model consists of latent variables or there is missing data, then ML and MAP estimation will become a challenging problem. In such cases, gradient descent methods can be used to find a local minimum of the negative log likelihood (NLL) [77]:

(10.1) $NLL left-parenthesis bold-italic theta right-parenthesis equals minus StartFraction 1 Over upper N plus 1 EndFraction log p Subscript bold-italic theta Baseline left-parenthesis script upper D right-parenthesis comma$

where represents the set of model parameters and denotes the set of observed data points for . It is often required to impose additional constraints on the model such as mixing weights must be normalized and covariance matrices must be positive definite. Expectation maximization (EM) algorithm paves the way for addressing this issue. As an iterative algorithm, EM enforces the required constraints and handles the missing data problem by alternating between two steps [77]:

E‐step: inferring the missing values given ...

Get Nonlinear Filters now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Nonlinear Filters by Peyman Setoodeh, Saeid Habibi, Simon Haykin

10Expectation Maximization

10.1 Introduction

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly