We have defined likelihood as a filtering term in the Bayes formula. In general, it has the form of the following:
Here, the first term expresses the actual likelihood of a hypothesis, given a dataset X. As you can imagine, in this formula, there are no more Apriori probabilities, so, maximizing it doesn't imply accepting a theoretical preferential hypothesis, nor considering unlikely ones. A very common approach, known as Expectation Maximization (EM), which is used in many algorithms (we're going to see an example in logistic regression), is split into two main parts:
-
Determining a log-likelihood expression ...