It is humbling to learn that there are entire classes of optimization problems for which most methods—including those described previously—are utterly useless.
We will pose these problems as maximizations except in Section 3.3, although in nonstatistical contexts minimization is often customary. For statistical applications, recall that maximizing the log likelihood is equivalent to minimizing the negative log likelihood.
Let us assume that we are seeking the maximum of f(θ) with respect to θ = (θ1, . . ., θp), where θ Θ and Θ consists of N elements for a finite positive integer N. In statistical applications, it is not uncommon for a likelihood function to depend on configuration parameters that describe the form of a statistical model and for which there are many discrete choices, as well as a small number of other parameters that could be easily optimized if the best configuration were known. In such cases, we may view f(θ) as the log profile likelihood of a configuration, θ, that is, the highest likelihood attainable using that configuration. Section 3.1.1 provides several examples.
Each θ Θ is termed a candidate solution. Let fmax denote the globally maximum value of f(θ) achievable for θ Θ, and let the set of global maxima be . If there ...