MISSING DATA AND MATRIX COMPLETION 177
to the global optimum. In the absence of additional structure (such as strong
convexity), this is the fastest rate that can be expected from a first-order gra-
dient method (Nemirovski and Yudin 1983). Interestingly, in certain settings,
it can be shown that simple first-order methods converge at a much faster
geometric rate, meaning that O(log(1/δ)) iterations are sufficient to compute
a δ-optimum. For instance, Agarwal, Negahban and Wainwright (2012a) an-
alyze an algorithm closely related to the Soft-Impute algorithm; they show
that under the same conditions that guarantee good statistical performance
of the nuclear norm estimator, this first-order algorithm is guaranteed to con-
verge at the geometric rate.
7.3.3 ...