Abstract
We consider the minimization of a nonconvex quadratic form regularized by a cubic term, which may exhibit saddle points and a suboptimal local minimum. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the global minimum to within ε accuracy in O(ε−1 log(1/ε)) steps for large ε and O(log(1/ε)) steps for small ε (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth nonconvex functions.
Original language | English |
---|---|
Pages (from-to) | 2146-2178 |
Number of pages | 33 |
Journal | SIAM Journal on Optimization |
Volume | 29 |
Issue number | 3 |
DOIs | |
State | Published - 2019 |
Externally published | Yes |
Keywords
- Cubic regularization
- Global optimization
- Gradient descent
- Newton's method
- Nonasymptotic rate of convergence
- Nonconvex quadratics
- Power method
- Trust region methods