TY - JOUR
T1 - Riemannian optimization with a preconditioning scheme on the generalized Stiefel manifold
AU - Shustin, Boris
AU - Avron, Haim
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2023/5/15
Y1 - 2023/5/15
N2 - Optimization problems on the generalized Stiefel manifold (and products of it) are prevalent across science and engineering. For example, in computational science they arise in symmetric (generalized) eigenvalue problems, in nonlinear eigenvalue problems, and in electronic structures computations, to name a few problems. In statistics and machine learning, they arise, for example, in various dimensionality reduction techniques such as canonical correlation analysis. In deep learning, regularization and improved stability can be obtained by constraining some layers to have parameter matrices that belong to the Stiefel manifold. Solving problems on the generalized Stiefel manifold can be approached via the tools of Riemannian optimization. However, using the standard geometric components for the generalized Stiefel manifold has two possible shortcomings: computing some of the geometric components can be too expensive and convergence can be rather slow in certain cases. Both shortcomings can be addressed using a technique called Riemannian preconditioning, which amounts to using geometric components derived by a preconditioner that defines a Riemannian metric on the constraint manifold. In this paper we develop the geometric components required to perform Riemannian optimization on the generalized Stiefel manifold equipped with a non-standard metric, and illustrate theoretically and numerically the use of those components and the effect of Riemannian preconditioning for solving optimization problems on the generalized Stiefel manifold.
AB - Optimization problems on the generalized Stiefel manifold (and products of it) are prevalent across science and engineering. For example, in computational science they arise in symmetric (generalized) eigenvalue problems, in nonlinear eigenvalue problems, and in electronic structures computations, to name a few problems. In statistics and machine learning, they arise, for example, in various dimensionality reduction techniques such as canonical correlation analysis. In deep learning, regularization and improved stability can be obtained by constraining some layers to have parameter matrices that belong to the Stiefel manifold. Solving problems on the generalized Stiefel manifold can be approached via the tools of Riemannian optimization. However, using the standard geometric components for the generalized Stiefel manifold has two possible shortcomings: computing some of the geometric components can be too expensive and convergence can be rather slow in certain cases. Both shortcomings can be addressed using a technique called Riemannian preconditioning, which amounts to using geometric components derived by a preconditioner that defines a Riemannian metric on the constraint manifold. In this paper we develop the geometric components required to perform Riemannian optimization on the generalized Stiefel manifold equipped with a non-standard metric, and illustrate theoretically and numerically the use of those components and the effect of Riemannian preconditioning for solving optimization problems on the generalized Stiefel manifold.
KW - Generalized Stiefel manifold
KW - Preconditioning
KW - Riemannian optimization
UR - http://www.scopus.com/inward/record.url?scp=85145568276&partnerID=8YFLogxK
U2 - 10.1016/j.cam.2022.114953
DO - 10.1016/j.cam.2022.114953
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85145568276
SN - 0377-0427
VL - 423
JO - Journal of Computational and Applied Mathematics
JF - Journal of Computational and Applied Mathematics
M1 - 114953
ER -