TY - JOUR
T1 - The Sample Complexity of Sparse Multireference Alignment and Single-Particle Cryo-Electron Microscopy
AU - Bendory, T
AU - Edidin, D
PY - 2024
Y1 - 2024
N2 - Multireference alignment (MRA) is the problem of recovering a signal from its multiple noisy copies, each acted upon by a random group element. MRA is mainly motivated by single-particle cryoelectron microscopy (cryo-EM) that has recently joined X-ray crystallography as one of the two leading technologies to reconstruct biological molecular structures. Previous papers have shown that, in the high-noise regime, the sample complexity of MRA and cryo-EM is n = w(\sigma2d), where n is the number of observations, \sigma2 is the variance of the noise, and d is the lowest-order moment of the observations that uniquely determines the signal. In particular, it was shown that, in many cases, d = 3 for generic signals, and thus, the sample complexity is n = w(\sigma6). In this paper, we analyze the second moment of the MRA and cryo-EM models. First, we show that, in both models, the second moment determines the signal up to a set of unitary matrices whose dimension is governed by the decomposition of the space of signals into irreducible representations of the group. Second, we derive sparsity conditions under which a signal can be recovered from the second moment, implying sample complexity of n = w(\sigma4). Notably, we show that the sample complexity of cryo-EM is n = w(\sigma4) if at most one-third of the coefficients representing the molecular structure are nonzero; this bound is near-optimal. The analysis is based on tools from representation theory and algebraic geometry. We also derive bounds on recovering a sparse signal from its power spectrum, which is the main computational problem of X-ray crystallography.
AB - Multireference alignment (MRA) is the problem of recovering a signal from its multiple noisy copies, each acted upon by a random group element. MRA is mainly motivated by single-particle cryoelectron microscopy (cryo-EM) that has recently joined X-ray crystallography as one of the two leading technologies to reconstruct biological molecular structures. Previous papers have shown that, in the high-noise regime, the sample complexity of MRA and cryo-EM is n = w(\sigma2d), where n is the number of observations, \sigma2 is the variance of the noise, and d is the lowest-order moment of the observations that uniquely determines the signal. In particular, it was shown that, in many cases, d = 3 for generic signals, and thus, the sample complexity is n = w(\sigma6). In this paper, we analyze the second moment of the MRA and cryo-EM models. First, we show that, in both models, the second moment determines the signal up to a set of unitary matrices whose dimension is governed by the decomposition of the space of signals into irreducible representations of the group. Second, we derive sparsity conditions under which a signal can be recovered from the second moment, implying sample complexity of n = w(\sigma4). Notably, we show that the sample complexity of cryo-EM is n = w(\sigma4) if at most one-third of the coefficients representing the molecular structure are nonzero; this bound is near-optimal. The analysis is based on tools from representation theory and algebraic geometry. We also derive bounds on recovering a sparse signal from its power spectrum, which is the main computational problem of X-ray crystallography.
KW - X-ray crystallography
KW - cryo-EM
KW - Multireference alignment
KW - Representation theory
KW - Signal processing
KW - Sparsity
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=tau-cris-version-2&SrcAuth=WosAPI&KeyUT=WOS:001197903300002&DestLinkType=FullRecord&DestApp=WOS_CPL
U2 - 10.1137/23M155685X
DO - 10.1137/23M155685X
M3 - Article
SN - 2577-0187
VL - 6
SP - 254
EP - 282
JO - SIAM Journal on Mathematics of Data Science
JF - SIAM Journal on Mathematics of Data Science
IS - 2
ER -