TY - JOUR
T1 - Data-driven determination of number of discrete conformations in single-particle cryo-EM
AU - Zhou, Ye
AU - Moscovich, Amit
AU - Bartesaghi, Alberto
N1 - Publisher Copyright:
© 2022 The Author(s)
PY - 2022/6
Y1 - 2022/6
N2 - Background and objective: One of the strengths of single-particle cryo-EM compared to other structural determination techniques is its ability to image heterogeneous samples containing multiple molecular species, different oligomeric states or distinct conformations. This is achieved using routines for in-silico 3D classification that are now well established in the field and have successfully been used to characterize the structural heterogeneity of important biomolecules. These techniques, however, rely on expert-user knowledge and trial-and-error experimentation to determine the correct number of conformations, making it a labor intensive, subjective, and difficult to reproduce procedure. Methods: We propose an approach to address the problem of automatically determining the number of discrete conformations present in heterogeneous single-particle cryo-EM datasets. We do this by systematically evaluating all possible partitions of the data and selecting the result that maximizes the average variance of similarities measured between particle images and the corresponding 3D reconstructions. Results: Using this strategy, we successfully analyzed datasets of heterogeneous protein complexes, including: 1) in-silico mixtures obtained by combining closely related antibody-bound HIV-1 Env trimers and other important membrane channels, and 2) naturally occurring mixtures from diverse and dynamic protein complexes representing varying degrees of structural heterogeneity and conformational plasticity. Conclusions: The availability of unsupervised strategies for 3D classification combined with existing approaches for fully automatic pre-processing and 3D refinement, represents an important step towards converting single-particle cryo-EM into a high-throughput technique.
AB - Background and objective: One of the strengths of single-particle cryo-EM compared to other structural determination techniques is its ability to image heterogeneous samples containing multiple molecular species, different oligomeric states or distinct conformations. This is achieved using routines for in-silico 3D classification that are now well established in the field and have successfully been used to characterize the structural heterogeneity of important biomolecules. These techniques, however, rely on expert-user knowledge and trial-and-error experimentation to determine the correct number of conformations, making it a labor intensive, subjective, and difficult to reproduce procedure. Methods: We propose an approach to address the problem of automatically determining the number of discrete conformations present in heterogeneous single-particle cryo-EM datasets. We do this by systematically evaluating all possible partitions of the data and selecting the result that maximizes the average variance of similarities measured between particle images and the corresponding 3D reconstructions. Results: Using this strategy, we successfully analyzed datasets of heterogeneous protein complexes, including: 1) in-silico mixtures obtained by combining closely related antibody-bound HIV-1 Env trimers and other important membrane channels, and 2) naturally occurring mixtures from diverse and dynamic protein complexes representing varying degrees of structural heterogeneity and conformational plasticity. Conclusions: The availability of unsupervised strategies for 3D classification combined with existing approaches for fully automatic pre-processing and 3D refinement, represents an important step towards converting single-particle cryo-EM into a high-throughput technique.
UR - http://www.scopus.com/inward/record.url?scp=85133100657&partnerID=8YFLogxK
U2 - 10.1016/j.cmpb.2022.106892
DO - 10.1016/j.cmpb.2022.106892
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 35597206
AN - SCOPUS:85133100657
SN - 0169-2607
VL - 221
JO - Computer Methods and Programs in Biomedicine
JF - Computer Methods and Programs in Biomedicine
M1 - 106892
ER -