TY - JOUR

T1 - Algorithm and data structures for efficient energy maintenance during Monte Carlo simulation of proteins

AU - Lotan, Itay

AU - Schwarzer, Fabian

AU - Halperin, Dan

AU - Latombe, Jean Claude

PY - 2004

Y1 - 2004

N2 - Monte Carlo simulation (MCS) is a common methodology to compute pathways and thermodynamic properties of proteins. A simulation run is a series of random steps in conformation space, each perturbing some degrees of freedom of the molecule. A step is accepted with a probability that depends on the change in value of an energy function. Typical energy functions sum many terms. The most costly ones to compute are contributed by atom pairs closer than some cutoff distance. This paper introduces a new method that speeds up MCS by exploiting the facts that proteins are long kinematic chains and that few degrees of freedom are changed at each step. A novel data structure, called the ChainTree, captures both the kinematics and the shape of a protein at successive levels of detail. It is used to efficiently detect self-collision (steric clash between atoms) and/or find all atom pairs contributing to the energy. It also makes it possible to identify partial energy sums left unchanged by a perturbation, thus allowing the energy value to be incrementally updated. Computational tests on four proteins of sizes ranging from 68 to 755 amino acids show that MCS with the ChainTree method is significantly faster (as much as 10 times faster for the largest protein) than with the widely used grid method. They also indicate that speed-up increases with larger proteins.

AB - Monte Carlo simulation (MCS) is a common methodology to compute pathways and thermodynamic properties of proteins. A simulation run is a series of random steps in conformation space, each perturbing some degrees of freedom of the molecule. A step is accepted with a probability that depends on the change in value of an energy function. Typical energy functions sum many terms. The most costly ones to compute are contributed by atom pairs closer than some cutoff distance. This paper introduces a new method that speeds up MCS by exploiting the facts that proteins are long kinematic chains and that few degrees of freedom are changed at each step. A novel data structure, called the ChainTree, captures both the kinematics and the shape of a protein at successive levels of detail. It is used to efficiently detect self-collision (steric clash between atoms) and/or find all atom pairs contributing to the energy. It also makes it possible to identify partial energy sums left unchanged by a perturbation, thus allowing the energy value to be incrementally updated. Computational tests on four proteins of sizes ranging from 68 to 755 amino acids show that MCS with the ChainTree method is significantly faster (as much as 10 times faster for the largest protein) than with the widely used grid method. They also indicate that speed-up increases with larger proteins.

KW - Deforming chains

KW - Energy computation

KW - Monte Carlo simulation

KW - Protein folding

KW - Self-collision detection

UR - http://www.scopus.com/inward/record.url?scp=8544262223&partnerID=8YFLogxK

U2 - 10.1089/cmb.2004.11.902

DO - 10.1089/cmb.2004.11.902

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:8544262223

SN - 1066-5277

VL - 11

SP - 902

EP - 932

JO - Journal of Computational Biology

JF - Journal of Computational Biology

IS - 5

ER -