TY - JOUR
T1 - The multiple common point set problem and its application to molecule binding pattern detection
AU - Shatsky, Maxim
AU - Shulman-Peleg, Alexandra
AU - Nussinov, Ruth
AU - Wolfson, Haim J.
PY - 2006/3
Y1 - 2006/3
N2 - Recognition of binding patterns common to a set of protein structures is important for recognition of function, prediction of binding, and drug design. We consider protein binding sites represented by a set of 3D points with assigned physico-chemical and geometrical properties important for protein-ligand interactions. We formulate the multiple binding site alignment problem as detection of the largest common set of such 3D points. We discuss the computational problem of multiple common point set detection and, particularly, the matching problem in K-partite-ε graphs, where K partitions are associated with K structures and edges are defined between ε-close points. We show that the K-partite-ε matching problem is NP-hard in the Euclidean space with dimension larger than one. Consequently, we show that the largest common point set problem between three point sets is NP-hard. On the practical side, we present a novel computational method, MultiBind, for recognition of binding patterns common to a set of protein structures. It performs a multiple alignment between protein binding sites in the absence of overall sequence, fold, or binding partner similarity. Despite the NP-hardness results, in our applications, we practically overcome the exponential number of multiple alignment combinations by applying an efficient branch-and-bound filtering procedure. We show applications of MultiBind to several biological targets. The method recognizes patterns which are responsible for binding small molecules, such as estradiol, ATP/ANP, and transition state analogues.
AB - Recognition of binding patterns common to a set of protein structures is important for recognition of function, prediction of binding, and drug design. We consider protein binding sites represented by a set of 3D points with assigned physico-chemical and geometrical properties important for protein-ligand interactions. We formulate the multiple binding site alignment problem as detection of the largest common set of such 3D points. We discuss the computational problem of multiple common point set detection and, particularly, the matching problem in K-partite-ε graphs, where K partitions are associated with K structures and edges are defined between ε-close points. We show that the K-partite-ε matching problem is NP-hard in the Euclidean space with dimension larger than one. Consequently, we show that the largest common point set problem between three point sets is NP-hard. On the practical side, we present a novel computational method, MultiBind, for recognition of binding patterns common to a set of protein structures. It performs a multiple alignment between protein binding sites in the absence of overall sequence, fold, or binding partner similarity. Despite the NP-hardness results, in our applications, we practically overcome the exponential number of multiple alignment combinations by applying an efficient branch-and-bound filtering procedure. We show applications of MultiBind to several biological targets. The method recognizes patterns which are responsible for binding small molecules, such as estradiol, ATP/ANP, and transition state analogues.
KW - Common point set problem
KW - Consensus binding patterns
KW - K-partite matching
KW - Multiple structure alignment of binding sites
KW - Pattern discovery
KW - Pattern matching
KW - Recognition of functional sites
UR - http://www.scopus.com/inward/record.url?scp=33645465515&partnerID=8YFLogxK
U2 - 10.1089/cmb.2006.13.407
DO - 10.1089/cmb.2006.13.407
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:33645465515
SN - 1066-5277
VL - 13
SP - 407
EP - 428
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 2
ER -