Recognition of binding patterns common to a set of protein structures is important for recognition of function, prediction of binding, and drug design. We consider protein binding sites represented by a set of 3D points with assigned physico-chemical and geometrical properties important for protein-ligand interactions. We formulate the multiple binding site alignment problem as detection of the largest common set of such 3D points. We discuss the computational problem of multiple common point set detection and, particularly, the matching problem in K-partite-ε graphs, where K partitions are associated with K structures and edges are defined between ε-close points. We show that the K-partite-ε matching problem is NP-hard in the Euclidean space with dimension larger than one. Consequently, we show that the largest common point set problem between three point sets is NP-hard. On the practical side, we present a novel computational method, MultiBind, for recognition of binding patterns common to a set of protein structures. It performs a multiple alignment between protein binding sites in the absence of overall sequence, fold, or binding partner similarity. Despite the NP-hardness results, in our applications, we practically overcome the exponential number of multiple alignment combinations by applying an efficient branch-and-bound filtering procedure. We show applications of MultiBind to several biological targets. The method recognizes patterns which are responsible for binding small molecules, such as estradiol, ATP/ANP, and transition state analogues.
- Common point set problem
- Consensus binding patterns
- K-partite matching
- Multiple structure alignment of binding sites
- Pattern discovery
- Pattern matching
- Recognition of functional sites