We present an efficient technique for the comparison of protein structures. The algorithm uses a vector representation of the secondary structure elements and searches for spatial configurations of secondary structure elements in proteins. In such recurring protein folds, the order of the secondary structure elements in the protein chains is disregarded. The method is based on the geometric hashing paradigm and implements approaches originating in computer vision. It represents and matches the secondary structure element vectors in a 3-D translation and rotation invariant manner. The matching of a pair of proteins takes on average under 3 s on a Silicon Graphics Indigo2 workstation, allowing extensive all-against-all comparisons of the data set of non-redundant protein structures. Here we have carried out such a comparison for a data set of over 500 protein molecules. The detection of recurring topologicai and non-topological, secondary structure element order-independent protein folds may provide further insight into evolution. Moreover, as these recurring folding units are likely to be conformationally favourable, the availability of a data set of such topological motifs can serve as a rich input for threading routines. Below, we describe this rapid technique and the results it has obtained. While some of the obtained matches conserve the order of the secondary structure elements, others are entirely order independent. As an example, we focus on the results obtained for Che Y, a signal transduction protein, and on the profilin-β-actin complex. The Che Y molecule is composed of a five-stranded, parallel β-sheet flanked by five helices. Here we show its similarity with the Escherichia coli elongation factor, with L-arabinose binding protein, with haloalkane dehalogenase and with adenylate kinase. The profilin-β-actin contains an antiparallel β-pleated sheet with α-helical termini. Its similarities to lipase, fructose disphosphatase and β-lactamase are displayed.
- Analysis of crystallographic databases
- Computer vision
- Geometric hashing
- Protein structure comparison
- Topological motifs