While there are a number of structurally non-redundant datasets of protein monomers, there is none of protein-protein interfaces. Yet, the availability of such a dataset is expected to provide an added insight into a number of investigations. First and foremost among these is analyzing the interfaces to obtain their prevailing architectures, the forces that account for the protein-protein associations and their packing considerations. Their comparisons with those of the monomers are likely to shed additional light on protein-protein recognition on the one hand and on the folding of the polypeptide chain on the other. Docking simulations are also expected to benefit from the existence of such a dataset. A major stumbling block to the generation of a dataset of interfaces has been that the interface is composed of at least two chains. Furthermore, in the interfaces, each of the chains might be represented by non-contiguous pieces. Their order in the interfaces being compared might be different as well. This discontinuity stems from the definition of an interface. An interface consists of interacting residues between the chains, and those that are in their vicinity in the supporting scaffold, within a certain distance threshold. This necessarily yields unordered fragments, as well as isolated residues. Our novel, efficient, sequence-order-independent structural comparison technique is ideally suited to handle the task of the generation of a library of structurally non-redundant protein-protein interfaces. As it is computer-vision based, it views atoms as collections of points in space, disregarding their chain connectivity. In this work, 351 interface-families are created. Comparisons of the interfaces, and separately, of the chains which contribute to them, yield some interesting cases. In one of the cases, while two interfaces are similar, the structure of only one of the two chains is similar between the two complexes. The structure of the second chain of the first complex differs from that of the second chain of the second complex. Here the structure of the cleft in the first chain dictates the specific binding interactions. In another case, while the interfaces in the two complexes are similar, both chains composing them differ between the complexes. Lastly, the chains composing the complexes are similar, but the interfaces are dissimilar, providing a set of data for investigations of the favorable orientations of protein-protein associations.
- Dataset of subunit interfaces
- Hydrophobic interactions
- Protein cores
- Protein folding
- Protein-protein recognition