TY - JOUR
T1 - GOSSIP
T2 - A method for fast and accurate global alignment of protein structures
AU - Kifer, I.
AU - Nussinov, R.
AU - Wolfson, H. J.
N1 - Funding Information:
Funding: IK was supported in part by a fellowship from the Edmond J. Safra Bioinformatics Program at Tel-Aviv university. The research of HJW has been supported in part by the Israel Science Foundation (grant no. 1403/09) and the TAU Minerva-Minkowski Geometry center. This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institute of Health, under contract number (HHSN261200800001E). The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US Government. This research was supported (in part) by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.
PY - 2011/4
Y1 - 2011/4
N2 - Motivation: The database of known protein structures (PDB) is increasing rapidly. This results in a growing need for methods that can cope with the vast amount of structural data. To analyze the accumulating data, it is important to have a fast tool for identifying similar structures and clustering them by structural resemblance. Several excellent tools have been developed for the comparison of protein structures. These usually address the task of local structure alignment, an important yet computationally intensive problem due to its complexity. It is difficult to use such tools for comparing a large number of structures to each other at a reasonable time. Results: Here we present GOSSIP, a novel method for a global allagainst-all alignment of any set of protein structures. The method detects similarities between structures down to a certain cutoff (a parameter of the program), hence allowing it to detect similar structures at a much higher speed than local structure alignment methods. GOSSIP compares many structures in times which are several orders of magnitude faster than well-known available structure alignment servers, and it is also faster than a database scanning method. We evaluate GOSSIP both on a dataset of short structural fragments and on two large sequence-diverse structural benchmarks. Our conclusions are that for a threshold of 0.6 and above, the speed of GOSSIP is obtained with no compromise of the accuracy of the alignments or of the number of detected global similarities.
AB - Motivation: The database of known protein structures (PDB) is increasing rapidly. This results in a growing need for methods that can cope with the vast amount of structural data. To analyze the accumulating data, it is important to have a fast tool for identifying similar structures and clustering them by structural resemblance. Several excellent tools have been developed for the comparison of protein structures. These usually address the task of local structure alignment, an important yet computationally intensive problem due to its complexity. It is difficult to use such tools for comparing a large number of structures to each other at a reasonable time. Results: Here we present GOSSIP, a novel method for a global allagainst-all alignment of any set of protein structures. The method detects similarities between structures down to a certain cutoff (a parameter of the program), hence allowing it to detect similar structures at a much higher speed than local structure alignment methods. GOSSIP compares many structures in times which are several orders of magnitude faster than well-known available structure alignment servers, and it is also faster than a database scanning method. We evaluate GOSSIP both on a dataset of short structural fragments and on two large sequence-diverse structural benchmarks. Our conclusions are that for a threshold of 0.6 and above, the speed of GOSSIP is obtained with no compromise of the accuracy of the alignments or of the number of detected global similarities.
UR - http://www.scopus.com/inward/record.url?scp=79953300931&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btr044
DO - 10.1093/bioinformatics/btr044
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:79953300931
SN - 1367-4803
VL - 27
SP - 925
EP - 932
JO - Bioinformatics
JF - Bioinformatics
IS - 7
M1 - btr044
ER -