Similarity-based methods to predict drug targets, indications and side-effects

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Elucidating drug targets, potential indications and side effects are fundamental challenges in drug development. Key to addressing these challenges are methods that can integrate similarity information on drugs, genes, diseases and side effects from multiple sources. We present an array of similarity-based methods to predict drug properties that are extensible to additional emerging similarity measures among drug- and disease-related entities. The first method, Similarity-based Inference of drug-TARgets (SITAR), incorporates multiple drug-drug and gene-gene similarity measures for drug target prediction. SITAR consists of a new scoring scheme for drug-gene associations based on a given pair of drug-drug and gene-gene similarity measures, combined with a logistic regression component that integrates the scores of multiple measures to yield the final association score. We apply SITAR to predict targets for hundreds of drugs using both commonly used and novel drug-drug and gene-gene similarity measures and compare our results to existing state of the art methods, markedly outperforming them. We then employ our framework to make novel target predictions for hundreds of drugs; we validate these predictions via curated databases that were not used in the learning stage. The second method, PREdiction of Drug IndiCaTions (PREDICT), is designed for the large-scale prediction of drug indications and can handle both approved drugs and novel molecules. PREDICT is based on the observation that similar drugs are indicated for similar diseases, and utilizes multiple drug-drug and disease-disease similarity measures for the prediction task. On cross validation, it obtains high specificity and sensitivity (AUC=0.9) in predicting drug indications, surpassing existing methods. We validate our predictions by their overlap with drug indications that are currently under clinical trials, and by their agreement with tissue expression information for the drug targets. We further show that disease-specific genetic signatures can be used to accurately predict drug indications for new diseases (AUC=0.92). This lays the computational foundation for future personalized drug treatments, where gene expression signatures from individual patients would replace the disease-specific signatures. Finally, we present a novel approach to predict the side effects of a given drug, taking into consideration information on other drugs and their side effects. Starting from a query drug, a combination of canonical correlation analysis and network-based diffusion is applied to predict its side effects. We evaluate our method by measuring its performance in a cross validation setting using a comprehensive data set of 692 drugs and their known side effects derived from package inserts. For 34matches a known side effect of the drug. Remarkably, even on unseen data, our method is able to infer side effects that highly match existing knowledge. In addition, we show that our method outperforms a prediction scheme that considers each side effect separately. We believe that these methods represent a promising step toward shortcutting the process and reducing the cost of drug development.

Original languageEnglish
Title of host publicationProceedings - 4th International Conference on SImilarity Search and APplications, SISAP 2011
Number of pages2
StatePublished - 2011
Event4th International Conference on SImilarity Search and APplications, SISAP 2011 - Lipari, Italy
Duration: 30 Jun 20111 Jul 2011

Publication series

NameProceedings - 4th International Conference on SImilarity Search and APplications, SISAP 2011


Conference4th International Conference on SImilarity Search and APplications, SISAP 2011


Dive into the research topics of 'Similarity-based methods to predict drug targets, indications and side-effects'. Together they form a unique fingerprint.

Cite this