One of the critical stages in drug development is the identification of potential side effects for promising drug leads. Large-scale clinical experiments aimed at discovering such side effects are very costly and may miss subtle or rare side effects. Previous attempts to systematically predict side effects are sparse and consider each side effect independently. In this work, we report on a novel approach to predict the side effects of a given drug, taking into consideration information on other drugs and their side effects. Starting from a query drug, a combination of canonical correlation analysis and network-based diffusion is applied to predict its side effects. We evaluate our method by measuring its performance in a cross validation setting using a comprehensive data set of 692 drugs and their known side effects derived from package inserts. For 34% of the drugs, the top scoring side effect matches a known side effect of the drug. Remarkably, even on unseen data, our method is able to infer side effects that highly match existing knowledge. In addition, we show that our method outperforms a prediction scheme that considers each side effect separately. Our method thus represents a promising step toward shortcutting the process and reducing the cost of side effect elucidation.
- Canonical correlation analysis
- drug side effect
- drug target
- network diffusion