TY - JOUR
T1 - Evolutionary selection against short nucleotide sequences in viruses and their related hosts
AU - Zarai, Yoram
AU - Zafrir, Zohar
AU - Siridechadilok, Bunpote
AU - Suphatrakul, Amporn
AU - Roopin, Modi
AU - Julander, Justin
AU - Tuller, Tamir
N1 - Publisher Copyright:
© 2020 The Author(s) 2020. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
PY - 2020/5/19
Y1 - 2020/5/19
N2 - Viruses are under constant evolutionary pressure to effectively interact with the host intracellular factors, while evading its immune system. Understanding how viruses co-evolve with their hosts is a fundamental topic in molecular evolution and may also aid in developing novel viral based applications such as vaccines, oncologic therapies, and anti-bacterial treatments. Here, based on a novel statistical framework and a large-scale genomic analysis of 2,625 viruses from all classes infecting 439 host organisms from all kingdoms of life, we identify short nucleotide sequences that are under-represented in the coding regions of viruses and their hosts. These sequences cannot be explained by the coding regions' amino acid content, codon, and dinucleotide frequencies. We specifically show that short homooligonucleotide and palindromic sequences tend to be under-represented in many viruses probably due to their effect on gene expression regulation and the interaction with the host immune system. In addition, we show that more sequences tend to be under-represented in dsDNA viruses than in other viral groups. Finally, we demonstrate, based on in vitro and in vivo experiments, how under-represented sequences can be used to attenuated Zika virus strains.
AB - Viruses are under constant evolutionary pressure to effectively interact with the host intracellular factors, while evading its immune system. Understanding how viruses co-evolve with their hosts is a fundamental topic in molecular evolution and may also aid in developing novel viral based applications such as vaccines, oncologic therapies, and anti-bacterial treatments. Here, based on a novel statistical framework and a large-scale genomic analysis of 2,625 viruses from all classes infecting 439 host organisms from all kingdoms of life, we identify short nucleotide sequences that are under-represented in the coding regions of viruses and their hosts. These sequences cannot be explained by the coding regions' amino acid content, codon, and dinucleotide frequencies. We specifically show that short homooligonucleotide and palindromic sequences tend to be under-represented in many viruses probably due to their effect on gene expression regulation and the interaction with the host immune system. In addition, we show that more sequences tend to be under-represented in dsDNA viruses than in other viral groups. Finally, we demonstrate, based on in vitro and in vivo experiments, how under-represented sequences can be used to attenuated Zika virus strains.
KW - Zika virus
KW - systems-biology
KW - under-represented sequences
KW - virus-host co-evolution
UR - http://www.scopus.com/inward/record.url?scp=85087320741&partnerID=8YFLogxK
U2 - 10.1093/dnares/dsaa008
DO - 10.1093/dnares/dsaa008
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 32339222
AN - SCOPUS:85087320741
SN - 1340-2838
VL - 27
JO - DNA Research
JF - DNA Research
IS - 2
M1 - dsaa008
ER -