TY - JOUR
T1 - A widespread role of the motif environment in transcription factor binding across diverse protein families
AU - Dror, Iris
AU - Golan, Tamar
AU - Levy, Carmit
AU - Rohs, Remo
AU - Mandel-Gutfreund, Yael
N1 - Publisher Copyright:
© 2015 Dror et al.
PY - 2015/9/1
Y1 - 2015/9/1
N2 - Transcriptional regulation requires the binding of transcription factors (TFs) to short sequence-specific DNAmotifs, usually located at the gene regulatory regions. Interestingly, based on a vast amount of data accumulated from genomic assays, it has been shown that only a small fraction of all potential binding sites containing the consensus motif of a given TF actually bind the protein. Recent in vitro binding assays, which exclude the effects of the cellular environment, also demonstrate selective TF binding. An intriguing conjecture is that the surroundings of cognate binding sites have unique characteristics that distinguish them from other sequences containing a similar motif that are not bound by the TF. To test this hypothesis, we conducted a comprehensive analysis of the sequence and DNA shape features surrounding the core-binding sites of 239 and 56 TFs extracted from in vitro HT-SELEX binding assays and in vivo ChIP-seq data, respectively. Comparing the nucleotide content of the regions around the TF-bound sites to the counterpart unbound regions containing the same consensus motifs revealed significant differences that extend far beyond the core-binding site. Specifically, the environment of the bound motifs demonstrated unique sequence compositions, DNA shape features, and overall high similarity to the corebinding motif. Notably, the regions around the binding sites of TFs that belong to the same TF families exhibited similar features, with high agreement between the in vitro and in vivo data sets. We propose that these unique features assist in guiding TFs to their cognate binding sites.
AB - Transcriptional regulation requires the binding of transcription factors (TFs) to short sequence-specific DNAmotifs, usually located at the gene regulatory regions. Interestingly, based on a vast amount of data accumulated from genomic assays, it has been shown that only a small fraction of all potential binding sites containing the consensus motif of a given TF actually bind the protein. Recent in vitro binding assays, which exclude the effects of the cellular environment, also demonstrate selective TF binding. An intriguing conjecture is that the surroundings of cognate binding sites have unique characteristics that distinguish them from other sequences containing a similar motif that are not bound by the TF. To test this hypothesis, we conducted a comprehensive analysis of the sequence and DNA shape features surrounding the core-binding sites of 239 and 56 TFs extracted from in vitro HT-SELEX binding assays and in vivo ChIP-seq data, respectively. Comparing the nucleotide content of the regions around the TF-bound sites to the counterpart unbound regions containing the same consensus motifs revealed significant differences that extend far beyond the core-binding site. Specifically, the environment of the bound motifs demonstrated unique sequence compositions, DNA shape features, and overall high similarity to the corebinding motif. Notably, the regions around the binding sites of TFs that belong to the same TF families exhibited similar features, with high agreement between the in vitro and in vivo data sets. We propose that these unique features assist in guiding TFs to their cognate binding sites.
UR - http://www.scopus.com/inward/record.url?scp=84940998400&partnerID=8YFLogxK
U2 - 10.1101/gr.184671.114
DO - 10.1101/gr.184671.114
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84940998400
SN - 1088-9051
VL - 25
SP - 1268
EP - 1280
JO - Genome Research
JF - Genome Research
IS - 9
ER -