TY - JOUR
T1 - Transcription factor family-specific DNA shape readout revealed by quantitative specificity models
AU - Yang, Lin
AU - Orenstein, Yaron
AU - Jolma, Arttu
AU - Yin, Yimeng
AU - Taipale, Jussi
AU - Shamir, Ron
AU - Rohs, Remo
N1 - Publisher Copyright:
© 2017 The Authors. Published under the terms of the CC BY 4.0 license
PY - 2017/2/1
Y1 - 2017/2/1
N2 - Transcription factors (TFs) achieve DNA-binding specificity through contacts with functional groups of bases (base readout) and readout of structural properties of the double helix (shape readout). Currently, it remains unclear whether DNA shape readout is utilized by only a few selected TF families, or whether this mechanism is used extensively by most TF families. We resequenced data from previously published HT-SELEX experiments, the most extensive mammalian TF–DNA binding data available to date. Using these data, we demonstrated the contributions of DNA shape readout across diverse TF families and its importance in core motif-flanking regions. Statistical machine-learning models combined with feature-selection techniques helped to reveal the nucleotide position-dependent DNA shape readout in TF-binding sites and the TF family-specific position dependence. Based on these results, we proposed novel DNA shape logos to visualize the DNA shape preferences of TFs. Overall, this work suggests a way of obtaining mechanistic insights into TF–DNA binding without relying on experimentally solved all-atom structures.
AB - Transcription factors (TFs) achieve DNA-binding specificity through contacts with functional groups of bases (base readout) and readout of structural properties of the double helix (shape readout). Currently, it remains unclear whether DNA shape readout is utilized by only a few selected TF families, or whether this mechanism is used extensively by most TF families. We resequenced data from previously published HT-SELEX experiments, the most extensive mammalian TF–DNA binding data available to date. Using these data, we demonstrated the contributions of DNA shape readout across diverse TF families and its importance in core motif-flanking regions. Statistical machine-learning models combined with feature-selection techniques helped to reveal the nucleotide position-dependent DNA shape readout in TF-binding sites and the TF family-specific position dependence. Based on these results, we proposed novel DNA shape logos to visualize the DNA shape preferences of TFs. Overall, this work suggests a way of obtaining mechanistic insights into TF–DNA binding without relying on experimentally solved all-atom structures.
KW - DNA shape
KW - binding specificity
KW - feature selection
KW - quantitative modeling
KW - transcription factor
UR - http://www.scopus.com/inward/record.url?scp=85013904734&partnerID=8YFLogxK
U2 - 10.15252/msb.20167238
DO - 10.15252/msb.20167238
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 28167566
AN - SCOPUS:85013904734
SN - 1744-4292
VL - 13
JO - Molecular Systems Biology
JF - Molecular Systems Biology
IS - 2
M1 - 910
ER -