Background: MRNA can form local secondary structure within the protein-coding sequence, and the strength of this structure is thought to influence gene expression regulation. Previous studies suggest that secondary structure strength may be maintained under selection, but the details of this phenomenon are not well understood. Results: We perform a comprehensive study of the selection on local mRNA folding strengths considering variation between species across the tree of life. We show for the first time that local folding strength selection tends to follow a conserved characteristic profile in most phyla, with selection for weak folding at the two ends of the coding region and for strong folding elsewhere in the coding sequence, with an additional peak of selection for strong folding located downstream of the start codon. The strength of this pattern varies between species and organism groups, and we highlight contradicting cases. To better understand the underlying evolutionary process, we show that selection strengths in the different regions are strongly correlated, and report four factors which have a clear predictive effect on local mRNA folding selection within the coding sequence in different species. Conclusions: The correlations observed between selection for local secondary structure strength in the different regions and with the four genomic and environmental factors suggest that they are shaped by the same evolutionary process throughout the coding sequence, and might be maintained under direct selection related to optimization of gene expression and specifically translation regulation.
- Codon usage
- Comparative genomics
- Gene expression regulation
- Protein-coding sequence evolution
- mRNA secondary structure