TY - JOUR
T1 - Prediction and large-scale analysis of primary operons in plastids reveals unique genetic features in the evolution of chloroplasts
AU - Shahar, Noam
AU - Weiner, Iddo
AU - Stotsky, Lior
AU - Tuller, Tamir
AU - Yacoby, Iftach
N1 - Publisher Copyright:
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License
PY - 2019/4/23
Y1 - 2019/4/23
N2 - While bacterial operons have been thoroughly studied, few analyses of chloroplast operons exist, limiting the ability to study fundamental elements of these structures and utilize them for synthetic biology. Here, we describe the creation of a plastome-specific operon database (link provided below) achieved by combining experimental tools and predictive modeling. Using a Reverse-Transcription-PCR based method and published data, we determined the transcription-state of 213 gene pairs from four plastomes of evolutionary distinct organisms. By analyzing sequence-based features computed for our dataset, we were able to highlight fundamental characteristics differentiating between operon pairs and non-operon pairs. These include an interesting tendency toward maintaining similar messenger RNA-folding profiles in operon gene pairs, a feature that failed to yield any informative separation in cyanobacteria, suggesting that it catches unique traits of operon gene expression, which have evolved post-endosymbiosis. Subsequently, we used this feature set to train a random-forest classifier for operon prediction. As our results demonstrate the ability of our predictor to obtain accurate (84%) and robust predictions on unlabeled datasets, we proceeded to building operon maps for 2018 sequenced plastids. Our database may now present new opportunities for promoting metabolic engineering and synthetic biology in chloroplasts.
AB - While bacterial operons have been thoroughly studied, few analyses of chloroplast operons exist, limiting the ability to study fundamental elements of these structures and utilize them for synthetic biology. Here, we describe the creation of a plastome-specific operon database (link provided below) achieved by combining experimental tools and predictive modeling. Using a Reverse-Transcription-PCR based method and published data, we determined the transcription-state of 213 gene pairs from four plastomes of evolutionary distinct organisms. By analyzing sequence-based features computed for our dataset, we were able to highlight fundamental characteristics differentiating between operon pairs and non-operon pairs. These include an interesting tendency toward maintaining similar messenger RNA-folding profiles in operon gene pairs, a feature that failed to yield any informative separation in cyanobacteria, suggesting that it catches unique traits of operon gene expression, which have evolved post-endosymbiosis. Subsequently, we used this feature set to train a random-forest classifier for operon prediction. As our results demonstrate the ability of our predictor to obtain accurate (84%) and robust predictions on unlabeled datasets, we proceeded to building operon maps for 2018 sequenced plastids. Our database may now present new opportunities for promoting metabolic engineering and synthetic biology in chloroplasts.
UR - http://www.scopus.com/inward/record.url?scp=85064976851&partnerID=8YFLogxK
U2 - 10.1093/nar/gkz151
DO - 10.1093/nar/gkz151
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85064976851
SN - 0305-1048
VL - 47
SP - 3344
EP - 3352
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 7
ER -