The study, published in the journal Science, reveals that the power of plant genomes lies not only in their protein-coding genes, but also in ancient regulatory DNA sequences that control where, when and how strongly to turn on gene expression.
In animals, many of these regulatory DNA sequences, called cis-regulatory elements, persist across deep evolutionary time as conserved non-coding sequences (CNSs).
These sequences are central to evolution. For example, humans and chimpanzees share about 98% of the same protein-coding genes. The differences between humans and chimps lies not in their genes, but in the regulatory DNA that controls when and where these genes are switched on.
Scientists have long searched for similar ancient regulatory sequences in plants, but with limited success. Now, The Conservatory Project team has revealed the hidden ancient regulatory sequences that have been hiding in plain sight.
Professor Madelaine Bartlett, who co-led the study and is a group leader at the Sainsbury Laboratory Cambridge University, explained: “Plant genes are continually shuffling themselves around, which makes the links between genes and their master switches extremely hard to spot.
“Repeated duplication of entire genomes, followed by gene loss and rearrangement, hide relationships between genes and their master switches from us. As a result, it was thought CNSs were rare in plants and those we knew about were thought to be young, in evolutionary terms.”
The missing manual of plant evolution
The team designed a new gene-centric computational platform that used genetic data from 284 plant species, generated by the global plant research community, to detect conserved regulatory DNA across deep time while accounting for gene duplication and rapid divergence.
They identified over two million ancient gene master switches, which control gene expression across 284 plant species from 73 plant families. This includes DNA switches that pre-date the emergence of flowering plants over 300 million years ago.
The vast and previously hidden trove of ancient regulatory DNA sequences has stood the test of evolutionary time, remaining stable and controlling plant development despite millions of years of genetic shuffling.
“The power in plant genomes isn’t just in their genes – it’s also in the DNA switches that control them,” said Bartlett. “By identifying regulatory sequences that have been conserved for hundreds of millions of years, we can begin to pinpoint the most important switches controlling plant traits.”
A new tool to inform crop engineering
The ability to engineer crop traits with speed and precision is crucial as agriculture grapples with the triple threat of climate change, increasing levels of crop disease and rising food demands.
However, the challenge is no longer whether plants can be engineered, but which exact DNA sequences should be targeted to produce predictable and beneficial traits – such as drought tolerance or pest resistance.
In crop gene editing, the focus has moved on from simply ‘knocking out’ or duplicating genes to a more sophisticated approach targeting the DNA sequences that regulate these genes.
Editing coding sequences is a heavy-handed approach. If a gene is knocked entirely, it often results in drastic changes that are too abnormal for agricultural use. What plant breeders want is the ability to ‘fine-tune’ traits – that’s the job of cis-regulatory elements.
For example, the CLAVATA3 gene in tomatoes plays a crucial role in regulating fruit size. If the CLAVATA3 gene itself is mutated, it results in big, ugly, misshapen tomatoes, but if the regulatory sequences are mutated, the result is something more intermediate and useful. CLAVATA3 genes act similarly in maize.
Mutations in non-coding, regulatory DNA nudge a gene’s expression and function, causing, for example, a fruit to be slightly larger. These subtle shifts are often exactly what agriculture needs. Once dismissed as ‘junk’, identifying these ancient non-coding DNA sequences will be key for the future of crop trait editing.
“For my lab, and others, this dataset is a treasure trove,” said Bartlett. “We now have thousands of regulatory elements to explore, both to understand plant evolution, and to manipulate in agriculture. We haven’t found all the CNSs yet, but now we have the tools to look.”
The project was led by the labs of Madelaine Bartlett (Sainsbury Laboratory Cambridge University), Idan Efroni (The Hebrew University of Jerusalem), and Zachary Lippman (Cold Spring Harbor), together with joint first co-authors Kirk R. Amundson from University of Massachusetts Amherst and Anat Hendelman from Cold Spring Harbor Laboratory.
The Conservatory data set for 284 plant species is available here.
This research was supported by the United States-Israel Binational Science Foundation, Israel Science Foundation, Howard Hughes Medical Institute, U.S. National Science Foundation, USDA AFRI and The Gatsby Charitable Foundation.
Reference
Kirk R. Amundson, Anat Hendelman, Danielle Ciren, Hailong Yang, Amber E. de Neve, Shai Tal, Adar Sulema, David Jackson, Madelaine E. Bartlett, Zachary B. Lippman, Idan Efroni (Science, 2025). ‘A deep-time landscape of plant cis-regulatory sequence evolution’. DOI: 10.1126/science.adt8983