By tracing seed evolution across 250 million years, NYU and New York Botanical Garden researchers uncovered more than 4,000 ovule genes that could reshape how breeders engineer yield, quality and resilience.
When scientists look for the origins of agriculture’s most powerful invention — the seed — they often stop at the rise of flowering plants. New York University (NYU) and the New York Botanical Garden (NYBG) collaborated to go further back, to the first “naked seeds” that appeared long before corn, soybeans or wheat.
The researchers built a broad evolutionary dataset across conifers, cycads, ginkgo and early flowering plants, then mapped the genetic switches that helped seeds emerge and thrive. Their results point to a deep genetic toolkit that modern breeders can use to build tougher crops with better seed performance.
The study analyzed ovule and leaf tissues from 20 species that bracket key evolutionary splits. That breadth let the team compare reproductive gene activity across seed plants and their closest non-seed relatives, then ask which genes were most decisive in the shift from spores to seeds. The work identified more than 4,000 candidate ovule genes that rose to prominence at major branch points, many of them developmentally regulated during ovule formation. Those same programs still influence how seeds develop and store resources in today’s crops.
“The defining trait of gymnosperms, which survived mass extinctions, are ‘naked seeds’ exposed to environmental stress,” NYU biology professor Gloria Coruzzi says. “Thus, our discovery that genes with strong influence on seed evolution in gymnosperms have counterparts in flowering crop plants opens enormous opportunities for engineering or molecular breeding of environmentally resilient seeds in agriculture.”
A Genomic Time Capsule
The team used deep RNA sequencing to capture what genes are switched on in ovules versus leaves across a wide phylogenetic spread, including conifers such as Metasequoia and Sciadopitys, cycads, ginkgo and a set of angiosperms used as anchors. They then grouped transcripts into ortholog families that share ancestry, reconstructed evolutionary relationships and asked which gene groups best explain the splits that produced modern seed plants. Because the analysis pairs phylogeny with gene expression, it highlights not only what changed over time, but also when certain programs became active in ovule tissues.
That approach revealed a consistent pattern. Genes that flip on in ovules during key developmental windows carry a disproportionate share of the signal that distinguishes seed plants from their ancestors. In other words, the regulatory timing of gene expression, not just gene presence, helped drive the emergence of integuments, nutrient stores and the embryo protection strategies that make seeds so successful. For breeders, that is a practical message. Developmental timing is selectable, it is editable and it often shows clear links to seed size, seed fill, dormancy and vigor.
The Ancient Blueprint for Modern Traits
Among the most informative sets were MADS-box and BELL gene families, long studied in flowering plants for roles in floral identity, ovule patterning and stress signaling. The new work pushes their roots deeper, tying these regulators to early seed evolution in gymnosperms. That matters for agriculture because it points to conserved switches that breeders already know how to track with markers, validate in model systems and tune with modern editing tools.
“MADS-box genes have been heavily studied for their connections to seed and flower development and stress resilience,” NYU and New York Botanical Garden researcher Veronica Sondervan says. “These newly discovered genes in gymnosperm ovules have direct relevance to breeding for commercial timber and could potentially be used by plant breeders to introduce desired traits via genetic editing.”
The researchers also mined a well-annotated model, Arabidopsis thaliana, to look for previously overlooked genes that behave like ovule regulators. They flagged more than a thousand candidates that had escaped prior annotations, which suggests there are still hidden levers in the most studied plant genome. That is good news for seed companies that rely on Arabidopsis and maize as functional proving grounds before moving targets into priority crops.
“Ovule genes shared in seed plants would include genetic pathways for subsequent seed growth,” New York Botanical Garden plant genomics curator Barbara Ambrose says. “These genes could be used in crop breeding for increasing seed size among other seed traits.”
When 250 Million Years of Biology Meet the Breeding Pipeline
The genetic themes that allowed gymnosperms to ride out ancient drought, heat and nutrient stress line up with the traits growers need now. The same categories of genes that govern integument development, tissue specification and resource loading in ovules tie directly to seed weight, seed composition and germination behavior. Those traits can be tracked in breeding with expression markers or SNP haplotypes, then tuned with CRISPR when a strong causal variant is in reach.
Three near-term plays stand out for commercial teams:
Genomic mining for seed vigor and longevity. Developmentally regulated ovule genes are natural candidates for seed longevity, a priority for both retail shelf life and carryover inventory. Companies can screen elite lines for alleles at these regulators, test for correlations with accelerated aging assays, then move promising haplotypes with backcrossing or editing.
Stress-readiness without yield drag. Gymnosperm-derived insights into desiccation tolerance and resource buffering can inform seed coat architecture, protective protein accumulation and hormonal timing that supports emergence under low moisture. Because the targets are regulatory nodes rather than large-effect structural genes, there is a path to stress resilience without heavy penalties on yield in good years.
Precision seed size and composition. BELL and MADS-box members influence cell identity and growth in the developing ovule. Tuning their timing or dosage offers a route to predictable shifts in seed size, protein-to-oil balance, or endosperm development. That connects upstream genetics to downstream processors who want more uniformity and less sorting loss.
A Practical Path From Lab to Field
The study’s value grows when paired with the tools seed companies already run at scale. Genomic selection models can absorb new markers tied to ovule regulators and improve prediction for traits that are costly to phenotype, like longevity or emergence under stress. Functional screens in maize, soybean, or canola can validate candidate genes from gymnosperm-informed shortlists using transient assays, then confirm in edited lines. Trait integration teams can layer the best variants into existing stacks with low regulatory risk when edits mimic natural alleles.
Translating this kind of discovery also benefits from pre-competitive collaboration. Public datasets reduce duplication and shared standards for ovule sampling and expression analysis make company-to-company findings more comparable. Once a small set of high-confidence targets is validated in models, the private work of trait integration, germplasm fit and market positioning can proceed faster.
“Our analysis of 14 gymnosperm genomes enabled us to discover 1,224 undocumented candidate ovule-regulated genes in the most highly annotated plant genome, Arabidopsis thaliana,” Coruzzi says. “A public–private partnership would enhance the translation of this gold-mine of candidate genes for seed resilience from lab to field.”
Subhead: The Next Frontier in Seed Design
There is a broader strategic signal here. Rather than only pushing deeper into variation within a crop, this work pulls knowledge laterally from a lineage that solved similar problems under different constraints. That cross-lineage perspective is how the industry found new herbicide targets, new disease resistance genes and new quality traits in the past. Gymnosperm ovule biology adds another lens, with a focus on the programs that make seeds seeds.
For R&D leaders, researchers say next steps are clear. Prioritize a short list of ovule regulators with strong evolutionary support and clean expression patterns. Build quick validation assays that tie those regulators to seed traits in one model and one priority crop. Add the best markers to genomic prediction pipelines, then start the slow, steady work of stacking the right alleles into elite families. Keep an eye on seed longevity, emergence under deficit moisture and uniformity of seed fill — the traits that pay in both farmer fields and processing plants.
As global agriculture faces pressure to grow more with fewer inputs, the ability to design seeds that perform under stress will define the next decade. Gymnosperms may have been first to the seed, but their legacy can help seed companies build the next generation of resilient, high-performing crops. The team says the blueprint is ancient, the opportunity is now.