In the rapidly evolving landscape of clinical genomics, the advent of high-throughput sequencing has made it easier than ever to identify rare and novel genetic variants. However, this progress has also presented a significant diagnostic challenge: once a variant is found, its clinical significance is often ambiguous. For many patients and clinicians, the question remains: does this specific genetic mutation mean that a patient will actually develop a disease? This is the central problem of genetic penetrance, a term that describes the proportion of individuals carrying a particular genetic variant who also express an associated trait or disease.

Traditional approaches to estimating penetrance have often been limited to large-scale population studies that classify outcomes in a binary fashion—either a patient has the disease or they do not. While effective for highly penetrant Mendelian diseases, this binary classification fails to capture the nuanced, spectrum-based nature of complex, multifactorial diseases such as diabetes, cardiovascular disease, or hypertension. These conditions are influenced by a combination of genetic predisposition and environmental factors, manifesting across a wide range of severities.

Lab manager academy logoAdvanced Lab Management Certificate

The Advanced Lab Management certificate is more than training—it’s a professional advantage.

Gain critical skills and IACET-approved CEUs that make a measurable difference.

VIEW Certificate

A new study by researchers at the Icahn School of Medicine at Mount Sinai has addressed this critical gap by developing an innovative, data-driven methodology. Their work leverages the power of artificial intelligence (AI) and machine learning to move beyond the simple yes/no diagnosis, offering a more precise and scalable way to assess AI, genetic disease risk. The details of their groundbreaking findings were published in the August 28 online issue of Science.

A Machine Learning-Based Framework for Quantifying Risk

The Mount Sinai team’s methodology represents a paradigm shift in how disease penetrance is evaluated. Instead of relying on a binary outcome, they trained machine learning models to quantify disease on a continuous spectrum. These models were built using vast amounts of real-world patient data sourced from more than 1 million electronic health records (EHRs).

The core of this approach lies in the data used to train the models. Rather than relying solely on a physician’s diagnosis, the researchers incorporated routine clinical and laboratory test results that are readily available within EHRs. These included common metrics such as cholesterol levels, blood counts, and kidney function panels. By training the AI models on these granular, quantitative data points, the system gained a more nuanced understanding of disease progression. For example, instead of simply noting a diagnosis of hypertension, the model could interpret a patient’s blood pressure readings, a much more informative and continuous variable. This ability to analyze and interpret a spectrum of biochemical and physiological data allows the model to predict the likelihood of a given genetic variant contributing to disease expression.

This predictive capability is manifested in a “ML penetrance” score, which ranges from 0 to 1. A score closer to 1 signifies a higher likelihood that the genetic variant will lead to disease development, while a score closer to 0 indicates a minimal or negligible risk. By applying this framework, the team successfully calculated ML penetrance scores for more than 1,600 rare genetic variants, providing actionable insights into variants previously labeled as “uncertain” or “of unknown significance.”

Want to stay up to date on the latest lab management news?

Subscribe to our free Lab Manager Monitor Newsletter.

Is the form not loading? If you use an ad blocker or browser privacy features, try turning them off and refresh the page.

By subscribing, you agree to receive email related to Lab Manager content and products. You may unsubscribe at any time.

“We wanted to move beyond black-and-white answers that often leave patients and providers uncertain about what a genetic test result actually means,” says Ron Do, PhD, senior study author and the Charles Bronfman Professor in Personalized Medicine at the Icahn School of Medicine at Mount Sinai. “By using artificial intelligence and real-world lab data, such as cholesterol levels or blood counts that are already part of most medical records, we can now better estimate how likely disease will develop in an individual with a specific genetic variant. It’s a much more nuanced, scalable, and accessible way to support precision medicine, especially when dealing with rare or ambiguous findings.”

The Power of Data: Unveiling Hidden Connections

The application of this model yielded some surprising results. Genetic variants that had been previously classified as having “uncertain significance” displayed clear and quantifiable disease signals, suggesting a definite role in disease pathogenesis. Conversely, other variants that were traditionally thought to be causative showed minimal or no effect in the real-world data analyzed by the AI models.

These findings highlight a critical advantage of the Mount Sinai methodology: it leverages the immense, previously untapped potential of routine lab data and EHRs. This wealth of information can provide a more accurate and comprehensive picture of a variant’s clinical impact than traditional, smaller-scale genetic studies.

Impact on Clinical Laboratories and Medical Practice

For laboratory professionals, this study has profound implications. The ML penetrance score could be integrated into genetic testing reports, providing clinicians with a more precise tool for patient care. Instead of a vague “variant of unknown significance,” a report could include a data-driven score that helps inform a care plan.

How This Technology Impacts the Lab-to-Clinic Workflow:

Refined Variant Interpretation: The score offers a quantitative metric to assist in the classification of rare and novel variants. This can reduce the number of ambiguous findings, streamlining the variant interpretation process.Targeted Screening and Prevention: For patients with a high ML penetrance score for a specific disease, the data could trigger a recommendation for earlier or more frequent screenings. For instance, a patient with a rare genetic variant linked to Lynch syndrome could be flagged for earlier and more frequent colonoscopies if their score is high.Preventing Overtreatment: Conversely, a low ML penetrance score could help clinicians and patients avoid unnecessary worry, aggressive monitoring, or even overtreatment for a variant that poses minimal real-world risk.

As lead study author Iain S. Forrest, MD, PhD, notes, “While our AI model is not meant to replace clinical judgment, it can potentially serve as an important guide, especially when test results are unclear.” The model is a clinical decision support tool, not a substitute for a comprehensive evaluation by a healthcare professional.

The Road Ahead: Scalability and Validation

Looking to the future, the Mount Sinai team is focused on expanding the scope and capabilities of their model. Their ongoing work includes:

Increased Breadth: Expanding the model to cover more diseases and a broader spectrum of genetic changes.Diverse Populations: Applying the model to more diverse patient cohorts to ensure its accuracy and generalizability across different ethnic and racial groups.Prospective Validation: Tracking how the predictions hold up over time in real-world patient populations. This will involve long-term studies to determine if individuals with high-risk variants actually go on to develop the predicted diseases and if early clinical action can mitigate the risk.

This approach signifies a new frontier in the application of data science and AI, genetic disease risk assessment. “Ultimately, our study points to a potential future where AI and routine clinical data work hand in hand to provide more personalized, actionable insights for patients and families navigating genetic test results,” says Dr. Do. “Our hope is that this becomes a scalable way to support better decisions, clearer communication, and more confidence in what genetic information really means.”

This work not only advances our understanding of complex genetic interactions but also provides a powerful framework for integrating new technological capabilities into the daily practice of precision medicine, from the diagnostic laboratory to the patient’s bedside.