By grouping patients based on gene activity, scientists show that shared molecular pathways, especially immune-related ones, help explain why some diseases overlap and others diverge, offering clues for treatment and prevention.

Single strand ribonucleic acid (RNA) molecules. 3D illustrationStudy: Patient stratification reveals the molecular basis of diseaseco- occurrences. Image credit: nobeastsofierce/Shutterstock.com

Using genomic and transcriptional data has greatly improved the understanding of multiple aspects of human physiology. A new paper in PNAS reports on molecular-level associations of co-occurring diseases identified by their RNA expression.

The investigators went a step further by categorizing participants by their gene expression patterns. This revealed more disease groupings, both known and potential, offering possibilities for the systematic discovery of relationships between diseases at the molecular level. This could enhance treatment approaches to such comorbidities.

Introduction

Comorbidity refers to the occurrence of two or more disease conditions in the same patient or set of patients. Specific illnesses confer a higher risk for certain other conditions. These patterns of co-occurrence help predict the course and prognosis of the diseases, as well as the odds of developing specific secondary illnesses as a result of the index condition.

Shared disease-related genes may explain these co-occurrences and can be identified using network analysis. The authors of the present paper previously showed how gene expression profiles predicted disease similarity networks, uncovering known comorbidities.

However, earlier network studies failed to identify many known comorbidities. The current study used publicly available RNA-sequencing data, which offer greater sensitivity and reproducibility than earlier methods.

The investigators built a disease similarity network, which replicated and added to associations between a much larger proportion of known comorbidities. Next, they exploited differential gene expression data to build a stratified similarity network, grouping patients by their gene expression profile.

Study findings

The networks identified direct and inverse comorbidities, that is, conditions that occur more or less often together than expected by chance. Most importantly, the stratified network recalls ~64% of epidemiologically known comorbidity pairs by analyzing patient subgroups with similar expression profiles. The results correlated with those from epidemiological studies, validating the methodological soundness of the analysis.

Identified associations include those of irritable bowel disease (IBD) and lung or liver cancer, or Kaposi’s sarcoma and HIV infection. Some less obvious associations were also identified, such as Kaposi’s sarcoma and immunological diseases like IBD.

Again, kinesin pathways were enriched in cancer but were lower than expected in Huntington’s disease. Huntington’s disease shows increased Th1/IL-12 signaling and complement activation, whereas these pathways are underexpressed in several cancers, illustrating opposite immune tendencies.

Gut-related comorbidities had the highest precision of 66.4%. Neoplasms showed the lowest precision, while mental disorders tended to have lower recall. Notably, 95.2% of DSN interactions that match epidemiology share one or more overexpressed immune pathways. More than 90% share metabolic or extracellular matrix.

Common mechanisms of comorbidities

The study suggests common underlying biological explanations for comorbidities with a strong immune component and reveals multiple deeper relationships between diseases.

Thus, common underlying mechanisms can be of three types: Both diseases share the same pathway, one condition alters pathways, causing the second condition, or a third condition causes changes that increase the risk of the other two.

Multiple combinations of these may also occur, especially with chronic medical conditions.

Therefore, not all associations between diseases reflect actual risk increases. Some mirror similarities in dysregulated pathways. Others correspond to comorbidities that have not been widely recognized, such as breast cancer with colorectal or thyroid cancer, and thyroid cancer with ulcers due to radiation treatment.

For instance, metabolic syndrome is both due to and triggers the progression of the metabolic trajectory that involves obesity, insulin resistance, diabetes, cardiovascular disease, and cancer.

Disease subtypes and comorbidities

Disease patterns and subtypes also modify comorbidity incidence as they involve distinct gene expression patterns. The current study suggests that certain breast cancer patients are more likely to have autism and bipolar disorder, though the supporting evidence is mixed or nonsignificant in some cases.

Down syndrome was also associated with a higher risk of childhood leukemia and multiple autoimmune diseases, especially celiac disease, with a sixfold higher incidence. This is associated with widespread changes in the immune system.

Conclusions

The study is based on disease similarity networks based on gene expression profiles that provided associations between comorbidities at an unprecedented scale. The networks indicate that “comorbidities have a strong molecular component that is better captured with gene expression profiles than with other molecular sources,” and provide “a systematic framework for translating disease co-occurrences into molecular patterns”.

The study clarifies the biological processes involved, helping explain how these conditions arise and why they co-occur, with a strong emphasis on immune pathways. It could perhaps guide drug repurposing and drug development efforts.

The methodology overcame earlier systemic limitations such as a biased and inadequate knowledge of disease-associated genes and disease interactions. Using uniformly processed RNA-seq with study effect adjustment improved sensitivity and reproducibility; links were further cross-checked against epidemiology and literature.

The use of patient stratification by gene expression “phenotypes” excluded non-significant pathway alterations. Finally, it identifies both positive and negative (inverse) correlations; due to data limitations, only positive links could be systematically compared with epidemiology.

Further research is required to validate the negative associations, obtain generalized epidemiological network data, and correlate demographic and treatment-related data with gene expression differences. Larger sample sizes would help achieve these goals.

Download your PDF copy now!

Journal reference:

Urda-Garcia, B., Sanchez-Valle, J., Lepore, R., et al. (2025). Patient stratification reveals the molecular basis of disease co-occurrences. PNAS. doi: https://doi.org/10.1073/pnas.2421060122. https://www.pnas.org/doi/10.1073/pnas.2421060122