Forecasting disease

After the training phase, the researchers could fine-tune the model to different tasks.

First, they tested the model on standard sleep analysis tasks, such as classifying different stages of sleep and diagnosing the severity of sleep apnea. SleepFM performed as well as or better than state-of-the-art models used today.

Then the researchers tackled a more ambitious goal: predicting future disease onset from sleep data. To identify which conditions could be forecast, they needed to pair the training polysomnography data with the long-term health outcomes of the same participants. Fortunately, they had access to more than half a century’s worth of health records from a sleep clinic.

The Stanford Sleep Medicine Center was founded in 1970 by the late William Dement, MD, PhD, widely considered the father of sleep medicine. The largest cohort of patients used to train SleepFM — some 35,000 patients ranging in age from 2 to 96 — had their polysomnography data recorded at the clinic between 1999 and 2024. The researchers paired these patients’ polysomnography data with their electronic health records, which provided up to 25 years of follow-up for some patients.

(The clinic’s polysomnography recordings go back even further, but only on paper, said Mignot, who directed the sleep center from 2010 to 2019.)

SleepFM analyzed more than 1,000 disease categories in the health records and found 130 that could be predicted with reasonable accuracy by a patient’s sleep data. The model’s predictions were particularly strong for cancers, pregnancy complications, circulatory conditions and mental disorders, achieving a C-index higher than 0.8.

The C-index, or concordance index, is a common measure of a model’s predictive performance, specifically, its ability to predict which of any two individuals in a group will experience an event first.

“For all possible pairs of individuals, the model gives a ranking of who’s more likely to experience an event — a heart attack, for instance — earlier. A C-index of 0.8 means that 80% of the time, the model’s prediction is concordant with what actually happened,” Zou said.

SleepFM excelled at predicting Parkinson’s disease (C-index 0.89), dementia (0.85), hypertensive heart disease (0.84), heart attack (0.81), prostate cancer (0.89), breast cancer (0.87) and death (0.84).

“We were pleasantly surprised that for a pretty diverse set of conditions, the model is able to make informative predictions,” Zou said.

Models of less accuracy, with C-indices around 0.7, such as those that predict a patient’s response to different cancer treatments, have proven useful in clinical settings, he added.

Interpreting the model

The team is working on ways to further improve SleepFM’s predictions, perhaps by adding data from wearables, and to understand exactly what the model is interpreting.

“It doesn’t explain that to us in English,” Zou said. “But we have developed different interpretation techniques to figure out what the model is looking at when it’s making a specific disease prediction.”

The researchers note that even though heart signals factor more prominently in heart disease predictions and brain signals factor more prominently in mental health predictions, it was the combination of all the data modalities that achieved the most accurate predictions.

“The most information we got for predicting disease was by contrasting the different channels,” Mignot said. Body constituents that were out of sync — a brain that looks asleep but a heart that looks awake, for example — seemed to spell trouble.

Mignot and Zou are members of the Wu Tsai Neurosciences Institute.

Rahul Thapa, a PhD student in biomedical data science, and Magnus Ruud Kjaer, a PhD student at Technical University of Denmark, are co-lead authors of the study. Thapa is a Knight-Hennessy scholar.

Researchers from the Technical University of Denmark, Copenhagen University Hospital – Rigshospitalet, BioSerenity, University of Copenhagen and Harvard Medical School contributed to the work.

The study received funding from the National Institutes of Health (grant R01HL161253), Knight-Hennessy Scholars and Chan-Zuckerberg Biohub.