{"id":216846,"date":"2025-10-11T13:40:19","date_gmt":"2025-10-11T13:40:19","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/216846\/"},"modified":"2025-10-11T13:40:19","modified_gmt":"2025-10-11T13:40:19","slug":"population-scale-gene-based-analysis-of-whole-genome-sequencing-provides-insights-into-metabolic-health","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/216846\/","title":{"rendered":"Population-scale gene-based analysis of whole-genome sequencing provides insights into metabolic health"},"content":{"rendered":"<p>Ethics<\/p>\n<p>Our research complies with all relevant ethical regulations. All studies included in this research were approved by the relevant board or committee. UKBB has approval from the North West Multicentre Research Ethics Committee (REC reference 13\/NW\/0157) as a Research Tissue Bank (RTB) approval, and informed consent was provided by each participant. This approval means that researchers do not require separate ethical clearance and can operate under RTB approval. This RTB approval was granted initially in 2011 and is renewed every 5\u2009years; hence, UKBB successfully renewed approval in 2016 and 2021. All work in UKBB reported in this manuscript was performed under UKBB application numbers 20361 and 52293. The collection of participant information adhered to the AoU Research Program Operational Protocol (<a href=\"http:\/\/allofus.nih.gov\/article\/all-us-research-program-protocol\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/allofus.nih.gov\/article\/all-us-research-program-protocol<\/a>). The AoU Institutional Review Board (IRB) (<a href=\"https:\/\/allofus.nih.gov\/about\/who-we-are\/institutional-review-board-irb-of-all-of-us\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/allofus.nih.gov\/about\/who-we-are\/institutional-review-board-irb-of-all-of-us<\/a>) is charged with reviewing the protocol, informed consent and other participant-facing materials for the AoU Research Program. The IRB follows the regulations and guidance of the Office for Human Research Protections for all studies, ensuring that the rights and welfare of research participants are overseen and protected uniformly. For ALSPAC, ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees.<\/p>\n<p>UKBB WGS data processing<\/p>\n<p>The WGS of UKBB participants is described in detail in ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 7\" title=\"The UK Biobank Whole-Genome Sequencing Consortium Whole-genome sequencing of 490,640 UK Biobank participants. Nature 645, 692&#x2013;701 (2025).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR7\" id=\"ref-link-section-d185104123e2323\" rel=\"nofollow noopener\" target=\"_blank\">7<\/a>. In brief, 490,640 UKBB participants were sequenced to an average depth of 32.5\u00d7 using Illumina NovaSeq 6000 platform. Variants were jointly called using Graphtyper<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 52\" title=\"Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654&#x2013;1660 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR52\" id=\"ref-link-section-d185104123e2327\" rel=\"nofollow noopener\" target=\"_blank\">52<\/a>, which resulted in 1,037,556,156 and 101,188,713 high-quality (AAscore &lt;0.5 and &lt;5 duplicate inconsistencies) single-nucleotide polymorphisms (SNPs) and indels, respectively.<\/p>\n<p>We further processed the jointly called genotype data in Hail v.0.2<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 53\" title=\"Hail Team. Hail 0.2.108-fc03e9d5dc08. GitHub &#010;                  https:\/\/github.com\/hail-is\/hail\/releases\/tag\/0.2.108&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR53\" id=\"ref-link-section-d185104123e2334\" rel=\"nofollow noopener\" target=\"_blank\">53<\/a>, where multi-allelic sites were first split and normalized. Variants were then filtered based on low allelic balance (ABHet &lt;0.175, ABHom &lt;0.9), low quality-by-depth (QD) normalized score (QD\u2009&lt;\u20096), low phred-scaled quality score (QUAL\u2009&lt;\u200910) and high missingness (call rate &lt;90%). For the analysis in the European-ancestry cohort (see below), we further removed variants that failed test for Hardy\u2013Weinberg equilibrium (P\u2009&lt;\u20091\u2009\u00d7\u200910\u2212100) within this cohort.<\/p>\n<p>Variants were annotated using Ensembl variant effect predictor (VEP)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 54\" title=\"McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR54\" id=\"ref-link-section-d185104123e2346\" rel=\"nofollow noopener\" target=\"_blank\">54<\/a> v.108.2 with the LOFTEE plugin<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 55\" title=\"Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434&#x2013;443 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR55\" id=\"ref-link-section-d185104123e2350\" rel=\"nofollow noopener\" target=\"_blank\">55<\/a>. Combined annotation-dependent depletion (CADD) annotations were based on precomputed CADD<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 56\" title=\"Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. &amp; Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886&#x2013;D894 (2019).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR56\" id=\"ref-link-section-d185104123e2354\" rel=\"nofollow noopener\" target=\"_blank\">56<\/a> v.1.7 annotations for all SNPs and gnomAD v.4 indels. REVEL<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 15\" title=\"Ioannidis, N. M. et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877&#x2013;885 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR15\" id=\"ref-link-section-d185104123e2358\" rel=\"nofollow noopener\" target=\"_blank\">15<\/a> annotations were obtained from the 3 May 2021 release of precomputed REVEL scores for all SNPs. We prioritized the individual consequence for each variant based on severity, which was defined by VEP. The PTV category is the combination of stop-gained, frameshift, splice acceptor and splice donor variants. The missense and synonymous variants were adopted directly from VEP. Only the variants on autosomes and chromosome X, which were within ENSEMBL protein-coding transcripts, were included in our downstream analysis.<\/p>\n<p>Demographics of the study population are presented in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">16<\/a>.<\/p>\n<p>European ancestry definition in UKBB WGS<\/p>\n<p>We defined a European-ancestry cohort as people who most resembled the NFE (non-Finnish European) population as labeled in the gnomAD v.3.1 dataset<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 55\" title=\"Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434&#x2013;443 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR55\" id=\"ref-link-section-d185104123e2376\" rel=\"nofollow noopener\" target=\"_blank\">55<\/a>. This NFE group was one of nine ancestry groups labeled in gnomAD, which was based on Human Genome Diversity Project and 1000 Genomes Project samples. Variant loadings for 76,399 high-quality informative variants from gnomAD were used to project the first 16 principal components onto all UKBB WGS samples. A random forest classifier trained on the nine ancestry labels in gnomAD was then used to calculate probabilities that reflect the similarity between the UKBB participant and each of the gnomAD ancestry labels.<\/p>\n<p>Phenotype preparation in UKBB<\/p>\n<p>Binary outcomes were prepared using a combination of hospital episode statistics (UKBB showcase IDs: 41202, 41204, 41200, 41210) primary care records (UKBB showcase IDs: 42040), death certificates (UKBB showcase IDs: 40001, 40002) and self-reported medical conditions (UKBB showcase ID: 20002). Qualifying codes pertaining to each condition are listed in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">10<\/a>. Any participant with a qualifying code was considered a case, those without a qualifying code were considered controls. For T2D, qualifying terms included codes specifying diagnoses of noninsulin-dependent diabetes, T2D, and insulin-treated T2D. Participants who self-reported a history of T2D were also classified as cases. For CKD, diagnostic codes included those specifying chronic renal failure, chronic renal impairment, CKD, end-stage renal failure, hypertensive renal disease with renal failure, or codes indicative of preparation or receipt of renal replacement therapy. Participants who self-reported renal\/kidney failure, dialysis or procedures to prepare for peritoneal or hemodialysis were specified as cases. For T2D and CKD phenotype definitions, all participants who did not meet the qualifying terms were classified as controls (see Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">10<\/a> for a full list of qualifying codes). Thinness was defined as having the lowest 5% of BMI. Metabolic dysfunction-associated steatotic liver disease required the presence of steatosis and a qualifying metabolic risk factor, namely obesity, T2D or other metabolic dysregulation<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 57\" title=\"Tacke, F. et al. EASL-EASD-EASO clinical practice guidelines on the management of metabolic dysfunction-associated steatotic liver disease (MASLD). J. Hepatol. 81, 492&#x2013;542 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR57\" id=\"ref-link-section-d185104123e2394\" rel=\"nofollow noopener\" target=\"_blank\">57<\/a>. Steatosis was defined using the fatty liver index, a composite measurement of triglycerides, glutamyl-transferase, waist circumference and BMI that ranges from 0 to 100 (ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 58\" title=\"Bedogni, G. et al. The fatty liver index: a simple and accurate predictor of hepatic steatosis in the general population. BMC Gastroenterol. 6, 33 (2006).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR58\" id=\"ref-link-section-d185104123e2398\" rel=\"nofollow noopener\" target=\"_blank\">58<\/a>). Specific fatty liver index cut-offs according to participant sex, BMI and waist circumference as described in ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 59\" title=\"Wu, J., Li, H., Xu, Z., Ran, L. &amp; Kong, L.-Q. Population-specific cut-off points of fatty liver index for the diagnosis of hepatic steatosis. J. Hepatol. 75, 726&#x2013;728 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR59\" id=\"ref-link-section-d185104123e2402\" rel=\"nofollow noopener\" target=\"_blank\">59<\/a> were applied to define presence of steatosis.<\/p>\n<p>Genome-wide gene-burden testing in the UKBB<\/p>\n<p>BOLT-LMM<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 60\" title=\"Loh, P. R., Kichaev, G., Gazal, S., Schoech, A. P. &amp; Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906&#x2013;908 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR60\" id=\"ref-link-section-d185104123e2415\" rel=\"nofollow noopener\" target=\"_blank\">60<\/a> v.2.4.1 was used as our primary analytical software to conduct gene-burden tests.<\/p>\n<p>To run BOLT-LMM, we first derived a set of genotypes consisting of common (MAF &gt;0.01) linkage disequilibrium (LD)-pruned (LD r2\u2009&lt;\u20090.1) variants in people with WGS data to build the null model. Pruning was conducted using PLINK2<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 61\" title=\"Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR61\" id=\"ref-link-section-d185104123e2426\" rel=\"nofollow noopener\" target=\"_blank\">61<\/a> on a random subset of 50,000 individuals (options in effect: \u2013maf 0.01 \u2013thin-indiv-count 50,000 \u2013indep-pairwise 1,000\u2009kb 0.1).<\/p>\n<p>We adopted the same strategies used in our previous analyses using WES data<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\" title=\"Zhao, Y. et al. Protein-truncating variants in BSN are associated with severe adult-onset obesity, type 2 diabetes and fatty liver disease. Nat. Genet. 56, 579&#x2013;584 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR9\" id=\"ref-link-section-d185104123e2433\" rel=\"nofollow noopener\" target=\"_blank\">9<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\" title=\"Gardner, E. J. et al. Damaging missense variants in IGF1R implicate a role for IGF-1 resistance in the etiology of type 2 diabetes. Cell Genom. 2, 100208 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR11\" id=\"ref-link-section-d185104123e2436\" rel=\"nofollow noopener\" target=\"_blank\">11<\/a>. We generate the dummy genotype files in which each gene-mask combination was represented by a single variant, which were required as the genotype input for BOLT-LMM. We then coded people with a qualifying variant within a gene as heterozygous, regardless of the total number of variants they carried in that gene. We then created the dummy genotypes for the MAF &lt;0.1% high-confidence PTVs as defined by LOFTEE, missense variants with REVEL &gt;0.5 and missense variants with REVEL &gt;0.7. After getting all required inputs, BOLT-LMM was used to analyze BMI and T2D using default parameters except for the inclusion of the \u2018lmmInfOnly\u2019 flag. The covariates included in our analysis are age, age2, sex, age\u2009\u00d7\u2009sex, the first 20 principal components as calculated from all WGS samples and the WGS-released batch (Vanguard project, Sanger: 49,932, Sanger: 193,075, deCode: 247,504). Different from our previous studies, we included all samples without restricting their ancestries to maximize the sample size. Only people who withdrew consent or had missing phenotypes and covariates were excluded; filtering resulted in 481,137 and 489,941 samples remaining for BMI and T2D, respectively.<\/p>\n<p>To identify single variants driving a given association within a single gene, we performed a leave-one-out analysis for all identified genes using a generalized linear model (GLM) in R v.4.0.2 by dropping the variants contained in the gene-mask combination one at a time.<\/p>\n<p>To test whether our significant burden test results are independent of common-variant GWAS associations, we generated polygenic risk scores for each trait and included these as covariates in our linear mixed model. Independent genome-wide significant (P\u2009&lt;\u20095\u2009\u00d7\u200910\u22128) variants from existing single-variant GWAS summary statistics for each trait were first identified using GCTA-COJO<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 62\" title=\"Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369&#x2013;375 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR62\" id=\"ref-link-section-d185104123e2454\" rel=\"nofollow noopener\" target=\"_blank\">62<\/a>. Polygenic risk scores in each UKBB participant were then calculated as the weighted sum of the person\u2019s genotypes across the significant single variants, where weights were derived from the variants\u2019 beta coefficients in the corresponding GWAS. This score was then included as an additional covariate in the burden analysis as implemented in BOLT-LMM described above. As BOLT-LMM use a linear mixed model, we estimated and reported the OR using the generalized linear model in R v.4.0.2 for all T2D-associated genes.<\/p>\n<p>In an additional analysis designed to exclude that our new, replicated rare variant associations were the result of confounding by LD with common variants we interrogated marker level results from WGS-analyses of BMI and T2D. Regional common variants that could conceivably be driving the rare variant associations (MAF &gt;0.001, P\u2009&lt;\u20096.15\u2009\u00d7\u200910\u22127, \u00b1\u2009500\u2009kb from index gene) were extracted and clumped (r2\u2009&lt;\u20090.001) to identify approximately independent variants, which were then included as covariates in a generalized linear model with the cognate gene-burden mask as the predictor variable of interest. As in our discovery analysis, age, age2, sex, age\u2009\u00d7\u2009sex and the first 20 principal components as calculated from all WGS samples, and the WGS-released batch were included as covariates.<\/p>\n<p>Replication in AoU study<\/p>\n<p>Participants analyzed in this study were selected from the AoU Research Program cohort<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"All of Us Research Program Genomics Investigators Genomic data in the All of Us research program. Nature 627, 340&#x2013;346 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR36\" id=\"ref-link-section-d185104123e2480\" rel=\"nofollow noopener\" target=\"_blank\">36<\/a>. The collection of participant information adhered to the AoU Research Program Operational Protocol (<a href=\"http:\/\/allofus.nih.gov\/article\/all-us-research-program-protocol\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/allofus.nih.gov\/article\/all-us-research-program-protocol<\/a>). Detailed methodologies regarding genotyping, ancestry classification, quality control measures and the methodology for excluding related participants are thoroughly documented in the AoU Research Program Genomic Research Data Quality Report (<a href=\"https:\/\/support.researchallofus.org\/hc\/en-us\/articles\/4617899955092-All-of-Us-Genomic-Quality-Report\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/support.researchallofus.org\/hc\/en-us\/articles\/4617899955092-All-of-Us-Genomic-Quality-Report<\/a>).<\/p>\n<p>We conducted our analysis on short-read WGS data (v.7.1) subsetted to the protein-coding exome, focusing on two phenotypes: BMI and T2D. The analysis encompassed 219,015 unrelated people, including 112,526 of European ancestry, 46,414 of African\/African American ancestry, 34,865 of American Admixed\/Latino and 25,210 various other ancestries (see Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">3<\/a> for detailed sample size information). Ancestry assignment was conducted centrally by AoU. Briefly, a random forest classifier was trained on data from the Human Genetic Diversity Project and 1000 Genomes Project. This classifier was then applied to the AoU data. Further information is available from the AoU (<a href=\"https:\/\/support.researchallofus.org\/hc\/en-us\/articles\/4617899955092-All-of-Us-Genomic-Quality-Report-ARCHIVED-C2022Q4R9-CDR-v7\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/support.researchallofus.org\/hc\/en-us\/articles\/4617899955092-All-of-Us-Genomic-Quality-Report-ARCHIVED-C2022Q4R9-CDR-v7<\/a>).<\/p>\n<p>BMI data were derived from the \u2018body mass index (BMI) [Ratio]\u2019 metric (Concept Id 3038553) within the \u2018Labs and Measurements\u2019 domain. BMI values &lt;10 or &gt;100 were excluded and the earliest remaining value recorded and corresponding age was used. The \u2018Type 2 diabetes mellitus\u2019 identifier (Concept Id 201826, <a href=\"https:\/\/databrowser.researchallofus.org\/ehr\/conditions\/201826\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/databrowser.researchallofus.org\/ehr\/conditions\/201826<\/a>) in the \u2018Conditions\u2019 domain facilitated the identification of T2D cases, and the age corresponding to the earliest diagnosis of T2D was used. The participants\u2019 ages were calculated by subtracting the birth year from the timestamp of the earliest record. Among these people, 32,462 were identified as T2D cases, and 186,553 served as controls. Only people aged over 18\u2009years were included in the analyses. Only a small proportion of episodes that indicated a diagnosis of T2D had a contemporaneous BMI measurement. As such, to adjust T2D for BMI, we used two approaches: the median BMI value recorded was included as a covariate in the model or the BMI record closest to T2D was used. Population demographics by ancestry are described in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">17<\/a>.<\/p>\n<p>Gene-based burden tests were applied to variants with MAF &lt;0.001 that met prespecified bioinformatic criteria and were in selected genes (for example, those significant in UKBB discovery). Note that, due to different population composition, variant MAF will differ between AoU and UKBB. Burden tests were conducted using STAAR (variant-set test for association using annotation information)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 63\" title=\"Li, X. et al. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat. Genet. 52, 969&#x2013;983 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR63\" id=\"ref-link-section-d185104123e2527\" rel=\"nofollow noopener\" target=\"_blank\">63<\/a> implemented in STAARpipeline<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 64\" title=\"Li, Z. et al. A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nat. Methods 19, 1599&#x2013;1611 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR64\" id=\"ref-link-section-d185104123e2531\" rel=\"nofollow noopener\" target=\"_blank\">64<\/a> (R package v.0.9.7), with covariates adjustments for age, age2, sex, age\u2009\u00d7\u2009sex, and the first 16 principal components. The criteria for gene-burden masks followed the methodology of the main UKB analyses.<\/p>\n<p>Power calculations<\/p>\n<p>To estimate statistical power for replication in the AoU study, we first corrected effect estimates in the discovery analysis for winners\u2019 curse using the bootstrap method<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 65\" title=\"Forde, A., Hemani, G. &amp; Ferguson, J. Review and further developments in statistical corrections for Winner&#x2019;s Curse in genetic association studies. PLoS Genet. 19, e1010546 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR65\" id=\"ref-link-section-d185104123e2545\" rel=\"nofollow noopener\" target=\"_blank\">65<\/a> implemented in the winner\u2019s curse package in R (<a href=\"https:\/\/amandaforde.github.io\/winnerscurse\/\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/amandaforde.github.io\/winnerscurse\/<\/a>). For T2D, the resulting effect estimates (betas from a linear mixed model) were transformed to odds ratios<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 66\" title=\"Lloyd-Jones, L. R., Robinson, M. R., Yang, J. &amp; Visscher, P. M. Transformation of summary statistics from linear mixed model association on all-or-none traits to odds ratio. Genetics 208, 1397&#x2013;1408 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR66\" id=\"ref-link-section-d185104123e2556\" rel=\"nofollow noopener\" target=\"_blank\">66<\/a> (<a href=\"https:\/\/shiny.cnsgenomics.com\/LMOR\/\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/shiny.cnsgenomics.com\/LMOR\/<\/a>). Power calculations using the relevant winners\u2019 curse corrected effect estimates were then conducted in genpwr (<a href=\"https:\/\/cran.r-project.org\/web\/packages\/genpwr\/vignettes\/vignette.html\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/cran.r-project.org\/web\/packages\/genpwr\/vignettes\/vignette.html<\/a>).<\/p>\n<p>UKBB WES processing<\/p>\n<p>To quantify the gain from WGS versus WES in UKBB, we compared variant counts between our WGS data with those from the 450,000 original quality functional equivalence release of the UKBB WES data (454,756 participants total). We processed multisample pVCFs using Hail<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 53\" title=\"Hail Team. Hail 0.2.108-fc03e9d5dc08. GitHub &#010;                  https:\/\/github.com\/hail-is\/hail\/releases\/tag\/0.2.108&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR53\" id=\"ref-link-section-d185104123e2582\" rel=\"nofollow noopener\" target=\"_blank\">53<\/a> v.0.2, where multi-allelic sites were first split and normalized. Sites were then excluded if they failed the following quality metrics: for SNPs, ABHet &lt;0.175, QD &lt;2, QUAL &lt;30, SOR &gt;30, FS &gt;60, MQ &lt;40, MQRankSum &lt;\u221212.5 and ReadPosRankSum &lt;\u22128; for indels: ABHet &lt;0.175, QD &lt;2, QUAL &lt;30, FS &gt;200 and ReadPosRankSum &lt;\u221220, resulting in 23,273,514 variants available for analysis. People with high heterozygosity rates, discordant WES genotypes compared to array and discordant reported versus genetic sex were removed, resulting in 453,931 participants. Variants were annotated using the identical VEP pipeline, LOFTEE, CADD and REVEL annotations as described for WGS.<\/p>\n<p>PheWAS of identified BMI-associated and T2D-associated genes in UKBB<\/p>\n<p>We ran association tests between each identified genes carriers and a list of representative phenotypes (full list can be found in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">10<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">11<\/a>) available in the UKBB using R v.4.0.2 including the same covariates we used in our genome-wide gene-burden tests. We also extracted the phenotypic associations with P\u2009&lt;\u20090.05 for all genes we identified in our analysis from AstraZeneca PheWAS Portal<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 67\" title=\"Wang, Q. et al. Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature 597, 527&#x2013;532 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR67\" id=\"ref-link-section-d185104123e2603\" rel=\"nofollow noopener\" target=\"_blank\">67<\/a> (version: UKBB 470\u2009K WES v.5; Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">18<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">19<\/a>).<\/p>\n<p>BMI and T2D GWAS lookup<\/p>\n<p>Identified genes were queried for proximal BMI and T2D GWAS signals, using data from the largest published GWAS meta-analyses. For BMI, we used data from the GIANT consortium<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 68\" title=\"Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in &#x223C;700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641&#x2013;3649 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR68\" id=\"ref-link-section-d185104123e2622\" rel=\"nofollow noopener\" target=\"_blank\">68<\/a>, which includes data on up to 806,834 individuals. For T2D, we used data from the DIAGRAM consortium<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 69\" title=\"Suzuki, K. et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature 627, 347&#x2013;357 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR69\" id=\"ref-link-section-d185104123e2626\" rel=\"nofollow noopener\" target=\"_blank\">69<\/a>, which included up to 428,452 T2D cases and 2,107,149 controls.<\/p>\n<p>For each of these GWAS, we performed signal selection and prioritized causal GWAS genes using the \u2018GWAS to Genes\u2019 pipeline as described elsewhere<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Kentistou, K. A. et al. Understanding the genetic complexity of puberty timing across the allele frequency spectrum. Nat. Genet. 56, 1397&#x2013;1411 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR18\" id=\"ref-link-section-d185104123e2633\" rel=\"nofollow noopener\" target=\"_blank\">18<\/a>. The genes identified previously were annotated if their start or end sites were within 500\u2009kb up- or downstream of GWAS signals in the two meta-analyses, using the National Center for Biotechnology Information RefSeq gene map for GRCh37, and overlayed with further supporting functional dataset information. For further details about the specific application of this method, see ref. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Kentistou, K. A. et al. Understanding the genetic complexity of puberty timing across the allele frequency spectrum. Nat. Genet. 56, 1397&#x2013;1411 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR18\" id=\"ref-link-section-d185104123e2637\" rel=\"nofollow noopener\" target=\"_blank\">18<\/a>.<\/p>\n<p>Assessment for severe insulin resistance in carriers of IRS2 PTVs in a UK birth cohort<\/p>\n<p>ALSPAC is a prospective birth cohort from the southwest of England that recruited &gt;75% of all pregnancies delivered in the Greater Bristol area between 1990 and 1992 (refs. <a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Northstone, K. et al. The Avon Longitudinal Study of Parents and Children (ALSPAC): an update on the enrolled sample of index children in 2019. Wellcome Open Res. 4, 51 (2019).\" href=\"#ref-CR70\" id=\"ref-link-section-d185104123e2654\">70<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Fraser, A. et al. Cohort Profile: the Avon Longitudinal Study of Parents and Children: ALSPAC mothers cohort. Int. J. Epidemiol. 42, 97&#x2013;110 (2013).\" href=\"#ref-CR71\" id=\"ref-link-section-d185104123e2654_1\">71<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Lawlor, D. A. et al. The second generation of the Avon Longitudinal Study of Parents and Children (ALSPAC-G2): a cohort profile. Wellcome Open Res. 4, 36 (2019).\" href=\"#ref-CR72\" id=\"ref-link-section-d185104123e2654_2\">72<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 73\" title=\"Boyd, A. et al. Cohort Profile: the &#x2018;children of the 90s&#x2019;&#x2014;the index offspring of the Avon Longitudinal Study of Parents and Children. Int. J. Epidemiol. 42, 111&#x2013;127 (2013).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR73\" id=\"ref-link-section-d185104123e2657\" rel=\"nofollow noopener\" target=\"_blank\">73<\/a>). The study has currently enrolled 14,833 unique women (G0 mothers), 3,807 G0 partners and 14,901 children. Full details of the cohort and study design are available at <a href=\"http:\/\/www.alspac.bris.ac.uk\/\" rel=\"nofollow noopener\" target=\"_blank\">http:\/\/www.alspac.bris.ac.uk<\/a>. Please note that the study website contains details of all the data that is available through a fully searchable data dictionary and variable search tool (<a href=\"http:\/\/www.bristol.ac.uk\/alspac\/researchers\/our-data\/\" rel=\"nofollow noopener\" target=\"_blank\">http:\/\/www.bristol.ac.uk\/alspac\/researchers\/our-data\/<\/a>). Exome sequencing data from 8,605 children and 3,389 of their parents was interrogated for carriers of any high-confidence protein truncating variants in IRS2 as defined by LOFTEE with MAF &lt;1%. Two such carriers were identified (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM4\" rel=\"nofollow noopener\" target=\"_blank\">14<\/a>), and insulin and glucose measurements extracted from available data and insulin levels were compared to population specific reference ranges (2.5th and 97.5th centile) and commonly used clinical cut-offs for severe insulin resistance (&lt;150\u2009pmol\u2009l\u22121). Insulin was measured using either an ELISA (Mercodia) or an ECLIA (Roche)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 74\" title=\"Lawlor, D. A. et al. Association between general and central adiposity in childhood, and change in these, with cardiovascular risk factors in adolescence: prospective cohort study. BMJ 341, 1205 (2010).\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#ref-CR74\" id=\"ref-link-section-d185104123e2684\" rel=\"nofollow noopener\" target=\"_blank\">74<\/a>. Details of exome sequencing, quality control, variant calling and annotation have been described in <a href=\"https:\/\/wellcomeopenresearch.org\/articles\/9-390\/v1\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/wellcomeopenresearch.org\/articles\/9-390\/v1<\/a>.<\/p>\n<p>Lookup of effects of IRS2 PTVs on CKD-related traits in the AoU cohort<\/p>\n<p>To provide supporting evidence of an effect of loss-of-function variants in IRS2 on CKD, we leveraged the results of a recent PheWAS conducted in the AoU cohort, publicly accessible using the \u2018All by All web browser\u2019 (<a href=\"https:\/\/allbyall.researchallofus.org\/\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/allbyall.researchallofus.org\/<\/a>). The relevant gene page is available directly at <a href=\"https:\/\/allbyall.researchallofus.org\/app?state=%7B%22regionId%22%3Anull%2C%22geneId%22%3A%22ENSG00000185950%22%2C%22resultIndex%22%3A%22gene-phewas%22%2C%22resultLayout%22%3A%22full%22%2C%22analysisId%22%3A%223027114%22%2C%22variantId%22%3Anull%2C%22burdenSet%22%3A%22pLoF%22%2C%22ancestryGroup%22%3A%22meta%22%2C%22phewasOpts%22%3Atrue%2C%22selectedContig%22%3A%22all%22%2C%22hideGeneOpts%22%3Afalse%7D\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/allbyall.researchallofus.org\/app?state=%7B%22regionId%22%3Anull%2C%22geneId%22%3A%22ENSG00000185950%22%2C%22resultIndex%22%3A%22gene-phewas%22%2C%22resultLayout%22%3A%22full%22%2C%22analysisId%22%3A%223027114%22%2C%22variantId%22%3Anull%2C%22burdenSet%22%3A%22pLoF%22%2C%22ancestryGroup%22%3A%22meta%22%2C%22phewasOpts%22%3Atrue%2C%22selectedContig%22%3A%22all%22%2C%22hideGeneOpts%22%3Afalse%7D<\/a>.<\/p>\n<p>Reporting summary<\/p>\n<p>Further information on research design is available in the <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41588-025-02364-2#MOESM2\" rel=\"nofollow noopener\" target=\"_blank\">Nature Portfolio Reporting Summary<\/a> linked to this article.<\/p>\n","protected":false},"excerpt":{"rendered":"Ethics Our research complies with all relevant ethical regulations. All studies included in this research were approved by&hellip;\n","protected":false},"author":2,"featured_media":216847,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[50],"tags":[2342,13114,258,8869,13113,257,200,3869,34411,3870,986,79,3629],"class_list":{"0":"post-216846","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-genetics","8":"tag-agriculture","9":"tag-animal-genetics-and-genomics","10":"tag-biomedicine","11":"tag-cancer-research","12":"tag-gene-function","13":"tag-general","14":"tag-genetics","15":"tag-genetics-research","16":"tag-genome-wide-association-studies","17":"tag-human-genetics","18":"tag-obesity","19":"tag-science","20":"tag-type-2-diabetes"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/216846","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=216846"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/216846\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/216847"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=216846"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=216846"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=216846"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}