{"id":482805,"date":"2026-02-21T23:32:29","date_gmt":"2026-02-21T23:32:29","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/482805\/"},"modified":"2026-02-21T23:32:29","modified_gmt":"2026-02-21T23:32:29","slug":"large-scale-environmental-dna-survey-reveals-niche-axes-of-a-regional-coastal-fish-community","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/482805\/","title":{"rendered":"Large-scale environmental DNA survey reveals niche axes of a regional coastal fish community"},"content":{"rendered":"<p>The workspace and equipment in the pre-PCR area were sterilised with hypochlorite solution before library preparation. Low-retention filtered pipette tips and microtubes were used and pre- and post-PCR manipulations were performed in two different dedicated rooms to safeguard against carryover contamination. Two-step PCR (first-round PCR and index PCR) was applied for paired-end next generation sequencer (NGS) library preparation, following previously developed methods<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 43\" title=\"Miya, M. et al. MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: detection of more than 230 subtropical marine species. R Soc. Open. Sci. 2, 150088 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR43\" id=\"ref-link-section-d172878278e2145\" rel=\"nofollow noopener\" target=\"_blank\">43<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 44\" title=\"Minamoto, T. et al. An illustrated manual for environmental DNA research: water sampling guidelines and experimental protocols. Environ. DNA. 3, 8&#x2013;13 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR44\" id=\"ref-link-section-d172878278e2148\" rel=\"nofollow noopener\" target=\"_blank\">44<\/a>. For the first-round PCR (1st PCR), a mixture of two primer pairs was used: MiFish-U-forward (5\u2032\u2013ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT NNN NNN GTC GGT AAA ACT CGT GCC AGC\u20133\u2032), MiFish-U-reverse (5\u2032\u2013GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC TNN NNN NCA TAG TGG GGT ATC TAA TCC CAG TTT G\u20133\u2032), MiFish-E-forward-v2 (5\u2032\u2013ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT NNN NNN RGT TGG TAA ATC TCG TGC CAG C\u20133\u2032) and MiFish-E-reverse-v2 (5\u2032\u2013GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC TNN NNN NGC ATA GTG GGG TAT CTA ATC CTA GTT TG\u20133\u2032). The 1st PCR was carried out with a 12-\u00b5l reaction volume in eight replications using a strip of eight 0.2-ml tubes. A 1st PCR blank (1B) was also prepared during this process. To reduce the experiment costs, a single 12-\u00b5l reaction was prepared for each type of blank (FB, EB, and 1B). After the 1st PCR, equal volumes of the PCR products were pooled from the eight replications in a single tube and purified using a GeneRead Size Selection Kit (Qiagen, Hilden, Germany). Subsequently, the purified target products (ca. 300 bp) were quantified using TapeStation 2200 (Agilent Technologies, Tokyo, Japan). After diluting samples to 0.1 ng\/\u00b5l with Milli-Q water, 1.5 \u00b5l of each diluted product was subjected to the index PCR (2nd PCR) in a 12-\u00b5l reaction volume to append dual-indexed sequences and flow cell-binding sites for NGS platform. For the three types of blanks (FB, EB, and 1B), the 1st PCR products were purified in the same manner but were not quantified. They were diluted at the average dilution ratio of the positive samples to be used as templates for the 2nd PCR. The following two primers were used for the 2nd PCR to append dual-indexed sequences (eight nucleotides indicated by Xs) and flow cell-binding sites for the MiSeq platform (5\u2032\u2013ends of the sequences before eight Xs): 2nd-PCR-forward (5\u2013AAT GAT ACG GCG ACC ACC GAG ATC TAC ACX XXX XXX XAC ACT CTT TCC CTA CAC GAC GCT CTT CCG ATC T\u20133\u2032) and 2nd-PCR-reverse (5\u2032\u2013CAA GCA GAA GAC GGC ATA CGA GAT XXX XXX XXG TGA CTG GAG TTC AGA CGT GTG CTC TTC CGA TCT\u20133\u2032). A 2nd PCR blank (2B) was made during this process.<\/p>\n<p>To monitor contamination during on-site filtration, DNA extraction, and 1st and 2nd PCRs, 321 blanks were made (FB\u2009=\u2009191, EB\u2009=\u200952, 1B\u2009=\u200965, 2B\u2009=\u200913) and subjected to the library preparation procedure. They were divided into four sets and the libraries were pooled in equal volumes into four 1.5-ml tubes. The target amplicons (ca. 370\u00a0bp) were purified using a 2% E-Gel Size Select agarose gel (Invitrogen, Carlsbad, CA). The concentration of the size-selected libraries was measured using a Qubit dsDNA HS Assay Kit and a Qubit fluorometer (Life Technologies, Carlsbad, CA), diluted to 12.0 pM with HT1 buffer (Illumina), and sequenced on the MiSeq platform (Illumina, San Diego, CA) using a MiSeq v2 Reagent Kit, 300 cycles (Illumina) with a PhiX Control v3 (Illumina, San Diego, CA) spike-in (expected at 5%), following the manufacturer\u2019s protocol. To remove residual contamination from the MiSeq flow path, the flow channel was washed with hypochlorous acid before each operation. All raw DNA sequence data and associated information were deposited in DDBJ\/EMBL\/GenBank (accession number DRA007474).<\/p>\n<p>Data preprocessing and taxonomic assignment<\/p>\n<p>Data preprocessing and analyses were performed using MiSeq raw reads from the 528 samples. Using USEARCH ver. 10.0.240<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 45\" title=\"Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460&#x2013;2461 (2010).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR45\" id=\"ref-link-section-d172878278e2162\" rel=\"nofollow noopener\" target=\"_blank\">45<\/a>, quality-filtered forward and reverse reads were merged, primer sequences were removed, low-quality reads were filtered, and reads were dereplicated and denoised to obtain amplicon sequence variants (ASVs). Finally, fish species (molecular operational taxonomic units; MOTUs) were assigned to ASVs with a sequence identity of &gt;\u200980% with the reference sequences. A custom reference database (see details in Supplementary Information S1) was used. Four expert ichthyologists (M.M., H.Motomura, T.Sado, and H.S.) revised the taxonomic assignments based on phylogenetic relationships combined with their knowledge of species distributions and dominance. The details of data processing, database assembly, and taxonomic assignment are described in Supplementary Information S1.<\/p>\n<p>To focus on coastal fishes, all fishes principally inhabiting regions outside of the coastal areas (deep-sea fishes, oceanic epipelagic fishes, pure freshwater fishes, and all non-native fishes originating from food materials and carryover contamination) were excluded (Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#MOESM2\" rel=\"nofollow noopener\" target=\"_blank\">S2<\/a>). Read counts were subsequently rarefied to the approximate minimum numbers (Supplementary Table S3).<\/p>\n<p>Collecting environmental covariates<\/p>\n<p>To reduce the burden of fieldwork, only seawater temperature and salinity were measured at each site after water sampling and on-site filtration. Additionally, four environmental covariates were collected from publicly available satellite data<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 46\" title=\"NASA Ocean Biology Processing Group. MODIS-Aqua Level-3 Mapped Ocean Color Data, version R2022.0, NASA Ocean Biology Distributed Active Archive Center (2018). Accessed 10 July  2025.\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR46\" id=\"ref-link-section-d172878278e2180\" rel=\"nofollow noopener\" target=\"_blank\">46<\/a>: annual minimum and maximum sea surface temperatures in 2017, mean chlorophyll a concentration in June-August 2017 and mean particulate inorganic carbon concentration in June-August 2017. The resolution of all collected satellite data is 4 km \u00d7 4 km. The field-measured seawater temperature and salinity were missing at ten sites due to instrument failure. Therefore, data from these sites were removed from the subsequent analysis. Furthermore, the following MOTU data were removed: (1) MOTUs categorised as species complex (i.e. including several species) and (2) MOTUs occurring at fewer than six sites. Consequently, the subsequent analyses included 519 species from 518 sampling sites.<\/p>\n<p>Nonlinear multivariate analysis<\/p>\n<p>To better understand the regional community structure, sampling sites were geographically grouped into eight districts based on commonly used climate classifications and geographic proximity: Hokkaido Island (HKD, n = 71), East Main Island along the Pacific Ocean (EMP, n = 39), East Main Island along the Japan Sea (EMJ, n = 39), West Main Island along the Pacific Ocean (WMP, n = 95), West Main Island along the Japan Sea (WMJ, n = 90), West Main Island along inland sea (WMI, n = 63), Izu-Ogasawara Island (IOI, n = 51) and Satsuma-Ryukyu Island (SRI, n = 80). A nonlinear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 47\" title=\"van der Maaten, L. J. P. &amp; Hinton, G. E. Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579&#x2013;2605 (2008).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR47\" id=\"ref-link-section-d172878278e2218\" rel=\"nofollow noopener\" target=\"_blank\">47<\/a>, was used to capture the structures of the entire regional fish community based on Jaccard distances between 518 sites.<\/p>\n<p>Generalised linear latent variable model<\/p>\n<p>Joint species distribution models are used to analyse co-occurrence patterns while accounting for missing predictors. In this study, a generalised linear latent variable model (GLLVM)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Harpole, W. S. &amp; Tilman, D. Grassland species loss resulting from reduced niche dimension. Nature 446, 791&#x2013;793 (2007).\" href=\"#ref-CR7\" id=\"ref-link-section-d172878278e2230\">7<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" title=\"Harpole, W. S. et al. Addition of multiple limiting resources reduces grassland diversity. Nature 537, 93&#x2013;96 (2016).\" href=\"#ref-CR8\" id=\"ref-link-section-d172878278e2230_1\">8<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\" title=\"Wiens, J. J. &amp; Graham, C. H. Niche conservatism: integrating evolution, ecology, and conservation biology. Annu. Rev. Ecol. Evol. Sys. 36, 519&#x2013;539 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR9\" id=\"ref-link-section-d172878278e2233\" rel=\"nofollow noopener\" target=\"_blank\">9<\/a> with spatially autocorrelated latent variables (LVs) was used as a joint species distribution model to explore the hidden niche space. The GLLVM decomposes high-dimensional residual correlations using low-dimensional LVs, thus allowing fitting to relatively large datasets. The model structure was a logistic regression with environmental covariates (\\(\\:{\\varvec{x}}_{\\varvec{i}}\\)) and spatially autocorrelated LVs (\\(\\:{\\varvec{u}}_{\\varvec{i}}\\)) as explanatory variables and species presence-absence (\\(\\:{y}_{ij}\\)) as response variables:<\/p>\n<p>$$\\:{y}_{ij}\\:\\sim\\:\\text{B}\\text{e}\\text{r}\\text{n}\\text{o}\\text{u}\\text{l}\\text{l}\\text{i}\\left({p}_{ij}\\right),\\:\\text{l}\\text{o}\\text{g}\\text{i}\\text{t}\\left({p}_{ij}\\right)={\\beta\\:}_{j0}+{\\varvec{x}}_{i}^{\\top\\:}\\:{\\varvec{\\beta\\:}}_{j}+{\\varvec{u}}_{i}^{\\top\\:}\\:{\\varvec{\\gamma\\:}}_{j}+{\\alpha\\:}_{i},$$<\/p>\n<p>for fish species \\(\\:j\\) = 1, \u2026, 519 at sampling site \\(\\:i\\) = 1, \u2026, 518. \\(\\:{\\varvec{\\beta\\:}}_{j}\\) and \\(\\:{\\varvec{\\gamma\\:}}_{j}\\) are species-specific responses related to the environmental covariates and LVs, respectively. \\(\\:{\\beta\\:}_{j0}\\) is the species-specific intercept and \\(\\:{\\alpha\\:}_{i}\\) is random site effects. The random site effects can mitigate the bias owing to the variation in total eDNA concentrations among sites<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 9\" title=\"Wiens, J. J. &amp; Graham, C. H. Niche conservatism: integrating evolution, ecology, and conservation biology. Annu. Rev. Ecol. Evol. Sys. 36, 519&#x2013;539 (2005).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR9\" id=\"ref-link-section-d172878278e2300\" rel=\"nofollow noopener\" target=\"_blank\">9<\/a>. The spatial autocorrelation was modelled by Gaussian Markov random field<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 48\" title=\"Lindgren, F., Rue, H. &amp; Lindstr&#xF6;m, J. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J. R Statist Soc. B. 73, 423&#x2013;498 (2011).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR48\" id=\"ref-link-section-d172878278e2304\" rel=\"nofollow noopener\" target=\"_blank\">48<\/a> (Supplementary Figure S4). The details of the model structure are described in Supplementary Information S2.<\/p>\n<p>This study considered three candidate models with different sets of environmental covariates. As environmental covariates, Model (a) included the two field-measured variables, Model (b) included the four satellite variables and Model (c) included both the two field-measured variables and the four satellite variables. Furthermore, models with different numbers of latent variables (from 0 to 5) were also considered for each candidate model. Using five-fold cross validation with negative log-likelihoods as the validation loss, the Model (a) with three spatially autocorrelated LVs was selected as the best-fit model from 18 candidate models. In the main text, the results of this best-fit model were mainly reported for simplicity, whereas the results of the best-fit model of Models (b\u2013c) were reported in Supplementary Information S2. Model (a) exhibited considerably higher model predictive ability for the regional fish community than Models (b\u2013c) (Supplementary Table S5). Nevertheless, these models estimated similar geological variations in latent variables 1 and 2 and reproduced the five hypothetical biogeographic boundaries described in the following paragraph (Supplementary Figure S5).<\/p>\n<p>As several previous studies<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 49\" title=\"Kreft, H. &amp; Jetz, W. A framework for delineating biogeographical regions based on species distributions. J. Biogeogr. 37, 2029&#x2013;2053 (2010).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR49\" id=\"ref-link-section-d172878278e2314\" rel=\"nofollow noopener\" target=\"_blank\">49<\/a>,<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 50\" title=\"Pang, S. E. H., Silk, J. W. F., Zurell, D. &amp; Webb, E. L. The clustering of spatially associated species unravels patterns in tropical tree species distributions. Ecosphere 14, e4589 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#ref-CR50\" id=\"ref-link-section-d172878278e2317\" rel=\"nofollow noopener\" target=\"_blank\">50<\/a> did, our study defined biogeographic boundaries based on the ordination space (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#Fig2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>). Given that GLLVM directly relates ordination axes (i.e. latent variables) to species occurrence probabilities, community compositions are expected to change largely at the boundary of the zero-value line. Thus, zero-value lines were used to identify hypothetical biogeographic boundaries. The first and second LVs revealed five hypothetical boundaries between distinct local communities (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#Fig2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a> and Supplementary Figure <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#MOESM2\" rel=\"nofollow noopener\" target=\"_blank\">S2<\/a>). The third LV showed broad zero-value areas (not zero-value lines), suggesting no boundaries between communities (Supplementary Figure <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-31307-4#MOESM2\" rel=\"nofollow noopener\" target=\"_blank\">S2<\/a>). Species niche centres were defined as the position with the highest occurrence probability in the niche space. For this, species occurrence records in geographic space were mapped onto the niche space and then the occurrence probability on the niche space was calculated using a kernel density estimation method.<\/p>\n<p>Calculation of response diversity<\/p>\n<p>To calculate the response diversity of a local community to each LV, 100 communities were randomly generated with estimated occurrence probabilities (\\(\\:\\varvec{p}\\)). The interquartile range of the response to each LV (\\(\\:\\varvec{\\gamma\\:}\\)) was calculated for the species comprising each generated community and the interquartile ranges over all 100 generated communities were averaged as the expected response diversity. The interquartile range was used instead of the other dispersion proxies (e.g. variance) to reduce the influence of outliers due to estimation error.<\/p>\n<p>Spectral clustering and linear regression were used to investigate the relationship between species richness and response diversity. Local communities were classified into high and low expected response diversity using spectral clustering (the number of clusters was set to two). Then, linear regression was applied to each cluster:<\/p>\n<p>$$\\:\\text{l}\\text{o}\\text{g}\\left(response\\:diversity\\right)\\:\\sim\\:\\text{l}\\text{o}\\text{g}\\left(richness\\right).$$<\/p>\n<p>To ensure homogeneity of variances, both species richness and response diversity were log-transformed. Because spectral clustering can produce several different results due to the randomness in the optimisation process, clustering was repeated 100 times and finally the clustering result with the highest log-likelihood was selected for linear regressions. All codes are available at https:\/\/doi.org\/10.5281\/zenodo.17823192.<\/p>\n","protected":false},"excerpt":{"rendered":"The workspace and equipment in the pre-PCR area were sterilised with hypochlorite solution before library preparation. Low-retention filtered&hellip;\n","protected":false},"author":2,"featured_media":482806,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[47],"tags":[6470,125475,222427,125476,192,1159,1160,79],"class_list":{"0":"post-482805","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-environment","8":"tag-biodiversity","9":"tag-biogeography","10":"tag-community-ecology","11":"tag-ecological-modelling","12":"tag-environment","13":"tag-humanities-and-social-sciences","14":"tag-multidisciplinary","15":"tag-science"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/482805","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=482805"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/482805\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/482806"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=482805"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=482805"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=482805"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}