{"id":145389,"date":"2025-11-17T23:36:22","date_gmt":"2025-11-17T23:36:22","guid":{"rendered":"https:\/\/www.newsbeep.com\/ie\/145389\/"},"modified":"2025-11-17T23:36:22","modified_gmt":"2025-11-17T23:36:22","slug":"profiling-of-vaginal-microbial-communities-in-chilean-women-via-self-sampling-and-nanopore-sequencing","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ie\/145389\/","title":{"rendered":"Profiling of vaginal microbial communities in Chilean women via self-sampling and nanopore sequencing"},"content":{"rendered":"<p>In silico validation of 16S rRNA gene metataxonomic analysisVaginal mock community and database selection<\/p>\n<p>To select the most suitable database for this dataset, a comparison of different 16S rRNA gene taxonomic databases was conducted using an in silico-generated vaginal mock community. Following information from the literature<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\" title=\"Ravel, J. et al. Vaginal microbiome of reproductive-age women. Proc. Natl. Acad. Sci. U. S. A. 108(Suppl 1), 4680&#x2013;4687 (2011).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR23\" id=\"ref-link-section-d43362747e1063\" rel=\"nofollow noopener\" target=\"_blank\">23<\/a>, the mock community was designed to represent the most common bacterial taxa found in vaginal samples across all CSTs. A total of 1,000 in silico 16S rRNA gene amplicons were simulated using Grinder (version 0.5.3)<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"Angly, F. E., Willner, D., Rohwer, F., Hugenholtz, P. &amp; Tyson, G. W. Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic Acids Res. 40, e94 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR36\" id=\"ref-link-section-d43362747e1067\" rel=\"nofollow noopener\" target=\"_blank\">36<\/a> with its in silico PCR option, employing genomic DNA sequences from the selected taxa and standard full-length 16S rRNA primer sequences. Of the 11 taxa chosen initially, only 7 were successfully amplified under these simulation conditions; amplification failed for the strains Gardnerella vaginalis, Ureaplasma parvum ATCC 27815, Bifidobacterium breve, and Megasphaera paucivorans. The estimated relative abundances of the amplicons for each taxon are summarized in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Tab2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>.<\/p>\n<p>Table 2 Species present in in-silico mock community. All in silico amplicons are in an equal relative abundance of 12.5% in the mock community.<\/p>\n<p>To mimic the error profile of Oxford Nanopore Technologies (ONT) long-read sequencing, we introduced errors into the in silico simulated amplicons using CuReSim-LoRM<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 37\" title=\"Mesloub, Y., Beury, D., Vandermeeren, F. &amp; Caboche, S. CuReSim-LoRM: A tool to simulate metabarcoding long reads. Int. J. Mol. Sci. 24, 14005 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR37\" id=\"ref-link-section-d43362747e1253\" rel=\"nofollow noopener\" target=\"_blank\">37<\/a>, a simulator that generates realistic ONT-like read errors based on empirical models. The resulting reads constituted the final vaginal mock community. Considering the benchmarking performance of the Emu tool<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 15\" title=\"Curry, K. D. et al. Emu: Species-level microbial community profiling for full-length nanopore 16S reads. (2021) &#010;                  https:\/\/doi.org\/10.1101\/2021.05.02.442339&#010;                  &#010;                .\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR15\" id=\"ref-link-section-d43362747e1257\" rel=\"nofollow noopener\" target=\"_blank\">15<\/a>, this method was selected for 16S taxonomic assignment. Three 16S rRNA databases were evaluated with this tool using the generated mock community: the Emu database, the SILVA database version available for Emu, and a GTDB database version adapted for Emu. Relative abundances obtained for the mock community at the genus level for each database are shown in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Tab3\" rel=\"nofollow noopener\" target=\"_blank\">3<\/a> and graphically represented in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Fig1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a>. The estimated relative abundances from all databases were close to the expected values, with minor deviations attributable to simulated sequencing errors or taxonomic misassignments. Among the three databases, the Emu database demonstrated the highest concordance with expected abundances, showing relative abundances closest to the expected values, the lowest mean absolute error (0.0116), and misassigning only two genera.<\/p>\n<p>Table 3 Relative genus abundances of in silico vaginal mock community estimated with different databases.Fig. 1<a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-23837-8\/figures\/1\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig1\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/41598_2025_23837_Fig1_HTML.png\" alt=\"figure 1\" loading=\"lazy\" width=\"685\" height=\"456\"\/><\/a><\/p>\n<p>Relative abundance plot of the in silico vaginal mock community comparing the three databases used. Bars represent relative abundances estimated in the mock community with different databases using the Emu tool.<\/p>\n<p>The same analysis was conducted at the species level for the three databases, with a summarized version of the results presented in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Tab4\" rel=\"nofollow noopener\" target=\"_blank\">4<\/a>. In all cases, the relative abundances closely align with the expected values; however, certain species are either underrepresented or overrepresented. For instance, Lactobacillus crispatus is consistently overrepresented across all databases, while Prevotella bivia is underrepresented. The SILVA database showed a significant limitation, with five taxa being assigned only at the genus level rather than resolving to the species level, a drawback not observed with the other databases. In contrast, the GTDB and Emu databases demonstrated better species-level resolution but exhibited higher misassigned species. Specifically, the GTDB database misassigned 14 species, while the Emu database misassigned 11 species, with the latter showing lower relative abundances for these misassignments. Considering the importance of resolving taxa at the species level for subsequent analyses and minimizing taxonomic assignment errors in data comparable to the generated mock community, the Emu database was selected for further analyses.<\/p>\n<p>Table 4 Relative species abundances of in silico vaginal mock community estimated with different databases.<\/p>\n<p>Given the clinical relevance of G. vaginalis, the in silico PCR simulation was repeated using a previously described forward degenerate primer (10F:5\u2032-RGTTYGATYCTGGCTCAG-3\u2032) known to amplify this species<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 43\" title=\"Verhelst, R. et al. Cloning of 16S rRNA genes amplified from normal and disturbed vaginal microflora suggests a strong association between Atopobium vaginae, Gardnerella vaginalis and bacterial vaginosis. BMC Microbiol. 4, 16 (2004).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR43\" id=\"ref-link-section-d43362747e2133\" rel=\"nofollow noopener\" target=\"_blank\">43<\/a>. Using the modified primer, the 16S rRNA gene of nine strains\u2014including G. vaginalis\u2014was successfully amplified and correctly assigned by Emu, confirming that the lack of detection in the initial analysis was due to primer specificity rather than the absence of target sequences. These results are presented in Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">1<\/a> and Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S3<\/a>.<\/p>\n<p>ONT sequencing and 16S rRNA gene metataxonomic analysis of self-collected vaginal samplesCharacteristics of participants<\/p>\n<p>In this study, 20 Chilean women between the ages of 22 and 36\u00a0years (median age 32\u00a0years) consented to vaginal self-sampling. Each participant responded to an interview related to her lifestyle and health history. All participants declared themselves sexually active. Among them, 6 out of 20 reported recurrent infections at least thrice a year. In addition, 25% of the participants reported using antibiotics regularly. Regarding hygienic practices, responses varied, but mainly water use, and 6 participants reported daily use of intimate hygiene products. During the self-sampling procedure, only one participant experienced discomfort when inserting the swab.<\/p>\n<p>MinION-based 16S rRNA sequencing and taxonomic profiling of vaginal microbiota<\/p>\n<p>To characterize the bacterial composition of vaginal samples, DNA extraction was performed, yielding concentrations ranging from 30 to 223\u00a0ng\/\u00b5l, except for one sample with an insufficient concentration (&lt;\u20090.2\u00a0ng\/\u00b5l). Library preparation and purification were carried out for all extracted samples, which were successfully sequenced using the Oxford Nanopore Technologies (ONT) MinION platform. Sequencing and basecalling results yielded an average read length of 1.6\u00a0Kb. For taxonomic profiling, the 20 barcoded samples were length-filtered using fastp<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 34\" title=\"Chen, S., Zhou, Y., Chen, Y. &amp; Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884&#x2013;i890 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR34\" id=\"ref-link-section-d43362747e2167\" rel=\"nofollow noopener\" target=\"_blank\">34<\/a>, retaining only reads between 1400 and 1700 base pairs, corresponding to the estimated full-length 16S rRNA gene. The number of reads obtained per sample and read counts after filtering are summarized in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Tab5\" rel=\"nofollow noopener\" target=\"_blank\">5<\/a>. On average, 7.7% (SD\u2009=\u20091.9%) of reads per sample were retained after filtering.<\/p>\n<p>Table 5 Reads for each vaginal sample after sequencing and length filtering.<\/p>\n<p>Filtered reads from each sample were taxonomically profiled using Emu<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 15\" title=\"Curry, K. D. et al. Emu: Species-level microbial community profiling for full-length nanopore 16S reads. (2021) &#010;                  https:\/\/doi.org\/10.1101\/2021.05.02.442339&#010;                  &#010;                .\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR15\" id=\"ref-link-section-d43362747e2693\" rel=\"nofollow noopener\" target=\"_blank\">15<\/a> and the Emu 16S rRNA database. Notably, there were no unassigned reads across all samples, indicating that the Emu pipeline successfully classified all 16S rRNA genes present. The relative abundances obtained were processed in R as a phyloseq object for downstream analysis and visualization. The estimated abundances per sample at the genus level are presented in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Fig2\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>, with the complete dataset available in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S4<\/a>. As expected, the genus Lactobacillus was the most abundant across samples, with 14 out of 20 samples showing a relative abundance above 75%, 12 exceeding 90%, and 5 surpassing 99%.<\/p>\n<p>Fig. 2<a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-23837-8\/figures\/2\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig2\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/41598_2025_23837_Fig2_HTML.png\" alt=\"figure 2\" loading=\"lazy\" width=\"685\" height=\"527\"\/><\/a><\/p>\n<p>Bar plot of the top ten most abundant genera across all analyzed vaginal 16S samples. The bars represent relative abundances estimated by Emu for each sequenced sample, displaying only the top ten genera. Remaining genera are grouped as &#8220;Others.&#8221; The x-axis indicates the percentage of estimated abundances, while the y-axis lists samples ordered ascendingly by Lactobacillus abundance. The \u201cNot defined\u201d category corresponds to a particular taxon (Peptostreptococcaceae bacterium oral taxon 929) that lacks an assigned genus in the database used for analysis.<\/p>\n<p>The second most abundant classification was the \u201cOthers\u201d category, encompassing less represented genera that did not rank among the top 10. This pattern suggests that in samples dominated by Lactobacillus, the less abundant genera are a mix of low-abundance taxa without a consistent secondary dominant genus. Conversely, in the six samples where Lactobacillus relative abundance fell below 50%, the second most abundant, and in some cases, the dominant genus was Dialister. This genus and others evenly represented in these samples align with CST IV composition, characterized by the absence of a dominant genus<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 44\" title=\"Ma, Z. S. &amp; Li, L. Quantifying the human vaginal community state types (CSTs) with the species specificity index. PeerJ 5, e3366 (2017).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR44\" id=\"ref-link-section-d43362747e2742\" rel=\"nofollow noopener\" target=\"_blank\">44<\/a>.<\/p>\n<p>Relative abundances were analyzed at the species level, with results presented in Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Fig3\" rel=\"nofollow noopener\" target=\"_blank\">3<\/a>, while the complete dataset is available in Supplementary Table S5. Among the Lactobacillus species, L. crispatus, L. iners, and L. gaseri were the most relatively abundant, correlating strongly CST I, II, and III, respectively<a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\" title=\"Ravel, J. et al. Vaginal microbiome of reproductive-age women. Proc. Natl. Acad. Sci. U. S. A. 108(Suppl 1), 4680&#x2013;4687 (2011).\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#ref-CR23\" id=\"ref-link-section-d43362747e2765\" rel=\"nofollow noopener\" target=\"_blank\">23<\/a>. These species exhibited relative abundances often exceeding 90% in dominant samples, such as M07, M12, and M14. Notably, 7 out of the 20 samples exhibited a relative abundance of L. crispatus above 80%, aligning with healthy vaginal microbiota profiles. Interestingly, only one sample (M11) showed a dominant abundance of L. jensenii (98%), which is characteristic of CST V. It is also noteworthy that in cases where L. iners was the most abundant species (M05, M09, M13), its relative abundance never exceeded 80%, and the second most abundant species was also a Lactobacillus species.<\/p>\n<p>Fig. 3<a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-23837-8\/figures\/3\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig3\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/41598_2025_23837_Fig3_HTML.png\" alt=\"figure 3\" loading=\"lazy\" width=\"685\" height=\"478\"\/><\/a><\/p>\n<p>Bar plot of the top ten most abundant species across all analyzed vaginal 16S samples. The bars represent the species relative abundances estimated by Emu for each sequenced sample, displaying only the top ten most abundant across all samples. Remaining species are grouped as &#8220;Others.&#8221; The x-axis indicates the percentage of estimated abundances, while the y-axis lists samples ordered ascendingly by species from the Lactobacillus genus abundance.<\/p>\n<p>Samples with reduced Lactobacillus dominance, such as M01 and M19, displayed more diverse microbial communities, including significant proportions of Mageeibacillus indolicus (30.6% in M01) and Megasphaera elsdenii (23.2% in M19). The \u201cOthers\u201d category, representing less abundant taxa, contributed up to 30\u201340% in the more diverse samples (M01, M06, M10, M18, M19, M20), indicating substantial microbial complexity. Samples M19 and M20, where Lactobacillus species were minimal, exhibited a marked increase in diversity, with dominance by genera such as Mageeibacillus and Megasphaera. This shift toward increased diversity and reduced Lactobacillus dominance is consistent with CST IV profiles.<\/p>\n<p>Considering that the species-level analysis yielded results consistent with previously reported key taxa and relative abundances characteristic of each CST, we performed a preliminary classification of the samples into groups based on these features. Samples were classified according to their dominant species, defined as the species representing more than 80% of the total relative abundance. The results of this preliminary classification are presented in Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"table anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Tab6\" rel=\"nofollow noopener\" target=\"_blank\">6<\/a>.<\/p>\n<p>Table 6 Classification of samples into groups based on relative abundances of Lactobacillus species.Diversity analysis of vaginal microbiome samplesGenus diversity analysis correlates with a potential dysbiosis state<\/p>\n<p>We conducted a diversity analysis at the genus level to evaluate the information provided at this taxonomic resolution and compare it with species-level resolution. This comparison aims to highlight the advantages of full-length 16S rRNA gene sequencing. Alpha diversity, measured using the Shannon and Simpson indices, was calculated for each sample at the genus level. The results are presented in Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">2<\/a>, with statistical comparisons of the diversity indices across previously determined groups shown in Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S6<\/a>. As expected, samples classified in group IV exhibited the highest diversity among all groups. Although the sample size is limited, group I had the lowest mean diversity, suggesting that Lactobacillus predominantly dominates this state with minimal presence of other bacterial genera. Given that group IV is consistent with CST IV, which is often associated with a potential dysbiotic state, its significantly higher mean diversity suggests that alpha diversity at the genus level may serve as an indicator of dysbiosis when considered independently.<\/p>\n<p>A similar analysis was performed for beta diversity at the genus level using Bray\u2013Curtis distance, with results shown in Supplementary Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">3<\/a>. Principal component analysis results are provided in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S7<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S8<\/a>. Similar to the alpha diversity analysis, all samples clustered according to their previously determined groups. These results indicate that, at the genus level, it is not possible to distinguish between the different Lactobacillus-dominated groups. Still, samples corresponding to the diverse group IV can be separated. In Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Fig5\" rel=\"nofollow noopener\" target=\"_blank\">5<\/a>, the principal component on the X-axis explains 75.2% of the variance, demonstrating that this type of analysis at the genus level provides a strong statistical representation. The loading analysis further indicates that the primary contributors to this separation are the relative abundances of Lactobacillus and various other bacterial genera.<\/p>\n<p>Species-level diversity analysis allows community state classification<\/p>\n<p>To further assess taxonomic resolution and validate the CST classification of our samples, we first conducted a beta diversity analysis at the species level using Bray\u2013Curtis distance (Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Fig4\" rel=\"nofollow noopener\" target=\"_blank\">4<\/a>). Additional results are provided in Supplementary Tables <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S9<\/a> and <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S10<\/a>. The results show that samples clustered according to the preliminary groups and their correlative Community State Types (CSTs), supporting the classification based on dominant species and confirming concordance with previously defined CST structures. Statistical grouping analysis showed that the PERMANOVA test was significant (p\u2009=\u20090.001, R2\u2009=\u20090.79), indicating strong differences among CSTs. However, PERMDISP was also significant (p\u2009=\u20090.001), suggesting that part of the observed separation may be attributed to differences in group dispersion, an effect likely amplified by the relatively small number of samples per CST. To complement these findings, we also analyzed alpha diversity grouped by CSTs (Fig.\u00a0<a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#Fig5\" rel=\"nofollow noopener\" target=\"_blank\">5<\/a>; Supplementary Table <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"supplementary material anchor\" href=\"http:\/\/www.nature.com\/articles\/s41598-025-23837-8#MOESM1\" rel=\"nofollow noopener\" target=\"_blank\">S11<\/a>). CST IV exhibited the highest diversity values, with an average Shannon index of 2.42 and Simpson index of 0.851, indicating a more heterogeneous bacterial composition. In contrast, CSTs dominated by Lactobacillus species (I, II, III, and V) displayed lower diversity values, consistent with a more uniform microbiota. These results support the use of alpha diversity as a quantitative marker of potentially dysbiotic states and further highlight the added value of species-level analysis in distinguishing CSTs. Unlike the genus-level analysis, species-level beta diversity enabled a clearer separation between all CSTs, reflecting a finer resolution of microbial composition. The first principal component (X-axis) explained 44% of the variance, indicating a more complex diversity distribution at the species level. Notably, CST IV showed the greatest dispersion, consistent with its highly diverse and heterogeneous microbial profile. This variability suggests possible sub-classifications within CST IV, potentially representing distinct dysbiotic states or ecological shifts. The loading analysis (highlighting the 15 most influential taxa) revealed that several species, including Neisseria gonorrhoeae, Prevotella corporis, and Campylobacter hominis, contribute significantly to CST IV differentiation, contrasting with taxa predominant in Lactobacillus-dominated CSTs.<\/p>\n<p>Fig. 4<a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-23837-8\/figures\/4\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig4\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/41598_2025_23837_Fig4_HTML.png\" alt=\"figure 4\" loading=\"lazy\" width=\"685\" height=\"527\"\/><\/a><\/p>\n<p>Beta diversity analysis by PCoA based on Bray\u2013Curtis distance at the species level. The graph represents the Bray\u2013Curtis distances of vaginal samples analyzed at the species level. Each sample is color-coded according to its previously determined community state (CST). Ellipses were added to group samples by CST; however, due to limited sample size, only CST I and CST VI are displayed. Principal Component Analysis (PCA) loadings are represented as arrows labeled with the corresponding species, showing only the 15 most contributing taxa. The x-axis represents the first principal component, while the y-axis represents the second principal component.<\/p>\n<p>Fig. 5<a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s41598-025-23837-8\/figures\/5\" rel=\"nofollow noopener\" target=\"_blank\"><img decoding=\"async\" aria-describedby=\"Fig5\" src=\"https:\/\/www.newsbeep.com\/ie\/wp-content\/uploads\/2025\/11\/41598_2025_23837_Fig5_HTML.png\" alt=\"figure 5\" loading=\"lazy\" width=\"685\" height=\"457\"\/><\/a><\/p>\n<p>Alpha diversity analysis at the species level in vaginal samples. Boxplots represent species-level diversity in vaginal samples, grouped according to community state types (CSTs). CSTs include: CST I (L. crispatus-dominant), CST II (L. gasseri-dominant), CST III (L. iners-dominant), CST IV (high bacterial diversity), and CST V (L. jensenii-dominant). The x-axis displays the different CSTs, where each point represents an individual sample. The y-axis indicates alpha diversity, measured using Shannon and Simpson indices.<\/p>\n","protected":false},"excerpt":{"rendered":"In silico validation of 16S rRNA gene metataxonomic analysisVaginal mock community and database selection To select the most&hellip;\n","protected":false},"author":2,"featured_media":145390,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[5428,103,2026,61,60,2166,2027,83132,82],"class_list":{"0":"post-145389","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-health","8":"tag-dna-sequencing","9":"tag-health","10":"tag-humanities-and-social-sciences","11":"tag-ie","12":"tag-ireland","13":"tag-microbiome","14":"tag-multidisciplinary","15":"tag-next-generation-sequencing","16":"tag-science"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts\/145389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/comments?post=145389"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/posts\/145389\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/media\/145390"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/media?parent=145389"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/categories?post=145389"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ie\/wp-json\/wp\/v2\/tags?post=145389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}