The Microbiome and Epidemiology
Incorporating microbiota data into epidemiologic models: examples from vaginal microbiota research

https://doi.org/10.1016/j.annepidem.2016.03.004Get rights and content

Abstract

Purpose

Next generation sequencing and quantitative polymerase chain reaction technologies are now widely available, and research incorporating these methods is growing exponentially. In the vaginal microbiota (VMB) field, most research to date has been descriptive. The purpose of this article is to provide an overview of different ways in which next generation sequencing and quantitative polymerase chain reaction data can be used to answer clinical epidemiologic research questions using examples from VMB research.

Methods

We reviewed relevant methodological literature and VMB articles (published between 2008 and 2015) that incorporated these methodologies.

Results

VMB data have been analyzed using ecologic methods, methods that compare the presence or relative abundance of individual taxa or community compositions between different groups of women or sampling time points, and methods that first reduce the complexity of the data into a few variables followed by the incorporation of these variables into traditional biostatistical models.

Conclusions

To make future VMB research more clinically relevant (such as studying associations between VMB compositions and clinical outcomes and the effects of interventions on the VMB), it is important that these methods are integrated with rigorous epidemiologic methods (such as appropriate study designs, sampling strategies, and adjustment for confounding).

Introduction

The improved availability and affordability of high-throughput molecular techniques is revolutionizing microbiota research [1], including vaginal microbiota (VMB) research [2]. VMB dysbiosis (also known by its clinical name bacterial vaginosis [BV]) has long been recognized as a common clinical condition with potentially devastating consequences (such as preterm birth), but its etiology and pathogenesis have never been fully understood. BV is treated empirically in most clinical settings, diagnosed by the Amsel criteria (clinical signs and microscopy) in some specialized clinics [3], and diagnosed by Gram stain Nugent scoring (microscopy) in research settings [4]. Microscopy and culture studies had already shown that the VMB of healthy asymptomatic women predominantly consists of lactobacilli, and that BV is associated with a reduction of lactobacilli and an overgrowth of other (facultative) anaerobic bacteria. However, high-throughput molecular techniques have characterized VMB compositions in much more detail, identified novel bacterial taxa in the vaginal niche, and allowed the field to get a better handle on determinants of VMB composition, VMB fluctuations over the menstrual cycle and over a lifetime, VMB associations with clinical outcomes, and the effects of interventions on the VMB [2].

While early studies between 2002 and 2013 used a variety of molecular techniques (DNA fingerprinting, DNA microarrays, quantitative polymerase chain reaction (qPCR), and sequencing of DNA isolated from culture colonies or directly from genital samples using many different sequencing platforms; [2]), studies in 2014 and 2015 almost exclusively used next generation sequencing (NGS) and/or (multiplex) qPCR of DNA extracted directly from genital samples. For that reason, we have focused this article on the latter two techniques. Furthermore, in the VMB field thus far, the vast majority of studies have targeted the 16S ribosomal DNA (rDNA) gene for bacterial identification. We therefore limited this review to NGS and qPCR of the 16S rDNA gene, but note that shotgun sequencing is increasingly available and affordable and will likely increase in importance in future VMB research.

We wrote this article for epidemiologists who are interested in studying the effects of microbiota composition on clinical outcomes but are not experts in genomic laboratory methods or bioinformatics. Throughout the article, we used examples from VMB research. While the first 20 years of VMB genomics have been dominated by the development and initial applications of the technologies in relatively small, mostly descriptive studies, we believe that the time has now come for incorporation into clinical epidemiologic studies to answer biomedical research questions or test interventions on a much wider scale.

This paragraph briefly summarizes the principles of 16S rDNA-based NGS, but more detailed explanations can be found in the Appendix. In microbiota studies, the conserved regions of the 16S rDNA gene are used for the initial amplification of 16S rDNA present in a sample, and portions of one or more variable regions are sequenced to allow for identification of bacterial species, genera, or higher order taxa (collectively referred to as taxa in this article; [5], [6], [7]). The ability to classify sequencing reads to species level depends on various factors including choice of NGS platform [8], [9], variable region(s), and alignment databases (see in the following paragraphs). Most NGS platforms allow for multiplexing (the use of a unique barcode sequence to identify DNA originating from a specific sample), so that samples can be pooled during sequencing and subsequently sorted by barcode.

A multiplex 16S rDNA NGS run typically results in thousands of sequence reads per sample [8], [9]. The sequence reads are first checked for quality and preprocessed, a process that is known to introduce biases (an observed microbiota composition, i.e., different from the actual microbiota composition; [10], [11]). The processed reads are then used to identify bacterial taxa present in each sample by sequence alignment [12], [13], [14]. The reads are usually first assigned to operational taxonomic units (OTUs; based on a sequence similarity threshold—usually 97%—within the experimental data set), which are subsequently compared with known bacterial taxa sequences in publicly available databases [15], [16], [17]. These databases do not always allow for assignment of sequences at species level, and some laboratories have designed their own customized databases to fill the gaps (see e.g., [18]). Some researchers report phylotypes (based on sequence similarity with an external database) instead of OTUs. We will refer to OTUs in the remainder of the article, but all methods described also apply to phylotypes unless explicitly stated otherwise.

From the resulting sequence alignment, phylogenetic analysis can be conducted to assess the sequences' shared evolutionary origins. Methods for estimating phylogenies, each with their own strengths and weaknesses, include neighbor-joining, unweighted pair group method with arithmetic mean, maximum parsimony, maximum likelihood, and Bayesian inference of phylogeny (reviewed in [19]). Phylogenies are typically visualized using a dendrogram (Appendix: Fig. 1).

Rarefaction curves are used to determine whether most taxa present in a sample were in fact identified (Appendix: Fig. 2; [20]). Most articles only report on taxa that constitute at least 0.1% of the overall bacterial community; taxa constituting less than 0.1% are referred to as rare taxa. However, if a bacterial community has 108 bacteria per mL of biological sample, then rare taxa may represent up to 105 bacteria per mL. Such “rare” taxa could cause disease (e.g., if it produces toxins or has a high pathogenicity index for other reasons), play important roles in the bacterial community, or constitute a “seed bank” of taxa whose numbers increase under conditions that favor their growth [21].

The number of sequence reads per individual sample within one study can be vastly different for a number of reasons. This is usually dealt with by normalizing the data in the following ways: (1) base analyses on the relative abundance of each species; or (2) rarefy, which refers to the process of throwing away sequences from samples with high numbers of reads so that all samples have the same number of reads [22]. Although the former does not address heteroscedasticity (different species might have different variability), the latter omits potentially large amounts of available valid data. Some experts therefore object to both these options and argue in favor of a third option, which is to use negative binomial models to account for differences in read numbers between samples (for an in-depth discussion, see [22]).

The field of microbial and/or environmental ecology existed long before human microbiota research soared, and ecologic terminology and methodology have been incorporated into human microbiota research. The term “richness” refers to the number of taxa present in an ecological community (not taking the abundance of each taxa into account), and “evenness” refers to how close in abundance these taxa are. Diversity takes both richness and evenness into account. The total diversity (“gamma diversity”) consists of the diversity at one ecologic niche or in one (type of) sample (“alpha diversity”) and the differentiation between ecologic niches or (types of) samples (“beta diversity”). Popular alpha diversity measures include the Shannon (also referred to as Shannon–Wiener) diversity index and (inverse) Simpson diversity index. Popular beta diversity measures include the Bray–Curtis dissimilarity (uses counts of shared and unshared OTUs between two samples), Jensen-Shannon divergence (measures the similarity between two probability distributions), and UniFrac measures (uses shared and unshared branches in a phylogenetic tree [23]). Diversity is often visualized by a heatmap showing each OTU (on the vertical axis) for each participant or sampling time point (on the horizontal axis) with the proportion of sequence reads assigned to each OTU (often referred to as the relative abundance) shown in a different color (Appendix: Fig. 1). Alternatively, an interpolated bar plot is shown, with the relative abundance on the vertical axis and the participant or time point on the horizontal axis and each OTU shown in a different color (Appendix: Fig. 3).

After multiple samples have been sequenced and the sequencing data have been organized into OTUs for each sample, it is time to consider how these data can be used to answer biomedical research questions. We have divided this section into (1) methods that compare the presence or relative abundance of individual OTUs, or community compositions, between different groups of women or sampling time points; and (2) methods that first reduce the complexity of the data into a few variables followed by the incorporation of these variables into traditional biostatistical models. A third group of methods of potential interest, but not discussed further in this article, are bioinformatics methods such as sequence mining (which identifies statistically relevant patterns) and alignment-free sequence analysis (when alignment is not possible, e.g., because sequences are not closely related). The NGS data input for most methods in the second and third category is a distance matrix. A distance matrix in this context is a two-dimensional array containing the distances (the degree of similarity) of all pairwise sequences and/or OTUs in the data set.

The first step is usually to determine the relative abundances of OTUs of interest. In the VMB field, the focus is often on the relative abundance of the Lactobacillus genus, specific Lactobacillus species, and/or the most common BV-associated bacteria (Gardnerella vaginalis, Atopobium vaginae, among others) because of their known association with vaginal health. It is important to note that the relative abundances derived from NGS data are semi-quantitative and can be biased by laboratory and sequence processing methods used as discussed earlier. If answering the research question requires quantitative data, and the bacterial taxa of interest are known a priori, qPCR assays of those predefined taxa are likely to be more appropriate (see in the following). The mean or median relative abundance of an OTU of interest for multiple samples from a group of study participants (e.g., HIV-positive vs. HIV-negative women), and/or samples collected at a specific time point (e.g., pretreatment and posttreatment), can also be calculated. Traditional biostatistical methods can then be used to compare these mean and/or median relative abundances of individual taxa between groups or sampling time points, or determine correlations or associations with other variables of interest (such as Gram stain Nugent scores, the presence of clinical signs and symptoms or behaviors; see e.g., [24], [25], [26]). This approach works well when focusing the analysis on just a few taxa but does not make optimal use of all the available data related to entire bacterial communities.

A relatively easy way to compare community compositions is to compare alpha or beta diversity measures (see e.g. [27]). One should keep in mind, however, that diversity measures only provide information about richness and evenness of multiple taxa and not about which taxa are present. Whether this is clinically meaningful depends on the ecological niche and the research question. After a decade of molecular VMB studies, we now know that a healthy VMB is dominated by lactobacilli and therefore has low diversity, whereas BV is always associated with high diversity [2]; diversity measures can therefore be clinically meaningful in this context. In fact, some studies have shown “dose response” relationships between VMB diversity and clinically relevant outcomes such as prevalence of sexually transmitted infections, shedding of HIV in the female genital tract, and vaginal mucosal degradation and inflammation [28], [29]. However, caution is needed because some organisms that can cause symptomatic vaginitis or adverse outcomes can be present in a low diverse Lactobacillus-dominated VMB (such as Candida yeasts, which are not measured by 16S techniques, and Streptococcus agalactiae, which is often present in low relative abundance).

The alpha diversity Shannon and reverse Simpson indices are continuous variables with a lower bound of 0 and no upper bound (with a higher value indicating increased diversity). A mean or median diversity for a group can therefore be determined, and groups can be compared using traditional biostatistical methods such as analysis of variance or correlation methods. Beta diversity dissimilarity or distance measures (referred to as distance measures in the remainder of the article) have a value between 0 and 1, with a score of 0 indicating that the two sequence profiles are exactly the same in bacterial presence and abundance, whereas a score of one represents profiles with no overlapping sequences. The distance measures themselves are an indication of the degree of similarity between two samples, but many research questions involve the comparison of groups containing multiple samples. This requires the compilation of a distance matrix including the distance measures of all possible sample pairs, which can subsequently be used for clustering or other dimensionality reduction methods, as described in the following.

Indicator value analysis is an ecologic analysis method with an emphasis on identifying taxa responsible for community composition differences [30]. Indicator value analysis quantifies the fidelity (the proportion of samples in a group that contain the taxa) and specificity (relative abundance) of taxa in each group of samples in a user-specified classification of these groups, and tests for the statistical significance of the associations by permutation. It was used by Hickey at al. [31] to determine if individual taxa were more strongly associated with the vaginal or the vulval environment.

The main aim of most metagenomic studies is to identify organisms, genes, or pathways that consistently explain the differences between two or more microbial communities. All the methods described previously, derived from either biostatistics or microbial ecology, have important limitations in that regard: they do not make use of all the available data, and do not adequately address the multidimensionality of the data, the considerable experimental variance embedded in NGS procedures, the considerable intersubject variability, and adjustment for multiple comparisons. Therefore, in recent years, many new analysis tools combining statistics and microbial ecology have been developed. Explaining each of these in detail is beyond the scope of this article, but we will mention a few that have been used in VMB research in recent years. These include the Integral-LIBSHUFF tool [32], the linear discriminant analysis effect size method [33], and the analysis of variance–like Differential Expression (ALDEx) tool [34]. The Integral-LIBSHUFF tool uses the exact integral form of the Cramer-von Mises statistic, which tests the quality of a curve fit, and Monte-Carlo sampling to calculate the probability that the observed differences between two sequence profiles are due to chance. It was used by Kim et al. [35] to determine whether VMB composition was statistically significantly different between women, and the microbiota between body sites within women. Linear discriminant analysis effect size determines the taxa (or genes or functions) most likely to explain differences between groups by first detecting taxa with significant differential abundance between groups using the nonparametric factorial Kruskal–Wallis sum-rank test, then investigating association consistency using (unpaired) Wilcoxon rank-sum tests, and finally estimating the effect size of each differentially abundant taxa by linear discriminant analysis. It was used to determine which VMB taxa were associated with race [36], with high risk human papillomavirus infection [37], and with the vaginal versus seminal microbiome within sexual relationships [27]. The ALDEx tool decomposes sample-to-sample variation into four parts (within-condition variation, between-condition variation, sampling variation, and general unexplained error) using Monte-Carlo sampling from a Dirichlet distribution. The outcome is a log2-transformed abundance value per taxa, which represents its abundance relative to the mean abundance of all taxa in the sample. It was used by Macklaim et al. [38], in combination with Welch's t-tests and Benjamini Hochberg false discovery rate correction, to determine differentially abundant VMB taxa pretreatment and posttreatment of BV with oral tinidazole and probiotics.

Although the methods described in the previous paragraph make optimal use of NGS data to analyze differences in bacterial communities, they can only incorporate a limited number of demographic, behavioral, or clinical covariates. They are therefore not ideal for addressing certain clinical epidemiologic research questions, for example, those that require adjustment for multiple confounders. Dimensionality reduction techniques use all the available NGS data (usually in the form of a distance matrix) but reduce the data to a small number of random variables that can subsequently be incorporated into bivariable or multivariable biostatistical models, including models with repeated measures over time.

A commonly used dimensionality reduction method in microbiome research is clustering. Many clustering methods and software tools are available, each with their own advantages and disadvantages [39]. VMB studies to date have identified 3–9 clusters per study (the number depending on the sample size and study population), usually including multiple clusters dominated by one Lactobacillus species (Lactobacillus crispatus, L. iners, L. gasseri, L. jensenii, and/or L. vaginalis), at least one cluster with intermediate diversity (typically dominated by lactobacilli and G. vaginalis), and at least one cluster containing mixtures of anaerobes with or without L. iners [2]. After clustering, each sample (each woman at each sampling time point) can be assigned to a microbiota cluster, and biostatistical methods can be used to determine correlations or associations between variables of interest and membership of specific clusters [21], [28], [31], [40], [41], [42], [43]. Cluster membership can be incorporated into multivariable statistical models as a categorical or ordinal variable (if the clusters are ranked, e.g., based on diversity; [44]), but can also be analyzed as indicator variables with the cluster of interest coded as one and all other clusters as 0 [37], [45], [46]. Mehta et al. [44] used multinomial mixed-effects modeling with the VMB clusters ranked by relative Lactobacillus abundance as multinomial outcome, a subject-specific random slope and intercept, and various independent variables to determine cluster membership trends over time by HIV status and other factors. Romero et al. [45] used generalized estimating equations models and different types of linear mixed-effects models to determine associations between VMB cluster membership (with indicator variables for each cluster of interest) and pregnancy status.

Microbiome cluster membership may change over time. Gajer et al. [47] introduced the concept of “community classes,” which they defined as the dominant VMB cluster over 16 weeks of observation. This new variable was assessed as a predictor of human papillomavirus status by Fisher's exact test [42]. They also calculated the frequency of transitions between each pair of clusters and the rate of change of the log of Jensen-Shannon distances between consecutive clusters [47]. The latter was used as the independent variable in a linear mixed-effects model, which included “subject” as the random effect (to account for correlations between repeated measurements in the same subject) and various dependent variables as fixed effects to determine their associations with cluster transition.

Other dimension reduction methods include those that are used when the number of dimensions in the data is not known (such as principle components analysis [PCA], principal coordinates analysis [PCoA], nonmetric multidimensional scaling, and factor analysis) or is known (such as linear discriminant analysis and canonical correlation analysis). PCA and PCoA are by far the most used in VMB research [18], [21], [27], [31], [37], [43], [46], [47], [48], [49], but nonmetric multidimensional scaling has also been used [46], [50]. PCA and PCoA use an eigenvector-based approach to represent multidimensional data in as few dimensions (the principal components [PCs]) as possible. They aim for the first PC to account for as much of the variability in the data as possible with each succeeding component in turn accounting for the highest variance possible under the constraint that it is orthogonal to the preceding components. The PCs are often plotted in two- or three-dimensional plots to visualize groupings (Appendix: Fig. 4), and the first PC is often used as a dependent variable in statistical models to identify factors associated with it. The difference between PCA and PCoA is in the types of matrix input that can be used. PCA requires the input of a covariance or correlation matrix, whereas PCoA can use matrices based on any distance metric.

Whereas NGS characterizes “all” 16S rDNA present in a sample, qPCR quantifies one specific taxon (or, in the case of a multiplex qPCR assay, a few specific taxa). qPCR assays use primers directed to the taxon of interest to amplify this taxon (usually resulting in higher sensitivity and specificity for this taxon than NGS), and the fluorescent DNA-binding dye SYBRG-green for determination of its concentration by comparing the fluorescence level of the sample to a standard curve. The bacterial load of specific taxa is sometimes a better indicator of health or disease, or the effect of an intervention, than the mere presence or absence of the taxa. When this is the case, and when the taxa of interest are known a priori, it is likely better to use (multiplex) qPCR than NGS. qPCR assays have been developed for several VMB taxa such as the Lactobacillus genus, several Lactobacillus species, and several BV-associated taxa (G. vaginalis, A. vaginae, Leptotrichia genus, Sneathia genus, Prevotella genus, among others; [51], [52], [53]). Similar to NGS data, qPCR data can be used to answer biomedical questions that: (1) compare the presence or concentrations of taxa (also referred to as bacterial loads) between groups or sampling time points; and (2) combine the prevalence and/or concentrations of a number of taxa into one or more variables.

The outcome of a qPCR assay is the concentration of the targeted bacterial taxa (number of 16S rDNA gene equivalents (geq) or copies/mL). This information can be reduced to a binary variable (indicating presence or absence of the taxa in the sample) or categorical variable (e.g.: not quantifiable, <106 geq/mL, ≥106 geq/mL [53]) but is usually used as a continuous variable, often log10-transformed. These binary, categorical, or continuous variables can subsequently be compared between groups using standard biostatistical tests or used in bivariable or multivariable biostatistical models. In VMB research, such qPCR-based studies compared the presence and/or concentrations of taxa of interest between women with or without BV by Amsel criteria or Nugent scoring (e.g., [53], [54], [55], [56], [57], [58], women with or without vaginal symptoms [59], [60], premenopausal and postmenopausal women [61], in young women presexual and postsexual debut [62], in women with pelvic inflammatory disease [63] or HIV [64], [65], and longitudinally over menstrual cycles [55], [66] and in pregnancy [67]). Current research is beginning to zoom in on pathogenesis of dysbiosis, such as associations between the presence and/or concentrations of taxa of interest and vaginal mucosal inflammatory markers [68], [69] or bacterial metabolites [70], the potential use of multiplex qPCR as a new diagnostic tool for BV [71], [72], and the inclusion of qPCR data in multivariable statistical models (typically mixed linear effects models). When including qPCR data in multivariable models, a value of half the lower limit of detection is sometimes assigned to women who do not have detectable concentrations of a particular taxon in their VMB to avoid losing a large proportion of the data [73]. The use of qPCR is particularly useful when studying interventions, such as antibiotic treatment [74], [75]. For example, unchanged concentrations of taxa that should be sensitive to the antibiotic used might suggest antibiotic resistance, whereas declining but still detectable concentrations might lead to recurrence [74].

The dimensionality reduction techniques described previously for NGS data can also be applied to qPCR data, reducing multiple taxa concentrations to a few variables. For example, Jespers et al. [72] conducted PCA analysis on log10-transformed qPCR data of L. crispatus, L. iners, L. jensenii, L. gasseri, L. vaginalis, A. vaginae, and G. vaginalis (Appendix: Fig. 4), and found that the first and second PC explained 58% of the variance, with the first PC describing a gradient from a Lactobacillus-dominated VMB to a G. vaginalis- and A. vaginae-dominated VMB and the second PC the concentration of L. crispatus versus L. iners. This study also tested various combinations of individual taxa concentrations in their ability to diagnose BV (using receiver operating characteristics analyses) and concluded that a combination of Lactobacillus genus, G. vaginalis, and A. vaginae qPCR performed best with a sensitivity of 93.4% and a specificity of 83.6% compared with a Nugent score of 7–10.

In recent years, NGS and qPCR are increasingly being used in combination. These studies typically use NGS first to identify the taxa that are most important in the context of the research question, followed by qPCR assays of those taxa of interest to quantify their relative importance or evaluate the quantitative effects of interventions (see e.g., [73] for use of this strategy to study cervicitis). qPCR concentrations can also be correlated with results from -omics methods, such as proteomics and metabolomics. For example, Srinivasan et al. [70] recently correlated the concentrations of key vaginal bacteria with the 30% most variable metabolites in their vaginal samples using Pearson correlation coefficients.

Section snippets

Conclusions

NGS and (multiplex) qPCR technologies are now widely available and affordable, and research incorporating these methods is growing exponentially. However, in the VMB field, most research to date is still laboratory and bioinformatics focused and does not include rigorous epidemiologic methods such as adequate statistical power to answer clinically relevant research questions, appropriate sampling methods, study designs that minimize biases, and statistical models that address confounding. We

Acknowledgments

The authors thank Hanneke Borgdorff for critically reviewing the manuscript and providing figures and Jacques Ravel for providing figures.

References (75)

  • G.M. Weinstock

    Genomic approached to studying the human microbiota

    Nature

    (2012)
  • M. Ronaghi et al.

    A sequencing method based on real-time pyrophosphate

    Science

    (1998)
  • M.A. Quail et al.

    A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    BMC Genomics

    (2012)
  • J.P. Brooks et al.

    The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies

    BMC Microbiol

    (2015)
  • S.J. Salter et al.

    Reagent and laboratory contamination can critically impact sequence-based microbiome analyses

    BMC Biol

    (2014)
  • J.D. Thompson et al.

    The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools

    Nucleic Acids Res

    (1997)
  • J.G. Caporaso et al.

    QIIME allows analysis of high-throughput community sequencing data

    Nat Methods

    (2010)
  • J.R. Cole et al.

    Ribosomal Database Project: data and tools for high throughput rRNA analysis

    Nucleic Acids Res

    (2014)
  • D. McDonald et al.

    An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea

    ISME J

    (2012)
  • D.A. Benson et al.

    Genbank

    Nucleic Acids Res

    (2013)
  • S. Srinivasan et al.

    Bacterial communities in women with bacterial vaginosis: high resolution phylogenetic analyses reveal relationships of microbiota to clinical criteria

    PLoS One

    (2012)
  • Z. Yang et al.

    Molecular phylogenetics: principles and practice

    Nat Rev Genet

    (2012)
  • N.J. Gotelli et al.

    Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness

    Ecol Lett

    (2001)
  • J. Ravel et al.

    Vaginal microbiome of reproductive-age women

    Proc Natl Acad Sci U S A

    (2011)
  • P.J. McMurdie et al.

    Waste not, want not: why rarefying microbiome data is inadmissible

    PLoS Comput Biol

    (2014)
  • C. Lozupone et al.

    UniFrac: a new phylogenetic method for comparing microbial communities

    Appl Environ Microbiol

    (2005)
  • R. Hummelen et al.

    Deep sequencing of the vaginal microbiota of women with HIV

    PLoS One

    (2010)
  • R. Hummelen et al.

    Vaginal microbiome and epithelial gene array in post-menopausal women with moderate to severe dryness

    PLoS One

    (2011)
  • C.A. Muzny et al.

    Characterization of the vaginal microbiota among sexual risk behavior groups of women with bacterial vaginosis

    PLoS One

    (2013)
  • H. Borgdorff et al.

    Lactobacillus-dominated cervicovaginal microbiota associated with reduced HIV/STI prevalence and genital HIV viral load in African women

    ISME J

    (2014)
  • H. Borgdorff et al.

    Cervicovaginal dysbiosis is associated with proteome changes related to immune activation and cell death

    Mucosal Immunol

    (2015)
  • M. Dufrene et al.

    Species assemblages and indicator species: the need for a flexible asymmetrical approach

    Ecol Monogr

    (1997)
  • R.J. Hickey et al.

    Vaginal microbiota of adolescent girls prior to the onset of menarche resemble those of reproductive-age women

    MBio

    (2015)
  • P.D. Schloss et al.

    Integration of microbial ecology and statistics: a test to compare gene libraries

    Appl Environ Microbiol

    (2004)
  • N. Segata et al.

    Metagenomic biomarker discovery and explanation

    Genome Biol

    (2011)
  • A.D. Fernandes et al.

    ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-Seq

    PLoS One

    (2013)
  • T.K. Kim et al.

    Heterogeneity of vaginal microbial communities within individuals

    J Clin Microbiol

    (2009)
  • Cited by (15)

    • The gut microbiome and frailty

      2020, Translational Research
      Citation Excerpt :

      Diversity within 1 sample type or community (alpha diversity) and differentiation in microbiome constitution between sample types or ecologic niches (beta diversity) are key outputs. Additional data reduction techniques exist to facilitate methodological evaluation of microbiome relationships in health and disease.16 The altered state of the microbiome associated with disease has been termed dysbiosis, though what constitutes “dysbiosis” is debated and may differ depending on the disease state and the microbiota described.17

    • What fertility specialists should know about the vaginal microbiome: a review

      2017, Reproductive BioMedicine Online
      Citation Excerpt :

      These advances have transformed the field of microbial community analysis and subsequently the areas of health that may be directly or indirectly influenced by it, such as fertility and reproduction. Indeed, this transformation is shown by the exponential increase in the number of publications describing the influence of the composition and structure of the inhabiting microbial communities on reproductive health and fertility outcomes (Green et al., 2015; Haahr et al., 2016; Ma et al., 2012; Nuriel-Ohayon et al., 2016; van de Wijgert and Jespers, 2016). The objective of this review is to provide the reader with a clear understanding of the terminology used in this field, which is sometimes unknown to reproductive health and IVF specialists.

    • Epidemiology and the microbiome

      2016, Annals of Epidemiology
    • The global health impact of vaginal dysbiosis

      2017, Research in Microbiology
      Citation Excerpt :

      In research settings, BV is typically diagnosed by the Amsel criteria, which rely on wet mount microscopy and the presence of clinical criteria [2], or by Gram stain Nugent scoring [3], which relies on microscopy after Gram staining of a vaginal smear (Table 1). Since the beginning of the new century, molecular laboratory techniques to identify bacteria at the genus and species level have gradually become more available and affordable and are increasingly being employed as a tool in molecular epidemiological studies [4,5]. These molecular studies have now conclusively shown that lactobacilli-dominated VMB are indeed associated with a balanced immune-tolerant vaginal micro-environment and that BV is best described as an anaerobic polybacterial dysbiosis (reviewed in [4]).

    View all citing articles on Scopus
    View full text