Objectives To systematically review the evidence for the impact of study design and setting on the interpretation of tuberculosis (TB) transmission using clustering derived from Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR) strain typing.
Data sources MEDLINE, EMBASE, CINHAL, Web of Science and Scopus were searched for articles published before 21st October 2014.
Review methods Studies in humans that reported the proportion of clustering of TB isolates by MIRU-VNTR were included in the analysis. Univariable meta-regression analyses were conducted to assess the influence of study design and setting on the proportion of clustering.
Results The search identified 27 eligible articles reporting clustering between 0% and 63%. The number of MIRU-VNTR loci typed, requiring consent to type patient isolates (as a proxy for sampling fraction), the TB incidence and the maximum cluster size explained 14%, 14%, 27% and 48% of between-study variation, respectively, and had a significant association with the proportion of clustering.
Conclusions Although MIRU-VNTR typing is being adopted worldwide there is a paucity of data on how study design and setting may influence estimates of clustering. We have highlighted study design variables for consideration in the design and interpretation of future studies.
- MOLECULAR BIOLOGY
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
This is a timely evaluation of the impact of study design on estimates of tuberculosis clustering using Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats strain typing because it has been incorporated into national typing services globally.
The strength of this meta-analysis was limited by the lack of detail reported by the included studies, highlighting the need for better quality reporting in primary studies.
We have shown that the proportion of clustering derived from MIRU-VTNR typing is influenced by the number of loci typed, whether consent is required to type isolates, TB incidence in the study setting, and the maximum cluster size, highlighting these as important considerations in the design and interpretation of future studies.
The introduction of molecular typing methods has improved our understanding of Mycobacterium tuberculosis (TB) transmission and has changed local and national control policies.1–5 The proportion of cases that are clustered is often used to estimate the amount of ongoing transmission within the population, based on the assumption that cases with indistinguishable strain types are part of a chain of transmission. TB molecular typing methodology is changing rapidly and it is important that we better understand how to interpret the outputs and thus act.
TB molecular typing methods include Spoligotyping,6 insertion sequence 6110 (IS6110) restriction fragment length polymorphism (RFLP) analysis (the recent gold standard),7 Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR) typing,8 and whole genome sequencing.9–11 Published reviews have identified factors that might influence or bias clustering by IS6110 RFLP.12 ,13 No study has repeated this analysis using more up-to-date typing methods, which is important for understanding of the epidemiology of TB and to shape the application of molecular typing to improve TB control.
Published meta-analyses and modelling studies using IS6110 RFLP data show that the proportion of clustering observed can be affected by (1) study design (affecting the proportion of eligible cases that are included in the study); (2) features of the typing method (such as the ability to type isolates with low copy numbers); and (3) study setting (such as characteristics of the study population). For example, the proportion of clustering increases when the fraction of the total data sampled increases13–15 and when study duration increases.16
MIRU-VNTR is currently the preferred method of molecular typing,17–21 and can be used together with Spoligotyping.8 Relative to IS6110 RFLP, MIRU-VNTR does not have to exclude isolates with a low IS6110 copy number, has a faster turnaround time, is high throughput and the numeric strain types are more easily compared. MIRU-VNTR strain typing is increasingly being adopted worldwide,1 ,22–27 yet unlike IS6110 RFLP, the evidence for the interpretation of the findings such as the impact of study design and setting on clustering have not been reviewed. Although the two typing methods have been shown to have a similar discriminatory value, the markers evolve independently and at different rates, resulting in a difference in clustering between the two methods.28 This suggests that there could be differences in the way study design, typing method and setting affects clustering by the two methods. We conducted a systematic review to assess the evidence for the impact of study design and setting on the interpretation of TB transmission using clustering derived from MIRU-VNTR strain typing—as has been shown using IS6110 RFLP typing.
Five electronic databases were searched (EMBASE, ISI Web of Science, CINHAL, Scopus and Medline (Ovid)) up to 20 October 2014. The search strategy combined the following terms with Boolean operators: Tuberculosis, strain typing, and transmission (see online supplementary appendix 1). The search was limited to studies using the standard MIRU-VNTR method,8 in humans only, and in English.
All titles and abstracts from each of the searches were examined. The full text of each paper was obtained and reviewed if the study reported MIRU-VNTR strain typing of M. tuberculosis complex isolates with at least 15 of the standardised 24 loci (Exact Tandem Repeat A, B, C, D, E; MIRU 2, 10, 16, 20, 23, 24, 26, 27, 39, 40; VNTR 424, 1955, 2163b, 2347, 2401, 3171, 3690, 4052, 4156).8 ,29 ,30
Studies using fewer than 15 loci were not included because the level of discrimination is inadequate for epidemiological use (n=121).8 Studies that used loci different to the standardised 15 and 24 set were not included in the analysis in order to reduce the heterogeneity between studies (n=19). All publication types were included in this first screen to ensure that no relevant data were missed.
Reviews, letters, editorials, outbreaks or case reports (n=103) were excluded in the second screen. Studies that used incomplete sampling (eg, random samples, studies using subsets of populations such as multidrug-resistant patients; n=47) and studies that had a sample size of less than 50 (n=4) were also excluded.
A reviewer (JM) extracted the following data items from all included studies using a form developed in Excel (Microsoft 2010): publication details (year, authors, study country), study details (study duration, loci typed, secondary typing method, study population, whether participant consent was required (a characteristic of the study design that was used as proxy for sampling fraction, assuming that where consent was required the sampling fraction was low)), the number of clustered and unique isolates and the covariates of interest: the maximum size of clusters; the proportion of clusters containing two cases; the proportion of the population that was culture positive; the proportion of culture positive isolates typed; risk factors for clustering; and the Hunter Gaston Discriminatory Index (HGDI)31). IA extracted data from 10% of the papers for external validity, disagreements were discussed and a consensus agreed on.
The main outcome measure—the proportion of TB isolates clustered by MIRU-VNTR strain typing—was calculated as the number of clustered isolates/number of clustered+unique isolates. Where there were uncertainties JM consulted with IA.
Authors were contacted if TB incidence rate was not reported. Where no response was received WHO country estimates of TB incidence for the study year were used.32 As so few studies reported the proportion coinfected with TB/HIV, these estimates for the study country were taken from an European Union-wide survey and WHO country profiles.33 ,34 Owing to poor recording of the sampling fraction (the number of isolates typed/the total number of culture positive TB cases diagnosed during the study period (n=19)), whether the study required the consent of participants (yes/no) was included as a proxy for (low/high) sampling fraction. The risk of bias within each study was assessed using the STROME-ID checklist.35
Data were analysed in Stata V.12. Where studies reported data from more than one set of loci, the method with the highest discriminatory value was included (ie, MIRU-VNTR 24 would be chosen over MIRU-VNTR 15, and MIRU-VNTR 15 plus Spoligotyping would be chosen over MIRU-VNTR 15 alone; n=8). This review was not concerned with summary measures of clustering, but factors that influenced clustering; therefore articles must have included at least one of the covariates. Continuous variables were transformed where the distribution was skewed. The proportion clustered was transformed using the Freeman Tukey transformation.36 Study heterogeneity was assessed using a forest plot and the χ2 test of heterogeneity. Univariable meta-regression analyses were carried out to determine the effect of the study design covariates on the proportion of clustered isolates. All covariates in the analysis were hypothesised to influence the proportion clustered a priori.
Sensitivity analyses were conducted to see the effect of removing studies reporting 0% clustering, with only extrapulmonary TB cases, only Mycobacterium bovis cases, studies using the ‘old 12’ MIRU loci as part of their 15 loci, and studies assessed as having a high likelihood of bias (STROME-ID score less than 20).
The search identified 7274 references resulting in 27 studies (25 journal articles and 2 conference abstracts) included after deduplication and title/abstract/full text screening (figure 1). The main characteristics of the included studies are shown in table 1. Studies were published between 2007 and 2014 and the clustering reported varied from 0%37 to 62.8%.38 In all studies, clustered isolates were defined as having identical strain types based on the MIRU-VNTR loci typed, with or without Spoligotyping. Seventeen studies included isolates from newly diagnosed TB cases, three studies reported including isolates from new and chronic cases of TB, and seven did not report this information. In addition, 10 studies did not include repeat isolates from the same patient, one study included a repeat isolate from one patient and the remaining 17 did not report whether repeat isolates were included or not. Furthermore, four studies included isolates with missing loci in the cluster analysis, whereas four excluded isolates with missing loci and the remaining 20 did not report how they dealt with missing loci. The number of studies reporting each variable of interest is shown in table 2. STROME-ID scores can be found in online supplementary appendix 2.
A forest plot shows the spread of clustering reported by number of loci and additional typing method (figure 2). Significant heterogeneity was identified between the studies (p<0.001), suggesting that a metaregression would be an appropriate analysis.
The univariable metaregression shows evidence for the proportion of clustering to decrease as the number of MIRU-VNTR loci typed increased from 15 to 24 (p=0.04; table 3), accounting for 14% of the between study variation, and to increase when the study participants consented to being included in the study (p=0.03), accounting for 14% of the between study variation. The proportion of clustering increased as the TB incidence in the population increased (p=0.007, adjusted R2=26.7). There was also evidence for the proportion of clustering to increase as the maximum cluster size increased (p=0.001), accounting for 48% of between study variation. There was no evidence of the other study design or study setting variables significantly influencing the proportion clustered. Though non-significant (p>0.05), the TB/HIV coinfection rate in the population explained 2% of the between study variation. Too few studies included information on the proportion of clusters containing two cases, proportion of the study sample with previous TB or with pulmonary TB, so these could not be included in the analysis (table 2).
Sensitivity analyses to examine the effect of excluding studies reporting 0% clustering,37 only M. bovis cases,39 studies using the ‘old 12’ MIRU loci,39–44 and studies assessed as having a high risk of bias,37 ,38 ,45–47 did not generally change the results. The proportion of culture-positive TB in the population remained insignificant but explained 2.6% of the between study variation when excluding 0% clustering (p=0.278 and adjusted R2=2.62). Similarly, the proportion of culture-positive TB in the population remained insignificant but explained 2.6% of the between study variation when excluding studies with the highest risk of bias (p=0.278 and adjusted R2=2.62). The number of loci typed became non-significant, but explained 9.6% and 10.5% of the between study variation when excluding studies using the ‘old 12’ loci and the highest risk of bias, respectively (p=0.106, adjusted R2=9.63; p=0.111, adjusted R2=10.51, respectively).
This review identified 27 studies that met the inclusion criteria. We illustrate that the interpretation of studies using MIRU-VNTR to estimate clustering is subject to bias relating to study design and setting; however, there were insufficient data available to fully explore this impact.
As expected, we found that the proportion of clustering decreased with a greater number of MIRU-VNTR loci typed, with increasing TB incidence and with increasing maximum cluster size. We found that requiring consent to type patient isolates increased the proportion of clustering, which is not expected, given that the sampling fraction would be lower in these studies.
The other study design variables included in this analysis, such as study duration, did not significantly influence the proportion of isolates that were clustered, contrary to previous findings.12 This is likely to be because of a lack of good quality evidence: of the 27 studies that met the inclusion criteria for the review, none reported all the variables of interest, reducing the power of the analysis and precluding multivariable metaregression (table 2). Importantly, key details of cluster analyses were not reported consistently across the studies, such as whether repeat isolates from the same patients were included, or typing profiles with missing loci were included, introducing new, unmeasured biases. In addition, the range of the variables may have been too limited to show any impact on clustering estimates. For example, the proportion of culture-positive isolates typed ranged from 34.5% to 100%, with 17 of the 19 studies reporting this variable from 81.9% to 100%. Furthermore, most of the studies (17/27=63%) were from low TB burden settings and therefore may be reflecting the rate at which imported cases have matching strain types by chance, rather than rates of recent transmission.
The sensitivity analysis suggested that, when excluding the studies with the greatest risk of bias, the culture-positivity in the population might explain a small amount of the between-study variation. This is consistent with estimates of the influence of sampling on the proportion of clustering using IS6110 RFLP typing.48 In the sensitivity analysis excluding studies that used the ‘old 12’ loci, the effect of the number of loci typed becomes non-significant. This is likely because studies using the ‘old 12’ accounted for six out of 10 studies reporting 15 loci, reducing the number of studies and the power of the model.
This study is a timely evaluation of the impact of study design on estimates of TB clustering using MIRU-VNTR strain typing because it has been incorporated into national typing services globally.23 ,49 The findings are relevant where strain typing is used to evaluate TB control systems across different settings because the proportion of clustering is influenced by the number of loci typed, the TB incidence and the maximum cluster size. Given that strain typing methods are advancing beyond MIRU-VNTR typing and that the application of whole genome sequencing to TB control and public health strategies has been demonstrated,9–11 ,50 it is important that the biases in the analysis of such methods are explored and compared. Understanding how to design and compare research studies for public health will greatly improve the benefit gained from newer technologies.
The strength of this meta-analysis was limited by the (lack of) detail reported by the included studies. This review has highlighted the need for better quality reporting in primary studies to enable future reviews to be more robust. Recently published standards for reporting of molecular epidemiology for infectious diseases should improve the quality of reporting.35 This review is further limited by our inability to access 58 of the title/abstract screened articles for full text screening.
The use of TB strain typing as a public health tool in TB control programmes is increasing globally. We have identified a lack of good quality studies that can contribute to our understanding in interpreting the molecular typing of TB. We have also shown that the proportion of clustering derived from MIRU-VTNR typing is influenced by the number of loci typed, whether consent is required to type isolates, TB incidence in the study setting and the maximum cluster size, highlighting these as important considerations in the design and interpretation of future studies.
The authors would like to acknowledge Ross Harris from the Statistics Unit at Public Health England for his advice on metaregression.
Review history and Supplementary material
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Contributors JM drafted the article and PS, IA, TDM and TC revised it critically for important intellectual content. All authors made substantial contributions to the conception and design of the review, and the analysis and interpretation of data. All authors approved the final version for publication.
Funding Public Health England and University College London Impact Studentship. JM is funded through a Public Health England and University College London Impact Studentship. IA is funded through a NIHR Senior Research Fellowship
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.