There is considerable debate going on questioning the practical usefulness of a priori power calculations suggesting that “underpowered” studies are not unethical and that little scientific projection would be still better than no projection at all [1-4]. Some authors argue that “being underpowered is unethical” is a “widespread misconception which is only plausible when presented in vague, qualitative terms but does not hold when examined in detail” [1, 2]. Further review of the arguments reveals that the crucial assumptions implied in the reasoning do not reflect actual scientific practice. The main theoretical arguments assume a perfect “frequentist world” that may allow substitution of one big trial by a corresponding number of small trials that would, once being aggregated in a formal evidence synthesis i.e. meta-analysis, cumulate the same information as the big one [2, 4]. If the individual studies are non-representative samples of the target population, the practical value of estimating a pooled effect that is a weighted average of potentially disparate effects in different subpopulations is questionable.
A widely considered answer to the threat of effect heterogeneity in meta-analyses are random-effect confidence intervals that are often assumed to better reflect variation in the effects across subpopulations than fixed-effects confidence intervals. However, while such intervals offer a valid solution to inference regarding the average effect across all c...
A widely considered answer to the threat of effect heterogeneity in meta-analyses are random-effect confidence intervals that are often assumed to better reflect variation in the effects across subpopulations than fixed-effects confidence intervals. However, while such intervals offer a valid solution to inference regarding the average effect across all contributing effects, they continue to suffer from the principal limitations of effect estimates that are based on non-representative samples: the location and width of these confidence intervals will ultimately depend on the representation of subpopulations and therefore on the selection mechanisms inherent to the data.
While these are well-known and widely debated limitations of most sample-based research studies, another fundamental interpretational issue applies to confidence intervals: they refer to the mean (or average) effect across subpopulations. In the context of meta-analyses, with a large enough number of studies, either random or fixed effects confidence intervals will not cover the actual range of observed study-specific effect estimates. In other words, the intervals are providing a precise estimate of a parameter that actually does not exist, as it represents a weighted average of an underlying set of parameters in homogenous subpopulations.
In a recent plea for routinely presenting prediction intervals in meta-analysis , InHout et al. promote reporting prediction intervals, in addition to confidence intervals . Prediction intervals reflect the variation in treatment or exposure effects over different settings, and allow to infer on what effect is to be expected in future individuals, such as a patient that a clinician is interested to treat. In contrast to confidence intervals, prediction intervals do not shrink to zero width if the sample size largely increases but cover a prespecified range of expected effects in the underlying population. The authors conclude that prediction intervals should be routinely reported to allow for more informative inferences in meta-analyses.
We suggest that prediction intervals are not only meaningful in the context of meta-analyses, but, as implied by the generally applicable concept of variance decomposition , may be in a very similar way relevant to reporting individual studies or trials.
The interest in subgroup analyses in individual studies is often not properly addressed at the analysis stage due to a general claim of “lack of power” that would arise in stratified analyses or modelling approaches including interaction terms. As a result, a single point estimate is often reported along with a confidence intervals that implies homogeneity of the effect across all known subgroups. Such subgroups do, in our point of view, constitute subpopulations similar to subpopulations (studies) in meta-analyses. We therefore question, why should we not consider reporting prediction intervals for single study effect estimates based on pre-specified subgroups such as strata used for randomization or purposive sampling in the context of clinical trials?
1. Bacchetti P, McCulloch CE, Segal MR. Being ‘underpowered’ does not make a study unethical. . Statistics in Medicine 2011; 30:2785–2792.
2. Bacchetti P, Wolf LE, Segal MR, McCulloch CE. Ethics and sample size. American Journal of Epidemiology 2005; 161:105–110.
3. Bacchetti P, McCulloch CE, Segal MR. Simple, defensible sample sizes based on cost efficiency. Biometrics 2008; 64:577–585.
4. Edwards SJL, Lilford RJ, Braunholtz D, Jackson J. Why “underpowered” trials are not necessarily unethical. Lancet 1997; 350:804–807.
5. IntHout J, Ioannidis JP, Rovers MM, Goeman JJ. Plea for routinely presenting prediction intervals in meta-analysis. BMJ open. 2016 Jul 1;6(7):e010247.
6. Weiss NA. A course in probability. Addison-Wesley; 2006.
As postpartum hemorrhage (PPH) researchers, and leaders in education and care of maternal health emergencies, from the United States, UK, Canada, India, Peru, Honduras, Zambia, India, Kenya, Tanzania, Colombia and Nepal, we read the Dumont et al paper with great interest. We would like to share our review:
The most fundamental flaw of this paper is that the authors confuse an intention-to-treat study of a clinical pathway of interventions and behaviors, with the efficacy of a device. These are two very different research questions. In order to test the latter via a randomized controlled trial (RCT) the two groups would need to be the similar and subjects that did not even receive the device (or received it in desperation two hours after the diagnosis of uncontrolled PPH) certainly could not be included in the intervention group. Thus, this study attempts to test intention-to-treat, not the efficacy of the uterine balloon tamponade (UBT) device.
The second most obvious flaw is that degrees of illness are not accounted for. Clinically defined "uncontrolled PPH" is in no way a homogeneous group. For example, someone that has been referred in and is moribund from their advanced shock is an entirely different subject than someone who has mild uncontrolled PPH. Since this is not controlled for, these two groups are likely incomparable.
Even taking into account the two issues described above, the two groups are different and heavily favor the non-...
Even taking into account the two issues described above, the two groups are different and heavily favor the non-intervention group. For example, in the intervention group the following were considerably worse than in the non-intervention group: late uterotonics (54% vs 37%) and retained products of conception/placenta (19% vs 10%). Additionally, UBTs were placed more than 30 minutes after diagnosis of uncontrolled PPH in 58% of cases
Therefore, while this study truly does not test the UBT device, what it does do is tell us that the care providers were not able to provide quality care to women defined as having uncontrolled PPH, despite being within the framework of a study that encouraged best practice. This is indeed extremely important. This study adds to the growing literature describing that performance of health care providers may often be inconsistent and suboptimal in maternal health emergencies, and, that these poor practices may contribute to the flawed nature of RCTs in maternal health interventions.
Prof Burke, Prof Arulkumaran, Prof. Rogo, Dr. Manasyn, Susana Ku, RMW, Monica Oguttu, RNMW, Dr. Thapa, Prof Ochoa, Prof. Tarimo, Dr. Eckardt, Dr. Suarez and Dr. Garg,
We have read the study conducted by Diniz et al. on the possible association between mammography and breast cancer-related mortality in the state of São Paulo, Brazil with the greatest of care. Despite the detailed statistical analysis, the ecological study design implies limitations to the hypothesis generated, as pointed out by the authors themselves (1). In our opinion, both the authors’ main conclusion and the assumed association of cause and effect are inappropriate.
The factors associated with the incidence of breast cancer in Brazil and its resulting mortality have recently been evaluated in different studies (2-4). Mortality rates have been found to vary as a function of geospatial location (rural areas versus urban centers)(4). In addition, the reduction encountered in mortality was associated with the regions in which the human development index (HDI) was higher. On the other hand, the highest mortality rates have been found to occur in the states with the highest HDI (5). Diniz et al. and many other investigators have mentioned that a higher incidence of breast cancer occurs among more affluent women living in urban areas and in large cities (1,5). In this respect, we are certain that mortality is also related to the incidence of the disease; hence, the higher the incidence, the greater the resulting mortality will be. Conversely, women who do not have breast cancer will obviously not die from the disease.
Therefore, we believe t...
Therefore, we believe that the study conducted by Diniz et al. should be interpreted in another manner. According to their report, breast cancer-related mortality was associated with three characteristics of the more affluent female population in Brazil: healthcare within the private healthcare system, nulliparity, and access to mammography (1). Nevertheless, the women with greatest access to mammography are those with better socioeconomic conditions living in urban areas and for this reason are more likely to develop breast cancer and, consequently, to die from this disease.
The authors take advantage of the results obtained to criticize mammography screening. In fact, screening programs must reach at least 70% of the target population to be considered effective. In 2016, the coverage provided by mammography screening within the Brazilian National Health Service (Sístema Único de Saúde - SUS) reached only 24% of the female population of 50 to 69 years of age. This percentage almost doubles within the supplementary healthcare system (6,7), and that fact also permits a critical analysis to be made in relation to the results obtained by Diniz et al. (1). In other population-based studies, improved access to healthcare and an increase in the number of mammograms performed was found to be associated with a considerable reduction in cases of breast cancer diagnosed at an advanced stage in some regions (7-9). When well conducted, mammography screening can reduce mortality by approximately 30% (10).
In terms of radiological protection, mammography screening is justifiable if the imaging procedures performed are of excellent quality and conducted using the lowest possible dose of radiation (11,12). Diniz et al. concluded that the number of cases of radiation-induced cancer caused by the ionizing radiation from imaging could be increasing the mortality rate (1). This association is wrong, since it would be necessary to take the dose at which the imaging procedure is performed into consideration, together with other factors, to enable the number of cases of radiation-induced cancer in a population to be calculated (13). Corrêa et al. found that the number of imaging procedures performed was not associated with the number of breast cancer deaths induced by women’s exposure to radiation and concluded that the final magnitude of the risk was exclusively determined by the dose of radiation emitted by the mammography equipment (13).
In the study conducted by Corrêa et al., the likelihood of developing a radiation-induced cancer as a result of mammography was found to be 0.2 per 100,000 women. Furthermore, the probability of death at 85 years of age as a result of the accumulated radiation dose to which a woman would be exposed at imaging procedures conducted every two years was estimated at 0.18 per 100,000 women annually. Therefore, the risk of dying from radiation-induced cancer is much lower than the risk associated with other factors that could result in death, including not undergoing imaging for the early detection of breast cancer and thus being diagnosed at an advanced stage (13,14). In a recent study, women receiving care within the private healthcare system were found to be half as likely to have advanced lesions at diagnosis, reflected in an increase in survival of approximately 10% in relation to the women treated within the Brazilian National Healthcare Service (SUS) (15). These data also conflict with those reported by Diniz et al. (1).
In the statistical analysis, Diniz et al. selected and analyzed an expressive set of explanatory variables. They also performed multiple regression analyses and applied the Akaike Information Criterion (AIC), which estimates the model with the best fit, to the set of resulting models (1). Nevertheless, application of the AIC does not judge whether the best-fit model found is actually satisfactory, bearing in mind that all the models could be inadequate. The final fitted model used by the authors contains the following variables: the mammography ratio, the percentage of women with private healthcare and the proportion of women of childbearing age who did not have children. Therefore, many important variables were largely ignored in the analysis, including the gross domestic product, human development index, and income, among others.
Figure 1 in the paper merits further scrutiny, together with charts on the human development index, gross domestic product and income. Despite the availability of computer software programs such as Geographically Weighted Regression (GWR) (16), the authors elected to use the Global Moran’s Index, even in the presence of spatial clusters from the municipalities with higher and lower mortality rates as shown in Figure 1. Apparently, these clusters occur in metropolitan regions where socioeconomic factors are high but whose population is larger and exposed to greater levels of pollution, stress and anxiety. Therefore, the Local Moran’s Index rather than the Global Moran’s Index should be applied to this map (17). In this respect, we believe that the analysis of the results was inadequate, since the study’s findings were not discussed in relation to the social, economic and environmental factors associated with the occurrence of mortality in the state of São Paulo, Brazil. Finally, a critical analysis should be conducted with respect to the data collection and registry of mammograms performed within the Brazilian National Health Service (SUS). Individuals who have to travel to another municipality to undergo mammography (18), sometimes within the private healthcare system, and the bias in the mammography registries caused by discrepancies in the public funding of the exams should be emphasized.
Based on the aforementioned considerations, we conclude that the study conducted by Diniz et al., although contributing with a fairly important analysis, fails to reach an adequate conclusion (1). According to our analysis, breast cancer-related mortality is associated with factors linked to social and medical development in the state of São Paulo.
Rosemar Macedo Sousa Rahal
Rosangela da Silveira Corrêa
Danielle Cristina Netto Rodrigues
Nilson Clementino Ferreira
Noley Vicente Ribeiro
Leonardo Ribeiro Soares
1. Diniz CSG, Pellini ACG, Ribeiro AG, et al. Breast cancer mortality and associated factors in São Paulo State, Brazil: an ecological analysis. BMJ Open 2017;7(8):e016395.
2. Cecilio AP, Takakura ET, Jumes JJ, et al. Breast cancer in Brazil: epidemiology and treatment challenges. Breast Cancer (Dove Med Press) 2015;7:43-9.
3. Gonzaga CM, Freitas-Junior R, Curado MP, et al. Temporal trends in female breast cancer mortality in Brazil and correlations with social inequalities: ecological time-series study. BMC Public Health 2015;15:96.
4. Gonzaga CM, Freitas-Junior R, Souza MR, et al. Disparities in female breast cancer mortality rates between urban centers and rural areas of Brazil: ecological time-series study. Breast 2014;23(2):180-7.
5. Liedke PER, Finkelstein DM, Szymonifka J, et al. Outcomes of Breast Cancer in Brazil Related to Health Care Coverage: A Retrospective Cohort Study. Cancer Epidemiol Biomarkers Prev 2014; 23:126-33.
6. Freitas-Junior R, Rodrigues DCN, Corrêa RS, et al. Contribution of the Unified Health Care System to mammography screening in Brazil, 2013. Radiol Bras 2016; 49(5): 305-10.
7. Smith RA, Duffy SW, Gabe R, et al. The randomized trails of breast cancer screening: what have we learned? Radiol Clin North Am 2004; 42:793-806.
8. Martins E, Freitas-Junior R, Curado MP, et al. [Temporal evolution of breast cancer stages in a population-based cancer registry in the Brazilian central region]. Rev Bras Ginecol Obstet 2009;31(5):219-23.
9. Nunes RD, Martins E, Freitas-Junior R, et al. Descriptive study of breast cancer cases in Goiânia between 1989 and 2003. Rev Col Bras Cir 2011;38(4):212-6.
10. Corrêa R. da S, Freitas-Júnior R, Peixoto JE, et al. [Estimated mammogram coverage in Goiás State, Brazil]. Cad Saude Publica 2011;27(9):1757-67.
11. Berrington de González A, Reeves G. Mammographic screening before age 50 years in the UK: comparison of the radiation risks with the mortality benefits. Br J Cancer 2005;93:590-6.
12. Corrêa R da S, Peixoto JE, Ferreira RDS, et al. Risco de câncer radioinduzido em rastreamento mamográfico. Rio de Janeiro, RJ, Brazil: IX Latin American IRPA Regional Congress on Radiation Protection and Safety - IRPA 2013, 2013. http://www.iaea.org/inis/collection/NCLCollectionStore/_Public/45/071/45...
13. Yaffe MJ, Mainprize JG. Risk of radiation-induced breast cancer from mammographic screening. Radiology 2011;258:98-105.
14. Young KC, Faulkner K, Wall B, et al. UK National Health Service Breast Screening Programme (NHSBSP). Review of radiation risk in breast screening. NHSBSP Publication no. 54, 2003.
15. Anselin L. Exploring Spatial Data with GeoDATM: A Workbook. Center for Spatially Integrated Social Science, 2005.
16. Wheeler D, Tiefelsdorf M. Multicollinearity and correlation among local regression coefficients in geographically weighted regression. J Geog Syst 2005;7:161-87.
17. Henley SJ, Anderson RN, Thomas CC, Massetti GM, Peaker B, Richardson LC. Invasive Cancer Incidence, 2004-2013, and Deaths, 2006-2015, in Nonmetropolitan and Metropolitan Counties - United States. MMWR Surveill Summ 2017;66(14):1-13.
18. Vieira RADC, Formenton A, Bertolini SR. Breast cancer screening in Brazil. Barriers related to the health system. Rev Assoc Med Bras (1992). 2017;63:466-74.
Please see my article with the above name, published in the New Zealand Medical Journal, which used the Caerphilly data. This has not been mentioned elsewhere, presumably because of its obscure site. There was no relationship at all between cholesterol and heart attacks and only 4% of the variance was associated with strokes. NZ Med. J. 2012, 125, 1364.
This study asks the important question, what proportion of systematic reviews searched for and made use of unpublished data? However, an important follow-up question remains to be addressed: Among those cases in which unpublished data was used, how was it used? Unpublished data can of course address study publication bias, ie. data from unpublished studies can be simply added to data obtained from the published literature. However, unpublished data can also address outcome reporting bias,[1-3] ie. a trial publication conveys that the intervention is safe and/or effective while unpublished data on the same trial tell a different story. For example, in a study of 74 industry-sponsored antidepressants trials, in addition to 23 (31%) unpublished trials, we found 11 (15%) trials with outcome reporting bias. If we had corrected for the former while ignoring the latter, we would have obtained an effect size estimate that was still inflated. Returning to the current study, an informative follow-up would be to look within the cohort of systematic reviews that made use of unpublished data and determine how many used it to verify the published results.
1 Kirkham JJ, Dwan KM, Altman DG, et al. The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews. BMJ 2010;340:c365.
2 Chan A-W, Altman DG. Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of author...
2 Chan A-W, Altman DG. Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of authors. BMJ 2005;330:753. doi:10.1136/bmj.38356.424606.8F
3 Chan A-W, Hróbjartsson A, Haahr MT, et al. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 2004;291:2457–65. doi:10.1001/jama.291.20.2457
4 Turner EH, Matthews AM, Linardatos E, et al. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med 2008;358:252–60. doi:10.1056/NEJMsa065779
5 Ziai H, Zhang R, Chan A-W, et al. Search for unpublished data by systematic reviewers: an audit. BMJ Open 2017;7:e017737. doi:10.1136/bmjopen-2017-017737
We thank Dr. Fowler and colleagues for taking the time to consider and comment on our BMJ Rapid Recommendation (1). They speculate on reasons why tenofovir and emtricitabine increased the risk of neonatal mortality and early preterm delivery in their trial (2) and then say that the current evidence does not support a recommendation for alternative NRTIs over a tenofovir-based antiretroviral therapy (ART) regimen. We do agree that most, but not all, of the evidence comes from a single study, which may have overestimated harm. Our systematic review attempted to generate the current best evidence, and is not definitive: it is moderate-to-low quality for key outcomes (3). However, we disagree with the implication that based on this evidence, most women would choose a tenofovir-based ART regimen.
The PROMISE authors suggest that results of the comparison between tenofovir-ART and AZT-ART are untrustworthy because the risk of neonatal death was lower in the AZT-ART arm in the earlier period 1 before the tenofovir-ART arm was introduced (2). However, the difference between the two time-periods in the AZT-ART arm could easily be explained by chance (neonatal mortality 1.4% in period 1 vs. 0.6% in period 2, p=0.39; very preterm delivery 3.4% in period 1 vs. 2.6% in period 2, p=0.60). Regardless, the only reliable comparison between tenofovir-ART and AZT-ART is during period 2 when randomisation to both AZT and tenofovir-based ART occurred. Despite these r...
The PROMISE authors suggest that results of the comparison between tenofovir-ART and AZT-ART are untrustworthy because the risk of neonatal death was lower in the AZT-ART arm in the earlier period 1 before the tenofovir-ART arm was introduced (2). However, the difference between the two time-periods in the AZT-ART arm could easily be explained by chance (neonatal mortality 1.4% in period 1 vs. 0.6% in period 2, p=0.39; very preterm delivery 3.4% in period 1 vs. 2.6% in period 2, p=0.60). Regardless, the only reliable comparison between tenofovir-ART and AZT-ART is during period 2 when randomisation to both AZT and tenofovir-based ART occurred. Despite these reservations, we performed sensitivity analyses that included data from the AZT-arm in period 1 before the tenofovir-ART arm was introduced (3). The increased risk of early preterm delivery and stillbirth with tenofovir/emtricitabine remained statistically significant and interpretation does not change when data from period 1 is included. Dr. Fowler and colleagues have also suggested that there may have been “some unknown confounder” wherein tenofovir-ART caused harm during period 2, but would not have been harmful to the participants in period 1 (2, 4). We consider this unlikely. Even if true, no such confounder has been identified and women faced with choosing an ART regimen will not know whether or not tenofovir-ART has the potential for harm in their case.
We agree that when tenofovir and emtricitabine are used in combination with lopinavir/ritonavir, it is possible that the risk is higher than with efavirenz; although it is unlikely that if tenofovir is indeed the ‘culprit’ medication, that there would be no risk at all when combined with efavirenz. Put another way, even if the risk of premature delivery and neonatal death is low with tenofovir/emtricitabine plus efavirenz, based on the available evidence, the risk with AZT/lamivudine plus efavirenz may be even lower.
We did not state that the pathophysiology of stillbirth and early neonatal death are the same. Perinatal mortality has long been a global standard outcome measure of maternal and perinatal healthcare (5) and is likely to be similarly important to women, thus our panel pre-specified that it was appropriate to combine them in our evidence summary.
We agree with their concern regarding the possibility that all combination ART regimens may increase the risk of prematurity (versus no ART or monotherapy), albeit this is uncertain and not the focus of this guidance. Given the unique physiology (and pathophysiologies) of pregnancy, the lack of an understood biological rationale at this stage should neither lead to a definitive conclusion nor reassurance. It remains possible that potential pharmacokinetic interactions, and failing or restoring immune systems are different in pregnancy. These are all good reasons to recognise that work from non-pregnant male and female adults cannot always be applied directly to pregnant women. Instead, these are strong justifications for further pregnancy-specific research. We believe that pregnant women (and their babies) should have an equitable standard of research evidence, and thus disagree that it is unlikely that there will be other randomised trials. It is imperative that further randomised trials are conducted. Regulatory authorities, and perhaps the WHO, have a responsibility to ensure that the appropriate studies are performed by the pharmaceutical industry to ensure that pregnant women are not disadvantaged.
Fowler et al. assert that the available observational evidence should provide reassurance to pregnant women. In this, we believe they are misguided. We reviewed the entirety of the observational evidence, including the single observational study that they cite (6); it cannot provide such assurance. First, even the highest quality observational studies are at high risk of residual confounding (7). Second, none of the studies controlled for all of the most important known confounders, including HIV disease status (CD4 count and viral load), socioeconomic status, and availability and quality of healthcare. Third, the studies were inconsistent with some showing harm with tenofovir and others benefit. Fourth, the results were imprecise with the confidence intervals including a magnitude of harm that almost all women would find important.
We strongly disagree with any implication that most women would be willing to risk the health of their child when other options exist. The decision about which vertical transmission strategy or combination ART regimen to use should rest squarely with each informed woman, based on her own values and preferences. This message was consistent from the linked systematic review on the values and preferences of women living with HIV (8), from the three women living with HIV on the guideline panel, as well as an associated opinion piece written by a woman living with HIV (9). Avoiding death in a newborn child is tremendously important to all or almost all women and even if the increased risk of stillbirth or neonatal mortality is extremely low with tenofovir/emtricitabine, almost all women would choose to use a different regimen. Unless future randomised trials show that tenofovir/emtricitabine is safe, we believe that most fully informed women would choose an alternative. Efforts should be made to share the best available evidence and empower women who are pregnant or might consider pregnancy to choose their medications for themselves rather than a ” one size fits all” approach to HIV treatment.
Reed A.C. Siemieniuk, Graham P. Taylor, Gordon H. Guyatt, Lyubov Lytvyn, Yaping Chang, Paul E. Alexander, Yung Lee, Thomas Agoritsas, Arnaud Merglen, Haresh Kirpalani, Susan Bewley
1. Siemieniuk RA, Lytuyn L, Ming JM et al. Antiretroviral therapy in pregnant women living with HIV: a clinical practice guideline. BMJ 2017;358:j3961.
2. Fowler MG, Qin M, Fiscus SA, et al. Benefits and risks of antiretroviral therapy for perinatal prevention. N Engl J Med 2016;375:1726-37.
3. Siemieniuk RA, Foroutan F, Mirza R et al. Antiretroviral therapy for pregnant women living with HIV or hepatitis B: a systematic review and meta-analysis. BMJ Open 207;7:e019022.
4. Peer review of Siemieniuk RA, Foroutan F, Mirza R et al. Antiretroviral therapy for pregnant women living with HIV or hepatitis B: a systematic review and meta-analysis. BMJ Open 207;7:e019022. Available at: http://bmjopen.bmj.com/content/bmjopen/7/9/e019022.reviewer-comments.pdf Accessed October 9, 2017.
5. World Health Organization. “Maternal and perinatal health.” http://www.who.int/maternal_child_adolescent/topics/maternal/maternal_pe... Accessed October 9, 2017.
6. Zash R, Jacobson DL, Diseko M, et al. Comparative Safety of Antiretroviral Treatment Regimens in Pregnancy. JAMA Pediatr. 2017 Oct 2;171(10):e172222.
7. Agoritsas T, Merglen A, Shah ND, O'Donnell M, Guyatt GH. Adjusted Analyses in Studies Addressing Therapy and Harm: Users' Guides to the Medical Literature. JAMA. 2017 Feb 21;317(7):748-759.
8. Lytvyn L, Siemieniuk RA, Dilmitis S, et al. Values and preferences of women living with HIV who are pregnant, postpartum or considering pregnancy on choice of antiretroviral therapy during pregnancy. BMJ Open. 2017 Sep 11;7(9):e019023.
9. Welbourn, A. WHO and the rights of women living with HIV. BMJ Opinion. Available at: http://blogs.bmj.com/bmj/2017/09/11/alice-welbourn-who-and-the-rights-of... Accessed October 9, 2017.
We agree with Professors Salemi and Zoorob (1) that the 2016 UK guidelines’ message regarding the consumption of alcohol during pregnancy is clear: it is best to be avoided. However, this has not always been the case and previous guidelines implicitly suggested that drinking up to 4 units per week was likely to be safe. We also recognise the methodological challenges of studying long term effects of low maternal alcohol consumption in pregnancy. Like Salemi and Zoorob, we feel that the paucity of evidence is unfortunate and should be addressed. We also maintain that in light of the evidence that is available, basing guidelines on the precautionary principle is reasonable.
Specifically, Salemi and Zoorob raise concerns about the inclusion of a study by Salihu et al. (2, reference 27 in the review) in the review (3), because maternal alcohol consumption was determined retrospectively, i.e. after delivery. Our predefined eligibility criterion was explicit - we required alcohol consumption to be ascertained prior to pregnancy. Salemi and Zoorob therefore question the inclusion of Salihu et al.’s study in our review. In the Methods section, Salihu et al. state that they used “the Missouri linked cohort data files…1989 through 2005.” No further details are provided The Discussion states that “Alcohol was based on maternal recall and therefore subject to bias. Women may underreport actual alcohol consumption because of societal stigmas, biasing study results toward the nu...
Specifically, Salemi and Zoorob raise concerns about the inclusion of a study by Salihu et al. (2, reference 27 in the review) in the review (3), because maternal alcohol consumption was determined retrospectively, i.e. after delivery. Our predefined eligibility criterion was explicit - we required alcohol consumption to be ascertained prior to pregnancy. Salemi and Zoorob therefore question the inclusion of Salihu et al.’s study in our review. In the Methods section, Salihu et al. state that they used “the Missouri linked cohort data files…1989 through 2005.” No further details are provided The Discussion states that “Alcohol was based on maternal recall and therefore subject to bias. Women may underreport actual alcohol consumption because of societal stigmas, biasing study results toward the null.” Only later in the Discussion do they also note that “…women with an adverse pregnancy outcome may be even more likely to underreport alcohol consumption,” suggesting that alcohol consumption was ascertained retrospectively.
Two independent reviewers read this paper and deemed it eligible for inclusion. As stated in our Methods section, we did not contact authors for further information. Having been alerted by Salemi and Zoorob to the use of birth certificates as the data source in Salihu et al., we are still unsure as to the where and when these data were ascertained. Were they abstracted from antenatal records (and hence prospectively assessed?)? Were they self-reported post-delivery (and hence retrospectively ascertained)? Do all hospitals in Missouri collect these data in the same manner? We have now contacted the authors of the original paper for clarification but have not yet had a response.
In light of these concerns, carried out sensitivity analyses omitting the Salihu et al. study in the meta-analyses for preterm delivery and SGA. For preterm delivery, the pooled odds ratio including Salihu et al. and reported in the review was: 1.10 (95%CI: 0.95, 1.28, I2 = 59%). Excluding Salihu et al. it is 1.13 (0.92, 1.38; I2 = 41%). Respective results for SGA are: with Slaihu et al. 1.08 (95%CI: 1.02, 1.14; I2 = 0%) and 0.96 (95%CI: 0.80, 1.15, I2 = 0%) without. These changes would not affect our main findings as reported in the original review: “the two main findings are: (1) a surprisingly limited number of prospective studies specifically addressing the question of whether light maternal alcohol consumption (ie, up to 32g/week (or 4 UK units) has any causal effect (adverse or beneficial) on infant and later offspring outcomes and pregnancy outcomes, and, as a result, (2) a paucity of evidence demonstrating a clear detrimental effect, or safe limit, of light alcohol consumption on outcomes.”
Salemi and Zoorob also question why we included specific results from Ernhart et al. (4, reference 59 in the review) in Table 2. We presented results of analyses in which Ernhart et al. used maternal self-reported alcohol consumption during pregnancy. We did not include results of analyses in which alcohol consumption prior to or just after conception was estimated from an equation developed in a separate sample. We feel that prioritising the results based on prospective maternal report, which is comparable to other studies, was a sensible decision.
Loubaba Mamluk, Luisa Zuccolo and Abigail Fraser on behalf of all co-authors
1. Salemi JL and Zoorob RJ. The importance of methodological rigor and communication of information. http://bmjopen.bmj.com/content/7/7/e015410.responses#the-importance-of-m....
2. Salihu HM, Kornosky JL, Lynch O, Alio AP, August EM, Marty PJ. Impact of prenatal alcohol consumption on placenta-associated syndromes. Alcohol (Fayetteville, NY). 2011;45(1):73-79.
3. Mamluk L, Edwards HB, Savović J, et al. Low alcohol consumption and pregnancy and childhood outcomes: time to change guidelines indicating apparently ‘safe’ levels of alcohol during pregnancy? A systematic review and meta-analyses. BMJ Open 2017;7:e015410. doi: 10.1136/bmjopen-2016-015410
4. Ernhart CB, Sokol RJ, Ager JW, Morrow-Tlucak M, Martier S. Alcohol-related birth defects: assessing the risk. Annals of the New York Academy of Sciences. 1989;562:159-172.
During the peer review process of this manuscript, it came to our attention that one of the reviewers, Erling Solheim, may have an undeclared conflict of interest. We were told that he had worked in the same research group as the authors from June 2006 to September 2007 and had published research with the authors in 2008 and 2013.
We attempted to contact Erling Solheim a number of times to verify these claims, but he has not responded to our emails. If true, we do not feel that this undeclared conflict of interest would have compromised the peer review process or altered our decision to publish the manuscript.
In their recent paper,1 Mamluk et al describe the results of a systematic review and meta-analyses of the association between low alcohol intake in pregnancy and several adverse birth outcomes and long-term outcomes in children. The authors conclude that there is “limited evidence for a causal role of light drinking in pregnancy, compared with abstaining, on most of the outcomes examined” and that their “extensive review shows that this specific question is not being researched thoroughly enough, if at all.”
I sympathize with the thorough work performed by the authors. Even so, a word of caution is needed with respect to the second part of their conclusion. The authors defined the intake of interest as a maximum of 32 g of alcohol per week corresponding to 4 standard UK drinks/week. While this makes sense from a British point of view, the definition of a standard drink being 8 g of alcohol, this cut-off makes little sense in most non-British countries, where a standard drink is typically defined as containing 10-12 g. Hence, a large number of non-British studies using three or four drinks as their cut-off for low intake have not been included in the present review, only because their definition did not correspond with the British.
Several meta-analyses in recent years have used slightly more embracing definitions.2 3 These meta-analyses included far more studies, simply because a limit of 32 g fits poorly with international definitions. Using...
Several meta-analyses in recent years have used slightly more embracing definitions.2 3 These meta-analyses included far more studies, simply because a limit of 32 g fits poorly with international definitions. Using slightly higher cut-offs would have enabled the authors to include far more studies. The conclusion from all the reviews and meta-analyses remain the same, ie. little evidence of any association between low alcohol intake on a weekly basis with any adverse outcome.1 2 3 But the conclusion that the evidence is sparse does not seem correct. There is, in fact, quite a few studies if the limit for low intake is set slightly higher; and if (the majority of) studies with a higher cut-off of e.g. 36, 48 g/week or something similar generally show no association with the outcomes of interest, the conclusion should not be that there is paucity of evidence for intake up to 32 g/week. There is plenty of evidence of no association, only the evidence does not fit the British definition of 32 g but mainly with slightly higher cut-off points.
Ulrik Schiøler Kesmodel
Professor in obstetrics and gynaecology
Herlev University Hospital, Copenhagen
Institute for Clinical Medicine, University of Copenhagen
1. Mamluk L, Edwards HB, Savović J et al. Low alcohol consumption and pregnancy and childhood outcomes: time to change guidelines indicating apparently ‘safe’ levels of alcohol during pregnancy? A systematic review and meta-analyses. BMJ Open 2017;7:e015410. doi:10.1136/bmjopen-2016-015410.
2. Patra J, Bakker R, Irving H, et al. Dose-repsonse relationship between alcohol consumption before and during pregnancy and the risks of low birthweight, preterm birth and small for gestational age (SGA) - a systematic review and meta-analyses. BJOG 2011; 118: 1411-1421. DOI: 10.1111/j.1471-0528.2011.03050.x
3. Flak AL, Su S, Bertrand J, Denny CH, Kesmodel US, Cogswell ME. The association between mild, moderate, and binge prenatal alcohol exposure and child
Thank you for the opportunity to respond to Lassman et al.’s re-analysis of our study titled, “Clinical trial registration, reporting, publication and FDAAA Compliance: A cross-sectional analysis and ranking of new drugs approved by the FDA in 2012”.1 Our original study assessed the clinical trial transparency of novel drugs approved by the FDA in 2012 that were sponsored by large drug companies. We assessed the drugs by two sets of transparency standards: U.S. legal requirements and an ethical standard that all human subjects research should be publicly accessible to contribute to generalizable knowledge.
Our original analysis included a review of 15 drugs, sponsored by 10 large companies, involving 342 trials. Lassman and colleagues’ reassessment examined 69 of these 342 trials and focused on only the U.S. legal requirements standard. Lassman et al did not elaborate on why they limited their assessment to this subset of trials and on compliance with legal requirements. As a reminder, US clinical trial disclosure requirements are defined by the Food and Drug Administration Amendments Act (FDAAA), passed in 2007.2
We applaud efforts to replicate studies. We are glad that our policy of publicly sharing data through the Dryad Digital Repository enabled replication and re-analysis.3 Additionally, we generally agree with the Lassman and colleagues re-assessment of our study using today’s new and updated knowledge base and world-view....
We applaud efforts to replicate studies. We are glad that our policy of publicly sharing data through the Dryad Digital Repository enabled replication and re-analysis.3 Additionally, we generally agree with the Lassman and colleagues re-assessment of our study using today’s new and updated knowledge base and world-view. However, we do not agree with their implications that our study was lacking in some way. Our work reflected prevailing standards at the time.
It is important to note, that, when we conducted our analysis (starting around 2013), implementation and interpretation of FDAAA were very different than today. Our study used the leading interpretation of the law at the time of our study. Our interpretation was shared by many drug companies, researchers, and reflected in top academic publications, including the banner NEJM publication on the subject by Anderson et. al. titled, “Compliance with Results Reporting at ClinicalTrials.gov.”4
As such, Lassman et al paper isn’t so much a re-analysis of our work, but rather a reflection and update on our work given the learning and evolution of practices that have emerged since the time our original study was conducted. Updated interpretations of FDAAA were generally catalyzed by the NIH issuance of a Final Rule, on September 21, 2016, clarifying key ambiguous language in FDAAA.5 In particular, thanks in large part to the Final Rule, there is now general consensus that under FDAAA, there was no requirement to report trial results for unapproved drugs (for trials studying initial use indications).
Prior interpretations involved results reporting by one year after a trial’s primary completion date, unless a certificate of delay was filed asking for an extension of time to report. In the case of a certificate being filed, results reporting could be delayed until 30 days after FDA approval of the indication. Under the Final Rule, this will change. Results will be required to be reported for both approved and unapproved drugs.
Lassman et al acknowledge that interpretations of FDAAA have been varied and evolved over time. They state: “FDAAA is complex, and after its passage “a spectrum of interpretations” emerged, particularly with respect to the deadlines for results reporting and the necessity of “certificates of delay” (CODs)….The COD provisions… are ambiguous in some cases.”
Below is a table showing our original 2012 analysis, along with a re-analysis of our 2012 sample of drugs using today’s interpretation of FDAAA. Lassman and colleagues 2017 reanalysis is included in the table as well (Table 1). Our analyses are in general agreement.
Our aim as investigators on the Good Pharma Scorecard project is to ensure that we employ an iterative, open and adaptive learning system when establishing standards and reporting on clinical trial transparency. We strive to continuously improve and refine our methods as appropriate. We also continuously reach out to and convene stakeholders, soliciting feedback to ensure the accuracy and timeliness of our work and that the project is advancing the needs of patients and public health broadly.
We recently published an assessment of clinical trial transparency for novel drugs approved by the FDA in 2014 that were sponsored by large companies.6 It uses the updated and current interpretations of FDAAA catalyzed by the passing of the Final Rule. They are similar to those of Lassman and colleagues. There is less debate at this point about how to interpret most sections of FDAAA reporting requirements. Experts should generally be in agreement on requirements and be better positioned to advance a more open and transparent healthcare innovation and drug development enterprise that strives towards trustworthiness and patient-centricity.
Table 1 (found here: http://blogs.bmj.com/bmjopen/files/2017/09/Miller-1.png): Table showing our (Miller et al) original analysis, along with our re-analysis of our 2012 drugs using current FDAAA interpretations and Lassman and colleagues’ analysis.
The Good Pharma Scorecard is funded by a grant from the Laura and John Arnold Foundation to Bioethics International. In the past 36 months, Dr. Ross has received support through Yale University from Johnson and Johnson to develop methods of clinical trial data sharing, from Medtronic, Inc. and the Food and Drug Administration (FDA) to develop methods for post-market surveillance of medical devices, from the Food and Drug Administration (FDA) to establish the Yale-Mayo Center for Excellence in Regulatory Science and Innovation (CERSI), from the Blue Cross Blue Shield Association to better understand medical technology evaluation, from the Centers of Medicare and Medicaid Services (CMS) to develop and maintain performance measures that are used for public reporting, from the Agency for Healthcare Research and Quality, and from the Laura and John Arnold Foundation to support the Collaboration on Research Integrity and Transparency (CRIT) at Yale.
Miller JE, Korn D, Ross JS. Clinical trial registration, reporting, publication and FDAAA compliance: a cross-sectional analysis and ranking of new drugs approved by the FDA in 2012. BMJ Open 2015; 5: e009758. doi: 10.1136/bmjopen-2015-009758
2 Food and Drug Administration Amendments Act of 2007 (FDAAA), Public Law No. 110-85, § 801, 121 Stat. 904 (2007) (codified at 42 U.S.C. § 282).
3 Miller JE, Korn D, Ross JS (2015) Data from: Clinical trial registration, reporting, publication and FDAAA compliance: a cross-sectional analysis and ranking of new drugs approved by the FDA in 2012. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.t8n07.
4 Anderson ML, Chiswell K, Peterson ED, et al. Compliance with results reporting at ClinicalTrials.gov. N Engl J Med2015;372:1031–9. doi:10.1056/NEJMsa1409364
5 Clinical Trials Registration and Results Information Submission; Final Rule (NIH Final Rule), 81 Fed. Reg. 64982 (Sept. 21, 2016) (codified at 42 C.F.R. Part 11).
6 Miller JE, Wilenzick M, Ritcey N, Ross JS, Mello MM. Measuring clinical trial transparency: An empirical analysis of newly approved drugs and large pharmaceutical companies. BMJ Open (in press).