Original ArticleA simple method for analyzing matched designs with double controls: McNemar's test can be extended
Introduction
Collecting data is often slow, expensive, awkward, and exhausting. The advent of big data can overcome such challenges by assembling large amounts of digital information as a byproduct of daily work. One example is an analysis of trends around Internet search queries for tracking influenza epidemics across the globe [1]. Interpreting big data for medical research, however, can be a challenge because the lack of randomization means that unmeasured confounders may bias causal interpretations. Statistical measures such as regression, stratification, or propensity score matching attempt to mimic a randomized trial in big data, yet none of these methods is ideal and new designs that control for confounding are a priority for future research.
Pair-matched designs are an elegant approach for providing statistical power and eliminating multiple unmeasured confounders. One example is a case–crossover study suggesting a fourfold increase in the risk of a motor vehicle crash when individuals use a cellular telephone while driving [2], [3]. In some investigations, these matched designs have led to new visual displays and new statistical approaches for interpreting observed associations [4]. Interestingly, the combination of big data and matched designs has also motivated further reconsideration of a few classic problems in statistical science [5]. Indeed, the feasibility and power of big data now means that classic problems have growing real-world relevance and are no longer academic puzzles.
Quinn McNemar [6] introduced the McNemar test in 1947 as a basic statistic for evaluating pair-matched binary data in a 2 × 2 contingency table. A few extensions were developed over the next half-century to account for small cell sizes, outcomes that were multinomial rather than binary, cases with missing data, and ways of calculating sample size estimates based on anticipated statistical power [7], [8]. The advent of high-performance maximum likelihood estimation, in addition, has served as a substitute for similar problems but offers no closed-form expression and no easy method for data visualization. We know of no past study, however, that extended the McNemar test to studies involving double controls rather than single controls.
Here we highlight matched designs with double controls, defined as studies where exactly two controls are linked to each case using comprehensive data. We illustrate the design with data examining the association between overcast weather and the risk of a life-threatening traffic crash (Box). We demonstrate that a chi-square test can be misleading because it ignores the matching and that McNemar test can be imprecise because it ignores half the controls. We also show that conditional logistic regression is reasonable but can be replaced by a simpler approximation that yields a visual display and easy verification. The new approach is generalizable to experimental studies where randomization assigns two controls to each case (or two cases for each control).
Section snippets
Background
Consider a study of life-threatening traffic crashes that selected consecutive patients and characterized the circumstances of each adverse event. The purpose was to explore the association between weather and traffic risks without invoking an ecologic fallacy (that typically assesses different regions rather than different individuals). The specific study question tested if exposure to overcast weather can lead to a decrease in the short-term risk of a traffic crash, perhaps because of the
Statistical analysis
One approach to analyzing double controls uses conditional logistic regression. Specifically, each triplet is considered a stratum, the binary indicator for overcast weather on crash days is compared with the indictor on the two control days, and the contrast is tested against the null ratio of 1:2. In this case, the odds ratio is 0.82 (95% confidence interval: 0.75, 0.89) and indicates about an 18% relative reduction in crash risk associated with overcast weather. This conditional logistic
Discussion
In this study, we introduce a new approach for analyzing double controls for matched studies. We illustrate the approach using data on the association between overcast weather and the risk of a life-threatening traffic crash. In agreement with statistical theory, the results of the approach are more powerful than an analysis that ignores the matching and more precise than an analysis that ignores half the controls (Table 1). Overall, the approach can provide results that agree closely with
Acknowledgments
We thank Peter Austin, Alex Kiss, Sheharyar Raza Michael Schull, Deva Thiruchelvam, Jason Woodfine, and Christopher Yarnell for helpful suggestions on specific points.
Funding: This project was supported by a Canada Research Chair in Medical Decision Sciences, the Canadian Institutes of Health Research, and the BrightFocus Foundation. The views expressed are those of the authors and do not necessarily reflect the Ontario Ministry of Health & Long-term Care.
Accountability: The lead author
References (13)
The exposure-crossover design is a new method for studying sustained changes in recurrent events
J Clin Epidemiol
(2013)- et al.
Detecting influenza epidemics using search engine query data
Nature
(2009) - et al.
Association between cellular-telephone calls and motor vehicle collisions
N Engl J Med
(1997) - et al.
Role of mobile phones in motor vehicle crashes resulting in hospital attendance: a case-crossover study
BMJ
(2005) - et al.
An introduction to statistical learning
(2013) Note on the sampling error of the difference between correlated proportions or percentages
Psychometrika
(1947)
Cited by (12)
Risks of Serious Injury with Testosterone Treatment
2021, American Journal of MedicineCitation Excerpt :The primary analysis evaluated serious injuries and compared a patient's baseline and subsequent risk of injury after initiating testosterone treatment. Statistical testing used longitudinal generalized estimating equations (GEE) to yield estimates of relative risk (similar to McNemar test adapted for the unequal durations of the baseline and subsequent intervals).67,68 Longitudinal GEE with a Poisson link function was also used to correct for possible temporal trends because all patients age forward in time.69,70
An approach to explore for a sweet spot in randomized trials
2020, Journal of Clinical EpidemiologyCitation Excerpt :In addition, a risk score approach can be unreliable because of post hoc cut-points, group stratification, and underpowered Mantel-Haenszel interaction tests [28]. A matched analysis using predilection scores avoids these limitations and yields a more powerful method for identifying a potential sweet spot [29]. Our approach to identifying a sweet spot has several limitations regardless of whether a stratified or a matched analysis is followed.
Analyzing excess risk from matched designs with double controls: author's response
2019, Journal of Clinical EpidemiologyMethods for analyzing matched designs with double controls: excess risk is easily estimated and misinterpreted when evaluating traffic deaths
2018, Journal of Clinical EpidemiologyCitation Excerpt :Matched designs with double controls do not preclude randomization, yet we know of no study using such an approach. We have previously introduced and validated a simple method for analyzing matched designs with double controls for binary outcomes [5]. The main contribution was to provide an extension of McNemar's test that can be easily inspected by a visual display and verified by a hand-held calculator.
Evaluation of prostate health index in predicting bone metastasis of prostate cancer before bone scanning
2022, International Urology and Nephrology
Conflicts: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the article. All authors have no financial or personal relationships or affiliations that could influence the decisions and work in this article.