Elsevier

Journal of Clinical Epidemiology

Volume 81, January 2017, Pages 51-55.e2
Journal of Clinical Epidemiology

Original Article
A simple method for analyzing matched designs with double controls: McNemar's test can be extended

https://doi.org/10.1016/j.jclinepi.2016.08.006Get rights and content

Abstract

Objectives

To introduce a new analytic approach for matched studies, where exactly two controls are linked to each case (double controls rather than solitary controls). The intent is to extend McNemar's test for one-to-two matching (instead of one-to-one matching) when evaluating binary predictors and outcomes.

Study Design and Setting

We review McNemar's approach for analyzing matched data, demonstrate the Mantel–Haenszel approach for integrating two overlapping McNemar's estimates, review conditional logistic regression as an alternative analytic approach, and introduce a new method that yields a visual display and easy verification.

Results

We illustrate the new approach with real data testing the association between overcast weather and the risk of a life-threatening traffic crash (n = 6,962). We show that results from the new approach agree closely with conditional logistic regression and are sufficiently simple as to be computed on a handheld calculator. We further validate the approach by conducting simulations when a positive association was predefined and when a null association was predefined.

Conclusion

The new approach provides a feasible, simple, and efficient method for analyzing matched designs with double controls.

Introduction

Collecting data is often slow, expensive, awkward, and exhausting. The advent of big data can overcome such challenges by assembling large amounts of digital information as a byproduct of daily work. One example is an analysis of trends around Internet search queries for tracking influenza epidemics across the globe [1]. Interpreting big data for medical research, however, can be a challenge because the lack of randomization means that unmeasured confounders may bias causal interpretations. Statistical measures such as regression, stratification, or propensity score matching attempt to mimic a randomized trial in big data, yet none of these methods is ideal and new designs that control for confounding are a priority for future research.

Pair-matched designs are an elegant approach for providing statistical power and eliminating multiple unmeasured confounders. One example is a case–crossover study suggesting a fourfold increase in the risk of a motor vehicle crash when individuals use a cellular telephone while driving [2], [3]. In some investigations, these matched designs have led to new visual displays and new statistical approaches for interpreting observed associations [4]. Interestingly, the combination of big data and matched designs has also motivated further reconsideration of a few classic problems in statistical science [5]. Indeed, the feasibility and power of big data now means that classic problems have growing real-world relevance and are no longer academic puzzles.

Quinn McNemar [6] introduced the McNemar test in 1947 as a basic statistic for evaluating pair-matched binary data in a 2 × 2 contingency table. A few extensions were developed over the next half-century to account for small cell sizes, outcomes that were multinomial rather than binary, cases with missing data, and ways of calculating sample size estimates based on anticipated statistical power [7], [8]. The advent of high-performance maximum likelihood estimation, in addition, has served as a substitute for similar problems but offers no closed-form expression and no easy method for data visualization. We know of no past study, however, that extended the McNemar test to studies involving double controls rather than single controls.

Here we highlight matched designs with double controls, defined as studies where exactly two controls are linked to each case using comprehensive data. We illustrate the design with data examining the association between overcast weather and the risk of a life-threatening traffic crash (Box). We demonstrate that a chi-square test can be misleading because it ignores the matching and that McNemar test can be imprecise because it ignores half the controls. We also show that conditional logistic regression is reasonable but can be replaced by a simpler approximation that yields a visual display and easy verification. The new approach is generalizable to experimental studies where randomization assigns two controls to each case (or two cases for each control).

Section snippets

Background

Consider a study of life-threatening traffic crashes that selected consecutive patients and characterized the circumstances of each adverse event. The purpose was to explore the association between weather and traffic risks without invoking an ecologic fallacy (that typically assesses different regions rather than different individuals). The specific study question tested if exposure to overcast weather can lead to a decrease in the short-term risk of a traffic crash, perhaps because of the

Statistical analysis

One approach to analyzing double controls uses conditional logistic regression. Specifically, each triplet is considered a stratum, the binary indicator for overcast weather on crash days is compared with the indictor on the two control days, and the contrast is tested against the null ratio of 1:2. In this case, the odds ratio is 0.82 (95% confidence interval: 0.75, 0.89) and indicates about an 18% relative reduction in crash risk associated with overcast weather. This conditional logistic

Discussion

In this study, we introduce a new approach for analyzing double controls for matched studies. We illustrate the approach using data on the association between overcast weather and the risk of a life-threatening traffic crash. In agreement with statistical theory, the results of the approach are more powerful than an analysis that ignores the matching and more precise than an analysis that ignores half the controls (Table 1). Overall, the approach can provide results that agree closely with

Acknowledgments

We thank Peter Austin, Alex Kiss, Sheharyar Raza Michael Schull, Deva Thiruchelvam, Jason Woodfine, and Christopher Yarnell for helpful suggestions on specific points.

Funding: This project was supported by a Canada Research Chair in Medical Decision Sciences, the Canadian Institutes of Health Research, and the BrightFocus Foundation. The views expressed are those of the authors and do not necessarily reflect the Ontario Ministry of Health & Long-term Care.

Accountability: The lead author

References (13)

  • D.A. Redelmeier

    The exposure-crossover design is a new method for studying sustained changes in recurrent events

    J Clin Epidemiol

    (2013)
  • J. Ginsberg et al.

    Detecting influenza epidemics using search engine query data

    Nature

    (2009)
  • D.A. Redelmeier et al.

    Association between cellular-telephone calls and motor vehicle collisions

    N Engl J Med

    (1997)
  • S.P. McEvoy et al.

    Role of mobile phones in motor vehicle crashes resulting in hospital attendance: a case-crossover study

    BMJ

    (2005)
  • G. James et al.

    An introduction to statistical learning

    (2013)
  • Q. McNemar

    Note on the sampling error of the difference between correlated proportions or percentages

    Psychometrika

    (1947)
There are more references available in the full text version of this article.

Cited by (12)

  • Risks of Serious Injury with Testosterone Treatment

    2021, American Journal of Medicine
    Citation Excerpt :

    The primary analysis evaluated serious injuries and compared a patient's baseline and subsequent risk of injury after initiating testosterone treatment. Statistical testing used longitudinal generalized estimating equations (GEE) to yield estimates of relative risk (similar to McNemar test adapted for the unequal durations of the baseline and subsequent intervals).67,68 Longitudinal GEE with a Poisson link function was also used to correct for possible temporal trends because all patients age forward in time.69,70

  • An approach to explore for a sweet spot in randomized trials

    2020, Journal of Clinical Epidemiology
    Citation Excerpt :

    In addition, a risk score approach can be unreliable because of post hoc cut-points, group stratification, and underpowered Mantel-Haenszel interaction tests [28]. A matched analysis using predilection scores avoids these limitations and yields a more powerful method for identifying a potential sweet spot [29]. Our approach to identifying a sweet spot has several limitations regardless of whether a stratified or a matched analysis is followed.

  • Methods for analyzing matched designs with double controls: excess risk is easily estimated and misinterpreted when evaluating traffic deaths

    2018, Journal of Clinical Epidemiology
    Citation Excerpt :

    Matched designs with double controls do not preclude randomization, yet we know of no study using such an approach. We have previously introduced and validated a simple method for analyzing matched designs with double controls for binary outcomes [5]. The main contribution was to provide an extension of McNemar's test that can be easily inspected by a visual display and verified by a hand-held calculator.

View all citing articles on Scopus

Conflicts: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the article. All authors have no financial or personal relationships or affiliations that could influence the decisions and work in this article.

View full text