The detection of gene-environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement?

M Y Wong; N E Day; J A Luan; K P Chan; N J Wareham

doi:10.1093/ije/dyg002

The detection of gene-environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement?

Int J Epidemiol. 2003 Feb;32(1):51-7. doi: 10.1093/ije/dyg002.

Authors

M Y Wong¹, N E Day, J A Luan, K P Chan, N J Wareham

Affiliation

¹ Department of Mathematics, The Hong Kong University of Science & Technology, Hong Kong.

PMID: 12690008
DOI: 10.1093/ije/dyg002

Abstract

Background: The search for biologically relevant gene-environment interactions has been facilitated by technological advances in genotyping. The design of studies to detect interactions on continuous traits such as blood pressure and insulin sensitivity is attracting increasing attention. We have previously described power calculations for such studies, and this paper describes the extension of those calculations to take account of measurement error.

Methods: The model considered in this paper is a simple linear regression relating a continuous outcome to a continuously distributed exposure variable in which the ratio of slopes for each genotype is considered as the interaction parameter. The classical measurement error model is used to describe the uncertainty in measurement in the outcome and the exposure. The sample size to detect differing magnitudes of interaction with varying frequencies of the minor allele are calculated for a given main effect observed with error both in the exposure and the outcome. The sample size to detect a given interaction for a given minor allele frequency is calculated for differing degrees of measurement error in the assessment of the exposure and the outcome.

Results: The required sample size is dependent upon the magnitude of the interaction, the allele frequency and the strength of the association in those with the common allele. As an example, we take the situation in which the effect size in those with the common allele was a quarter of a standard deviation change in the outcome for a standard deviation change in the exposure. If a minor allele with a frequency of 20% leads to a doubling of that effect size, then the sample size is highly dependent upon the precision with which the exposure and outcome are measured. rho(Tx) and rho(Ty) are the correlation between the measured exposure and outcome, respectively and the true value. If poor measures of the exposure and outcome are used, (e.g. rho(Tx) = 0.3, rho(Ty) = 0.4), then a study size of 150 989 people would be required to detect the interaction with 95% power at a significance level of 10(-4). Such an interaction could be detected in study samples of under 10 000 people if more precise measurements of exposure and outcome were made (e.g. rho(Tx) = 0.7, rho(Ty) = 0.7), and possibly in samples of under 5000 if the precision of estimation were enhanced by taking repeated measurements.

Conclusions: The formulae for calculating the sample size required to study the interaction between a continuous exposure and a genetic factor on a continuous outcome variable in the face of measurement error will be of considerable utility in designing studies with appropriate power. These calculations suggest that smaller studies with repeated and more precise measurement of the exposure and outcome will be as powerful as studies even 20 times bigger, which necessarily employ less precise measures because of their size. Even though the cost of genotyping is falling, the magnitude of the effect of measurement error on the power to detect interaction on continuous traits suggests that investment in studies with better measurement may be a more appropriate strategy than attempting to deal with error by increasing sample sizes.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Environmental Exposure
Epidemiologic Methods*
Gene Frequency
Genetic Predisposition to Disease*
Genotype
Humans
Linear Models
Models, Statistical*
Quantitative Trait, Heritable*
Regression Analysis
Research Design
Sample Size