Original Article
In an occupational health surveillance study, auxiliary data from administrative health and occupational databases effectively corrected for nonresponse

https://doi.org/10.1016/j.jclinepi.2013.10.017Get rights and content

Abstract

Objectives

To show how reweighting can correct for unit nonresponse bias in an occupational health surveillance survey by using data from administrative databases in addition to classic sociodemographic data.

Study Design and Setting

In 2010, about 10,000 workers covered by a French health insurance fund were randomly selected and were sent a postal questionnaire. Simultaneously, auxiliary data from routine health insurance and occupational databases were collected for all these workers. To model the probability of response to the questionnaire, logistic regressions were performed with these auxiliary data to compute weights for correcting unit nonresponse. Corrected prevalences of questionnaire variables were estimated under several assumptions regarding the missing data process. The impact of reweighting was evaluated by a sensitivity analysis.

Results

Respondents had more reimbursement claims for medical services than nonrespondents but fewer reimbursements for medical prescriptions or hospitalizations. Salaried workers, workers in service companies, or who had held their job longer than 6 months were more likely to respond. Corrected prevalences after reweighting were slightly different from crude prevalences for some variables but meaningfully different for others.

Conclusion

Linking health insurance and occupational data effectively corrects for nonresponse bias using reweighting techniques. Sociodemographic variables may be not sufficient to correct for nonresponse.

Introduction

What is new?

What this adds to what was known?

  1. This study shows not only the interest of linking routine health insurance and occupational data to study nonresponse bias but also how these data can be taken into account to use response probability to estimate prevalences by reweighting techniques.

What is the implication and what should change now?
  1. In an epidemiological surveillance survey, it is not sufficient to correct nonresponse bias solely with sociodemographic variables. The health and occupation-related data available for both respondents and nonrespondents should also be used.

A decline in participation rates in epidemiological studies has been observed in recent decades [1]. It is a particular concern in epidemiological surveillance surveys that aim to provide descriptive statistics that may be extrapolated to a target population. A nonresponse bias occurs when the response probability (also called response propensity) and the outcome variable are correlated [2]. It can be corrected when this correlation may be completely explained by a known set of variables. Two main techniques can be used for dealing with nonresponse [3]. The first is imputation, which consists of modeling the outcome variable and replacing each missing item of data by its predicted value. The second is reweighting, which broadly consists of modeling the response probability and then reweighting data by the inverse of the estimated response probability for each subject so-called inverse probability weighting (IPW). The use of imputation is generally recommended for partial nonresponse (subjects answered a questionnaire but did not fill in all the questions), and reweighting is recommended for unit nonresponse (subjects did not answer a questionnaire at all) [3], [4]. As we focus on participation in epidemiological studies, we are specifically interested in unit nonresponse and thus in reweighting. Still, it should be noted that imputation, as well as reweighting, require that some data should be known on both respondents and nonrespondents. This may be particularly challenging for unit nonresponse, but it can be done when survey data can be linked to existing databases such as medical administrative databases containing health-care, occupational, or sociodemographic information [5]. The aim is then to model accurately the probability of response using the variables available. Several epidemiological studies have already addressed this issue and have shown that nonresponse is associated with gender, age, marital status, unhealthy lifestyle, healthcare reimbursement, or occupational status [6], [7], [8], [9], [10]. Few studies, however, have used these results to correct the prevalence estimates for nonresponse bias [11], [12]. Reweighting methods are in fact rarely used and are poorly known in the epidemiological community.

The principal objective of the present study was to show how reweighting can correct for unit nonresponse bias in an occupational surveillance survey by using data from administrative databases related to health and occupation, in addition to the sociodemographic data traditionally used. We then evaluated the impact on prevalence estimates of reweighting corrections with these auxiliary data.

Section snippets

The Coset-MSA cohort

The Coset-MSA study is part of the overall Coset program (Cohort for Epidemiological Surveillance in Connection with Occupation), which aims to study health characteristics and morbidity trends in relation to occupational factors in the French working population [13]. This program relies on data from three cohorts of individuals insured through the three main social welfare funds in France, which cover 95% of the population: the Constances cohort [14], conducted by the French National Institute

Results

Of the 9,307 persons included in the study, 57.2% were salaried, 67.6% were men, and their median age was 43 years. Around 90% of the sample had claimed for reimbursement of medical services and 10% had been hospitalized. The date of last entry was in 2008 for 4% of the sample and 2009 for the remaining 96%. The participation rate in the postal questionnaire was 24.8% (2,320 respondents).

Discussion

In the Coset-MSA study, response probability was related to not only sociodemographic variables but also health and occupational variables. Comparison of the estimated prevalence on outcome variables yielded by questionnaires, under different assumptions on the missing data process (MCAR or MAR), showed moderate but noticeable differences whose magnitude varied according to the variables studied; these differences reflect the association between estimated response probabilities and outcome

Conclusion

This study not only demonstrates the interest of linking routine health insurance and occupational data to study nonresponse bias but also shows how these data can be used to take into account the nonresponse bias for estimating prevalence. The results are quite promising even with a response rate as low as 25%. They indicate that in addition to the response rate, the major concern is the relevance of the data used to correct for nonresponse bias [24], [26]. In our study, work- and

Acknowledgments

The authors thank the Mutualité Sociale Agricole personnel (Alain Pelc, Nicolas Viarouge, Florian Brémaud, and Yves Cosset) for their fruitful collaboration on the Coset-MSA project, the CnamTS for the access to their databases, David Haziza and Jean-Luc Marchand for their precious advice during the analysis step, Marie Zins and her Constances team for the exchanges during the study, and Ellen Imbernon and Marcel Goldberg for their fruitful comments on the manuscript.

References (31)

  • E. Zanutto et al.

    Using administrative records to impute for nonresponse

  • M. Goldberg et al.

    Socioeconomic, demographic, occupational, and health factors associated with participation in a long-term epidemiologic survey: a prospective study of the French GAZEL cohort and its target population

    Am J Epidemiol

    (2001)
  • A.K. Knudsen et al.

    The health status of nonparticipants in a population-based health study

    Am J Epidemiol

    (2010)
  • P. Martikainen et al.

    Does survey non-response bias the association between occupational social class and health?

    Scand J Public Health

    (2007)
  • B. Geoffroy-Perez et al.

    Coset: un nouvel outil généraliste pour la surveillance épidémiologique des risques professionnels

    Bull Epidemiol Hebd (Paris)

    (2012)
  • Cited by (16)

    • Carpal Tunnel Syndrome Among Male French Farmers and Agricultural Workers: Is It Only Associated With Physical Exposure?

      2020, Safety and Health at Work
      Citation Excerpt :

      Among the 10,000 selected workers, 9,477 had a valid postal address, and 2,363 responded to a self-administered postal questionnaire (participation rate: 24.9%) (Fig. 1). Salaried workers, workers in service companies, or those who had held their job for longer than 6 months were more likely to respond [38]. Analyses were carried out only on the data of the cross-sectional pilot study implemented in 2010 and restricted to individuals aged over 30 (there was no CTS in individuals under 30 in the data set), who were active in farming when filling in the questionnaire and who had been working for at least 12 months.

    • A two-phase sampling survey for nonresponse and its paradata to correct nonresponse bias in a health surveillance survey

      2017, Revue d'Epidemiologie et de Sante Publique
      Citation Excerpt :

      They received a 40-page self-administered postal questionnaire about working conditions and health. A postal reminder was mailed one month later [17]. This first phase of the survey (hereafter the “initial survey”) was conducted in February 2010, the response rate being 23.6%.

    • Selection bias was reduced by recontacting nonparticipants

      2016, Journal of Clinical Epidemiology
      Citation Excerpt :

      To compare MI-MNAR with the MI-MAR methods, we also impute the deaths and hospital visits during the follow-up. As these are available for the full cohort, the imputation-based estimates can be benchmarked against the real data [29,31]. In the imputation, it is assumed that the deaths and hospital visits are available for participation groups 1 and 2 but missing for participation group 3, and the imputation model is similar to the model used for the questionnaire variables.

    View all citing articles on Scopus

    Conflict of interest: None.

    Funding: None.

    View full text