Objectives Selective participation can bias results in epidemiological surveys. The importance of health status is often suggested as a possible explanation for non-participation but few empirical studies exist. In a population-based study, explicitly focused on sickness absence, health and work, we examined whether a history of high levels of sickness absence was associated with non-participation.
Design The study is based on data from official sickness absence registers from participants, non-participants and the total target population of the baseline survey of the Health Assets Project (HAP).
Setting HAP is a population-based cohort study in the Västra Götaland region in South Western Sweden.
Participants HAP included a random population cohort (n=7984) and 2 cohorts with recent sickness absence (employees (n=6140) and non-employees (n=990)), extracted from the same overall general working-age population.
Primary outcome measures We examined differences in participation rates between cohorts (2008), and differences in previous sickness absence (2001–2008) between participants (individual-level data) and non-participants or the target population (group-level data) within cohorts.
Results Participants had statistically significant less registered sickness absence in the past than non-participants and the target population for some, but not all, of the years analysed. Yet these differences were not of substantial size. Other factors than sickness absence were more important in explaining differences in participation, whereby participants were more likely to be women, older, born in Nordic countries, married and have higher incomes than non-participants.
Conclusions Although specifically addressing sickness absence, having such experience did not add any substantial layer to selective participation in the present survey. Detailed measures are needed to gain a better understanding for health selection in health-related surveys such as those addressing sickness absence, for instance in order to discriminate between selection due to ability or motivation for participation.
- Selection bias
- Sickness absence
- participation rate
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
Selective participation by history of sickness absence was examined employing official registries of sickness absence across 8 years. Such health data have rarely been applied in former studies on survey representativeness.
The sickness absence data on participants, non-participants and the target population alike are based on all reimbursements from the Social Insurance Agency, and are not self-reported, which is a strength with regard to common methodological problems such as attrition and recall bias.
Since data from a population-based survey were employed, the observed results may reflect general tendencies concerning selective survey participation.
Both recent and more distant sickness absence were included as predictors for participation, which may provide evidence on representativeness of participants concerning both recent time and recurrent sickness absence.
The study does not investigate mechanisms driving an association between sickness absence and survey participation, such as obstacles or motivations, which also are important to clarify to provide decision support for how to best approach potential participants.
Sickness absence is a major challenge and policy development requires high-quality and unbiased data. In sickness absence research, surveys and cohort studies remain important to gain better understanding of variations in level, causes, consequences and mechanisms of sickness absence across social groups and gender. A crux of any survey is to ensure sample representativeness; if participants are different from non-participants in the variables of interest, estimates may suffer from bias.1 ,2 The declining participation rates in epidemiological surveys observed across Western countries in the past 30 years are therefore worrying.3 ,4 Registry data can circumvent issues regarding participation, but often lack the required depth of information for sickness absence research to move forward. Consequently, knowledge about selective survey participation and, in particular, concerning the key variable, sickness absence, is needed to provide researchers with decision support in how to contact participants and, perhaps more importantly, to evaluate the accuracy of results from such surveys.
In surveys across topics, demographic factors such as female gender, being married and higher socioeconomic position are consistently found to predict survey participation,5–9 whereas the evidence regarding age groups and ethnicity are less conclusive.10 Existing evidence further suggests health selection whereby participants have better general6 ,8 ,11 ,12 and mental health,13 are less likely to be on6–8 or at risk for disability pension award,11 and also have a higher life expectancy14 than non-participants. Studies of health status and survey participation have mostly examined rare health-related events (such as hospitalisation), or severe or long-lasting illness (like disability pension award and mortality). Barriers and selection mechanisms may be different in these cases than for sickness absence, which is common in the entire population, fluctuate and, in the majority of cases, concern common musculoskeletal and mental illnesses. Sickness absence is moreover a measure of health that reflect aspects related to functional and working ability, which might be more relevant than diagnoses in explaining survey behaviour.
If and how sickness absence predicts survey participation is uncertain. Linkages to administrative registries are expedient, as they enable unbiased and complete data from participants and non-participants.15 Of the few studies having employed such data, some have found that participants have lower sickness absence rates than non-participants, in line with health selection to survey participation.5 ,16–18 Others have found this among men only,9 or report weak19 or no7 association between sickness absence and survey participation. The unequivocal findings may relate to variations in measures and follow-up time, as well as complex selection mechanisms involving reachability, ability and motivation to participate.20
Concerning motivation, it is commonly proposed that people will be more prone to participate if the survey topic is relevant for them personally.20–23 In interviews with participants and responding non-participants, perceived value or personal gain of contributing to advances in research in the topic has been highlighted as decisive.22 ,24 Following this line of thought, studies addressing sickness absence should lead to increased inclusion of current and previous sickness absentees. Direct measures of relevance are difficult to obtain in representative samples of study participants, and a feasible compromise is to match characteristics of sampled individuals and the core topic of the survey, and infer topic relevance via these characteristics.25 Based on this approach, personal relevance selection is found through randomised controlled designs,26 observed by the general experience that cases are easier to recruit than controls in case–control studies10 and relating to consent giving in medical record follow-ups.27
Only one study has addressed personal relevance selection in surveys on sickness absence specifically16 in which, in contrast to the personal relevance hypothesis, participants were found to have less sickness absence than non-participants. Owing to a small study population from one company only, the finding might not be generalisable to a general population context.
Taken together, it remains empirically unsettled whether sickness absence history influences survey participation and, in particular, in surveys where sickness absence is the main topic. The general decreasing participation rates call for studies that can provide a basis for how to approach potential participants in the future. In the current study, we analysed associations between registered sickness absence and survey participation in a large population-based survey-linkage study that explicitly focused on sickness absence (the Health Assets Project, HAP). HAP started in 2008 with the main aim of comparing workers with sickness absence experiences to those without such experience concerning health, work life and family affairs. To this end, a unique feature of HAP was the use of a ‘case–control’ sampling technique, sampling two cohorts with a recent, new sickness absence episode (employees and non-employees) in addition to a random population cohort (not recent sick-listed ‘controls’), all extracted from a working age population of the Västra Götaland region in Sweden. This technique has, for example, enabled studies of differences in individual and structural factors between workers with and without sickness absence28 ,29 and predictors of return to work.30 ,31 The data collection included links to official registries covering demographics and sickness absence days per year across 9 years (2001–2009), extracted at an individual level for participants and group level for the target populations for each of the three cohorts. This specific design allowed for examining our research aim through the following research questions:
Were the participation rates higher in the two cohorts with a recent, new episode of sickness absence (employees and non-employees) than in the random population cohort?
Within each of the three cohorts, respectively, did participants have more sickness absence days annually in the years preceding the survey (2001–2008) than non-participants or the target population?
Within each of the three cohorts, respectively, were the proportions of individuals with registered sickness absence annually in the years preceding the survey (towards 2001) higher among participants than non-participants or the target population?
This study is based on registry data from participants, non-participants and the target population of the baseline survey of HAP 2008. Figure 1 depicts the sampling procedure in HAP, which specific components that are compared and data available for each component in the current study.
Target population and cohorts in HAP
The study base in HAP was the working age population (19–64 years) in Västra Götaland in Sweden, a region with both urban and rural areas comprising 17% of the Swedish population. In Sweden, all inhabitants are covered by the national sickness insurance. For employees, the employer covers the first 14 days of a sickness absence episode (except one qualifying day); thereafter, benefits are granted from the Social Insurance Agency (SIA). Non-employed (eg, self-employed, unemployed and students) can apply through self-report for benefit from SIA for sickness absence beyond 1 day. SIA thus has registries of all covered sickness absence beyond 14 days for employees and beyond 1 day for non-employees. With help from SIA and Statistics Sweden, the following three cohorts were extracted from the study base to obtain groups with and without recent sickness absence (see also figure 1 and ref. 28 for more details):
A recent sick-listed cohort of employees (employee cohort), of which the target population consisted of all employed individuals with a new sickness absence episode >14 days during 18 February to 15 April 2008 (n=12 543).
A recent sick-listed cohort of non-employees (non-employee cohort), where the target population included all other insured with a new sickness absence episode >1 day during 18 February to 1 April 2008 (n=5004). The sampling frame for these cohorts only included those registered in SIA by 15 April 2008 (n=6140 in the employee cohort and n=4240 in the non-employee cohort), as the survey ideally should be conducted as close as possible to the current absence episode. In the employee cohort, the total sampling frame was invited to participate (n=6140), whereas a random sample of the non-employee-sampling frame was invited (n=990).
Finally, a random population cohort (population cohort, n=7984) was invited. A negative coordination was performed to ensure non-overlapping cohorts; thus, the population cohort included no cases with new registrations of sickness absence during inclusion.
Data collection: Eligible participants were invited through a postal survey, sent out on 15th and 25th of April 2008 with two reminders (ie, up until 2 months after onset of the registered sickness absence episode for the 2 sick-listed cohorts). The invitation letter included a description of the study aim, data collection procedures, contact details and information that withdrawal from the study was possible at any time. It was explicitly stated that the SIA would not have access to information on participation status and that participation would not affect the invitee's sickness allowance. Participants gave informed consent to link survey data to official registry data on sociodemographic factors, sickness absence and employment status. For the current study, we extracted the corresponding registry data for each of the three cohorts' target populations, which are officially available at a grouped level.
In the following, the registry data employed in the current study will be described in more detail, including amendments made to enable comparisons between the individual-level data (participants) and group-level data (non-participants and target population).
Data source and measures on demographic variables
Regarding demographic variables, group-level data from all invited were extracted from Statistics Sweden: Participation (yes, no), gender (male, female), age group (19–30, 31–50, 51–64), country of birth (Nordic, others), marital status (married, not married (includes cohabitants)) and gross income in intervals (Swedish Krona (SEK)≤149 000, 150 000–299 000, ≥300 000).
Data sources on registered sickness absence
Data on sickness absence benefit granted from SIA during the years 2001–2008 were extracted from the ‘Longitudinal integrated database for sickness insurance and labour market research (LISA)’. The data included annual number of reimbursed sickness absence days (including sickness absence, rehabilitation and work injury allowancei). Data on participants were available at an individual level and data on the target populations at a group level, distributed by gender and age groups (employee cohort and non-employee cohort: age groups 19–30, 31–50, 51–64; Västra Götaland population: age groups 20–29, 30–39, 40–49, 50–59 for data on sickness absence days and 16–19, 20–24, 25–29, 30–34, 35–39, 40–44, 45–49, 50–54, 55–59 for data on sickness absence cases).
To achieve appropriate comparison groups, the following accommodations were made: First, since the data from the target populations for the employee cohort and the non-employee cohort included those granted reimbursement, we excluded participants with no registered sickness absence days in 2008 from the participant groups. Second, to approximate non-participation groups, we subtracted participants in the employee cohort and non-employee cohort from their respective target populations. Finally, we handled problems with age-related left censoring back in time (towards 2001) by only including those aged 31–64 in 2008 in the employee cohort and non-employee cohort. In the population cohort, to correspond to available official statistics, we included participants aged 20–59 per calendar year when comparing sickness absence days, and participants aged 16–59 per calendar year when comparing sickness absence cases.
Measures on registered sickness absence
Participation rates between cohorts. As a first crude step, we examined whether participation rate in the two cohorts with a recent registered sickness absence episode (employee cohort and non-employee cohort) differed from that in the population cohort.
Days with registered sickness absence annually. We compared mean number of registered sickness absence days per year (2001–2008) between participants and non-participants (employee cohort and non-employee cohort) and the target population (all 3 cohorts).
Proportions with previous sickness absence annually. Finally, we compared the proportion of individuals with registered sickness absence per year between participants and non-participants (employee cohort and non-employee cohort, 2001–2007) or the target population (population cohort, 2001–2008).
The data were analysed using Microsoft Excel 2010 and Stata V.12. Differences in participation rate and distribution across demographic characteristics between participants and non-participants in each of the three cohorts were examined as relative proportions and χ2 tests for group-level data. Regarding sickness absence, we first compared participation rates with 95% CIs between the cohorts and performed χ2 tests for group-level data. Second, we performed one sample mean comparison Student's t-test to examine differences in mean number of sickness absence days per year, and across years, from 2001 until 2008 between participants and their comparison groups in each cohort, respectively. To account for gender and age differences between the comparison groups, we calculated means weighted for the distribution in the respective participant groups. Finally, to compare proportions with registered sickness absence per year, gender-stratified ORs (95% CIs) were calculated comparing participants and their comparison groups in each cohort, respectively.
Demographic characteristics of participants and non-participants
Table 1 displays demographic characteristics and participation rates across groups between participants and invited non-participants in the three cohorts. Participants were more likely than non-participants to be women, older, born in Nordic countries, married and have higher incomes in both the population cohort and the recent sick-listed employee cohort. The demographic distribution was more even between participants and non-participants in the non-employee cohort, though participants were more likely than non-participants to be women and to be born in Nordic countries.
Differences in participation rates between cohorts
The participation rate was 3.5 percentage points higher in the employee cohort (53.9%, 95% CI 52.7% to 55.2%) than in the population cohort (50.4%, 95% CI 49.3% to 51.5%) (χ2=16.75, df=1 p<0.001). The participation rate in the non-employee cohort (50.3%, 95% CI 47.2% to 53.5%) was similar to that among the population cohort (χ2=0.00, df=1, p=0.936). As detailed in table 1, there were more variations overall in participation rates across demographic groups within cohorts than between the cohorts.
Differences in mean days of registered sickness absence between participants and comparison groups within cohorts
Overall, there were no substantial differences in registered sickness absence between participants and their comparison groups across the three cohorts. Participants in the population cohort had a lower mean number of sickness absence days per year than the corresponding level in the population in the years 2001–2003. Weighted for gender and age distribution among participants, the differences were statistically significant through 2001–2008, except 2007. Yet the raw differences in annual mean number of registered sickness absence days only ranged from 1.7 to 5.3 days (table 2). The same tendency was found in the employee cohort; however, it was only statistically significant when comparing participants to non-participants in the years 2001–2003 and 2007, weighted for gender and age distribution (table 2). By contrast, participants in the non-employee cohort had a higher mean number of sickness absence days per year than non-participants and the target population in 2008 and 2007, gender and age weighted (table 2).
Differences in proportions with registered sickness absence between participants and comparison groups within cohorts
Regarding individuals with registered sickness absence per year, the proportions were lower overall among participants than non-participants or the target population. In the population cohort, compared with the target population, participants had statistically significant lower odds for having had an episode of sickness absence only in 2001 and 2003 for women, and in 2001, 2002 and 2003 for men (ORs ranging from 0.84 to 0.91 for women and 0.76 to 0.80 for men, table 3). In the employee cohort, compared with non-participants, participants had statistically significant lower odds for having had an episode of sickness absence most of the comparisons per years from 2001 to 2007 (ORs ranging from 0.87 to 0.95 for women and 0.77 to 0.88 for men, see table 3). The corresponding comparisons in the non-employee cohort resulted in small and generally non-significant differences, and in opposing directions for men and women (table 3).
Participants in the HAP study, which specifically invited people to a survey on sickness absence, health and work, had less registered sickness absence in the past than non-participants and the target population in some but not all of the years analysed. The differences found in sickness absence were moreover not of substantial size. Secondary findings harmonise with commonly observed differences in sociodemographic characteristics as participants were more likely than non-participants to be women, older, born in Nordic countries, married and have higher incomes.
Strengths and limitations
The main strengths of this study were chiefly related to our application of objective registry data on sickness absence from participants and the target population. First, this enabled investigation of selection effects by sickness absence, which has rarely been achievable in prior research and restricted in many countries by lack of available registries. This study examined sickness absence history across more years than in previous studies. Many non-participation analyses on health variables are based on supplementary surveys of ‘participating non-participants’ willing to complete a shortened version of the survey, with the inherent risk of partly reproducing the same non-participation bias.32 Second, the use of registries reduced common methodological problems such as recall bias and missing responses.15 Third, since the registry data are based on financial reimbursement from the SIA, they are considered to be accurate and reliable. Finally, examining sickness absence several years before the survey is a particular advantage when studying selection by sickness absence, as the phenomenon on the one hand is common, with a 1 year cumulative incidence of 11.3% in the working population in Western Sweden in 200833 and, on the other hand, in some cases is prolonged and recurrent. Thus, the findings might inform representativeness of participants regarding both present time and prolonged or recurrent cases. Additionally, most studies on sickness absence as a predictor for survey participation have employed specific occupational5 ,9 ,16 ,18 or diagnostic groups.34 These groups may have specific distributions of sickness absence and demography making the observed results not necessarily applicable to other groups. Since this study examined population-based cohorts, the results may to a greater extent be regarded as general tendencies. Despite considerable advantages in applying registries in research, the quality and accuracy of an analysis rest on the information available. First, some participants had either no days but one or more episode of registered absence or vice versa, whereas it was uncertain as to whether there were corresponding cases among non-participants, due to the use of group-level data. This uncertainty might have produced noise in the analyses. Our results were, however, quite robust across alternative analyses of the data, strengthening our confidence in the observed findings.
Second, the skewed distribution of sickness absence days makes median calculations more appropriate than means.35 The use of group-level data on the target populations precluded calculating median values and SD estimates for the comparison groups. The one-sample Student's t-test was considered a valid approach based on the data available as the Student's t-test is very robust for comparing means, and as the distribution of means, according to ‘the central limit theorem’, will approximate a normal distribution when the sample size increases, even when the distribution in the population is non-normal.36 That said, interpreting the mean values by themselves can be problematic when the distribution of the data is skewed. Though means of sickness absence days arguably is fairly meaningful, interpretations of results should focus more on the differences in means between groups than the mean values themselves.
Third, owing to the fluctuating nature of sickness absence and lag in registry administration, our comparison groups for research question 1 were inevitably somewhat overlapping concerning sickness absence status. The population cohort naturally included some ongoing cases and some cases with onset after inclusion (sampling procedures ensured no new cases during inclusion, but 6.7% of the population cohort participants self-reported being currently sickness-absent). Nevertheless, since the employee cohort and non-employee cohort all had recent sickness absence (ie required to be included in these cohorts), the comparison of participation rates between the cohorts were regarded appropriate. As for the within cohorts comparisons, non-participants in the sick-listed cohorts comprised the respective target populations minus participants. These target populations also included some non-invited individuals due to registration in SIA after the predefined inclusion period. Lagged registration in SIA is in general slightly skewed.37 A sensitivity analysis, however, revealed no differences in outcome between those invited in the first and second rounds in the employee cohort, with late registrations presumably over-represented in the latter, indicating fairly comparable sickness absence histories between the invited and non-invited non-participants (numbers not shown).
Finally, we only had access to a limited amount of variables characterising the non-participation group. Hence, we cannot rule out an impact from residual confounding, especially from socioeconomic factors,9 ,19 on our results. The data available on income, country of birth and marital status were retrieved separately from the sickness absence data, precluding the possibility for making statistical adjustments. The registry data did not include information on medicolegal cause or specific timing of the sickness absence episodes beyond number of registered days per year, precluding some analyses on how sickness absence might influence survey participation.
Interpretation of the findings
Selection effects by topic relevance are assumed to be a particular statistical concern as associations are more prone to be biased if selection has to do with the key statistics.1 ,10 ,25 Empirical tests of this assumption have thus far not found consequential impact on survey estimates analysing associations,1 in line with most,6 ,10 ,11 ,38 though not all,12 ,39 available studies on non-participation bias. Prevalence estimates are notably more vulnerable for selection bias. Levels of registered sickness absence among participants did not diverge substantially from the target populations in HAP, and selection by sickness absence is thus not likely to be any substantial source of bias in this particular survey.
As described in the introduction, selection mechanisms in surveys are complex and involve reachability, ability and motivation to participate. Sickness absence-related motivators and barriers may have influenced participation in the opposite direction, as will be elaborated on in the following, in concert contributing to the finding of relatively similar sickness absence histories between participants and non-participants. The study design did not allow for addressing these nuances directly, but the observed results might shed light on some aspects to be addressed in more detail in future studies. Personal relevance by recent or previous sickness absence seemed not to be a prominent selection mechanism for this survey. Notably, the participation rate was slightly higher in the recent sickness-listed employee cohort than in the population cohort. This could be interpreted as a ‘recency effect’ of personal relevance selection, as the finding contrasted the results regarding more distant sickness absence. The employee cohort nevertheless also included more women than the population cohort, and as women tend to participate more than men,10 this might have contributed to the observed result. The absolute difference of 3.5% may also be considered of little practical importance. Results for the non-employee cohort diverged somewhat from the two other cohorts as well. This might be explained by numerous factors specific for this cohort, such as absence registration schemes, huge heterogeneity including students and self-employed people, and finally the small size of this sample.
The overall finding in this study seemed more to reflect a reduced health and functional capacity among non-participants, as we found somewhat less previous sickness absence among those who participated than those who did not. According to the ‘health selection hypothesis’, illness precludes participation in research.6 ,8 ,11 Several potentially opposing mechanisms may have contributed to this finding. Naturally, current or recent sickness absence can simply entail reduced ability to participate due to poor health, fatigue, motivation or hospitalisation, even though the person under normal circumstances would be inclined to participate. Besides, social inequalities are related to both sickness absence40 and differential participation.8 ,9 Barriers and facilitators for survey participation across social groups are not well understood, but may involve structural barriers and differences in norms and perceived social value of research.10 ,41 ,42 Some barriers could be specific to sickness absence: First, ‘oversurveying’ is suggested to contribute to explaining falling participation rates in general.10 Recurrent or long-term sickness absences requires repeated assessments of work capacity to be eligible for sickness insurance, and being approached with yet another questionnaire might not have been welcomed by some of those invited. We do not know anything about “partial participation”, for example, persons who start to answer the questionnaire, which was rather substantial, but gave up due to tiredness or lack of motivation. Second, sensitive questionnaire items decrease participation rates.26 Stigma and shame related to some diagnoses such as mental illnesses10 ,32 or to the sickness absence status per se43 could thus have made some more hesitant to participate. In concert with this interpretation, an epidemiological survey on mental health found participants to have fewer psychotropic prescriptions than non-participants, although using more medical services for somatic disorders.32 The assurance of confidentiality in the invitation letter in the HAP study, hereunder that the questionnaire was not related to the employer or SIA, probably partly counteracted nonparticipation due to fear of “exposure”,26 but how much is not easily quantifiable. Diagnoses may also have yielded differences in personal relevance motivation, as the survey overall was directed more towards mental than physical aspects of work, health and sickness absence. In sum, a more direct and specified measure of perceived relevance and attitude towards the topic, although challenging to obtain, could in theory have discriminated better between individual motivations and barriers for participation.
Selective participation remains a challenge in epidemiological surveys, yet again demonstrated by demographic differences between participants and non-participants in the HAP survey. Sickness absence did not seem to add any substantial layer to the selection, based on several registry-based comparisons in the current study. Registry data is a crucial resource for increasing knowledge on selective participation. Detailed measures are needed to gain a better understanding for health selection in health-related surveys such as those addressing sickness absence, for instance in order to discriminate between selection due to ability or motivation for survey participation. Until such studies are performed, the overall findings of this study did not give rise to much concern about the representativeness of survey participants regarding sickness absence history.
The authors thank Carl Högfeldt and Ulrik Lidwall at the Swedish Social Insurance Agency for availability and help in extracting and interpreting the group-level sickness absence registry data.
Contributors MK, JL, GH, SØ and KH designed the study. MK analysed the data and wrote the first draft and main revisions of the manuscript. All authors contributed in interpretation of the data and critical revision of the manuscript, and approved the final version of the manuscript.
Funding The data collection for the Health Assets Project was supported by the Swedish Social Insurance Agency.
Competing interests None declared.
Ethics approval The HAP study was approved by the Ethics Committee at the University of Gothenburg (registration number 039-08) and conducted in accordance with the latest version of the Helsinki protocol.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.
↵i The Västra Götaland general population statistics did not include work injury allowance, but this is regarded negligible for the analyses due to small numbers.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.