Article Text

Download PDFPDF

The burden of cancer attributable to modifiable risk factors: the Australian cancer-PAF cohort consortium
  1. Maria E Arriaga1,
  2. Claire M Vajdic1,
  3. Karen Canfell2,3,4,
  4. Robert MacInnis5,
  5. Peter Hull1,
  6. Dianna J Magliano6,
  7. Emily Banks7,
  8. Graham G Giles5,
  9. Robert G Cumming3,8,
  10. Julie E Byles9,
  11. Anne W Taylor10,
  12. Jonathan E Shaw11,
  13. Kay Price12,
  14. Vasant Hirani3,13,
  15. Paul Mitchell14,
  16. Barbara-Ann Adelstein4,
  17. Maarit A Laaksonen1
  1. 1 Centre for Big Data Research in Health, University of New South Wales, Sydney, Australia
  2. 2 Cancer Research Division, Cancer Council New South Wales, Sydney, Australia
  3. 3 School of Public Health, University of Sydney, Sydney, Australia
  4. 4 Prince of Wales Clinical School, University of New South Wales, Sydney, Australia
  5. 5 Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Australia
  6. 6 Diabetes and Population Health Laboratory, Baker IDI Heart and Diabetes Institute, Melbourne, Australia
  7. 7 ANU College of Medicine, Biology and Environment, Australian National University, Canberra, Australia
  8. 8 ANZAC Research Institute, University of Sydney and Concord Hospital, Sydney, Australia
  9. 9 Research Centre for Gender, Health and Ageing, University of Newcastle, Newcastle, Australia
  10. 10 School of Medicine, University of Adelaide, Adelaide, Australia
  11. 11 Clinical Diabetes Laboratory, Baker IDI Heart and Diabetes Institute, Melbourne, Australia
  12. 12 School of Nursing and Midwifery, University of South Australia, Adelaide, Australia
  13. 13 School of Life and Environmental Sciences Charles Perkins Centre, University of Sydney, Sydney, Australia
  14. 14 Centre for Vision Research, University of Sydney, Sydney, Australia
  1. Correspondence to Dr. Maarit A Laaksonen; m.laaksonen{at}


Purpose To estimate the Australian cancer burden attributable to lifestyle-related risk factors and their combinations using a novel population attributable fraction (PAF) method that accounts for competing risk of death, risk factor interdependence and statistical uncertainty.

Participants 365 173 adults from seven Australian cohort studies. We linked pooled harmonised individual participant cohort data with population-based cancer and death registries to estimate exposure-cancer and exposure-death associations. Current Australian exposure prevalence was estimated from representative external sources. To illustrate the utility of the new PAF method, we calculated fractions of cancers causally related to body fatness or both tobacco and alcohol consumption avoidable in the next 10 years by risk factor modifications, comparing them with fractions produced by traditional PAF methods.

Findings to date Over 10 years of follow-up, we observed 27 483 incident cancers and 22 078 deaths. Of cancers related to body fatness (n=9258), 13% (95% CI 11% to 16%) could be avoided if those currently overweight or obese had body mass index of 18.5–24.9 kg/m2. Of cancers causally related to both tobacco and alcohol (n=4283), current or former smoking explains 13% (11% to 16%) and consuming more than two alcoholic drinks per day explains 6% (5% to 8%). The two factors combined explain 16% (13% to 19%): 26% (21% to 30%) in men and 8% (4% to 11%) in women. Corresponding estimates using the traditional PAF method were 20%, 31% and 10%. Our PAF estimates translate to 74 000 avoidable body fatness-related cancers and 40 000 avoidable tobacco- and alcohol-related cancers in Australia over the next 10 years (2017–2026). Traditional PAF methods not accounting for competing risk of death and interdependence of risk factors may overestimate PAFs and avoidable cancers.

Future plans We will rank the most important causal factors and their combinations for a spectrum of cancers and inform cancer control activities.

  • cohort
  • pooling
  • burden of disease
  • cancer
  • population attributable fraction
  • modifiable risk factors

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • A large, population-based, pooled prospective cohort with broad demographic and geographical coverage and individual participant data.

  • Risk factor exposure prevalence estimates obtained from representative contemporary data sources to enhance the accuracy of population attributable fraction (PAF) estimates.

  • The first cancer-PAF estimates from large-scale cohort study data that account for competing risk of death.

  • Cancer-PAF estimates for the simultaneous effects of multiple risk factors, thereby accounting for their interdependence.

  • CIs computed to show uncertainty in PAF estimates and differences between population subgroups.

  • Estimates of the future numbers of cancers in Australia preventable by adherence to current recommendations for a healthy lifestyle.

  • Evidence to underpin future evaluations of potential public health policies and interventions designed to reduce the cancer burden.

  • Further improvements in the PAF methods are needed to incorporate the time the risk factor modification takes to be realised and the uncertainty in the exposure prevalence estimates.

  • Larger cohort populations are needed to provide reliable data on some of the rarer cancers, cancer subtypes, risk factor combinations and population subgroups.


Cancer is the leading cause of disease burden and death in Australia.1 2 One of the principal strategies for reducing this burden is to target the key preventable causal factors, focusing activities where the association is strong, the exposure is common, and by considering the combination of both these factors overall and in population subgroups. The disease burden measure population attributable fraction (PAF) can be used to estimate the proportion of cancers that could be prevented if exposure to its risk factors were removed or reduced.3 4 PAF accounts for both the strength of the exposure-cancer association and the exposure prevalence in the population of interest. PAFs are increasingly used to evaluate the national, regional and global burden of cancer and to advocate for changes in public health policy and activity settings to reduce the prevalence of causal risk factors.5 However, limitations in both the available data and the methods used restrict the accuracy of the PAF estimates and the scope of the conclusions.

Most prior cancer-PAF studies have relied on published exposure-cancer associations. As risk factor interaction and population subgroup analyses are rarely available, overall PAF estimates for individual risk factors dominate the literature.6 Even where estimates are available, differences in the measurement and categorisation of risk factors and modelling approaches may limit their comparability. Most studies that have estimated PAFs for combined effects of risk factors have assumed independence between carcinogenic exposures.5 7 8 Yet, in reality, modifiable lifestyle-related risk factors can interact to cause cancer and their effect may be higher for certain subgroups.9 Moreover, these risk factors tend to co-occur or cluster, further adding to the burden of both cancer and death, and the effect of modifying one risk factor may be mediated by changes in other risk factors.9–11

PAFs are best estimated from cohort studies in which the risk factor exposure measurement precedes the cancer incidence.12 Cohort studies also allow ascertainment of multiple outcomes related to an exposure and thus permit analyses to account for potential competing risks, such as death, which can alter PAF results.13 To our knowledge, no previous cancer-PAF study has accounted for competing risk of death. This can be critical for cancers where established risk factors also predict risk of death from other causes and risk factor modifications will thus affect both outcomes. In addition, most previous cancer-PAF cohort studies have estimated exposure prevalence from the cohort population even when it has not been sampled to be representative of the target population of interest.14 This hinders both generalisation and comparison of the findings. Finally, CIs for PAF estimates are often not provided, precluding an evaluation of their precision and also differences between population subgroups.

We addressed these deficiencies by applying our method13 15 for estimating PAF and its CI for cancer incidence, allowing analysis of the simultaneous effects of multiple factors and accounting for competing risk of death, to an Australian cohort consortium and representative external exposure prevalence data.

Cohort consortium description

Australian cancer-PAF cohort consortium

The eligibility criteria for inclusion in the consortium were: well-established population-based Australian prospective cohort studies with comprehensive information on modifiable lifestyle-related exposures at baseline. Seven cohort studies met these criteria: Melbourne Collaborative Cohort Study (MCCS),16 Blue Mountains Eye Study (BMES),17 Australian Longitudinal Study on Women’s Health (ALSWH),18 Australian Diabetes, Obesity and Lifestyle Study (AusDiab),19 North West Adelaide Health Study (NWAHS),20 Concord Health and Ageing in Men Project (CHAMP)21 and 45 and Up Study (45&Up).22 Together they formed a study sample of 369 515 adult Australians of different ages covering the adult lifespan (table 1). Pooling of the cohorts identified 2457 people enrolled in more than one cohort, leaving a final population of 367 058 individuals, 365 173 with consent for record linkage.

Table 1

Characteristics of the data sources used in the Australian cancer-PAF cohort consortium

The cohorts recruited participants between 1990 and 2009 (table 1). Only one cohort, AusDiab with recruitment from 1999, was designed to include a sample representative of the Australian population. Therefore, we used the latest representative external data sources to obtain contemporary age- and sex-specific risk factor prevalence estimates. These sources included the National Health Surveys (NHS) conducted 2014–2015 (NHS3),23 2004-2005 (NHS2)24 and 2001 (NHS1),25 the National Drug Strategy Household Survey (NDSHS) conducted in 201326 and the Learning how Australians Deal with menopausal sYmptoms (LADY) Survey conducted in 201327 (tables 1, 2 and 3), for which de-identified unit record data were available to generate the required exposure prevalences.

Table 2

List of main harmonised modifiable baseline risk factors for cohort studies and external prevalence data sources

Table 3

List of main harmonised non-modifiable baseline risk factors for cohort studies and external prevalence data sources

Data collection and harmonisation

All cohort studies collected baseline information on demographic, medical, lifestyle-related and hormonal exposures through self-completed questionnaires and some also through interviews and medical examinations (MCCS, BMES, AusDiab, NWAHS and CHAMP). We harmonised all available information on the relevant exposures across the cohort studies and the external data sources to the greatest extent possible (tables 2 and 3).

The modifiable exposures examined were regular smoking, alcohol consumption, body fatness (BMI ≥25 kg/m2), physical activity, fruit consumption, vegetable consumption, red and processed meat consumption, oral contraceptive (OC) use, menopausal hormone therapy (MHT) use and breastfeeding. We classified the lifestyle exposures to match current Australian recommendations for healthy living, that is, not smoking, drinking no more than two standard alcoholic drinks per day (ie, 20 g of alcohol per day), maintaining healthy weight (BMI 18.5–24.9 kg/m2), doing at least 150 min of moderate or 75 min of vigorous physical activity per week, eating at least two serves (ie, 300 g) of fruits and five serves (ie, 375 g) of vegetables per day, and not eating more than two serves (130 g) of either red or processed meat 3–4 times a week (table 2).

We also harmonised non-modifiable exposures such as age, gender, height, country of birth, marital status, education, socioeconomic status, urban–rural status, health insurance, reproductive history and personal and family medical history to allow population subgroup analyses and assessment of potential confounding factors (table 3).

Data linkage

We linked the pooled cohort to the population-based Australian Cancer Database (ACD) and National Death Index (NDI) to identify cancers and deaths. The ACD is a database of all primary, malignant cancers, except keratinocyte cancers, notified to State and Territory Cancer Registries in Australia since 1982. The NDI records all deaths registered in Australia since 1980. Both the ACD and NDI are maintained by the Australian Institute of Health and Welfare, which facilitates research by conducting record linkage using an established probabilistic linkage algorithm.28

In October 2016, the ACD and NDI records were available until the end of 2012, providing 8–22 years follow-up depending on the individual cohort (table 1).

Data analysis and statistical methods

We classified cancers on the basis of the International Classification of Diseases for Oncology codes. Only invasive cancers identified by data linkage were included, and people with a cancer registration prior to baseline were excluded from the analysis for that malignancy.

We defined follow-up as the time from baseline to the date of diagnosis of the cancer of interest, death or end of follow-up, whichever occurred first. The survival times were assumed to follow a parametric proportional hazards model with piecewise constant baseline hazard function.29 Maximum likelihood estimation with iterative methods was used to obtain the parameter estimates and their estimated covariance matrices.13 We expressed the strength of exposure-cancer and exposure-death associations adjusted for baseline age, sex and study as HRs and their 95% CIs. We computed the corresponding age- and sex-specific exposure prevalence estimates from the most contemporary representative external data source. Participants with missing data for the variables included in the model were excluded from the analyses. We then combined the maximum likelihood estimates and the exposure prevalence estimates to calculate the PAF point estimates using our recently developed PAF formula13 accounting for competing risk of death. The asymptotic variance estimate of PAF was obtained using the delta method, and two-sided 95% CIs for the PAFs were calculated by applying a symmetrising complementary logarithmic transformation of PAF.13 Our PAF method13 and program15 allows a flexible choice of the reference level for the hypothetical risk factor modification and simultaneous analysis of the effects of multiple risk factors. Both individual PAFs for modification of single risk factors and joint PAFs for modification of several risk factors can be calculated.

To illustrate the novel cancer-PAF estimation, we estimated the fractions of cancers causally related to (1) body fatness and (2) both tobacco and alcohol consumption attributable to these risk factors in Australia over the next 10 years. We restricted these analyses to the first 10-year follow-up to generate comparable estimates across the cohorts. We included only those cancers judged by the International Agency for Research on Cancer to be causally associated with body fatness (oesophageal adenocarcinoma, stomach, colorectal, liver, gallbladder, pancreas, postmenopausal breast, corpus uteri, ovary, renal-cell carcinoma, meningioma, thyroid and multiple myeloma) or with both tobacco and alcohol consumption (tongue, mouth, oropharynx, hypopharynx, other pharynx (excluding nasopharynx), oesophagus, colorectal, liver and larynx).30 31 It should be noted that cancers of the lung and breast are not included here as they are not causally related to both tobacco and alcohol consumption. We estimated the individual contribution of body fatness, tobacco and alcohol consumption and the combined contribution of tobacco and alcohol consumption on the burden of the respective cancers. For body fatness, we evaluated scenarios in which (1) those currently obese or overweight had healthy weight and (2) those currently obese were overweight. For smoking, we evaluated scenarios in which (1) current and former smokers had never smoked and (2) current smokers were to quit and become former smokers. For alcohol consumption, we evaluated the scenario in which no-one drank more than two alcoholic drinks per day. We also evaluated potential effect modification of the contribution of smoking by alcohol consumption and all three risk factors by sex. We estimated the numbers of these cancers that could be avoided in Australia under these scenarios by multiplying the PAF estimates by the projected numbers of cancers over the next 10 years (2017–2026).32 33

To demonstrate the potential impact of our methodology on PAF estimates, we compared our results with PAF estimates produced by traditional methods,3 4 adapted to cohort studies with survival data by replacing the relative risks (RRs) in the original formulas by HRs from survival models34 that do not account for competing risk of death and that take a sequential approach to estimate the combined effect of multiple risk factors, assuming their independence.7 14

We carried out all statistical analyses using SAS V.9.4 and a publicly available PAF program based on SAS macros.15

Findings to date

Harmonisation and prevalence of lifestyle-related risk factors

Smoking, alcohol consumption and body fatness could be harmonised for all cohorts, while physical activity, fruit and vegetable consumption, red and processed meat consumption, OC and MHT use and breastfeeding were either not collected at baseline or could not be harmonised for some cohorts (table 2).

The participant age and sex distribution varied across the cohorts (table 1) as did the crude risk factor exposure prevalences (table 4), even in cohorts recruited around the same time. The sex- and age-stratified exposure prevalence estimates were more comparable but generally lower for the cohort studies than the representative external data sources from around the same time (see online supplementary table 1).

Supplementary Material

Supplementary Tables
Table 4

Crude prevalence (%) of main lifestyle-related risk factors for cancer at baseline for cohort studies and external prevalence data sources

Exposure prevalence estimates from the representative external sources (NHS 2001, 2004 and 2014–2015; table 4) showed different temporal trends depending on specific risk factors. The overall prevalence (men and women) of current smoking decreased over time (22%, 21%, and 15%, respectively). The overall prevalence of consuming more than two alcoholic drinks a day (19%, 22% and 17%) and inadequate fruit consumption (48%, 46% and 50%) were relatively stable over time. The prevalence of body fatness (50%, 54% and 63%), physical inactivity (52%, 61% and 74%) and inadequate vegetable consumption (70%, 86% and 91%) increased over time. Currently, the prevalence of many modifiable lifestyle-related risk factors exceeds 50% of Australians and is generally higher in men than women (table 4).

The variable cohort age and sex distribution and the temporal trends in exposure prevalences demonstrate the need to use representative and most recent prevalence estimates for reliable PAF calculations. Red and processed meat consumption were the only exposures that could not be obtained from a representative external source (tables 2 and 3); we obtained these prevalence estimates from the largest and latest cohort (The 45 and Up Study) and will perform sensitivity analyses to assess the impact of the uncertainty in this measure.

Cancer and death cases in the pooled cohort

During the maximum 22-year follow-up of the pooled cohort (n=365 173; table 1) with mean age 59 years and 59% women, 35 860 incident cancers and 32 107 deaths were observed (see online supplementary table 2). The distribution of the cancers in the pooled cohort is similar to that for the Australian population.2 During the first 10-year follow-up, we observed 27 483 cancers and 22 078 deaths. There were 9258 participants with a first primary cancer causally related to body fatness and 4283 participants with a first primary cancer causally related to both tobacco and alcohol (see online supplementary table 3). No significant heterogeneity between the cohort-specific HRs for cancers causally related to body fatness or both tobacco and alcohol was found (see online supplementary table 4).

Avoidable cancers causally related to body fatness, and both tobacco and alcohol

Individual and combined contributions of risk factors

According to our estimations, overweight and obesity (table 5) explain 13% (95% CI 11% to 16%) of the 10-year burden of cancers causally associated with body fatness (table 6). If those currently obese were overweight, 5% (95% CI 3% to 7%) of the burden could be avoided. For cancers causally related to both tobacco and alcohol consumption, 13% (95% CI 11% to 16%) is attributable to smoking and could be avoided if current and former smokers had never smoked (table 6). If current smokers were to quit, 3% (95% CI 1% to 4%) of the burden could be avoided. Drinking more than two alcoholic drinks per day explains 6% (95% CI 5% to 8%) of the burden. Excessive alcohol consumption combined with ever smoking explains 16% (95% CI 13% to 19%) and combined with current smoking 8% (95% CI 5% to 10%) of the burden of these cancers over the next 10 years. Current smokers who also consume more than two alcoholic drinks per day are at a significantly higher risk of these cancers compared with smokers whose alcohol consumption does not exceed two daily drinks (HR 2.09 vs 1.41; table 5) and would benefit much more from quitting smoking (PAF 9% (95% CI 4% to 15%) versus 1% (95% CI 0% to 3%)).

Table 5

Exposure prevalence and HRs for cancers causally related to body fatness or both tobacco and alcohol over 10-year follow-up

Table 6

Fractions of cancers causally related to body fatness or both tobacco and alcohol attributable to the respective risk factors over 10 years of  follow-up

Contributions by sex

The contribution of body fatness to cancers causally related to this risk factor was 17% (95% CI 11% to 22%) for men and 12% (95% CI 9% to 15%) for women. The contribution of both smoking and excessive alcohol consumption to the burden of cancers causally related to both these exposures was even more pronounced for men (table 6). The PAFs for current and former smoking were 22% (95% CI 17% to 26%) for men versus 7% (95% CI 3% to 10%) for women and for consuming more than two alcoholic drinks per day 9% (95% CI 6% to 12%) versus 2% (95% CI 0% to 4%). Modifications to both of these risk factors could reduce the burden by 26% (95% CI 21% to 30%) for men and 8% (95% CI 4% to 11%) for women.

Comparison of novel and traditional PAF methods

Body fatness and alcohol consumption were weakly associated with death from causes other than the cancers of interest, whereas smoking was a moderate risk factor for mortality during the 10-year follow-up (HR 1.36 for former smokers and 2.23 for current smokers). Accordingly, the PAF estimates based on the novel and traditional methods differed most for smoking, especially for men (table 6). The differences between the two methods were larger when the combined effect of modifying both smoking and alcohol consumption was analysed. The point estimates given by the traditional method no longer fitted within the CIs produced by the novel method both for the overall population (20% vs 16% (95% CI 13% to 19%) and for men (31% vs 26% (95% CI 21% to 30%); however, the lack of CIs around the traditional estimates makes direct comparison difficult.

Projected avoidable numbers of cancers

Based on the projected Australian cancer incidence rates over the next 10 years, around 840 000 people will be diagnosed with cancer, of which over 570 000 are cancers causally related to body fatness and 250 000 cancers causally related to both tobacco and alcohol consumption. Of these, according to our PAF estimates, 74 000 cases can be attributed to body fatness, 32 000 to smoking, 15 000 to consuming more than two alcoholic drinks per day and 40 000 to the latter two exposures combined. The number of cancers preventable through avoiding both smoking and excessive alcohol consumption are overestimated by 10 000 if the competing risk of death is not considered, and these two exposures are assumed to act independently.

Strengths and limitations


The large cohort and advanced PAF methodology enables analysis of both the individual and joint contribution of risk factors to the burden of cancer, both overall and in subgroups. In the next stage of this research, we will identify and rank the most harmful cancer risk factors and their combinations for specific cancers and evaluate the distribution of their burden. We will also evaluate the contribution of different risk factors across all cancers. We will use this epidemiological evidence to inform future health promotion and other cancer control activities.

Access to individual participant cohort data allowed us to harmonise risk factors, potential confounding factors and effect modifiers. This is expected to increase the comparability and accuracy of the PAF estimates. As recommended,35 we documented our rigorous guidelines for data harmonisation to enable reproducibility and use in subsequent pooling efforts. We aligned our exposure classifications with the Australian recommendations for maintaining a healthy lifestyle, allowing consistency of risk communication.

Utilising corresponding risk factor exposure prevalence estimates from representative data sources also increased the accuracy of our PAF estimates. Cohort studies may have variable age and sex distributions, and they may underestimate the exposure prevalence, likely due to a ‘healthy participant’ bias,36 reinforcing the need to use representative age- and sex-specific exposure prevalence data in PAF calculations.

We provide the first Australian estimates on the potential future burden of cancer avoidable through modification of current harmful exposures, using the latest available exposure prevalence estimates. Exposure to lifestyle-related risk factors is highly prevalent and largely increasing in Australia23 and internationally,5 37 and thus lifestyle modifications can have a large impact on the cancer burden. The exception to these trends is current smoking, as Australia is a world leader in smoking control, and prevalence rates are low and continuing to fall.23 26 38 As we used the latest exposure prevalence data and evidence on cancers causally associated with specific exposures, recently updated for body fatness,31 our PAF estimates are not directly comparable with previous Australian estimates.8 These estimates were also based on published, mostly international, exposure-cancer associations, whereas we use harmonised Australian cohort data.

Accounting for competing risk of death is likely to have further increased the accuracy of our PAF estimates as ignoring competing risk of death can overestimate the fraction of cancers preventable by risk factor modification.13 That is, if cancer and death share the same risk factors, reduction of these risk factors is likely to reduce the risk of cancer and the risk of death, and people living longer have increased opportunity to develop cancer. The bias is higher the more strongly the cancer risk factor is associated with death, the more risk factors are evaluated simultaneously and the longer the follow-up.13 Our PAF method also produced CIs for the PAF estimates, allowing an evaluation of their precision and statistical comparison of subgroup estimates.

Our PAF method also allows a flexible choice of the reference level for a risk factor modification (eg, reducing the risk of current smokers to the level of former smokers) and analysis of the simultaneous effects of multiple risk factors. We showed that the combined contribution of two exposures on the cancer burden was overestimated if their effects on mortality were not accounted for and were assumed to be independent. We found that even a relatively small difference in PAF estimates can translate into a large difference in the number of preventable cancers predicted. Compared with the traditional method, our PAF estimates are thus more likely to reflect the real-world impact of modifying one or more risk factors10 and to better inform future cancer control activities.


Some risk factors were not collected by all studies, not available in the baseline data or the information available was too different for harmonisation, and as a result these studies could not be included in all analyses, reducing the statistical power. Additionally, some risk factors varied in how well they could be harmonised due to different question formulations and definitions (eg, ‘daily’ vs ‘regular’ smoking) or measurement methods (eg, self-reported vs measured BMI). Measurement error, both within and between studies, would generally lead to underestimation of the respective associations and PAF estimates. Additionally, as the exposure prevalence trends over time demonstrate, exposure to risk factors measured at baseline may have changed during follow-up, which would have further contributed to underestimation of the respective associations and PAFs. Some cohort studies performed repeated measurements during follow-up; these measures could be incorporated in future analyses as our PAF method allows the inclusion of time-dependent covariates.

Our illustrative PAF estimates for body fatness-related and tobacco and alcohol-related cancers were adjusted for age, sex and study and are thus subject to residual confounding by other risk factors affecting these associations. In the next stage of the project, we will compute cancer-specific PAF estimates and thoroughly evaluate and adjust for potential confounding factors.

The distribution of cancers in the cohort studies, especially when grouped, may not be the same as in the Australian population, and this may impact the generalisability of the findings. Reassuringly, the rank order of individual cancers in our cohort was similar to that for the Australian population.2

Our risk factor exposure prevalence estimates were obtained from study populations sampled to be representative of all Australians, but these surveys were limited in size and did not achieve 100% response rates, and therefore their representativeness is uncertain. For red and processed meat consumption, an important risk factor for several cancers, no prevalence information from such data sources was available; this further emphasises the importance of reaching a consensus on question formulations and definitions for core risk factors. Furthermore, even though we used the latest available exposure prevalence data, the estimates still lag behind the present situation. Therefore, our PAFs may be either slightly underestimated or overestimated, depending on the current exposure prevalence trends.

We note that probabilistic record linkage will have incurred a low rate of false positive and false negative matches, resulting in slight misclassification of the outcome.39 Also, we were not able to capture loss to follow-up, for example, due to participants leaving Australia. Each of these limitations will likely have resulted in bias towards the null and PAF underestimation.39

Although we provide improved PAF estimates, further improvements are possible. One major assumption in the PAF estimation is an immediate reduction in risk after the hypothetical modification of the exposure of interest. This is unrealistic and therefore all PAF estimates overestimate the effect of the risk factor modification, or rather the time required for that effect to take place. Once reliable evidence on the lag time between an intervention and reduction in risk is available, it can be incorporated in the estimation of PAF using advanced modelling approaches.34 The extent to which this is balanced by the various sources of underestimation mentioned above is not known and varies by risk factor. Furthermore, there is inherent uncertainty in the exposure prevalence estimates that could be incorporated in the PAF estimation for example through resampling-based methods such as bootstrap that require access to individual-level survey data. Finally, despite the large database available via our consortium, we have insufficient power to provide robust estimates for some of the rarer cancers, cancer subtypes, risk factor interactions and population subgroups. We aim to overcome this limitation by establishing an international cancer-PAF consortium.

Collaborators International cohort studies with risk factor, confounder and effect modifier, cancer and death data and access to representative up-to-date prevalence sources are welcome to participate. We encourage any interested parties to contact the corresponding author.

Supplementary Material

Supplementary Data


The authors would like to thank the participating cohort studies and surveys and their participants for the data for this cohort consortium. Specific details of funding and data sources for the 45 and Up Study and the Australian Longitudinal Study on Women’s Health are available at: and The CHAMP study is funded by the National Health and Medical Research Council (ID301916) and the Ageing and Alzheimer’s Institute. We acknowledge the assistance of the Data Linkage Unit at the Australian Institute of Health and Welfare for undertaking the data linkage to the ACD and the NDI.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.


  • Contributors MA, CMV and MAL were responsible for study concept, analysed and interpreted the data, drafted and revised the manuscript and approved the final version. KC, RM, PH, DJM, EB, GGG, RGC, JEB, AWT, JES, KP, VH, PM, B-AA contributed substantially to drafting the manuscript or revising it critically for important intellectual content and approved the final version.

  • Funding This study was funded by the National Health and Medical Research Council (ID1060991; ID1053642 to MAL; ID1082989 to KC; ID1042717 to EB) and a Cancer Institute New South Wales Fellowship (ID13/ECF/1-07 to MAL). Maria Arriaga was supported by Australian Postgraduate Award and a Translational Cancer Research Network (TCRN) PhD Scholarship Top-up Award. The funding bodies played no role in the conduct of the study, the writing of the report or the decision to submit the paper for publication.

  • Competing interests None declared.

  • Patient consent All the participating cohort studies have valid ethical approvals from human research ethics committees and have obtained informed consent for participation, collection and use of data for health research from each individual.

  • Ethics approval Our study has been approved by all necessary institutional and jurisdictional ethics committees: University of New South Wales human research ethics committee, Australian Institute of Health and Welfare ethics committee and each state and territory cancer registry human research ethics committee.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement No additional data are available. The data harmonisation guidelines can be obtained from the corresponding author. The SAS macros applied in the estimation of PAF are publicly available (see reference 15). We encourage international cohort studies interested in cohort-specific and pooled PAF analyses to contact the corresponding author.

  • Correction notice This paper has been amended since it was published Online First. Owing to a scripting error, some of the publisher names in the references were replaced with 'BMJ Publishing Group'. This only affected the full text version, not the PDF. We have since corrected these errors and the correct publishers have been inserted into the references.