Does bone mineral density improve the predictive accuracy of fracture risk assessment? A prospective cohort study in Northern Denmark

Objective To evaluate the added predictive accuracy of bone mineral density (BMD) to fracture risk assessment. Design Prospective cohort study using data between 01 January 2010 and 31 December 2012. Setting North Denmark Osteoporosis Clinic of referred patients presenting with at least one fracture risk factor to the referring doctor. Participants Patients aged 40–90 years; had BMD T-score recorded at the hip and not taking osteoporotic preventing drugs for more than 1 year prior to baseline. Main outcome measures Incident diagnoses of osteoporotic fractures (hip, spine, forearm, humerus and pelvis) were identified using the National Patient Registry of Denmark during 01 January 2012–01 January 2014. Cox regression was used to develop a fracture model based on predictors in the Fracture Risk Assessment Tool (FRAX®), with and without, binary and continuous BMD. Change in Harrell’s C-Index and Reclassification tables were used to describe the added statistical value of BMD. Results Adjusting for predictors included in FRAX®, patients with osteoporosis (T-score ≤−2.5) had 75% higher hazard of a fracture compared with patients with higher BMD (HR: 1.75 (95% CI 1.28 to 2.38)). Forty per cent lower hazard was found per unit increase in continuous BMD T-score (HR: 0.60 (95% CI 0.52 to 0.69)). Accuracy improved marginally, and Harrell’s C-Index increased by 1.2% when adding continuous BMD (0.76 to 0.77). Reclassification tables showed continuous BMD shifted 529 patients into different risk categories; 292 of these were reclassified correctly (57%; 95% CI 55% to 64%). Adding binary BMD however no improvement: Harrell’s C-Index decreased by 0.6%. Conclusions Continuous BMD marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It is suggested that future focus should be on evaluation of this risk factor using routinely collected data and on the development of more clinically relevant methodology to assess the added value of a new risk factor.


Competing Interests
We have read and understood BMJ policy on declaration of interests and declare that we have no competing interests.
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Transparency declaration
The lead author (Dr Dhiman) affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Article Summary
Strengths and Limitations: • Addresses a research question recommended by The National Institute for Health and Care Excellence to investigate the added value of bone mineral density to fracture risk prediction. • Investigates bone mineral density in both the commonly used, binary, and continuous format. • Uses robustly collected data from Northern Denmark, with 3.2% missing data.
• As data is from a North Danish population, with at least one fracture risk factor, this limits generalisability of the results. • Explores replacing current fracture risk factors, as well as adding to them, with bone mineral density. Osteoporosis causes over 8.9 million fractures worldwide, of which over 4.5 million occur in the USA and Europe, and account for 2.8 million disability adjusted life years (1). Further, 1.2 million disability adjusted life years are accounted for by hip fractures, which are projected to increase to 6 million by 2050 (2).
Given this burden, and treatment options for osteoporosis, identifying patients at risk of an osteoporotic fracture is high priority amongst health policymakers to reduce the risk of future fracture (3). Risk prediction tools have been developed to aid in the identification of patients at risk. For example, the Fracture Risk Assessment Tool (FRAX®) and QFracture® are commonly used to assess fracture risk in patients based on pre-defined risk factors.
Bone mineral density (BMD), a measurement used to aid diagnosis of osteoporosis, has also been identified as a fracture risk factor (4)(5)(6)(7) . Unlike some other fracture risk factors, treatment options (e.g. bisphosphonate medication) are available that reduces the fracture risk markedly when treatment is initiated based on low BMD.
English National guidelines (The National Institute for Health and Care Excellence (NICE)) for fracture risk assessment recommend treatment of osteoporosis to prevent fractures but have not included BMD as a mandatory risk factor for fracture risk prediction tools to incorporate (8). This is partly due to the lack of robust evidence and limited generalisability of current research, which has particularly focused on evaluating BMD in postmenopausal women evaluating the added value of BMD to existing fracture risk factors (5)(6)(7).
The National Institute of Clinical Health and Excellence also recognise this gap in the evidence and have recommended research to assess the added value of BMD as a risk factor in fracture risk assessment (9).
The aim of this study is to assess the value of BMD measurement in addition to the standard fracture risk factors used in the FRAX® risk model using a robustly collected prospective cohort. Discrimination measures how well the risk prediction model differentiates between patients who have or have not observed the event in the study. This was quantified by the area under the receiver operating characteristic (ROC) curve (AUC), given by Harrell's C-Index with higher values indicating better discrimination.
Reclassification tables (17) measures movement between risk categories when adding a new risk factor. Threshold for treatment at 4 years was set at a fracture risk level of 8.5%; to be comparable to the treatment threshold of 20% at 10 years. This was presented by the total percent of patients reclassified (incorrectly and correctly), and also the Net Reclassification Index (NRI) (18,19). The NRI gives the net calculation of the changes in the right direction and a higher NRI indicates a better reclassifying model.

Characteristics of the data
The AURORA collected data on 7,912 patients; 1,795 patients were excluded comprising, 440 not aged between 40-90 years at baseline; 156 not having a recorded T-score value for the total hip at baseline; and 1,199 patients were taking anti-osteoporotic drug therapy for more than one year prior to baseline.
The study sample consisted of 6,117 patients; predominantly female (79.6%), and patients with a mean age of 62.9 (SD: 10.9) years. Two-thirds of this sample (n=4,093) was used for the derivation dataset and one-third (n=2,094) was used for the validation dataset. Table 1 presents the baseline characteristics of the study by derivation and validation dataset, and shows little difference between the datasets.
Patients in the derivation dataset observed 318 (7.8%) osteoporotic fractures during follow up. Of these, 316 fractures were eligible for the analysis (2 patients had a fractures on or prior to baseline and were excluded). Patients contributed 9352.8 person years of observation, giving a total incidence rate of 337.87 per 10,000 person years (95% CI:302.60 to 377.25).
Fractures during follow up were predominantly found in the forearm (27.0%) and hip (17.9%). Higher fracture incidence rates were found in patients classed as osteoporotic, based on their T-score at both the femoral neck (809.73 per 10,000 person years (95% CI:641.68 to 1021.78)) and spine (L1-L4) (553.59 per 10,000 person years (95% CI:462.55 to 662.55)) (Supplementary Table 2).  Hip DXA T-score -1. 13 1.09 -1. 16 1.08 *out of patients with a fracture **proportion out of respective number of females ***proportion out of respective number of females with menopause

Model development
The unadjusted analysis showed statistically significant association between BMD (continuous and binary) and osteoporotic fracture (p<0.001). Significant associations with fracture were also found with age (p<0.001), previous fracture (p<0.001), BMI (p=0.03), and gender (p=0.05). Further, a time-varying effect was found in patients with a previous fracture; hazard of a subsequent fracture was highest in the first year during follow up and decreased per year of follow up (p<0.001).
The adjusted analysis is presented in Table 2. Model 1 showed that of the standard risk factors, age and previous fracture were significantly associated with fracture; hazard of fracture increased by 2% per year increase in age (HR=1.02; 95% CI: 1.01 to 1.04); and increased almost 5 fold in patients with a previous fracture (HR=4.88; 95% CI: 3.37 to 7.08).
Insignificant risk factors were also removed (Model 4 and 5). Removing secondary osteoporosis when adding binary BMD (Model 4), and removing secondary osteoporosis, current smoker, and BMI, when adding continuous BMD gave similar results but simplified the model.

Model Validation
The 4-year predicted risk of fracture was calculated for all patients in the validation dataset; this was compared to the observed fracture outcome within the 4 year follow up.

Calibration and Discrimination
Calibration improved when adding BMD measurement; particularly when including continuous BMD T-score measurement (Model 3; Supplementary Figure 1).
The largest change in discrimination was found when adding continuous BMD measurement to standard risk factors; Harrell's C-Index increased by 1.15% (Table 3). However, binary BMD measurement, as a measure for osteoporotic patients, was found to reduce Harrell's C-Index by -0.62%.

Reclassification
Reclassification tables showed risk models with continuous BMD measurement improved classification of patients into their correct risk categories. This was not found when adding binary BMD.   Of the 1,960 patients in the validation dataset, 27% (n=529) were reclassified into a different risk category when including continuous BMD into fracture risk prediction. Two percent (9/529) were found to be reclassified correctly into a higher risk group and 55% (292/529) were reclassified correctly into a lower risk group; indicating 22% (292/1342) of patients at high risk in Model 1, not accounting for BMD measurement, were no longer at high risk. The net reclassification improvement when adding continuous BMD to standard risk factors, was 0.03, similar results were found when comparing Model 1 with the data driven models (Table  5).

Summary of Findings
Bone mineral density improved fracture risk prediction. This finding was consistent throughout the analysis; both the unadjusted and adjusted analyses. However, the format of BMD measurement in the fracture risk prediction model affected the results. Calibration, discrimination, and reclassification all improved when adding continuous BMD measurement to standard risk factors. This was not found when adding BMD in a binary format.
Adding BMD to fracture risk prediction model negated the effect of fracture with secondary osteoporosis, current smoking status, and BMI. Removing these risk factors had minimal impact on the model performance.

Strengths and Limitations
Answering Evidence gap To our knowledge, this is the first study to investigate the added value of BMD in a binary and continuous format, to standard fracture risk factors. It directly informs the NICE research recommendation to assess the added value of BMD to routine fracture risk assessment in primary care (21). It further highlights that the more commonly used, binary format of BMD resulted in a loss of predictability in fracture risk prediction; based on comparable measures for discrimination and reclassification

Robustness of Data
The prospective cohort was well populated with key standard risk factors recorded: BMI, smoking status and alcohol consumption, and personal and parental fracture history. Other than 3.2% of missing data for BMI, in 6,117 patients, complete data was collected for all risk factors (including BMD T-score recorded at the total hip). Further, the cohort was linked to a national robust electronic health records. This Danish National Patient Registry allowed for outcome fracture to be identified and also provided data on the mechanism for the fracture; this helped more accurately phenotype osteoporotic fractures.

Generalisability
The generalisability is affected in two ways. Firstly, the findings are based on a Danish cohort. Secondly, AURORA data was collected from patients who presented to their doctor with at least one fracture risk factor and were referred to the osteoporosis clinic; this led to a biased study sample with a higher risk of a fracture and increased age. This could overestimate fracture risk amongst patients in a primary care setting.

Methodology
As well as assessing the added value of BMD to standard, we have also explored the option to replace existing fracture risk factors with the BMD measurement; this has rarely been explored in the literature but should be considered in future analyses. (22,23).
Due to the increased age of the sample, death becomes a competing risk. However, information on death was not collected and could not be retrieved. This limited the analysis of the data as competing risks could not be accounted for which may again lead to an  (24). However, as an independent study primarily assessing the added value of BMD through deriving and validating the fracture risk prediction models, this bias would be present in both analyses to compare derived risk models with and without BMD measurement.
Internal validation was performed to validate the derived risk prediction models. This may lead to over optimistic results of the performance of the risk models (14). To account for this limitation, a commonly practised method which randomly assigns patients to the derivation and validation datasets was used; further, a similar 1:3 ratio was also used to split the data (25)(26)(27).
The study had a 4 year follow up which is shorter than other recognised risk models. To account for this, we adapted the 20% clinical risk threshold for 10 year fracture estimates to 8.5% for 4 year fracture estimates (28, 29).
Traditional methodology assessing the added value to risk factors to existing risk prediction models are criticised to be insensitive to change, to lack interpretability (30-33); and do not account for cost implications. Reclassification analysis was used to provide more clinically interpretable results.

Clinical implications
The most notable clinical implication is the more routine use of BMD measurement for fracture risk assessment. Further, evidence suggests continuous BMD adds better predictability compared to the binary format.

Future Research
Further research is recommended to evaluate the added value of BMD to fracture risk prediction; in particular using primary care routinely collected data. However, a brief interrogation into the Clinical Practice Research Datalink, a routinely collected UK primary care database, showed poor availability of BMD measurement in patient records, and thus, strong limitations to potential analyses. Less than 1% of patients had BMD recorded from a sample of 60,658 patients aged 40-90; not on any osteoporotic treatment; and with complete data for age, gender, BMI, smoking status, and alcohol consumption. Thus, prior to UK analysis, BMD recording in primary care databases needs to improve.
In addition, further research is recommended to develop current methodology used to assess the added value of BMD to provide more clinically relevant results, such as cost implications; and to allow for better comparability between new risk factors with respect to their added value, thus improving decision making.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   Supplementary Information   Beta coefficients from each Cox regression model were used to create each fracture risk   prediction model. Once all 5 models were finalised, their beta coefficients were used to create 5 risk prediction models and calculate risk of fracture for each patient, using the following general equation:

Copyright
The Corresponding Author has the right to grant on behalf of all authors and does grant on behalf of all authors, a worldwide license to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future), to i) publish, reproduce, distribute, display and store the Contribution, ii) translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution, iii) create any other derivative work(s) based on the Contribution, iv) to exploit all subsidiary rights in the Contribution, v) the inclusion of electronic links from the Contribution to third party material where-ever it may be located; and, vi) license any third party to do any or all of the above.

Exclusive License Statement
The Corresponding Author has the right to grant on behalf of all authors and does grant on behalf of all authors, an exclusive licence (or non exclusive for government employees) on a worldwide basis to the BMJ Publishing Group Ltd to permit this article (if accepted) to be published in BMJ editions and any other BMJPGL products and sublicenses such use and exploit all subsidiary rights, as set out in our licence.

Competing Interests
We have read and understood BMJ policy on declaration of interests and declare that we have no competing interests.
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Transparency declaration
The lead author (Dr Dhiman) affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Ethical Approval
Ethics approval was given through the Region of North Jutland's from the Danish Data Protection Agency ("paraplyanmeldelse 2008-58-0028").

Funding
This study was funded through the National Institute for Health Research (NIHR), School for Primary Care Research (SPCR).

NIHR acknowledgement
This paper presents independent research funded by the National Institute for Health Research School for Primary Care Research (NIHR SPCR).

Disclaimer
The views expressed are those of the author(s) and not necessarily those of the NIHR, the NHS or the Department of Health.

Patient Involvement
Patients were not involved in the development of this research question.

Data Sharing Statement
Data sharing: Technical appendix and statistical code is available from the corresponding author at paula.dhiman@nottingham.ac.uk. Due to restrictions by the Danish Data Protection Agency, data can only be shared on an aggregated level and by special permission.

Contributorship
Contributors: PD wrote the statistical analysis plan, cleaned and analysed the data, and drafted and revised the paper. PV and SA provided the AURORA dataset for analysis and linked patients to the National Patient Registry of Denmark, they also reviewed and revised the draft paper. NQ and TM provided clinical expertise, and reviewed and revised the draft paper. Conclusions: Continuous bone mineral density marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It is suggested that future focus should be on evaluation of this risk factor using routinely collected data, and on the development of more clinically relevant methodology to assess the added value of a new risk factor

Article Summary
Strengths and Limitations: • Addresses a research question recommended by The National Institute for Health and Care Excellence to investigate the added value of bone mineral density to fracture risk prediction. • Investigates bone mineral density in both the commonly used, binary, and continuous format. • Presents changes in calibration, discrimination, and reclassification to describe the added value of bone mineral density. • Uses robustly collected data from Northern Denmark, with 3.2% missing data.
• As data is from a North Danish population, with at least one fracture risk factor, this limits generalisability of the results. Given this burden, and treatment options for osteoporosis, identifying patients at risk of an osteoporotic fracture is high priority amongst health policymakers to reduce the risk of future fracture (3). Risk prediction tools have been developed to aid in the identification of patients at risk. For example, the Fracture Risk Assessment Tool (FRAX®) and QFracture® are commonly used to assess fracture risk in patients based on pre-defined risk factors.
Bone mineral density (BMD), a measurement used to aid diagnosis of osteoporosis, has also been identified as a fracture risk factor (4-7) . Unlike some other fracture risk factors, treatment options (e.g. bisphosphonate medication) are available that reduces the fracture risk markedly when treatment is initiated based on low BMD.
English National guidelines (The National Institute for Health and Care Excellence (NICE)) for fracture risk assessment recommend treatment of osteoporosis to prevent fractures but have not included BMD as a mandatory risk factor for fracture risk prediction tools to incorporate (8). This is partly due to the lack of robust evidence and limited generalisability of current research, which has particularly focused on evaluating BMD in postmenopausal women evaluating the added value of BMD to existing fracture risk factors (5-7).
The National Institute of Clinical Health and Excellence also recognise this gap in the evidence and have recommended research to assess the added value of BMD as a risk factor in fracture risk assessment (9).
The aim of this study is to assess the value of BMD measurement in addition to the standard fracture risk factors used in the FRAX® risk model using a robustly collected prospective cohort.

Methods
This paper has been written in accordance to the TRIPOD checklist.

Patient Involvement
Patients were not involved in the development of this research question and were not involved in the design of this study.

Study Design and Data Source
A prospective cohort study was conducted using patients from the Aalborg University Hospital Record for Osteoporosis Risk Assessment (AURORA) dataset; patients were followed up using the National Patient Registry of Denmark.
The AURORA dataset consists of patients attending the Osteoporosis Clinic at Aalborg University Hospital after a referral from their primary care physician. A referral was offered to patients with at least one risk factor for osteoporosis (low BMI, previous fracture, parental hip fracture, smoking status, alcohol consumption, glucocorticoid use, rheumatoid arthritis, and secondary osteoporosis) or if they were aged 80 years and above. Further detail of the data collection has been described elsewhere (10). The Danish National Patient Registry which collects inpatient and outpatient data from all Danish hospitals, was linked to the AURORA dataset through unique patient identifiers Ethics approval was given through the Region of North Jutland's from the Danish Data Protection Agency ("paraplyanmeldelse 2008-58-0028").

Cohort selection
Data collection for AURORA began 1 st January 2010 and was collected for 3 years (up to 31 st December 2012). Patients were included if they were aged 40-90 years; had a BMD T-score at the hip; and were not taking any osteoporotic preventing drugs or any bone sparing drugs for more than one year prior to baseline.

Primary Outcome
The primary outcome measure was an incident osteoporotic fracture during follow up (01/01/2012 to 01/01/2014); defined as a diagnosis of a fracture at the hip, spine, forearm, humerus, and pelvis. Fractures at these sites resulting from traffic, work, and sports related accidents were excluded from the study. Relevant fractures were identified in the Danish National Patient Registry, using the International Statistical Classification of Diseases, 10th Version codes (ICD-10 codes), which was developed using recognised database methodology for each fracture (11).

Fracture risk factors
Fracture risk factors, used in the FRAX® risk prediction model, were extracted at baseline. They were: age; gender; height (m); weight (kg); previous fracture; parental history of hip fracture, current smoking status; current alcohol consumption; glucocorticoid use (currently exposed for 3+ months); rheumatoid arthritis; and secondary osteoporosis (includes type I diabetes; osteogenesis imperfecta in adults; untreated, long standing hyperthyroidism;

Bone Mineral Density
DXA scans were performed by trained technicians using Hologic Discovery A (Bedford, MA, USA). A daily QC programme was in place and in vivo CV using repositioning of patients was <1%. Total hip BMD was used as region of interest. Bone mineral density was added to the fracture risk prediction model twice, firstly, as a continuously measured T-score value, and secondly, as a binary risk factor, dichotomised at/above T-score threshold for osteoporosis and below threshold, -2.5 in T-score (manufacturers' normal range using normal material from T Kelly et al (12)) based on World Health Organisation (WHO) classifications (13). Calculated T-scores were gender specific.

Statistical Analysis
A complete case analysis was performed on the data; 3.2% of data was missing. The AURORA dataset was split into two using recognised methodology (14); where a random number was assigned to patients and based on a cut off, two-thirds was used to derive the risk models, and the remaining third was used to validate them.

Model derivation
Three Cox proportional hazards models were developed for the primary outcome, using a complete case analysis on the derivation dataset: Graphical methods were used (log-log plots) to assess the proportional hazards assumption, and risk factors violating this assumption were added to the model as a time varying covariate.
Recognised methodology used in research studies was used to build the 3 risk prediction models (15,16); the Kaplan Meier method was used to obtain 4-year fracture risk estimates for patients. Further detail on the conversion of the Cox proportional hazards models to risk prediction models has been provided in Supplementary Table 1.

Validation of Models
Four-year fracture risk was calculated from each model and the predictive performance of each risk prediction model was assessed by measures describing calibration, discrimination, and reclassification. These metrics were assessed using the validation cohort.
Calibration measures how well the predicted risk agrees with observed risk in the data. It plots the mean predicted and observed risk of fracture for each decile of predicted risk. The observed risk of fracture was derived from the 4 year Kaplan-Meier estimate. Good calibration indicates the predicted risk is close to the observed risk of the outcome.
Discrimination measures how well the risk prediction model differentiates between patients who have or have not observed the event in the study. This was quantified by the area under Reclassification tables (17) measures movement between risk categories when adding a new risk factor. Threshold for treatment at 4 years was set at a fracture risk level of 8.5%; to be comparable to the treatment threshold of 20% at 10 years. This was presented by the total percent of patients reclassified (incorrectly and correctly), and also the Net Reclassification Index (NRI) (18,19). The NRI gives the net calculation of the changes in the right direction and a higher NRI indicates a better reclassifying model.

Characteristics of the data
The AURORA collected data on 7,912 patients; 1,795 patients were excluded comprising, 440 not aged between 40-90 years at baseline; 156 not having a recorded T-score value for the total hip at baseline; and 1,199 patients were taking anti-osteoporotic drug therapy for more than one year prior to baseline.
The study sample consisted of 6,117 patients; predominantly female (79.6%), and patients with a mean age of 62.9 (SD: 10.9) years. Two-thirds of this sample (n=4,093) was used for the derivation dataset and one-third (n=2,094) was used for the validation dataset. Table 1 presents the baseline characteristics of the study by derivation and validation dataset, and shows little difference between the datasets. Hip DXA T-score -1.1 1.1 -1.2 1.1 *out of patients with a fracture **proportion out of respective number of females ***proportion out of respective number of females with menopause

Model development
The unadjusted analysis showed statistically significant association between BMD (continuous and binary) and osteoporotic fracture (p<0.001). Significant associations with fracture were also found with age (p<0.001), previous fracture (p<0.001), BMI (p=0.03), and gender (p=0.05). Further, a time-varying effect was found in patients with a previous fracture; hazard of a subsequent fracture was highest in the first year during follow up and decreased per year of follow up (p<0.001).
The adjusted analysis is presented in Table 2. Model 1 showed that of the standard risk factors, age and previous fracture were significantly associated with fracture; hazard of fracture increased by 2% per year increase in age (HR=1.02; 95% CI: 1.01 to 1.04); and increased almost 5 fold in patients with a previous fracture at time 0 years (HR=4.88; 95% CI: 3.37 to 7.08).

Model Validation
The 4-year predicted risk of fracture was calculated for all patients in the validation dataset; this was compared to the observed fracture outcome within the 4 year follow up.

Calibration and Discrimination
Calibration plots suggested some improvement when adding BMD measurement; particularly when including continuous BMD T-score measurement (Model 3; Supplementary Figure 1).
The largest change in discrimination was found when adding continuous BMD measurement to standard risk factors; Harrell's C-Index increased by 1.17% (Table 3). However, binary BMD measurement, as a measure for osteoporotic patients, was found to reduce Harrell's C-Index by -0.65%.

Reclassification
Reclassification tables suggested that adding continuous BMD measurement improved classification of patients into their correct risk categories. This was not found when adding binary BMD.  Of the 1,960 patients in the validation dataset, 27% (n=529) were reclassified into a different risk category when including continuous BMD into fracture risk prediction. Two percent (9/529) were found to be reclassified correctly into a higher risk group and 55% (292/529) were reclassified correctly into a lower risk group; indicating 22% (292/1342) of patients at high risk in Model 1, not accounting for BMD measurement, were no longer at high risk. The net reclassification improvement when adding continuous BMD to standard risk factors, was 0.03, which resulted from increased specificity (non-event NRI = 4%) and decreased sensitivity (event NRI: -1%) from Model 1 ( Table 5).

Summary of Findings
Bone mineral density showed significant association with fracture risk with a 40% decrease for each SD rise in BMD. However, this resulted in small improvements in calibration, discrimination, and reclassification. Despite the limited improvement was found of 1% in discrimination when adding continuous BMD, reclassification tables showed 57% of reclassified patients moving into their correct risk group through improved specificity.
Importantly no improvement was found when adding BMD in a binary format. Our findings are consistent with and corroborate with current literature (7,21). Specifically, a study conducted in the Netherlands with 4 year follow up, investigating the added value of BMD for hip fractures risk found modest improvement in predictability (21). Further, a more recent study also indicated limited added value of BMD to fracture risk prediction (7).

Strengths and Limitations
Answering Evidence gap To our knowledge, this is the first study to investigate the added value of BMD in a binary and continuous format, to standard fracture risk factors. It helps inform the NICE research recommendation to assess the added value of BMD to routine fracture risk assessment in primary care (22). It further highlights that the more commonly used for treatment decision making, binary format of BMD resulted in a loss of predictability in fracture risk prediction; based on comparable measures for discrimination and reclassification

Robustness of Data
The prospective cohort was well populated with key standard risk factors recorded: BMI, smoking status and alcohol consumption, and personal and parental fracture history. Other than 3.2% of missing data for BMI, in 6,117 patients, complete data was collected for all risk factors (including BMD T-score recorded at the total hip). Further, the cohort was linked to a national robust electronic health records. This Danish National Patient Registry allowed for outcome fracture to be identified and also provided data on the mechanism for the fracture; this helped more accurately phenotype osteoporotic fractures.

Generalisability
The generalisability is affected in a few ways. Firstly, the findings are based on a Danish cohort. Secondly, AURORA data was collected from patients who presented to their doctor with at least one fracture risk factor and were referred to the osteoporosis clinic; this led to a biased study sample with a higher risk of a fracture and increased age. This could overestimate fracture risk amongst patients in a primary care setting.

Methodology
Due to the increased age of the sample, death becomes a competing risk. However, information on death was not collected and could not be retrieved. This limited the analysis of the data as competing risks could not be accounted for which may again lead to an overestimation of fracture risk (23). However, as an independent study primarily assessing the added value of BMD through deriving and validating the fracture risk prediction models, The FRAX risk algorithm has not yet been published, therefore FRAX estimates could not be directly calculated for the cohort. Instead, the FRAX risk model was recalibrated on the dataset with and without BMD added. Further, fracture outcomes in this study included pelvic fractures which are increasingly recognised as low trauma fragility fractures [(24)], and used BMD taken at the total hip instead of at the femur neck as it is the gold standard in Denmark (25).
Internal validation was performed to validate the derived risk prediction models. This may lead to over optimistic results of the performance of the risk models (14). To account for this limitation, a commonly practised method which randomly assigns patients to the derivation and validation datasets was used; further, a similar 1:2 ratio was also used to split the data (26-28).
The study had a 4 year follow up which is shorter than other recognised risk models. To account for this, we adapted the 20% clinical risk threshold for 10 year fracture estimates to 8.5% for 4 year fracture estimates, assuming that risk is constant over time (29, 30).
Traditional methodology assessing the added value to risk factors to existing risk prediction models are criticised to be insensitive to change, to lack interpretability (31)(32)(33)(34). This was shown when finding a 1% change in Harrell's C-Index and overlapping confidence intervals between models, limiting the interpretability of results. Reclassification analysis was thus also used to provide more clinically interpretable results.

Clinical implications
The most notable clinical implication is the more routine use of BMD measurement for fracture risk assessment. Further, evidence suggests continuous BMD adds better predictability compared to the binary format.

Future Research
Further research is recommended to evaluate the added value of BMD to fracture risk prediction; in particular in addition to QFracture risk factors and using primary care routinely collected data. However, a brief interrogation into the Clinical Practice Research Datalink, a routinely collected UK primary care database, showed poor availability of BMD measurement in patient records, and thus, strong limitations to potential analyses. Less than 1% of patients had BMD recorded from a sample of 60,658 patients aged 40-90; not on any osteoporotic treatment; and with complete data for age, gender, BMI, smoking status, and alcohol consumption. Thus, prior to UK analysis, BMD recording in primary care databases needs to improve.
Methodologically, as well as assessing the added value of BMD to standard risk factors, we should also explore the option to replace existing fracture risk factors with the BMD measurement; this has rarely been explored in the literature but should be considered in future analyses. We also recommend research to investigate the added value of BMD in a potentially more natural, 3 group format of BMD (osteopenic, normal, osteoporotic). In addition, further research is recommended to develop current methodology used to assess the added value of BMD to provide more clinically relevant results, such as cost implications; and to allow for better comparability between new risk factors with respect to their added value, thus improving decision making.

Conclusion
Continuous BMD marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It seems that prediction models for fragility fracture risk may be improved only marginally, using present risk factor assessment and evaluations. It is suggested that future focus should be on additional risk factors and on the development of more clinically relevant methodology to assess the added value of a new risk factor. Once all 5 models were finalised, their beta coefficients were used to create 5 risk prediction models and calculate risk of fracture for each patient, using the following general equation: Where ( ) is the baseline survival rate at follow up time, (for this example, a follow up time of 10 years will be used); beta ( ) are the regression coefficients for each included risk factor in the model ( ); is the observed data value for each risk factor; ̅ is the corresponding mean for each risk factor; and is the total number of risk factors included in the model. Table A1 shows the formula for each risk prediction model explicitly.

Competing Interests
We have read and understood BMJ policy on declaration of interests and declare that we have no competing interests.
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Transparency declaration
The lead author (Dr Dhiman) affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Ethical Approval
Ethics approval was given through the Region of North Jutland's from the Danish Data Protection Agency ("paraplyanmeldelse 2008-58-0028").

Funding
This study was funded through the National Institute for Health Research (NIHR), School for Primary Care Research (SPCR).

NIHR acknowledgement
This paper presents independent research funded by the National Institute for Health Research School for Primary Care Research (NIHR SPCR).

Disclaimer
The views expressed are those of the author(s) and not necessarily those of the NIHR, the NHS or the Department of Health.

Patient Involvement
Patients were not involved in the development of this research question.

Data Sharing Statement
Data sharing: Technical appendix and statistical code is available from the corresponding author at paula.dhiman@nottingham.ac.uk. Due to restrictions by the Danish Data Protection Agency, data can only be shared on an aggregated level and by special permission.

Contributorship
Contributors: PD wrote the statistical analysis plan, cleaned and analysed the data, and drafted and revised the paper. PV and SA provided the AURORA dataset for analysis and linked patients to the National Patient Registry of Denmark, they also reviewed and revised the draft paper. NQ and TM provided clinical expertise, and reviewed and revised the draft paper. Conclusions: Continuous bone mineral density marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It is suggested that future focus should be on evaluation of this risk factor using routinely collected data, and on the development of more clinically relevant methodology to assess the added value of a new risk factor

Strengths and Limitations:
• Addresses a research question recommended by The National Institute for Health and Care Excellence to investigate the added value of bone mineral density to fracture risk prediction. • Investigates bone mineral density in both the commonly used, binary, and continuous format. • Presents changes in calibration, discrimination, and reclassification to describe the added value of bone mineral density. • Uses robustly collected data from Northern Denmark, with 3.2% missing data.
• As data is from a North Danish population, with at least one fracture risk factor, this limits generalisability of the results. Given this burden, and treatment options for osteoporosis, identifying patients at risk of an osteoporotic fracture is high priority amongst health policymakers to reduce the risk of future fracture (3). Risk prediction tools have been developed to aid in the identification of patients at risk. For example, the Fracture Risk Assessment Tool (FRAX®) and QFracture® are commonly used to assess fracture risk in patients based on pre-defined risk factors.
Bone mineral density (BMD), a measurement used to aid diagnosis of osteoporosis, has also been identified as a fracture risk factor (4-7) . Unlike some other fracture risk factors, treatment options (e.g. bisphosphonate medication) are available that reduces the fracture risk markedly when treatment is initiated based on low BMD.
English National guidelines (The National Institute for Health and Care Excellence (NICE)) for fracture risk assessment recommend treatment of osteoporosis to prevent fractures but have not included BMD as a mandatory risk factor for fracture risk prediction tools to incorporate (8). This is partly due to the lack of robust evidence and limited generalisability of current research, which has particularly focused on evaluating BMD in postmenopausal women evaluating the added value of BMD to existing fracture risk factors (5-7).
The National Institute of Clinical Health and Excellence also recognise this gap in the evidence and have recommended research to assess the added value of BMD as a risk factor in fracture risk assessment (9).
The aim of this study is to assess the value of BMD measurement in addition to the standard fracture risk factors used in the FRAX® risk model using a robustly collected prospective cohort.

Methods
This paper has been written in accordance to the TRIPOD checklist.

Patient Involvement
Patients were not involved in the development of this research question and were not involved in the design of this study.

Study Design and Data Source
A prospective cohort study was conducted using patients from the Aalborg University Hospital Record for Osteoporosis Risk Assessment (AURORA) dataset; patients were followed up using the National Patient Registry of Denmark.
The AURORA dataset consists of patients attending the Osteoporosis Clinic at Aalborg University Hospital after a referral from their primary care physician. A referral was offered to patients with at least one risk factor for osteoporosis (low BMI, previous fracture, parental hip fracture, smoking status, alcohol consumption, glucocorticoid use, rheumatoid arthritis, and secondary osteoporosis) or if they were aged 80 years and above. Further detail of the data collection has been described elsewhere (10). The Danish National Patient Registry which collects inpatient and outpatient data from all Danish hospitals, was linked to the AURORA dataset through unique patient identifiers Ethics approval was given through the Region of North Jutland's from the Danish Data Protection Agency ("paraplyanmeldelse 2008-58-0028").

Cohort selection
Data collection for AURORA began 1 st January 2010 and was collected for 3 years (up to 31 st December 2012). Patients were included if they were aged 40-90 years; had a BMD T-score at the hip; and were not taking any osteoporotic preventing drugs or any bone sparing drugs for more than one year prior to baseline.

Primary Outcome
The primary outcome measure was an incident osteoporotic fracture during follow up (01/01/2012 to 01/01/2014); defined as a diagnosis of a fracture at the hip, spine, forearm, humerus, and pelvis. Fractures at these sites resulting from traffic, work, and sports related accidents were excluded from the study. Relevant fractures were identified in the Danish National Patient Registry, using the International Statistical Classification of Diseases, 10th Version codes (ICD-10 codes), which was developed using recognised database methodology for each fracture (11).

Fracture risk factors
Fracture risk factors, used in the FRAX® risk prediction model, were extracted at baseline. They were: age; gender; height (m); weight (kg); previous fracture; parental history of hip fracture, current smoking status; current alcohol consumption; glucocorticoid use (currently exposed for 3+ months); rheumatoid arthritis; and secondary osteoporosis (includes type I diabetes; osteogenesis imperfecta in adults; untreated, long standing hyperthyroidism;

Bone Mineral Density
DXA scans were performed by trained technicians using Hologic Discovery A (Bedford, MA, USA). A daily QC programme was in place and in vivo CV using repositioning of patients was <1%. Total hip BMD was used as region of interest. Bone mineral density was added to the fracture risk prediction model twice, firstly, as a continuously measured T-score value, and secondly, as a binary risk factor, dichotomised at/above T-score threshold for osteoporosis and below threshold, -2.5 in T-score (manufacturers' normal range using normal material from T Kelly et al (12)) based on World Health Organisation (WHO) classifications (13). Calculated T-scores were gender specific.

Statistical Analysis
A complete case analysis was performed on the data; 3.2% of data was missing. The AURORA dataset was split into two using recognised methodology (14); where a random number was assigned to patients and based on a cut off, two-thirds was used to derive the risk models, and the remaining third was used to validate them.

Model derivation
Three Cox proportional hazards models were developed for the primary outcome, using a complete case analysis on the derivation dataset: Graphical methods were used (log-log plots) to assess the proportional hazards assumption, and risk factors violating this assumption were added to the model as a time varying covariate.
Recognised methodology used in research studies was used to build the 3 risk prediction models (15,16); the Kaplan Meier method was used to obtain 4-year fracture risk estimates for patients. Further detail on the conversion of the Cox proportional hazards models to risk prediction models has been provided in Supplementary Table 1.

Validation of Models
Four-year fracture risk was calculated from each model and the predictive performance of each risk prediction model was assessed by measures describing calibration, discrimination, and reclassification. These metrics were assessed using the validation cohort.
Calibration measures how well the predicted risk agrees with observed risk in the data. It plots the mean predicted and observed risk of fracture for each decile of predicted risk. The observed risk of fracture was derived from the 4 year Kaplan-Meier estimate. Good calibration indicates the predicted risk is close to the observed risk of the outcome.
Discrimination measures how well the risk prediction model differentiates between patients who have or have not observed the event in the study. This was quantified by the area under Reclassification tables (17) measures movement between risk categories when adding a new risk factor. Threshold for treatment at 4 years was set at a fracture risk level of 8.5%; to be comparable to the treatment threshold of 20% at 10 years. This was presented by the total percent of patients reclassified (incorrectly and correctly), and also the Net Reclassification Index (NRI) (18,19). The NRI gives the net calculation of the changes in the right direction and a higher NRI indicates a better reclassifying model.

Characteristics of the data
The AURORA collected data on 7,912 patients; 1,795 patients were excluded comprising, 440 not aged between 40-90 years at baseline; 156 not having a recorded T-score value for the total hip at baseline; and 1,199 patients were taking anti-osteoporotic drug therapy for more than one year prior to baseline.
The study sample consisted of 6,117 patients; predominantly female (79.6%), and patients with a mean age of 62.9 (SD: 10.9) years. Two-thirds of this sample (n=4,093) was used for the derivation dataset and one-third (n=2,094) was used for the validation dataset.
The adjusted analysis is presented in Table 2. Model 1 showed that of the standard risk factors, age and previous fracture were significantly associated with fracture; hazard of fracture increased by 2% per year increase in age (HR=1.02; 95% CI: 1.01 to 1.04); and increased almost 5 fold in patients with a previous fracture at time 0 years (HR=4.88; 95% CI: 3.37 to 7.08).

Model Validation
The 4-year predicted risk of fracture was calculated for all patients in the validation dataset; this was compared to the observed fracture outcome within the 4 year follow up.

Calibration and Discrimination
Calibration plots suggested some improvement when adding BMD measurement; particularly when including continuous BMD T-score measurement (Model 3; Supplementary Figure 1).
The largest change in discrimination was found when adding continuous BMD measurement to standard risk factors; Harrell's C-Index increased by 1.17% (Table 3). However, binary BMD measurement, as a measure for osteoporotic patients, was found to reduce Harrell's C-Index by -0.65%.

Reclassification
Reclassification tables indicated that adding continuous BMD measurement improved classification of patients into their correct risk categories. This was not found when adding binary BMD.  Of the 1,960 patients in the validation dataset, 27% (n=529) were reclassified into a different risk category when including continuous BMD into fracture risk prediction. Two percent (9/529) were found to be reclassified correctly into a higher risk group and 55% (292/529) were reclassified correctly into a lower risk group; indicating 22% (292/1342) of patients at high risk in Model 1, not accounting for BMD measurement, were no longer at high risk. The net reclassification improvement when adding continuous BMD to standard risk factors, was 0.03, which resulted from increased specificity (non-event NRI = 4%) and decreased sensitivity (event NRI: -1%) from Model 1 ( Table 5).

Summary of Findings
Bone mineral density showed significant association with fracture risk with a 40% decrease for each SD rise in BMD. However, this resulted in small improvements in calibration, discrimination, and reclassification. The c-index estimate was slightly higher with continuous BMD but this increase is not conclusive given the width of the confidence intervals. Despite the limited improvement found of 1% in discrimination when adding continuous BMD, reclassification tables showed 57% of reclassified patients moving into their correct risk group through improved specificity. Importantly, no improvement was found when adding BMD in a binary format.
Our findings are consistent with and corroborate with current literature (7,21,22). Specifically, a study conducted in the Netherlands with 4 year follow up, investigating the added value of BMD for hip fractures risk, found modest improvement in predictability (21). Further, two more recent studies also indicated limited added value of BMD to fracture risk prediction (7,22).

Answering Evidence gap
To our knowledge, this is the first study to investigate the added value of BMD in a binary and continuous format, to standard fracture risk factors. Further, it is based on a larger sample size than other studies investigating BMD in addition to FRAX (7,21,22). It helps inform the NICE research recommendation to assess the added value of BMD to routine fracture risk assessment in primary care (23). It further highlights that the more commonly used for treatment decision making, binary format of BMD resulted in a loss of predictability in fracture risk prediction; based on comparable measures for discrimination and reclassification

Robustness of Data
The prospective cohort was well populated with key standard risk factors recorded: BMI, smoking status and alcohol consumption, and personal and parental fracture history. Other than 3.2% of missing data for BMI, in 6,117 patients, complete data was collected for all risk factors (including BMD T-score recorded at the total hip). Further, the cohort was linked to a national robust electronic health records. This Danish National Patient Registry allowed for outcome fracture to be identified and also provided data on the mechanism for the fracture; this helped more accurately phenotype osteoporotic fractures.

Generalisability
The generalisability is affected in a few ways. Firstly, the findings are based on a Danish cohort. Secondly, AURORA data was collected from patients who presented to their doctor with at least one fracture risk factor and were referred to the osteoporosis clinic; this led to a biased study sample with a higher risk of a fracture and increased age. This could overestimate fracture risk amongst patients in a primary care setting. Due to the increased age of the sample, death becomes a competing risk. However, information on death was not collected and could not be retrieved. This limited the analysis of the data as competing risks could not be accounted for which may again lead to an overestimation of fracture risk (24). However, as an independent study primarily assessing the added value of BMD through deriving and validating the fracture risk prediction models, this bias would be present in both analyses to compare derived risk models with and without BMD measurement.

Methodology
The FRAX risk algorithm has not yet been published, therefore FRAX estimates could not be directly calculated for the cohort. Instead, the FRAX risk model was recalibrated on the dataset with and without BMD added. Further, fracture outcomes in this study included pelvic fractures which are increasingly recognised as low trauma fragility fractures [(25)], and used BMD taken at the total hip instead of at the femur neck as it is the gold standard in Denmark (26).
Internal validation was performed to validate the derived risk prediction models. This may lead to over optimistic results of the performance of the risk models (14). To account for this limitation, a commonly practised method which randomly assigns patients to the derivation and validation datasets was used; further, a similar 1:2 ratio was also used to split the data (27-29).
The study had a 4 year follow up which is shorter than other recognised risk models. To account for this, we adapted the 20% clinical risk threshold for 10 year fracture estimates to 8.5% for 4 year fracture estimates, assuming that risk is constant over time (30, 31).
Traditional methodology assessing the added value to risk factors to existing risk prediction models are criticised to be insensitive to change, to lack interpretability (32)(33)(34)(35). This was shown when finding a 1% change in Harrell's C-Index and overlapping confidence intervals between models, limiting the interpretability of results. Reclassification analysis was thus also used to provide more clinically interpretable results.

Clinical implications
The most notable clinical implication is the more routine use of BMD measurement for fracture risk assessment. Further, evidence suggests continuous BMD adds better predictability compared to the binary format.

Future Research
Further research is recommended to evaluate the added value of BMD to fracture risk prediction; in particular in addition to QFracture risk factors and using primary care routinely collected data. However, a brief interrogation into the Clinical Practice Research Datalink, a routinely collected UK primary care database, showed poor availability of BMD measurement in patient records, and thus, strong limitations to potential analyses. Less than 1% of patients had BMD recorded from a sample of 60,658 patients aged 40-90; not on any osteoporotic treatment; and with complete data for age, gender, BMI, smoking status, and alcohol consumption. Thus, prior to UK analysis, BMD recording in primary care databases needs to improve. Methodologically, as well as assessing the added value of BMD to standard risk factors, we should also explore the option to replace existing fracture risk factors with the BMD measurement; this has rarely been explored in the literature but should be considered in future analyses. We also recommend research to investigate the added value of BMD in a potentially more natural, 3 group format of BMD (osteopenic, normal, osteoporotic).
In addition, further research is recommended to develop current methodology used to assess the added value of BMD to provide more clinically relevant results, such as cost implications; and to allow for better comparability between new risk factors with respect to their added value, thus improving decision making.

Copyright
The Corresponding Author has the right to grant on behalf of all authors and does grant on behalf of all authors, a worldwide license to the Publishers and its licensees in perpetuity, in all forms, formats and media (whether known now or created in the future), to i) publish, reproduce, distribute, display and store the Contribution, ii) translate the Contribution into other languages, create adaptations, reprints, include within collections and create summaries, extracts and/or, abstracts of the Contribution, iii) create any other derivative work(s) based on the Contribution, iv) to exploit all subsidiary rights in the Contribution, v) the inclusion of electronic links from the Contribution to third party material where-ever it may be located; and, vi) license any third party to do any or all of the above.

Exclusive License Statement
The Corresponding Author has the right to grant on behalf of all authors and does grant on behalf of all authors, an exclusive licence (or non exclusive for government employees) on a worldwide basis to the BMJ Publishing Group Ltd to permit this article (if accepted) to be published in BMJ editions and any other BMJPGL products and sublicenses such use and exploit all subsidiary rights, as set out in our licence.

Competing Interests
We have read and understood BMJ policy on declaration of interests and declare that we have no competing interests.
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Transparency declaration
The lead author (Dr Dhiman) affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Ethical Approval
Ethics approval was given through the Region of North Jutland's from the Danish Data Protection Agency ("paraplyanmeldelse 2008-58-0028").

Funding
This study was funded through the National Institute for Health Research (NIHR), School for Primary Care Research (SPCR).

NIHR acknowledgement
This paper presents independent research funded by the National Institute for Health Research School for Primary Care Research (NIHR SPCR).

Disclaimer
The views expressed are those of the author(s) and not necessarily those of the NIHR, the NHS or the Department of Health.

Patient Involvement
Patients were not involved in the development of this research question.

Data Sharing Statement
Data sharing: Technical appendix and statistical code is available from the corresponding author at paula.dhiman@nottingham.ac.uk. Due to restrictions by the Danish Data Protection Agency, data can only be shared on an aggregated level and by special permission.

Contributorship
Contributors: PD wrote the statistical analysis plan, cleaned and analysed the data, and drafted and revised the paper. PV and SA provided the AURORA dataset for analysis and linked patients to the National Patient Registry of Denmark, they also reviewed and revised Conclusions: Continuous bone mineral density marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It is suggested that future focus should be on evaluation of this risk factor using routinely collected data, and on the development of more clinically relevant methodology to assess the added value of a new risk factor

Article Summary
Strengths and Limitations: • Addresses a research question recommended by The National Institute for Health and Care Excellence to investigate the added value of bone mineral density to fracture risk prediction. • Investigates bone mineral density in both the commonly used, binary, and continuous format. • Presents changes in calibration, discrimination, and reclassification to describe the added value of bone mineral density. • Uses robustly collected data from Northern Denmark, with 3.2% missing data.
• As data is from a North Danish population, with at least one fracture risk factor, this limits generalisability of the results. Given this burden, and treatment options for osteoporosis, identifying patients at risk of an osteoporotic fracture is high priority amongst health policymakers to reduce the risk of future fracture (3). Risk prediction tools have been developed to aid in the identification of patients at risk. For example, the Fracture Risk Assessment Tool (FRAX®) and QFracture® are commonly used to assess fracture risk in patients based on pre-defined risk factors.
Bone mineral density (BMD), a measurement used to aid diagnosis of osteoporosis, has also been identified as a fracture risk factor (4)(5)(6)(7) . Unlike some other fracture risk factors, treatment options (e.g. bisphosphonate medication) are available that reduces the fracture risk markedly when treatment is initiated based on low BMD.
English National guidelines (The National Institute for Health and Care Excellence (NICE)) for fracture risk assessment recommend treatment of osteoporosis to prevent fractures but have not included BMD as a mandatory risk factor for fracture risk prediction tools to incorporate (8). This is partly due to the lack of robust evidence and limited generalisability of current research, which has particularly focused on evaluating BMD in postmenopausal women evaluating the added value of BMD to existing fracture risk factors (5-7).
The National Institute of Clinical Health and Excellence also recognise this gap in the evidence and have recommended research to assess the added value of BMD as a risk factor in fracture risk assessment (9).
The aim of this study is to assess the value of BMD measurement in addition to the standard fracture risk factors used in the FRAX® risk model using a robustly collected prospective cohort.

Methods
This paper has been written in accordance to the TRIPOD checklist.

Patient and Public Involvement
Patients and the public were not involved in the development of this research question and were not involved in the design of this study.

Study Design and Data Source
A prospective cohort study was conducted using patients from the Aalborg University Hospital Record for Osteoporosis Risk Assessment (AURORA) dataset; patients were followed up using the National Patient Registry of Denmark.
The AURORA dataset consists of patients attending the Osteoporosis Clinic at Aalborg University Hospital after a referral from their primary care physician. A referral was offered to patients with at least one risk factor for osteoporosis (low BMI, previous fracture, parental hip fracture, smoking status, alcohol consumption, glucocorticoid use, rheumatoid arthritis, and secondary osteoporosis) or if they were aged 80 years and above. Further detail of the data collection has been described elsewhere (10). The Danish National Patient Registry which collects inpatient and outpatient data from all Danish hospitals, was linked to the AURORA dataset through unique patient identifiers Ethics approval was given through the Region of North Jutland's from the Danish Data Protection Agency ("paraplyanmeldelse 2008-58-0028").

Cohort selection
Data collection for AURORA began 1 st January 2010 and was collected for 3 years (up to 31 st December 2012). Patients were included if they were aged 40-90 years; had a BMD T-score at the hip; and were not taking any osteoporotic preventing drugs or any bone sparing drugs for more than one year prior to baseline.

Primary Outcome
The primary outcome measure was an incident osteoporotic fracture during follow up (01/01/2012 to 01/01/2014); defined as a diagnosis of a fracture at the hip, spine, forearm, humerus, and pelvis. Fractures at these sites resulting from traffic, work, and sports related accidents were excluded from the study. Relevant fractures were identified in the Danish National Patient Registry, using the International Statistical Classification of Diseases, 10th Version codes (ICD-10 codes), which was developed using recognised database methodology for each fracture (11).

Fracture risk factors
Fracture risk factors, used in the FRAX® risk prediction model, were extracted at baseline. They were: age; gender; height (m); weight (kg); previous fracture; parental history of hip fracture, current smoking status; current alcohol consumption; glucocorticoid use (currently exposed for 3+ months); rheumatoid arthritis; and secondary osteoporosis (includes type I diabetes; osteogenesis imperfecta in adults; untreated, long standing hyperthyroidism;

Bone Mineral Density
DXA scans were performed by trained technicians using Hologic Discovery A (Bedford, MA, USA). A daily QC programme was in place and in vivo CV using repositioning of patients was <1%. Total hip BMD was used as region of interest. Bone mineral density was added to the fracture risk prediction model twice, firstly, as a continuously measured T-score value, and secondly, as a binary risk factor, dichotomised at/above T-score threshold for osteoporosis and below threshold, -2.5 in T-score (manufacturers' normal range using normal material from T Kelly et al (12)) based on World Health Organisation (WHO) classifications (13). Calculated T-scores were gender specific.

Statistical Analysis
A complete case analysis was performed on the data; 3.2% of data was missing. The AURORA dataset was split into two using recognised methodology (14); where a random number was assigned to patients and based on a cut off, two-thirds was used to derive the risk models, and the remaining third was used to validate them.

Model derivation
Three Cox proportional hazards models were developed for the primary outcome, using a complete case analysis on the derivation dataset: Graphical methods were used (log-log plots) to assess the proportional hazards assumption, and risk factors violating this assumption were added to the model as a time varying covariate.
Recognised methodology used in research studies was used to build the 3 risk prediction models (15,16); the Kaplan Meier method was used to obtain 4-year fracture risk estimates for patients. Further detail on the conversion of the Cox proportional hazards models to risk prediction models has been provided in Supplementary Table 1.

Validation of Models
Four-year fracture risk was calculated from each model and the predictive performance of each risk prediction model was assessed by measures describing calibration, discrimination, and reclassification. These metrics were assessed using the validation cohort.
Calibration measures how well the predicted risk agrees with observed risk in the data. It plots the mean predicted and observed risk of fracture for each decile of predicted risk. The observed risk of fracture was derived from the 4 year Kaplan-Meier estimate. Good calibration indicates the predicted risk is close to the observed risk of the outcome.
Discrimination measures how well the risk prediction model differentiates between patients who have or have not observed the event in the study. This was quantified by the area under Reclassification tables (17) measures movement between risk categories when adding a new risk factor. Threshold for treatment at 4 years was set at a fracture risk level of 8.5%; to be comparable to the treatment threshold of 20% at 10 years. This was presented by the total percent of patients reclassified (incorrectly and correctly), and also the Net Reclassification Index (NRI) (18,19). The NRI gives the net calculation of the changes in the right direction and a higher NRI indicates a better reclassifying model.

Characteristics of the data
The AURORA collected data on 7,912 patients; 1,795 patients were excluded comprising, 440 not aged between 40-90 years at baseline; 156 not having a recorded T-score value for the total hip at baseline; and 1,199 patients were taking anti-osteoporotic drug therapy for more than one year prior to baseline.
The adjusted analysis is presented in Table 2. Model 1 showed that of the standard risk factors, age and previous fracture were significantly associated with fracture; hazard of fracture increased by 2% per year increase in age (HR=1.024; 95% CI: 1.013 to 1.036); and increased almost 5 fold in patients with a previous fracture at time 0 years (HR=4.881; 95% CI: 3.336 to 7.078).

Model Validation
The 4-year predicted risk of fracture was calculated for all patients in the validation dataset; this was compared to the observed fracture outcome within the 4 year follow up.

Calibration and Discrimination
Calibration plots suggested some improvement when adding BMD measurement; particularly when including continuous BMD T-score measurement (Model 3; Supplementary Figure 1).
The largest change in discrimination was found when adding continuous BMD measurement to standard risk factors; Harrell's C-Index increased by 1.17% (Table 3). However, binary BMD measurement, as a measure for osteoporotic patients, was found to reduce Harrell's C-Index by -0.65%.

Summary of Findings
Bone mineral density showed significant association with fracture risk with a 40% decrease for each SD rise in BMD. However, this resulted in small improvements in calibration, discrimination, and reclassification. The c-index estimate was slightly higher with continuous BMD but this increase is not conclusive given the width of the confidence intervals. Despite the limited improvement found of 1% in discrimination when adding continuous BMD, reclassification tables showed 57% of reclassified patients moving into their correct risk group through improved specificity. Importantly, no improvement was found when adding BMD in a binary format.
Our findings are consistent with and corroborate with current literature (7,21,22). Specifically, a study conducted in the Netherlands with 4 year follow up, investigating the added value of BMD for hip fractures risk, found modest improvement in predictability (21). Further, two more recent studies also indicated limited added value of BMD to fracture risk prediction (7,22).

Strengths and Limitations
Answering Evidence gap To our knowledge, this is the first study to investigate the added value of BMD in a binary and continuous format, to standard fracture risk factors. Further, it is based on a larger sample size than other studies investigating BMD in addition to FRAX (7,21,22). It helps inform the NICE research recommendation to assess the added value of BMD to routine fracture risk assessment in primary care (23). It further highlights that the more commonly used for treatment decision making, binary format of BMD resulted in a loss of predictability in fracture risk prediction; based on comparable measures for discrimination and reclassification

Robustness of Data
The prospective cohort was well populated with key standard risk factors recorded: BMI, smoking status and alcohol consumption, and personal and parental fracture history. Other than 3.2% of missing data for BMI, in 6,117 patients, complete data was collected for all risk factors (including BMD T-score recorded at the total hip). Further, the cohort was linked to a national robust electronic health records. This Danish National Patient Registry allowed for outcome fracture to be identified and also provided data on the mechanism for the fracture; this helped more accurately phenotype osteoporotic fractures.

Generalisability
The generalisability is affected in a few ways. Firstly, the findings are based on a Danish cohort. Secondly, AURORA data was collected from patients who presented to their doctor with at least one fracture risk factor and were referred to the osteoporosis clinic; this led to a biased study sample with a higher risk of a fracture and increased age. This could overestimate fracture risk amongst patients in a primary care setting.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   17 Due to the increased age of the sample, death becomes a competing risk. However, information on death was not collected and could not be retrieved. This limited the analysis of the data as competing risks could not be accounted for which may again lead to an overestimation of fracture risk (24). However, as an independent study primarily assessing the added value of BMD through deriving and validating the fracture risk prediction models, this bias would be present in both analyses to compare derived risk models with and without BMD measurement.

Methodology
The FRAX risk algorithm has not yet been published, therefore FRAX estimates could not be directly calculated for the cohort. Instead, the FRAX risk model was recalibrated on the dataset with and without BMD added. Further, fracture outcomes in this study included pelvic fractures which are increasingly recognised as low trauma fragility fractures [(25)], and used BMD taken at the total hip instead of at the femur neck as it is the gold standard in Denmark (26).
Internal validation was performed to validate the derived risk prediction models. This may lead to over optimistic results of the performance of the risk models (14). To account for this limitation, a commonly practised method which randomly assigns patients to the derivation and validation datasets was used; further, a similar 1:2 ratio was also used to split the data (27-29).
The study had a 4 year follow up which is shorter than other recognised risk models. To account for this, we adapted the 20% clinical risk threshold for 10 year fracture estimates to 8.5% for 4 year fracture estimates, assuming that risk is constant over time (30, 31).
Traditional methodology assessing the added value to risk factors to existing risk prediction models are criticised to be insensitive to change, to lack interpretability (32)(33)(34)(35). This was shown when finding a 1% change in Harrell's C-Index and overlapping confidence intervals between models, limiting the interpretability of results. Reclassification analysis was thus also used to provide more clinically interpretable results.

Clinical implications
The most notable clinical implication is the more routine use of BMD measurement for fracture risk assessment. Further, evidence suggests continuous BMD adds better predictability compared to the binary format.

Future Research
Further research is recommended to evaluate the added value of BMD to fracture risk prediction; in particular in addition to QFracture risk factors and using primary care routinely collected data. However, a brief interrogation into the Clinical Practice Research Datalink, a routinely collected UK primary care database, showed poor availability of BMD measurement in patient records, and thus, strong limitations to potential analyses. Less than 1% of patients had BMD recorded from a sample of 60,658 patients aged 40-90; not on any osteoporotic treatment; and with complete data for age, gender, BMI, smoking status, and alcohol consumption. Thus, prior to UK analysis, BMD recording in primary care databases needs to improve.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60   F  o  r  p  e  e  r  r  e  v  i  e  w  o  n  l  y   18 Methodologically, as well as assessing the added value of BMD to standard risk factors, we should also explore the option to replace existing fracture risk factors with the BMD measurement; this has rarely been explored in the literature but should be considered in future analyses. We also recommend research to investigate the added value of BMD in a potentially more natural, 3 group format of BMD (osteopenic, normal, osteoporotic).
In addition, further research is recommended to develop current methodology used to assess the added value of BMD to provide more clinically relevant results, such as cost implications; and to allow for better comparability between new risk factors with respect to their added value, thus improving decision making.

Conclusion
Continuous BMD marginally improves fracture risk assessment. Importantly, this was only found when using continuous BMD measurement for osteoporosis. It seems that prediction models for fragility fracture risk may be improved only marginally, using present risk factor assessment and evaluations. It is suggested that future focus should be on additional risk factors and on the development of more clinically relevant methodology to assess the added value of a new risk factor. Once all 5 models were finalised, their beta coefficients were used to create 5 risk prediction models and calculate risk of fracture for each patient, using the following general equation: Where ( ) is the baseline survival rate at follow up time, (for this example, a follow up time of 10 years will be used); beta ( ) are the regression coefficients for each included risk factor in the model ( ); is the observed data value for each risk factor; ̅ is the corresponding mean for each risk factor; and is the total number of risk factors included in the model. Table A1 shows the formula for each risk prediction model explicitly.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59