Article Text

Download PDFPDF

Cohort profile
The Irish dual-energy X-ray absorptiometry (DXA) Health Informatics Prediction (HIP) for Osteoporosis Project
  1. Erjiang E1,
  2. Tingyan Wang1,2,
  3. Lan Yang1,3,
  4. Mary Dempsey3,
  5. Attracta Brennan4,
  6. Ming Yu1,
  7. Wing P. Chan5,
  8. Bryan Whelan6,7,
  9. Carmel Silke6,7,
  10. Miriam O'Sullivan6,7,
  11. Bridie Rooney8,
  12. Aoife McPartland7,
  13. Gráinne O’Malley6,8,
  14. John J. Carey6,9
  1. 1Department of Industrial Engineering, Tsinghua University, Beijing, China
  2. 2Nuffield Department of Medicine, University of Oxford, Oxford, UK
  3. 3School of Engineering, National University of Ireland Galway, Galway, Ireland
  4. 4School of Computer Science, National University of Ireland Galway, Galway, Ireland
  5. 5Department of Radiology, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
  6. 6School of Medicine, National University of Ireland Galway, Galway, Ireland
  7. 7Department of Rheumatology, Our Lady’s University Hospital, Manorhamilton, Ireland
  8. 8Department of Geriatric Medicine, Sligo University Hospital, Sligo, Ireland
  9. 9Department of Rheumatology, Galway University Hospitals, Galway, Ireland
  1. Correspondence to Dr John J. Carey; john.j.carey{at}


Purpose The purpose of the Irish dual-energy X-ray absorptiometry (DXA) Health Informatics Prediction (HIP) for Osteoporosis Project is to create a large retrospective cohort of adults in Ireland to examine the validity of DXA diagnostic classification, risk assessment tools and management strategies for osteoporosis and osteoporotic fractures for our population.

Participants The cohort includes 36 590 men and women aged 4–104 years who had a DXA scan between January 2000 and November 2018 at one of 3 centres in the West of Ireland.

Findings to date 36 590 patients had at least 1 DXA scan, 6868 (18.77%) had 2 scans and 3823 (10.45%) had 3 or more scans. There are 364 unique medical disorders, 186 unique medications and 46 DXA variables identified and available for analysis. The cohort includes 10 349 (28.3%) individuals who underwent a screening DXA scan without a clear fracture risk factor (other than age), and 9947 (27.2%) with prevalent fractures at 1 of 44 skeletal sites.

Future plans The Irish DXA HIP Project plans to assess current diagnostic classification and risk prediction algorithms for osteoporosis and fractures, identify the risk predictors for osteoporosis and develop novel, accurate and personalised risk prediction tools, by using the large multicentre longitudinal follow-up cohort. Furthermore, the dataset may be used to assess, and possibly support, multimorbidity management due to the large number of variables collected in this project.

  • rheumatology
  • health informatics
  • risk management
  • bone diseases

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • We have established a large anonymised database of >36 000 individuals with recorded demographics, DXA metrics, clinical and therapeutic variables, around one-fourth of whom have a prevalent fracture and one-fourth of whom have no identifiable major risk factor for osteoporosis.

  • These data can be used to provide substantive exploration of the validity of current tools to identity people with low bone mineral density or at risk of fracture.

  • Important limitations include the lack of data on under-represented populations, laboratory and other variables and validated prospective clinical variables and medications, and data from other regions in the country; we plan to address these in future studies.


Today, non-communicable diseases (NCDs) represent the most common disorders worldwide, including musculoskeletal conditions like arthritis and osteoporosis.1 Osteoporosis is a global health crisis currently affecting millions of men, women and children, with many more at risk.2 Fragility fractures represent the clinical manifestation of osteoporosis. Predictions suggest that the annual incidence of fragility fractures will rise by 23.3% from 2017 to 2030 in six European countries, whose combined populations represent >60% of Europe’s population.2 Fractures result in substantial morbidity, increased mortality and associated social and healthcare costs.3 More than 1 million quality-adjusted life-years are lost due to osteoporotic-related fractures in Europe each year.2 In 2010, approximately 43 000 European deaths were fracture related while expenditure related to osteoporosis exceeded €37 billion.4 5 Although inpatient hospital costs for osteoporotic fractures are similar to, or exceed those of other NCDs such as cancer and cardiovascular disease,6 7 they receive considerably less attention. This may be one reason why the osteoporosis care provided to patients presenting with fracture is often narrowly focused and inadequate,8 notwithstanding the fact that studies show that preventative treatment can reduce fracture risk by 33%–50%.9 The accurate and timely identification of high-risk or high-cost patients is critical to facilitate effective care and reduce the burden of this disease.

The majority of osteoporotic fractures are related to genetics, patients’ physical attributes, lifestyle, other illnesses and medications.10–13 Many fractures are preventable by identifying those at risk before they fracture, and then using proven, safe, cheap and effective interventions.2 Bone mineral density (BMD) is the single best predictor of osteoporotic fracture in a postmenopausal woman without fracture.12–19 BMD is typically measured non-invasively by dual-energy X-ray absorptiometry (DXA) machines. Many patients undergo DXA scanning to measure their BMD in order to estimate risk, classify them as osteoporotic (or not) and monitor the effects of interventions.20 21 The combination of BMD with other factors (eg, smoking, blood pressure, blood sugar) enhances this assessment, and an array of risk tools have been developed, all with corresponding strengths and limitations.22 However, even if a ‘best tool’ is available, a ‘one-size-fits-all’ approach with the ‘best’ algorithm has limitations in clinical practice; with a more personalised approach being preferable.23

WHO and its member states have committed to a Sustainable Development Goal, which aims to address NCDs and substantially reduce their burden by 2030.24 Electronic health information (EHI) has great potential to improve the quality, efficiency and effectiveness of healthcare diagnosis, monitoring and provision while substantially reducing costs.25 26 Large amounts of data are generated annually in healthcare for the provision of patient care, and for regulatory compliance, yet much of these data are underused, lying idle in large digital repositories.25 Big data is defined as ‘high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation’.27 Advances in big data analytics and novel applications like artificial intelligence (AI) can advance this goal by improving disease detection and treatment at a population level, while also offering a personalised approach for individuals.28 Presently, in osteoporosis care worldwide, the primary sources of EHI are DXA machines, hospital health systems, electronic medical records (EMRs) and hospital administrative systems.

Despite the fact that Irish people are recognised as one of the highest risk groups for fracture worldwide,2 limited robust data exists on the burden of osteoporotic illness, the validity of fracture risk and appropriate classification and management criteria for the Irish population. In part way to addressing this limitation, the authors undertook a systematic review of studies for Ireland examining ‘DXA’, ‘fracture risk assessment’ and ‘fracture risk prediction’. Two hundred three papers were identified, 39 of which evaluated risk prediction or DXA diagnostic criteria in their results. Most studies were small: four had >1000 subjects and only one had >10 000 subjects. None of the studies systematically evaluated the various DXA diagnostic criteria, seven papers compared other evaluation methods with DXA (ie, ultrasound, nail spectroscopy, CT, fracture risk tools), and seven papers evaluated various fracture risk factors or risk tools. One paper ‘validated’ the Fracture Risk Assessment Tool (FRAX) using a limited dataset from public hospital records on hip fractures. However, this paper did not include data on the accuracy of hip fracture diagnosis, other facilities, other fractures, DXA or other clinical information.29 However, one study (n=90) did evaluate the ability of FRAX to correctly classify Irish people for treatment.30

Cohort description

Patient and public involvement

This study was completed without participant involvement. A waiver of consent was granted by the Ethics Committee.

Source data

DXA scan data were collected at three hospital sites from four DXA machines. Each of these sites was located in a publicly funded academic teaching hospital: one in the Department of Geriatric Medicine at Sligo University Hospital, one in the Department of Rheumatology, Manorhamilton University Hospital and two in the Department of Rheumatology, Merlin Park, Galway University Hospital. One machine in Merlin Park hosted data imported from older scanners in 2005; scanners which had fewer software features. The source data from all four DXA machines (GE Lunar Prodigy Advance, Madison, Wisconsin, USA; software V.17.0) were extracted, validated, cleaned and merged into a single large anonymised data set.

Initial project plans included merging DXA data with EMRs used by the rheumatology and fracture liaison services, the Hospital Inpatient Enquiry System and the Electronic Discharge Summary systems, all of which were available electronically through the respective hospitals’ information system. However, the Irish interpretation of the new General Data Protection Regulation (GDPR) legislation precluded such a measure without explicit patient consent. As this requirement would require significant extra staffing and funding to be feasible, only DXA scanner data are presented in this paper.

All patients were referred for a DXA scan by their doctor, including general practitioners, orthopaedic fracture clinics, osteoporosis services and other medical specialties, the majority of whom had an appropriate indication for a DXA scan.31 All scans meet current national legislative requirements, and operate with best international practice as set forth by The International Society for Clinical Densitometry.32 The DXA clinical team members are experienced, trained and International Society for Clinical Densitometry-certified in performing and interpreting bone densitometry; each clinic has similar standard operating procedures. Furthermore, each clinic is also the primary hub for osteoporosis and fracture liaison services in their respective hospitals. Patients who have been referred with a ‘history of fracture’ have this verified by the DXA technologist at the time of scanning, by reviewing the respective hospitals’ electronic radiology systems. These systems are inaccessible to external facilities. Patients without an appropriate indication for a scan are returned to the referring physician, with either a request for additional information or the referral being declined. The use of a protocol to consider DXA screening for younger ‘healthier’ patients (based on older criteria) would act as an early detection and treatment innervation to reduce fracture-related incidence in later years in the Irish population.


All subjects who had at least one DXA scan on one of the four DXA scanners between 22 January 2000 and 1 November 2018 were included in the dataset. Also excluded were scans of DXA phantoms and scans, where demographics were not entered on the DXA machine. DXA scans with extreme values which appeared erroneous or could not be validated were also excluded. These included scans where the height <1 m or >2 m, weight <20 kg or more than than the machine limit (110–158 kg), body mass index >70 and a BMD <0.0 or >5.0. The data cleaning procedure was carried out prior to final GDPR legislation implementation when it was possible to identify and trace the scan using coded data. All data have since been completely anonymised such that it cannot be traced back to or used to identify any individual and/or linked with any other data. This is in compliance with current GDPR legislation and the project’s ethical approval requirements.

Control group

The control group was selected from the cohort. The control group is defined as those referred for a screening DXA scan and without having any other risk factor listed (other than age), including fracture. This control group, although not a random selection of healthy community volunteers, will serve in further comparative studies as the ‘healthy control group’. The primary objectives are correct risk classification and diagnosis, and fracture risk assessment. Subsequent data analyses will include a comparison of those with fractures with those without, together with a comparison of different fracture types in the five major categories, as listed by the International Osteoporosis Foundation (ie, hip, spine/vertebral, humerus, forearm and other).


When a patient is referred for a DXA scan, important information such as clinical diagnoses and medications is collected and input to the DXA machine. All clinical information, including fracture sites, and medications for this project, has been independently reviewed and verified by a senior clinical investigator (JC) and listed as either a ‘risk factor’, ‘protective factor’, ‘no effect’ or ‘osteoporosis or bone treatment’. Fractures are also grouped into one of five categories (ie, hip, spine/vertebral, humerus, forearm and other). Patient demographics include age, gender, race, height and weight. DXA variables include bone mineral content, bone area, BMD, T-score (National Health and Nutrition Examination Survey (NHANES) III/General Electric (GE)), Z-score (NHANES III/GE, no weight adjustment), fat mass, lean mass, vertebral height, hip geometry, FRAX 10-year risk of major osteoporotic fracture and hip fracture.

Findings to date

A total of 37 053 subjects (19 years of data) from 4 DXA machines located at three hospital sites in Ireland were extracted. Following cleaning and validation of the data prior to data analysis, 36 590 unique subjects remained, each of whom had at least one DXA scan, ranging in age from 4 to 104 years. A total of 25 899 (70.78%) subjects had one scan only, 6868 (18.77%) subjects had two scans and 3823 (10.45%) patients had three or more scans. The majority of the scans were performed in Merlin Park University Hospital (53%), 31% were performed in Manorhamilton University Hospital and 16% in Sligo University Hospital. A summary of data completeness and availability is presented in table 1.

Table 1

Summary of data completeness and availability of demographic, clinical, biometric and DXA scan variables

A total of 10 349 subjects who had been referred for a DXA scan had no obvious fracture risk factor (other than age in some cases) were selected to serve as the control group. A total of 9947 patients had at least 1 prior fracture, at 1 of 44 skeletal sites. Available for data analysis are 364 unique medical disorders, 186 unique medications and 46 DXA variables. There is considerable heterogeneity regarding comorbidities and medications. This likely reflects local practice and the referral populations being served.

Summary details for the female and male subjects are presented in tables 2 and 3, respectively. An unpaired Student’s t-test was performed for comparison between groups for continuous variables, while a χ2 test was performed for categorical variables. A p value <0.05 was deemed statistically significant. In terms of race and gender, almost 100% of the subjects are Caucasian (as expected), with 86% being female. The mean age of the female subjects is 60 years; 70% of the female subjects are postmenopausal women. Although 27% of the female subjects had a prior fracture, only 23% are taking calcium and/or vitamin D and <11% are on osteoporosis therapy. In general, male subjects are older (62.4 years vs 60.5 years), taller (172.3 cm vs 160.4 cm), heavier (80.8 kg vs 68.4 kg) and more likely to be corticosteroid users (24.5% vs 9.5%, p<0.001), although a similar proportion are smokers (9.9% vs 9.3%, p=0.196). While the male subjects have a higher mean femoral neck BMD (0.901 g/cm2 vs 0.842 g/cm2) and a lower 10-year fracture risk (11.03% vs 7.34%), all (p<0.001), they have a similar prevalence of fracture (27.3% vs 27.2%) and are less likely to be taking osteoporosis medication (7.3% vs 10.5%, p<0.01).

Table 2

Baseline summary of variables of female subjects from three centres

Table 3

Baseline variables of male subjects from three centres

All analyses were performed using Minitab V.16.0, Coventry, UK, R V.3.5.1 and MS SQL Server Management Studio 18.

Future plans

Osteoporosis data for the Irish population are restricted in terms of national applicability and generality, with the majority of the studies limited to small heterogeneous studies. While a number of larger studies have been carried out, they have been published using limited information from national public hospital administrative data sets.29 33 34 All studies which have been carried out to date include very limited information on individuals, do not provide information on patients admitted to other facilities or who were managed as an outpatient or never presented to a hospital, and are not linked to DXA data, medication use or to future events. Furthermore, some studies have questioned the accuracy of the FRAX tool for correctly identifying Irish people for fracture risk.30 35 Multiple risk models have been proposed using data derived from a variety of sources for predicting low BMD, fractures and comorbidities.11 36–49 The Health Informatics Prediction (HIP) Project ‘big data’ approach incorporates DXA material of almost 40 000 patients having approximately 10 000 fractures and a relatively large convenience control cohort spanning an almost 20-year period. The approach also includes the algorithmic risk score, as well as the variables used to calculate such scores. In addition, the project has access to an array of variables, including medications, comorbidities and known risk factors, variables which were not previously evaluated for Irish people. These data, uniquely offer for the first time, an opportunity to study fragility fracture relationships on a larger scale in a more representative Irish population. These data also offer an opportunity to develop a model which could be replicated globally. In addition, because of the large number of variables, other medications and diseases, this cohort could be used in the future, to study other illnesses of interest including cardiovascular disease, diabetes mellitus and cancer.

In future papers, the authors plan to compare and contrast demographics, DXA biometrics and other clinical risk factors between those with low BMD (or who would be classified as ‘osteoporosis’ by various DXA criteria), those with prevalent fractures, those without prevalent risk factors, those who subsequently developed fractures and those who did not. We plan to assess the importance of available risk factors for our population, and how much they contribute. The performance of the various fracture risk tools, model assessment strategies and more novel AI methodologies for predicting those with low BMD and those with fractures using traditional variables, will also be evaluated. We will also be able to compare the results with those of other populations, and assess the importance of variables and models for specific fracture sites, in addition to comparing the various models at an aggregate ‘one-size-fits-all’ approach for populations, as well as a more personalised approach for different individuals with specific traits or characteristics.

Multidimensional analyses will be performed on subjects using online analytical processing to see the demographic characteristics and dynamic changes of a specific disease and the marginal effects of risk factors from multiple perspectives, including age, gender, region and effect of changes in BMD, height, weight and other biometrics over time. Data will be summarised using appropriate descriptive statistics, then compared using classification criteria, after which it will be evaluated by multiple fracture prediction algorithms using statistical, cluster and correlation analyses, coupled with classification and regression models. Finally, econometric and machine learning methods will be applied to the data to predict the risk of osteoporotic fractures and/or changes in BMD in patients. For example, we have one paper in press which is about detecting low BMD using machine learning techniques.50 It is anticipated that the HIP framework (arising from the Irish DXA HIP for Osteoporosis Project) could also be applied to other geographic regions, and the prediction of other NCDs, such as cardiovascular disease, diabetes and cancer. These hypotheses will be explored at a later date.

Strengths and weaknesses

This study has several strengths including a large data sample size from multiple centres where high-quality performance is a tradition and is embedded in the practice. Furthermore, there is a long follow-up period for many patients and a significant number of demographic, biometric and clinical variables. As a substantial portion of the cohort appear to be at ‘low risk’ for fracture at baseline, they can serve as a large convenience control group. Meanwhile, as >25% of the cohort also has a fracture, this will enable more robust analyses of fracture risk algorithms. Given that the prevalence of osteoporosis and low BMD depends on the criteria and calculation methods used,51 this will require further exploration and explanation in an Irish context. Some patients in the dataset have had multiple scans and provide an opportunity to explore the marginal effect of risk factors such as changes in BMD with or without intervention and how such changes modify the outcome. Osteoporosis and the care of patients with fragility fractures is a global problem where current practice and the comprehensive utilisation of healthcare big data remains unsatisfactory,15 52 53 as the majority of patients are neither diagnosed, investigated nor treated for their skeletal fragility following fracture.52 54 55 Irish studies suggest that some patients with osteoporosis are similarly not treated, while others may be treated even in the absence of a DXA scan, fracture or high fracture risk.56–58 At first glance, it appears that many patients with fractures are not on treatment. There is an opportunity in this project to explore this further.

All research has its limitations. Some of the limitations of this study include the fact that >99% of the cohort are Caucasian, people from other parts of the country have not been included, those who have not had a DXA scan and patients who were scanned on other technologies are also excluded. While the current dataset has only 117 patients reporting other ethnicities, including 68 Asian, 18 black, 15 Hispanic and 16 ‘other’; if our results are replicated in other populations, it would suggest that they are robust and of greater interest and relevance. Furthermore, although we have information on prescribed medications, we do not have a complete list, doses or duration of therapy. We also acknowledge that considerable omissions can exist in the data processing stage. Although the data involved in our study have been carefully examined and verified, omissions can still occur due to the incomplete access to the entire patients’ information records, such as scan images and records in other departments. It is possible that some healthy patients were misdiagnosed with fractures. It is also probable that some patients were diagnosed with fracture but, only in a suspected condition. In addition, the data are reliant in many instances on patients’ self-reporting for other risk factors and medication use, and/or the reports of their referring clinician all of which cannot be accurately verified. Since the introduction of GDPR legislation, we are no longer able to link DXA data to other datasets or back to the original sources, as all data have been anonymised. Ideally, we would be able to obtain consent from patients and track this information more carefully going forward, but in order to do so, it will require research funding or the allocation of staff resource to do this prospectively. However, this paper and other planned publications outlining the scope, validity and robustness of these measures and tools, will generate results and potential further hypotheses, which in turn will future support further collaborations for prospective cohort studies incorporating DXA technology for predicting fractures, cardiovascular disease and other morbidities.



  • Collaborators All data for this project have been completely anonymised and are no longer traceable to source or identifiers coded or otherwise. All data are stored and handled confidentially. Currently, only a small group of the team has access to these data (EE, YL, JC) while the clinical team have access to the clinical systems but cannot link this to the study data. This Irish DXA HIP (Health Informatics Prediction) for Osteoporosis Project represents collaboration between the disciplines of clinical medicine and research, computer engineering, industrial engineering and big data science.

  • Contributors All authors were involved in the development of the programme, plans for analyses, have reviewed the results and participated in the writing and reviewing of the manuscript. The clinical members (JC, WPC, BW, CS, MO, BR, AM and GO'M) were involved in collection, the engineering and computer science members (EE, TW, LY, MD, AB and MY) in data cleaning, merging and analysis and five investigators (EE, LY, AB, MD and JC) at every stage to ensure the validity of data collection, processing and analyses.

  • Funding All investigators dedicated their professional expertise and input to the project free of charge.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval Prior to data collection, formal approval was sought and obtained for each clinical site from the local Hospitals’ Research Ethics Committee to process information from a retrospective cohort with the aim of exploring DXA validity for Ireland. There was considerable consideration around the use of, and access to, clinical systems, data validity, processing, storage and analytics, particularly in light of new legislation concerning data protection in the European Union, namely the General Data Protection Regulation (available at:

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement All data relevant to the study are included in the article or uploaded as supplementary information. Open access for published data. Original data set on file with authors.