Article Text
Abstract
Objectives The Coronary Revascularisation Outcome Questionnaire (CROQ) is a patient-reported outcome measure (PROM) for coronary artery bypass surgery (CABG) and percutaneous coronary intervention (PCI). We tested the psychometric properties of a modified version (CROQv2) when administered in a National Health Service (NHS)/Department of Health (DH) funded pilot of PROMs for coronary revascularisation.
Design Psychometric validation study.
Setting 11 English hospitals in the UK taking part in the NHS/DH funded pilot of PROMs for coronary revascularisation.
Participants Comprehensive analyses of acceptability, reliability, validity and responsiveness were conducted independently for each of the prerevascularisation (n=2685 and n=3711) and postrevascularisation (n=869 and n=837) versions of the CROQ-CABG and CROQ-PCI, respectively.
Results All versions met prespecified stringent criteria for (1) acceptability of items (missing data) and scales (missing data, floor and ceiling effects, skewness); (2) tests of scaling assumptions; (3) reliability: internal consistency (Cronbach's α, item-total correlations); (4) construct validity based on within-scale analyses (internal consistency, intercorrelations between scales, factor analysis and hypothesis testing); (5) construct validity based on comparisons with external measures (convergent and discriminant validity and hypothesis testing) and (6) responsiveness. Results were also confirmed when tests were repeated on subsamples of CABG (n=639) and PCI (n=615) patients who reported receiving help completing prerevascularisation questionnaires.
Conclusions The availability of a psychometrically robust procedure-specific tool that could be used as part of a large-scale coronary revascularisation PROMs programme to capture the patients' perspective of coronary revascularisation will enable outcomes important to patients to be routinely collected alongside clinical outcomes. The CROQ is suitable for administration by postal survey or the prerevascularisation versions can be administered in the clinical setting as in the Coronary Revascularisation PROMs Pilot.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
The Coronary Revascularisation Outcome Questionnaire includes a much broader range of outcomes important to patients than other cardiac-specific questionnaires.
The availability of a tool suitable for use in large-scale patient-reported outcome measures (PROMs) programmes, alongside the collection of clinical data, could enable the routine collection of outcomes that matter to coronary revascularisation patients, rather than focus on narrow aspects of disease or functioning.
Psychometric validation is an iterative process; it is from repeated use in large samples that we can gain confidence that PROMs are measuring what they intend to measure in a reliable way and are able to detect important change.
Large-scale psychometric validation of a procedure-specific PROM for coronary revascularisation in 11 hospitals in the UK.
Unable to measure test–retest reliability.
Introduction
Patient-reported outcome measures (PROMs) measure health status and health-related quality of life (HRQoL) from the patients' perspective. There is growing interest in capturing PROMs data for patients with coronary heart disease and other cardiac conditions and there are numerous disease-specific tools available.1 However, most have been developed to evaluate HRQoL in medically rather than surgically treated patients, and many have not been rigorously validated.2 The Coronary Revascularisation Outcome Questionnaire (CROQ)3 ,4 is a PROM to evaluate health status and HRQoL in patients undergoing coronary artery bypass graft surgery (CABG) and percutaneous coronary intervention (PCI). It was developed and validated in 2000, as a self-administered survey tool to evaluate outcomes in research and clinical audit, at prerevascularisation and 3-months postrevascularisation. It is currently the only disease-specific tool developed specifically to measure health outcomes before and after coronary revascularisation with some demonstrated evidence of reliability, validity and responsiveness, but has not been used widely.
PROMs are most commonly used to measure the impact of healthcare in research and audit but in recent years have been used to compare the performance of healthcare providers.5 Since 2009, the National Health Service (NHS) in England has used PROMs to assess outcomes in four elective surgical procedures (hip and knee replacement, varicose vein surgery and groin hernia surgery) on a routine basis for the purpose of service evaluation.5–7 NHS England and the Department of Health (DH) also use the data to monitor progress towards strategic objectives, such as those specified in the NHS Outcomes Framework.8 The NHS Coronary Revascularisation PROMs Pilot was launched in November 2011 in order to evaluate the feasibility of extending the NHS PROMs programme to patients before and after coronary revascularisation.9 The pilot was established to examine the feasibility of collecting PROMs data from patients selected for elective first-time coronary revascularisation across 11 hospitals in England. Consistent with the PROMs being collected routinely in other procedures5–7 the DH chose to use the generic EQ-5D-3L10 alongside a procedure-specific instrument. The CROQ3 ,4 was chosen as the procedure-specific instrument by the DH despite it not having been used widely, as it had demonstrated preliminary evidence of psychometric robustness, included a broader range of outcomes than other cardiovascular-specific measures, and was specific to coronary revascularisation.2
It is essential that PROMs satisfy certain development, psychometric and scaling standards if they are to provide reliable and valid information for decision making. Classical Test Theory (CTT),11–15 the traditional psychometric approach, is the most dominant paradigm in the development of PROMs.16 Within CCT, well-established methods and criteria are applied to indicate that the concept, that is, represented by the PROM is clear and well understood; that the content is relevant to the patient group concerned; that the psychometric properties (acceptability, reliability, validity, responsiveness) are adequate; and that the scaling structure is justified.17–19 The psychometric properties of a PROM are sample and context dependent within the CTT paradigm.18 ,19 Psychometric validation is an iterative process;17 it is from repeated use in large samples that we can gain confidence that PROMs are measuring what they intend to measure in a reliable way and are able to detect important change.18 ,19
Psychometric properties are influenced by how people respond and this can be influenced by many factors including use in different patient groups, a change in the mode of administration or setting, a change in the assessment point, minor changes to phrasing, question order or response format.18–23 Completion of PROMs by persons other than the patient can introduce bias; for example, family and professionals tend to underestimate patient's quality of life status across diverse cultures and health conditions.24 All changes made to the original version of an instrument require revalidation of the instrument, when used in a new context.18 For use in the NHS Coronary Revascularisation PROMs Pilot, changes were made to the original version of the CROQ (necessitating a new version, CROQv2), including some changes to the text, a change to the postrevascularisation assessment point (from 3 to 6 months postrevascularisation), method of administration, setting of administration and sampling frame (see table 1 and online supplementary appendices 1–4). The NHS Coronary Revascularisation PROMs Pilot, also included a much larger, and potentially wider group of patients in terms of demographic profile and case mix to the sample used to validate CROQv1. These changes necessitated a re-evaluation of the psychometric properties of the CROQ in the new context. We describe the psychometric validation of CROQv2 in the context of the NHS Coronary Revascularisation PROMs Pilot.
supplementary appendix
supplementary appendix
supplementary appendix
supplementary appendix
Methods
Samples
The samples described in this paper are those used for the psychometric validation only and are a subset of all the patient data gathered in the main NHS Coronary Revascularisation PROMs Pilot.
Prerevascularisation (Q1) samples
All NHS PROMs Pilot patients waiting for coronary revascularisation who completed a prerevascularisation (Q1) questionnaire were included except those categorised as ineligible (n=63) or duplicates (n=31). A total of 6396 (2685 CABG and 3711 PCI) patients were included.
Postrevascularisation (Q2) samples
Patients were only sent a postrevascularisation (Q2) questionnaire if they had already completed a Q1 questionnaire. For consistency with the rest of the NHS PROMs programme,5 ,6 the DH intended that the postrevascularisation Q2 questionnaire would be sent to patients at 6 months post procedure in the NHS Coronary Revascularisation PROMs Pilot. However, in practice the Q2 was not administered at a fixed interval of 6 months after the revascularisation date, and there was wide variation in the interval between the revascularisation procedure and Q2 completion. For the Q2 psychometric analysis, only patients who completed a Q2 questionnaire within 5–7 months of a linked Hospital Episode Statistics (HES) revascularisation episode were included. The Q2 psychometric samples therefore included 869 CABG and 837 PCI patients.
Responsiveness samples
Slightly more stringent inclusion criteria were applied to the responsiveness psychometric samples to evaluate whether the CROQ was sensitive to change (between Q1 and Q2) in the context of a specific single elective procedure. Patients who had an emergency or a repeat elective coronary revascularisation procedure were excluded. The responsiveness samples included 865 CABG and 811 PCI patients.
Psychometric evaluation
There is a prerevascularisation and postrevascularisation version for CABG and PCI. Each of the four versions are scored to produce four core scales: symptoms, physical functioning, psychosocial functioning and cognitive functioning. In addition, the postrevascularisation versions include additional items that are scored to produce the satisfaction and adverse effects scales. All scales are scored on a 0–100 scale with higher scores reflecting better functioning.3
We evaluated the acceptability, reliability, validity and responsiveness of CROQv2 prerevascularisation and postrevascularisation versions independently in CABG and PCI samples using the same widely adopted criteria, based on CTT, used to validate CROQv1.3 Table 2 provides an overview of the tests and criteria applied. Analyses included an evaluation of (1) the acceptability of items (missing data) and scales (missing data, floor and ceiling effects, skewness); (2) tests of scaling assumptions; (3) reliability: internal consistency (Cronbach'sα, item-total correlations); (4) construct validity based on within-scale analyses (internal consistency, intercorrelations between scales, factor analysis and hypothesis testing); (5) construct validity based on comparisons with external measures (convergent and discriminant validity and hypothesis testing) and (6) responsiveness.
In addition, the same prerevascularisation psychometric tests were applied to the subgroups of patients (n=639 CABG and n=615 PCI) who reported they received help completing the Q1 questionnaire in the clinical setting to see if the psychometric properties of the CROQ were compromised. The provision of help with questionnaire completion to respondents might have enabled a wider group of patients to have been included in this sample, than in the self-completed only sample. The type of help received and from whom was not recorded.
Results
Table 3 shows the respondent characteristics for the main psychometric samples. There was a higher proportion of male patients (86.0% vs 83.6%) and white patients (79.6% vs 59.9%) in the CABG Q2 sample than in the CABG Q1 sample. There was a higher proportion of male patients (77.9% vs 74.9%) and white patients (85.9% vs 61.1%) in the PCI Q2 sample than the PCI Q1 sample. PCI patients completing Q2 were also older on average than those completing Q1 (mean=66.0 (SD 9.7)) versus 64.7 (SD 10.7) years. A higher proportion of patients in the CABG (23.8%) and PCI (16.6%) Q1 samples reported that they had help completing their questionnaires than patients in the CABG (11.4%) and PCI (8.6%) Q2 samples. This probably reflects the fact that hospital staff was at hand to help patients complete questionnaires in the clinical setting at pre-revascularisation.
Acceptability
There was a low level of missing data across all items in each version (prerevascularisation and postrevascularisation); all scales at prerevascularisation and postrevascularisation had <3% missing data (table 4). Analysis of missing item data in the more elderly subsamples did not suggest that it was overly burdensome for elderly patients. Scale scores were calculated with a possible score range of 0–100, as described for CROQv1.3 There were no floor effects (a high proportion scoring at the bottom of the scales) in the prerevascularisation or postrevascularisation samples, but there were ceiling effects (a high proportion scoring at the top of the scales) in the postrevascularisation samples, as expected following effective interventions and small ceiling effects for Cognitive Functioning at prerevascularisation.
Reliability
Cronbach's α coefficients for all scales at prerevascularisation and postrevascularisation far exceeded the criterion of >0.7013 indicating excellent internal consistency (table 4). For all scales in all samples, the value of α if item was deleted did not substantially increase indicating that all of the items within each scale were contributing to the underlying constructs.25 Scales in all versions demonstrated evidence of homogeneity. All item-total correlations exceeded the criterion of >0.30,13 the range of item-total correlations was small to moderate, and the average item-total correlations were moderate to high.
Tests of scaling assumptions
The results of these tests provided strong confirmatory evidence that the CROQ items are correctly grouped in scales in the prerevascularisation and postrevascularisation versions (table 4).26
Construct validity (within scale analyses)
Construct validity was demonstrated by evidence of high internal consistency (high values of Cronbach's α and moderately high item-total correlations, table 4). Principal axis factor analysis and the pattern of intercorrelations between the CROQ scales confirmed the scaling structure of each version (data not presented). Analyses of CROQ scale scores showed the expected pattern for groups hypothesised to differ. For example, CABG and PCI patients who reported overall improvement in their heart condition at Q2 scored significantly higher (p<0.05) on all four CROQ scales than those who reported their heart condition as being the same or worse at Q2 (data not presented).
Construct validity (analysis against external criteria)
Convergent and discriminant validity
CROQ scales at prerevascularisation and postrevascularisation were more highly correlated with the EQ-5D-3L10 dimensions measuring conceptually similar constructs than with those measuring different constructs (table 5). The correlation coefficients were low to moderate as expected between a disease-specific and generic tool, and slightly higher at postrevascularisation. As hypothesised there were very low correlations (all <0.30) between each version of the CROQ and age and sex, demonstrating that scores are not biased by these demographic factors.
Hypothesis testing/known groups (analyses against external criteria)
The EQ-5D-3L User Guide advises that levels 2 (some problems) and 3 (extreme problems) are combined into a single category (problems) as extreme problems are often low in frequency. While extreme problems were reported by some patients for some EQ-5D dimensions, the numbers were small for this level for other dimensions so the levels were collapsed for all dimensions. As hypothesised, mean CROQ scale scores were significantly higher (p<0.001) for those reporting ‘no problems’ than ‘problems’ on the five EQ-5D-3L dimensions, at prerevascularisation and postrevascularisation in the CABG and PCI samples (web tables S1 and S2). In addition, on average, CABG and PCI patients reporting a comorbidity of depression scored significantly lower (p<0.05) on all CROQ scales, at prerevascularisation and postrevascularisation.
supplementary web table
Construct validity (comparison with external criteria) - hypothesis testing: Comparison of mean (SD) CROQ-CABG and CROQ-PCI Q1 scores for patients with problems versus no problems on the EQ-5D dimensions at Q1
supplementary web table
Construct validity (comparison with external criteria) - hypothesis testing: Comparison of mean (SD) CROQ-CABG and CROQ-PCI Q1 scores for patients with problems versus no problems on the EQ-5D dimensions at Q2
Responsiveness
Table 6 shows the effect sizes for change between prerevascularisation and postrevascularisation for the four core scales in the CABG and PCI responsiveness samples. All scales demonstrated significant change between prerevascularisation and postrevascularisation (p<0.001). For the CROQ-CABG, there was a large27 effect size for symptoms and psychosocial functioning, a moderate effect size for physical functioning, and a small effect size for cognitive functioning. For the CROQ-PCI, there was a large effect size for symptoms, moderate effect sizes for physical functioning and psychosocial functioning and a very small effect size for cognitive functioning. In the CABG and PCI samples, the generic EQ-5D-3L visual analogue scale (VAS) score had a smaller effect size (was less responsive) than three of the four disease-specific CROQ scales.
Subsamples of patients reporting having received help with the Q1 questionnaire
A significantly higher proportion of patients receiving help completing the Q1 questionnaire compared with those who did not receive help were female (CABG: 21.3% vs 14.5%; PCI: 30.4% vs 23.9%, p=.001) and considered themselves to have a disability (CABG 41.8% vs 26.4%; PCI: 58.4% vs 33.3%, p<0.001). A significantly lower proportion of CABG patients receiving help were white (49.8% vs 63.1%, p<0.001). CABG patients (68.6 (SD 9.9) years vs 65.6 (SD 9.5) years) and PCI patients (68.4 (SD 10.9) years vs 63.85 (SD 10.5) years) who received help were also significantly older (p<0.001) and scored significantly lower on all four core scales of the CROQ and the EQ-5D-3L VAS Score (p<0.001), than those who did not receive help. All tests of acceptability, scaling assumptions, reliability and validity met the same psychometric criteria when they were repeated for just the subsamples of patients who reported that they received help completing the prerevascularisation versions in the clinical setting (n=639 CABG and n=615 PCI).
Discussion
Traditional psychometric properties of PROMs are context and sample dependent.18 ,19 Traditional psychometric analyses showed that the prerevascularisation and postrevascularisation versions of the CROQ-CABG and CROQ-PCI demonstrated sufficient evidence of acceptability, reliability (internal consistency), validity and responsiveness when used in the context of the NHS Coronary Revascularisation PROMs Pilot, for the sample of patients whose postprocedure assessment point was fixed to between 5 and 7 months of a HES confirmed revascularisation procedure. Analyses also confirmed that the prerevascularisation versions of CROQv2 are robust for self-completion and for completion with patient-reported help when administered in a clinical setting.
The initial psychometric validation of CROQv1 showed it to be acceptable, reliable, valid and responsive when administered via postal survey, in the context of a research project, in selected samples of patients at prerevascularisation and 3 months postrevascularisation.3 ,4 The analysis described in this paper confirms that the psychometric properties for CROQv2 are also robust when it is administered to a more diverse and larger number of patients undergoing elective coronary revascularisation procedures, outside of the context of a research project. The CROQ was developed as a self-administered postal survey, but our subgroup analysis demonstrated that the psychometric properties were not compromised when patients received help in the clinical setting, despite the fact that research has shown that family and professionals tend to underestimate patients quality of life status across diverse cultured and health conditions.24 Our analysis also confirmed that the psychometric properties are withheld when the postrevascularisation version is administered at 5–7 months (rather than 3 months) postrevascularisation by postal survey. As such it is appropriate to administer the CROQ at prerevascularisation (by survey or in the clinical setting) and at 3 or 6 months postrevascularisation by postal survey. This will allow for a greater degree of flexibility in future study designs and may reduce administration costs.
The CROQ is the only validated disease-specific PROM developed specifically to measure outcomes before and after coronary revascularisation (CABG and PCI). While other cardiac-specific PROMs have been developed and are relevant for use with coronary heart disease patients, such as the Seattle Angina Questionnaire,28 the MacNew Heart Disease Health-Related Quality of Life Questionnaire,29 and the Quality of Life Index-Cardiac Version, QLI-CV,30 these questionnaires do not capture all outcomes of importance to patients before and after coronary revascularisation.2 While some of these questionnaires have been widely used with coronary heart disease patients, including those undergoing CABG and PCI, for example, the Seattle Angina Questionnaire, when selecting an instrument it is important to ensure that the most relevant and applicable PROM is used for the research question under study, that all the questions are applicable to the specific patient group and that items of importance to patients are included.
Study limitations
This study has some important limitations. First, a large number of patients had to be excluded from the main NHS PROMs Pilot postrevascularisation samples, as the interval when patients were sent their postoperative questionnaires (Q2) for completion at home, was very varied. As PROMs are sample and context dependent, to perform meaningful psychometric analysis, it was essential to compare patients at a similar point in time after revascularisation. The possibly slightly lenient criterion of including patients who completed their postoperative Q2 questionnaire between 5 and 7 months of a HES confirmed coronary revascularisation date was applied to all the postrevascularisation psychometric analysis. In future applications, if these essential exclusion criteria are not applied then the psychometric properties of the CROQ may be compromised and the data may be invalid.
Second, it was not possible to evaluate the stability of the CROQv2 through test–retest reliability17 as the appropriate data was not collected during the NHS PROMs Revascularisation Pilot. This should be assessed in a small random sample of CABG and PCI patients, if the decision to use CROQv2 more widely in this context is made.
Third, at the time the CROQv1 was originally developed, the dominant psychometric paradigm was CTT and the CROQv1 was developed using these traditional methods (as described here). It was therefore important to assess the psychometric properties of CROQv2 using the same methods as the original validation. However, future work could evaluate the CROQv2 using so called modern psychometric methods such as Item Response Theory31 or Rasch Measurement Theory.32 This would enable CROQv2 scores to be placed on a truly interval scale, to be invariant (ie, independent of sample and context) and potentially to be applicable in clinical practice at the individual patient level.33 Currently, CROQv2 should not be used as a tool to assess a patient's need for surgery as, like other PROMs, it has not been validated for this purpose and it is possible that its predictive validity is not strong enough.
Conclusion
The CROQ is reliable, valid and responsive when used in the context of a large-scale PROMs programme. While there are several validated cardiac specific PROMs, the CROQ remains the only validated procedure-specific questionnaire for coronary revascularisation. It was developed with patients and includes a much broader range of outcomes important to patients than other cardiac-specific questionnaires. The availability of a tool suitable for use in large-scale PROMs programmes, alongside the collection of clinical data, could enable the routine collection of outcomes that matter to coronary revascularisation patients, rather than focus on narrow aspects of disease or functioning. The CROQ is not yet appropriate for use in clinical practice at the individual patient level as it was developed using psychometric tests for group level measurement and more rigorous measurement standards need to be met for this application.33
Acknowledgments
We thank all patients who participated in the NHS Coronary Revascularisation PROMs Pilot and the staff who so generously gave their time voluntarily to help make the pilot work. The participating NHS Trusts who participated in the pilot included: Barts Health NHS Trust, Basildon and Thurrock University Hospitals NHS Foundation Trust (Essex Cardiothoracic Centre), Brompton & Harefield NHS Foundation Trust, Blackpool, Fylde and Wyre Hospitals NHS Foundation Trust, Liverpool Heart & Chest Hospital NHS Foundation Trust, Nottingham University Hospitals NHS Trust, Oxford University Hospitals NHS Foundation Trust, Papworth Hospital NHS Foundation Trust, Sheffield Teaching Hospitals NHS Foundation Trust, St George's NHS Trust, and University Southampton Hospitals NHS Foundation Trust. We thank Dr Andrew Wragg, Mr Peter Bradley and Alison Pottle for their contribution to the working group.
References
Footnotes
Contributors SS, RM, SG and MJ developed the sampling strategy for the psychometric testing. SG and RM cleaned the data sets and developed the samples for analysis. SS conducted all the psychometric analysis and wrote the first draft of the manuscript. SS, RM, SG, MJ contributed to the writing of the article and approved the final version of the manuscript.
Funding This work was commissioned by the Department of Health (now NHS England).
Competing interests SS developed and validated the CROQ. SS is employed full time by BMJ Publishing Group as a researcher, but is not involved in any publication decisions on manuscripts for any of its journals. A grant was paid by the Department of Health to Liverpool Heart and Chest Hospital to cover the costs of the analytics. RM, SG and SS were compensated for their contributions to the analysis of the NHS Coronary revascularisation PROMs Pilot.
Ethical approval Ethical approval was not required as the data was collected as part of a service evaluation for the NHS. Patients completed a consent form at the time they completed the Q1 questionnaire.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.