Article Text

Download PDFPDF

Shortening patient-reported outcome measures through optimal test assembly: application to the Social Appearance Anxiety Scale in the Scleroderma Patient-centered Intervention Network Cohort
  1. Daphna Harel1,2,
  2. Sarah D Mills3,
  3. Linda Kwakkenbos4,
  4. Marie-Eve Carrier5,
  5. Karen Nielsen6,
  6. Alexandra Portales7,
  7. Susan J Bartlett8,
  8. Vanessa L Malcarne9,10,
  9. Brett D Thombs11
  10. and the SPIN Investigators
    1. 1 Applied Statistics, Social Science, Humanities, New York University, New York City, New York, USA
    2. 2 PRIISM Applied Statistics Center, New York University, New York City, New York, USA
    3. 3 Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina, UK
    4. 4 Behavioural Science Institute, Clinical Psychology, Radboud University, Nijmegen, The Netherlands
    5. 5 Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
    6. 6 Scleroderma Society of Ontario, Hamilton, Ontario, Canada
    7. 7 Asociación Española de Esclerodermia, Madrid, Spain
    8. 8 Department of Medicine, McGill University, Monteal, Quebec, Canada
    9. 9 Department of Psychology, San Diego State University, San Diego, California, USA
    10. 10 San Diego Joint Doctoral Program in Clinical Psychology, San Diego State University/University of California, San Diego, California, USA
    11. 11 Department of Psychiatry, McGill University, Montreal, Quebec, Canada
    1. Correspondence to Dr Brett D Thombs; brett.thombs{at}mcgill.ca

    Abstract

    Objectives The Social Appearance Anxiety Scale (SAAS) is a 16-item measure that assesses social anxiety in situations where appearance is evaluated. The objective was to use optimal test assembly (OTA) methods to develop and validate a short-form SAAS based on objective and reproducible criteria.

    Design This study was a cross-sectional analysis of baseline data from adults enrolled in the Scleroderma Patient-centered Intervention Network (SPIN) Cohort.

    Setting Adults in the SPIN Cohort in the present study were enrolled at 28 centres in Canada, the USA and the UK.

    Participants The SAAS was administered to 926 adults with scleroderma.

    Primary and secondary measures The SAAS, Brief Fear of Negative Evaluation II (BFNE II), Brief Satisfaction with Appearance Scale (Brief-SWAP), Patient Health Questionnaire-8 (PHQ8) and Social Interaction Anxiety Scale-6 (SIAS-6) were collected, as well as demographic characteristics.

    Results OTA methods identified a maximally informative shortened version for each possible form length between 1 and 15 items. The final shortened version was selected based on prespecified criteria for reliability, concurrent validity and statistically equivalent convergent validity with the BFNE II scale. A five-item short version was selected (SAAS-5). The SAAS-5 had a Cronbach’s α of 0.95 and had high concurrent validity with the full-length form (r=0.97). The correlation of the SAAS-5 with the BFNE II was 0.66, which was statistically equivalent to that of the full-length form. Furthermore, the correlation of the SAAS-5 with the two subscales of the Brief-SWAP, and the SIAS-6, were statistically equivalent to that of the full-length form.

    Conclusions OTA was an efficient method for shortening the full-length SAAS to create the SAAS-5.

    • patient reported outcome measure
    • optimal test assembly
    • short form
    • generalized partial credit model
    • systemic sclerosis

    This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

    Statistics from Altmetric.com

    Request Permissions

    If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

    Strengths and limitations of this study

    • This study used optimal test assembly methods and equivalence testing to shorten the Social Appearance Anxiety Scale (SAAS) in patients with scleroderma.

    • This method is data driven and reproducible, unlike many alternative methods for shortening questionnaires.

    • The generalisability of findings is limited to adults with scleroderma and should be confirmed for other patient populations as well as the general population.

    Introduction 

    Patient-reported outcome measures (PROs) that assess patient health, well-being and psychological status based on patient perspectives are increasingly a central component of clinical trials and cohort-based observational studies in health research.1 This can lead, however, to participants being asked to respond to many scales that each contain multiple items, which may be a burden for participants, increase research costs and contribute to poor quality data due to survey fatigue. To ameliorate this problem, researchers sometimes attempt to create shortened versions of PROs with scores that can perform as well or nearly as well as original full-length versions.2–5

    In rare diseases, including systemic sclerosis (SSc), psychological impact can be substantial, and psychological measures are increasingly included in large, multisite studies. SSc is a rare autoimmune disorder characterised by thickening and fibrosis of the skin and internal organs.6 7 Changes in appearance are a hallmark of the disease and can include hypopigmentation and hyperpigmentation, digital ulcers, hand contractures, telangiectasias and altered facial features. Changes in appearance often occur in socially relevant areas (ie, hands and face) of the body and can have significant impacts on psychosocial functioning, in particular in social contexts.8 Adults with SSc report high rates of anxiety, with 64% reporting at least one anxiety disorder in their lifetime, and social anxiety being among the most common.9 Despite reports of appearance-related social discomfort, research in appearance-related social anxiety among adults with SSc is limited.

    The Social Appearance Anxiety Scale (SAAS)10 is a 16-item self-report measure that assesses fear of situations in which one’s appearance will be evaluated. The SAAS was recently validated in a large sample (n=938) of adults with SSc attending clinics in Canada, the USA and the UK.11 Consistent with previous studies,10 12–14 a unidimensional factor structure fits well among the total sample of adults with SSc. Internal consistency reliability as measured by Cronbach’s α was excellent in the total sample (α=0.96) and for limited (α=0.96) and diffuse (α=0.97) subtypes. Evidence of convergent validity was provided via moderate to large correlations between the 16-item SAAS and measures of social discomfort, fear of negative evaluation, social anxiety, symptoms of depression and dissatisfaction with appearance. In other studies, the SAAS has also demonstrated strong measurement properties in samples of university students,10 12 women with eating disorders13 and gay and bisexual men of colour.14 No studies, however, have examined whether all 16 items of the SAAS are necessary to achieve these measurement properties or whether it is possible to shorten this scale. Apparent redundancy between some of the 16 items suggest that there may be an opportunity for shortening (eg, item 7, ‘I am afraid people find me unattractive’, and item 16, ‘I am concerned that people think I am not good looking’.).

    Historically, researchers have created shortened versions of PROs through either an expert-based, qualitative assessment of item content or by fitting a factor analysis model and removing items with minimal factor loadings or low item–total correlations.3 More modern techniques, such as item response theory,15 have been used to identify items that are problematic. However, these methods often are administered in a way that the final selection of items in the shortened version is left to the researcher’s discretion, rather than by systematically establishing prespecified cut-offs or using reproducible criteria.

    Optimal test assembly (OTA) is a branch-and-bound, mixed integer programming procedure that relies on estimates obtained from an item response theory model to select an optimal subset of items that best satisfy objective, reproducible and prespecified constraints.16 OTA has been commonly used to create versions of high-stakes educational tests,17 but recently, a study demonstrated its use for the development of shortened versions of PROs in health research by shortening an 18-item hand function scale to six items while maintaining equivalent measurement properties to those of the full-length form.18 Furthermore, OTA was recently used to shorten the Patient Health Questionnaire – 9 to a four-item shortened form.19 This procedure was also shown to be replicable, reproducible and produce shortened forms of minimal length as compared with leading alternative methods.20

    The objective of the present study was to apply OTA to develop a shortened version of the SAAS. We: (1) used OTA methods to generate maximally precise candidate short versions of the SAAS of each possible length; (2) selected the shortest possible version that performed similarly to the full-form SAAS in terms of prespecified reliability and validity criteria; and (3) assessed the convergent validity of the final selected shortened form as compared with that of the full-length form.

    Material and methods

    Participants and procedures

    This study was a cross-sectional analysis of baseline data from adults enrolled in the Scleroderma Patient-centered Intervention Network (SPIN) Cohort21 who completed online study questionnaires from May 2014 to August 2016. Adults in the SPIN Cohort in the present study were enrolled at 28 centres in Canada, the USA and the UK. To be eligible for the SPIN Cohort, adults must be classified as having SSc according to 2013 American College of Rheumatology (ACR)/European League Against Rheumatism (EULAR) classification criteria22 and confirmed by a SPIN physician, be at least 18 years of age, have the ability to provide informed consent and be fluent in English, French or Spanish. Eligible adults are invited by the attending physician or a supervised nurse coordinator to participate in the cohort, and written informed consent is obtained. SPIN Cohort adults complete outcome measures via the internet on enrolment and subsequently every 3 months. Adults who completed all items of the SAAS and the Brief Fear of Negative Evaluation II (BFNE II) at baseline in English were included in the present study.

    Measures

    Demographic and medical variables

    Age, gender, marital status, number of years since first non-Raynaud’s symptom, disease subtype (limited or diffuse)and modified Rodnan skin score23 were collected. Limited disease was defined as skin sclerosis confined to the limbs distal to the elbows and knees with or without face involvement. Diffuse disease was defined as skin sclerosis involving the limbs proximal to the elbows and knees with or without chest or trunk involvement.24 Demographic variables were self-reported, and SPIN physicians or nurse coordinators collected medical variables.

    Social Appearance Anxiety Scale

    The SAAS, a 16-item measure, was developed to assess the respondent’s anxiety surrounding situations in which one’s appearance may be evaluated. Response options for each item range from 1 (not at all) to 5 (extremely). The total score is calculated by summing across all items, after reverse coding the first item. Scores range from 16 to 80, with higher scores indicating greater fear. A study of adults with SSc found strong evidence for a one-dimension factor structure both in the total sample and when examined separately among adults with limited and diffuse SSc, internal consistency reliability and convergent validity.11

    Brief Fear of Negative Evaluation II

    The BFNE-II is a 12-item measure that assesses the degree to which individuals worry about how they are perceived and evaluated by others.25 Response options for each item range from 1 (not at all characteristic of me) to 5 (extremely characteristic of me). Scores range from 12 to 60 with higher scores indicating greater fear of negative evaluation. A study of adults with SSc found strong evidence for a one-dimension factor structure, internal consistency reliability and convergent validity.26

    Brief Satisfaction with Appearance Scale (Brief-SWAP)

    The Brief-SWAP consists of two three-item subscales that measure dissatisfaction with appearance and social discomfort.27 Response options for each item range from 0 (strongly disagree) to 6 (strongly agree). Scores on each subscale range from 0 to 18, with higher scores indicating greater body image dissatisfaction. A study of adults with SSc found high internal consistency and strong convergent validity with the SAAS.11

    Patient Health Questionnaire-8 (PHQ-8)

    The PHQ-8 consists of eight items that measure depressive symptomology.28 Response options on each item range from 0 (not at all) to 3 (nearly every day), with a total score that ranges from 0 to 24. Higher scores indicate higher levels of depressive symptoms. A study of adults with SSc found high internal consistency and moderate convergent validity with the SAAS.11

    Social Interaction Anxiety Scale-6 (SIAS-6)

    The SIAS-6 assesses anxiety resulting from social.29 Response options on six items range from 0 (not at all characteristic or true of me) to 4 (extremely characteristic or true of me), with total scores ranging from 0 to 24. A study of adults with SSc found excellent internal consistency and strong convergent validity with the SAAS.11

    Statistical analysis

    Item response theory model and OTA

    Unidimensionality of the SAAS in this sample was confirmed previously using the same dataset as in the present study.11 Thus, a generalised partial credit item response theory model (GPCM) was fit to all 16 items of the SAAS.30 The GPCM estimates two types of parameters for each item: threshold parameters, which measure the level of anxiety at which people are more likely to endorse a higher category than the one below it, and discrimination parameters, which measure the strength of the association between that item and the underlying construct (in this case, social appearance anxiety). From these item-level parameters, item information functions are estimated for each of the 16 items, and summed pointwise to obtain the test information function (TIF). The TIF measures the total amount of Fisher’s information in the 16 items and is inversely related to the SE of measurement of the underlying construct. Thus, versions of a PRO with higher levels of test information result in greater precision in the measurements of the underlying construct.15

    A set of 15 candidate shortened versions, one of each possible length between 1 item and 15 items, was generated through the OTA procedure. OTA uses a branch-and-bound approach through mixed integer linear programming to systematically explore the space of all possible shortened versions of a fixed length to optimise an objective function. In this case, the objective function was defined to be the height of the TIF, thus minimising the SE of measurement of the underlying construct. Therefore, for each possible length, the OTA procedure creates an optimal candidate shortened version of the PRO, defined by selecting the items that maximise the TIF across the latent spectrum of the underlying construct, as compared with all other possible shortened versions of the same length. Based on previously established guidelines, the OTA procedure was anchored at five points across the spectrum of the underlying construct (−3, –1, 0, 1, 3), jointly maximising the objective function at these points.16

    Each of the 15 candidate short versions and the full-length form were scored using two procedures to obtain estimates of each participant’s level of anxiety surrounding situations in which one’s appearance will be evaluated. First, the summed scores across all items included in the form were calculated by adding item scores for each item included in the form. Second, factor scores, which estimate a level of a latent construct, were estimated from the GPCM for each participant for each form through an application of Bayes’ theorem. Although summed scores are typically relied on for clinical use, the factor scores were considered to provide a better estimate of the underlying construct. This is because of limitations of summed scores under the GPCM. Summed scores may result in an incorrect ordering of patients along the spectrum of the underlying construct. That is, patients with lower levels of fear may have higher summed scores than patients with higher levels of fear.31 32

    Selection of the final form

    OTA generates optimal candidate short versions of the SAAS but does not provide criteria by which the final form should be selected. When items are eliminated from the full-length form, the amount of test information inherently decreases, and there is no obvious threshold at which a shortened version would be said to contain adequate information. Therefore, the selection of the final form was based on five criteria: reliability, concurrent validity based on summed scores, concurrent validity based on factor scores, convergent validity based on summed scores and convergent validity based on factor scores. Applying these five criteria concurrently ensured that the final selected shortened version maintains desirable measurement properties across these categories.

    First, the reliability of each candidate shortened version and the full-length form was assessed using Cronbach’s α coefficient. The shortened version was required to maintain at least 95% of the value of Cronbach’s α for the full-length form. Second, concurrent validity for both summed and factor scores were assessed by calculating Pearson’s correlation coefficient between the scores on each candidate shortened version and the scores on the full-length form. For both the summed and factor scores, these correlations were required to be at least 0.95, ensuring that the shortened version demonstrated high concurrent validity.

    Lastly, convergent validity was assessed through the correlation between each patient’s score on the SAAS and their score on the BFNE II. The candidate shortened versions were required to demonstrate statistical equivalence within a tolerance of 0.05 with the convergent validity of the full-length SAAS through an application of equivalence testing. Equivalence testing, more commonly used in clinical trials, tests whether the difference between two correlations is within a prespecified range, in this case set at 0.05.33 34 Contrary to traditional hypothesis testing, equivalence testing tests a null hypothesis that the difference between the two correlations is greater than the prespecified range, against an alternative hypothesis of equivalence within the prespecified range. To assess statistical significance, we applied the Benjamini-Hochberg correction procedure for each of the 30 hypothesis tests used (15 candidate shortened versions × two scoring procedures).35

    Post hoc convergent validity of the shortened form

    Convergent validity of the selected shortened form was compared with that of the full-length form. Correlations between the summed scores of the selected shortened form and those of four other measures: the two subscales of the Brief-SWAP, the PHQ-8 and the SIAS-6 were calculated. Statistical equivalence was assessed within a tolerance of 0.05 with the convergent validity of the full-length SAAS using Benjamini-Hochberg adjusted p values.

    All analyses were conducted in R Studio V.1.0.136.36 The GPCM was fit using the ltm package.37 The OTA analysis was conducted using the lpSolveApi package.38

    Patient involvement

    SPIN was conceived by a collaboration of investigators and patients. SPIN’s Patient Advisory Board advises the SPIN Steering Committee on priorities for investigation. Patients were included in the SPIN Publication Committee, which reviewed the proposal for the present study and its methods. Two patients were coauthors of the present report.

    Results

    There were 926 people who completed both the SAAS and BFNE II. The mean age was 55.6 years, 88% were women and 43% had diffuse SSc. The mean±SD score on the SAAS was 28.3±24. SAAS scores in adults with diffuse SSc were significantly higher than adults with limited SSc (p<0.001). See table 1 for descriptive statistics.

    Table 1

    Patient demographic and disease characteristics (n=926)

    Item response theory model and OTA

    The GPCM was fit to the 16 items of the SAAS. Table 2 shows the item content, along with the discrimination parameters estimated from the GPCM. The three items with the highest amount of discriminative ability and, therefore, the most influential on the TIF, were items 6, 7 and 13. The items with the least amount of discriminative ability and, therefore, the least influential on the TIF, were items 1, 2 and 15. Figure 1 shows the individual item information functions generated from the estimates from the GPCM and the TIF.

    Table 2

    SAAS items and discrimination parameters from the GPCM

    Figure 1

    Item and test information curves of the SAAS. The left hand plot shows the 16 individual item information curves. The right hand plot compares the test information functions of the full SAAS (solid line) and SAAS-5 (dashed line). SAAS, Social Appearance Anxiety Scale.

    The OTA procedure generated 15 candidate short versions that each maximised the total amount of test information among all shortened versions of that length. Online supplementary appendix table 1 shows the items that were selected by the OTA procedure for each of the 15 candidate short versions. Items 6, 7, and 14 were included in all short forms of length at least three items. Although question 13 had a higher discrimination parameter estimate than question 14, it was not included in shortened forms of lengths shorter than 4. This is because the OTA procedure accounts for a more complete assessment of an item than just its discrimination parameter.20 That is, if two items have the same level of discrimination, but provide information at the same point on the latent spectrum, then the OTA procedure may not select both items into the shortened form. Items 1, 12, and 15 were the first three items dropped from the candidate shortened versions. These items all had low information across the spectrum of social appearance anxiety.

    Supplemental material

    Selection of the final shortened version

    Table 3 presents Cronbach’s α values and concurrent validity correlations for the 16 candidate short forms. Even for shortened versions with very few items, the values of Cronbach’s α and the validity correlations remained high. Table 4 presents results of the equivalency tests for the convergent validity correlation with the BFNE II. The two-item shortened version, and all versions with at least four items, demonstrated statistically significant equivalency for both the correlations between the summed and factor scores of the full SAAS with the BFNE II. All shortened versions with at least five items satisfied our prespecified criteria in terms of reliability, concurrent validity and convergent validity. Therefore, the five-item shortened version (SAAS-5, see online supplementary appendix table 2) was the shortest candidate version to fulfil our requirements. Versions shorter than the SAAS-5 failed to meet the criteria on concurrent validity for the factor scores from the GPCM.

    Table 3

    Properties of optimal shortened versions

    Table 4

    Equivalency analysis results

    The SAAS-5 includes item 6 (‘I am concerned that people will find me unappealing because of my appearance’), item 7 (‘I am afraid people find me unattractive’), item 12 (‘I am frequently afraid that I won’t meet others’ standards of how I should look’), item 13 (‘I worry people will judge the way I look negatively’) and item 14 (‘I am uncomfortable when I think others are noticing flaws in my appearance’). The SAAS-5 had a Cronbach’s α of 0.95 as compared with the Cronbach’s α of the full-length form of 0.96. Thus, the SAAS-5 maintained high reliability. The correlation of the summed scores from the SAAS-5 with those from the full 16-item SAAS scores was r=0.97 (95% CI 0.97 to 0.97). The correlation of the factor scores between the full-length and shortened versions was r=0.95 (95% CI 0.95 to 0.96). The summed scores on the SAAS-5 maintained moderate-to-high positive correlation with the BFNE II (r=0.66, 95% CI 0.62 to 0.69) compared with 0.66 (95% CI 0.63 to 0.70) for the 16-item SAAS. Similarly, the factor scores on the SAAS-5 maintained moderate-to-high positive correlations with the BFNE II (r=0.68, 95% CI 0.64 to 0.71) compared with 0.68 (95% CI 0.64 to 0.71) for the full SAAS. The mean score on the SAAS-5 in this sample was 8.29 with an SD of 4.55 and possible range of 5 to 25.

    Post hoc convergent validity of the SAAS-5

    The convergent validity of the SAAS-5 was statistically equivalent, within a tolerance of 0.05, to that of the full-length SAAS for the two subscales of the Brief-SWAP, and the SIAS-6, as shown in table 5. The convergent validity correlation was not statistically equivalent for the PHQ-8. However, even for this measure, convergent validity was moderate for both the SAAS-5 and full-length version.

    Table 5

    Convergent validity correlations

    Discussion

    This study investigated how OTA methods can be used to develop shortened versions of PRO measures, using a measure of social appearance anxiety—the SAAS. The 16-item SAAS was shortened to a five-item version through a reproducible process based on prespecified and objective criteria. The SAAS-5 maintained high reliability (α=0.95), high concurrent validity with the full-length form, with an r=0.97 (95% CI 0.97 to 0.97) for summed scores, and an r=0.95 (95% CI 0.95 to 0.96) for factor scores. The SAAS-5 maintained statistically equivalent convergent validity correlations with the BFNE II for both summed and factor scores. Furthermore, the SAAS-5 maintained statistically equivalent convergent validity correlations for the two subscales of the Brief-SWAP and SIAS-6 to that of the full-length form. Although the SAAS-5 did not maintain a statistically equivalent convergent validity correlation with the PHQ-8, this does not suggest poor convergent validity. This may have occurred because items that captured symptoms most relevant to depression in the SAAS are no longer included in the SAAS-5. Scores on the short-form remained moderately correlated in the expected direction with the PHQ-8.

    In addition to its measurement properties, face validity, or the degree to which a test appears to measure what it reports to measure,39 is strong for the SAAS-5. The items of the SAAS-5 assess concern about being unappealing and unattractive, not meeting others’ appearance standards, worry about appearance-related judgement and discomfort when others notice appearance-related flaws. These items all appear to measure aspects of social appearance anxiety, or a fear of situations in which one’s appearance will be evaluated. Thus, findings from the present study suggest that the SAAS-5 is a brief, valid and reliable measure of social appearance anxiety among adults with SSc. The SAAS-5 may be preferred over the 16-item SAAS as it reduces participant burden, which is particularly important among adults with SSc who may have difficulty completing self-report questionnaires due to restricted physical functioning. By reducing the number of items in measures like the SAAS, researchers may be able to increase the number of constructs that are measured in hard-to-access populations, such as people living with rare diseases, including SSc.

    There are several limitations that must be considered in this study. First, the SPIN Cohort is a convenience sample of patients receiving treatment at SPIN recruiting centres and who completed study questionnaires online. In addition, this sample had a relatively low skin score, which may limit the generalisability of study findings to patients with low disease severity.

    This study used cross-sectional data, and therefore, the sensitivity to change or intervention status, discriminant validity and test–retest reliability of the SAAS-5 were not investigated. The purpose in the present study was to illustrate the use of OTA in creating a shortened version of a full-length form and to propose a new shortened version of the SAAS. Future studies should investigate these properties in order to assess the discriminant, predictive and evaluative characteristics of the SAAS-5. Furthermore, the assessment of longitudinal changes that are clinically meaningful due to, for example, treatment in a clinical trial of SSc patients, would need further study. Therefore, this may limit the utility of the SAAS-5 as an evaluative measure in patients with SSc.

    The method used in this study does not include content validity or expert assessment of the items selected into the shortened form. Had an expert panel or focus group of patients been convened, they may have selected a different subset of items into the shortened form. An expert panel may have been able to use their knowledge to select items that were appropriate for the detection of clinically meaningful changes of worsening or improvement. However, such a procedure would not rely directly on patient data, may not be replicable and may result in reduced measure validity based on imperfect clinical intuition.20 More resource-intensive methods for developing short forms, such focus groups and content experts, along with replicable statistical criteria, would be ideal. However, the resources necessary to complete these procedures may represent a substantial barrier to the development of shortened forms. The OTA method provides a replicable method that maintains performance standards based on objective criteria and provides a more feasible method.

    The OTA procedure is sensitive to the investigator-defined choice of decision criteria in the selection of the final shortened version. These decision criteria, when applied in future studies, must be carefully considered by researchers. Furthermore, the OTA method treats the 16 items of the SAAS as if they represented a full item bank of possible items. It is possible that if other items were considered that a different set of items would have been selected into the final shortened version.

    The OTA procedure is data driven, and results of this study should be replicated in this patient population. An analysis based on one sample of SSc patients may not be sufficient for the derivation of a disease-specific measure. The results of this study are only as applicable for patients with SSc as the original full-length SAAS. It should be noted that the original SAAS was developed based on three different samples of volunteers from introductory psychology courses at large public universities. Therefore, even the original SAAS instrument might not provide sufficient coverage in terms of content validity for patients with SSc. Lastly, future work should assess whether the SAAS-5 is the optimal shortened form in other patient populations, as well as the general population, as results of this study are limited in their generalisability beyond patients with SSc.

    Conclusion

    In summation, this study showed how OTA methods might be used to shorten PROs. This method was used to shorten the 16-item SAAS to a five-item version while maintaining comparable reliability and validity among a sample of adults with SSc. This analysis should be replicated in this patient population, as well as other patient populations, to increase the generalisability of these findings. Moreover, expert opinions or focus groups should be solicited to assess whether the items selected into the shortened form match clinical intuition.

    References

    1. 1.
    2. 2.
    3. 3.
    4. 4.
    5. 5.
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16.
    17. 17.
    18. 18.
    19. 19.
    20. 20.
    21. 21.
    22. 22.
    23. 23.
    24. 24.
    25. 25.
    26. 26.
    27. 27.
    28. 28.
    29. 29.
    30. 30.
    31. 31.
    32. 32.
    33. 33.
    34. 34.
    35. 35.
    36. 36.
    37. 37.
    38. 38.
    39. 39.

    Footnotes

    • Contributors DH was responsible for the study conception. LK, M-EC, SJB, VLM, BDT and the Scleroderma Patient-centered Intervention Network (SPIN) Investigators contributed to data collection. All authors contributed to data analysis and interpretation. DH and SDM drafted the manuscript. All authors provided a critical revision of the manuscript and approved the final version of the manuscript. DH is the guarantor.

    • Funding SPIN has been funded by research grants from the Canadian Institutes of Health Research (TR3-119192, PJT-148504 and PJT-149073), the Arthritis Society, the Scleroderma Society of Ontario and Scleroderma Canada. SPIN has also received support from the Lady Davis Institute for Medical Research of the Jewish General Hospital, Montreal, Canada, from McGill University, Montreal, Canada, and from Sclérodermie Québec. SDM receives funding from the Cancer Control Education Program at the University of North Carolina, Chapel Hill (National Cancer Institute award T32 CA057726). LK was supported by a CIHR Banting Postdoctoral Fellowship. BDT was supported by a Fonds de recherche Santé -Québec (FRQS) researcher salary award. Authors had full access to the data and take responsibility for the integrity of the data and the accuracy of the data analysis.

    • Disclaimer No funding body had any role in the design, collection, analysis or interpretation of data; in the writing of the manuscript; or in the decision to submit the manuscript for publication.

    • Competing interests None declared.

    • Ethics approval The SPIN Cohort study was approved by the Research Ethics Committee of the Jewish General Hospital and by the Institutional Reviews Boards of each participating centre.

    • Provenance and peer review Not commissioned; externally peer reviewed.

    • Data sharing statement Data from the SPIN Cohort can be requested from the corresponding author.

    • Collaborators Murray Baron, McGill University, Montreal, Quebec, Canada; Daniel E Furst, Division of Rheumatology, Geffen School of Medicine at the University of California, Los Angeles, California, USA; Karen Gottesman, Scleroderma Foundation, Los Angeles, California, USA; McGill University, Montreal, Quebec, Canada; Maureen D Mayes, University of Texas McGovern School of Medicine, Houston, Texas, USA; Luc Mouthon, Université Paris Descartes, Paris, France; Warren R Nielson, St. Joseph’s Health Care, London, Ontario, Canada; Robert Riggs, Scleroderma Foundation, Danvers, Massachusetts, USA; Maureen Sauve, Scleroderma Society of Ontario, Hamilton, Ontario; Fredrick Wigley, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA; Shervin Assassi, University of Texas McGovern School of Medicine, Houston, Texas, USA; Isabelle Boutron, Université Paris Descartes, and Assistance Publique Hôpitaux de Paris, Paris, France; Angela Costa Maia, University of Minho, Braga, Portugal; Ghassan El-Baalbaki, Université du Québec à Montréal, Montreal, Quebec, Canada; Carolyn Ells, McGill University, Montreal, Quebec, Canada; Cornelia van den Ende, Sint Maartenskliniek, Nijmegen, The Netherlands; Kim Fligelstone, Scleroderma Society, London, UK; Catherine Fortune, Scleroderma Society of Ontario, Hamilton, Ontario, Canada; Tracy Frech, University of Utah, Salt Lake City, Utah, USA; Dominique Godard, Association des Sclérodermiques de France, Sorel-Moussel, France; Marie Hudson, McGill University, Montreal, Quebec, Canada; Ann Impens, Midwestern University, Downers Grove, Illinois, USA; Yeona Jang, McGill University, Montreal, Quebec, Canada; Sindhu R Johnson, Toronto Scleroderma Program, Mount Sinai Hospital, Toronto Western Hospital, and University of Toronto, Toronto, Ontario, Canada; Ann Tyrell Kennedy, Federation of European Scleroderma Associations, Dublin, Ireland; Annett Körner, McGill University, Montreal, Quebec, Canada; Maggie Larche, McMaster University, Hamilton, Ontario, Canada; Catarina Leite, University of Minho, Braga, Portugal; Carlo Marra, Memorial University, St. John’s, Newfoundland, Canada; Janet Pope, University of Western Ontario, London, Ontario, Canada; Tatiana Sofia Rodriguez Reyna, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico; Anne A Schouffoer, Leiden University Medical Center, Leiden, The Netherlands; Russell J Steele, Jewish General Hospital and McGill University, Montreal, Quebec, Canada; Maria E Suarez-Almazor, University of Texas MD Anderson Cancer Center, Houston, Texas, USA; Joep Welling, NVLE Dutch patient organization for systemic autoimmune diseases, Utrecht, The Netherlands, and Federation of European Scleroderma Associations (FESCA aisbl), Brussel, Belgium; Durhane Wong-Rieger, Canadian Organization for Rare Disorders, Toronto, Ontario, Canada; Christian Agard, Centre Hospitalier Universitaire – Hôtel-Dieu de Nantes, Nantes, France; Alexandra Albert, Université Laval, Quebec, Quebec, Canada; Marc André, Centre Hospitalier Universitaire Gabriel-Montpied, Clermont-Ferrand, France; Guylaine Arsenault, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Nouria Benmostefa, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Ilham Benzidia, Assistance Publique Hôpitaux de Paris – Hôpital St-Louis, Paris, France; Sabine Berthier, Centre Hospitalier Universitaire Dijon Bourgogne, Dijon, France; Lyne Bissonnette, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Gilles Boire, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Alessandra Bruns, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Patricia Carreira, Servicio de Reumatologia del Hospital 12 de Octubre, Madrid, Spain; Marion Casadevall, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Benjamin Chaigne, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Lorinda Chung, Stanford University, Stanford, California, USA; Pascal Cohen, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Pierre Dagenais, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Christopher Denton, Royal Free London Hospital, London, UK; Robyn Domsic, University of Pittsburgh, Pittsburgh, Pennsylvania, USA; Sandrine Dubois, Centre Hospitalier Régional Universitaire de Lille – Hôpital Claude Huriez, Lille, France; James V Dunne, St. Paul’s Hospital and University of British Columbia, Vancouver, British Columbia, Canada; Bertrand Dunogue, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Alexia Esquinca, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico; Regina Fare, Servicio de Reumatologia del Hospital 12 de Octubre, Madrid, Spain; Dominique Farge-Bancel, Assistance Publique Hôpitaux de Paris – Hôpital St-Louis, Paris, France; Paul R Fortin, CHU de Québec – Université Laval, Quebec, Quebec, Canada; Anna Gill, Royal Free London Hospital, London, UK; Jessica Gordon, Hospital for Special Surgery, New York City, New York, USA; Brigitte Granel-Rey, Aix Marseille Université, and Assistance Publique Hôpitaux de Marseille – Hôpital Nord, Marseille, France; Claire Grange, Centre Hospitalier Lyon Sud, Lyon, France; Genevieve Gyger, Jewish General Hospital and McGill University, Montreal, Quebec, Canada; Eric Hachulla, Centre Hospitalier Régional Universitaire de Lille – Hôpital Claude Huriez, Lille, France; Pierre-Yves Hatron, Centre Hospitalier Régional Universitaire de Lille – Hôpital Claude Huriez, Lille, France; Ariane L Herrick, University of Manchester, Salford Royal NHS Foundation Trust, Manchester, UK; Adrian Hij, Assistance Publique Hôpitaux de Paris – Hôpital St-Louis, Paris, France; Monique Hinchcliff, Northwestern University, Chicago, Illinois, USA; Alena Ikic, Université Laval, Quebec, Quebec, Canada; Niall Jones, University of Alberta, Edmonton, Alberta, Canada; Artur Jose de B Fernandes, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Suzanne Kafaja, University of California, Los Angeles, California, USA; Nader Khalidi, McMaster University, Hamilton, Ontario, Canada; Benjamin Korman, Northwestern University, Chicago, Illinois, Marc Lambert, Centre Hospitalier Régional Universitaire de Lille – Hôpital Claude Huriez, Lille, France; David Launay, Centre Hospitalier Régional Universitaire de Lille – Hôpital Claude Huriez, Lille, France; USA; Patrick Liang, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Jonathan London, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; David Luna, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico; Hélène Maillard, Centre Hospitalier Régional Universitaire de Lille – Hôpital Claude Huriez, Lille, France; Joanne Manning, Salford Royal NHS Foundation Trust, Salford, UK; Maria Martin, Servicio de Reumatologia del Hospital 12 de Octubre, Madrid, Spain; Thierry Martin, Les Hôpitaux Universitaires de Strasbourg – Nouvel Hôpital Civil, Strasbourg, France; Ariel Masetto, Université de Sherbrooke, Sherbrooke, Quebec, Canada; François Maurier, Hôpitaux Privés de Metz – Hôpital Belle-Isle, Metz, France; Arsene Mekinian, Assistance Publique Hôpitaux de Paris – Hôpital St-Antoine, Paris, France; Sheila Melchor, Servicio de Reumatologia del Hospital 12 de Octubre, Madrid, Spain; Mandana Nikpour, St Vincent’s Hospital and University of Melbourne, Melbourne, Victoria, Australia; Romain Paule, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Susanna Proudman, Royal Adelaide Hospital and University of Adelaide, Adelaide, South Australia, Australia; Alexis Régent, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Sébastien Rivière, Assistance Publique Hôpitaux de Paris – Hôpital St-Antoine, Paris, France; David Robinson, University of Manitoba, Winnipeg, Manitoba, Canada; Esther Rodriguez, Servicio de Reumatologia del Hospital 12 de Octubre, Madrid, Spain; Sophie Roux, Université de Sherbrooke, Sherbrooke, Quebec, Canada; Perrine Smets, Centre Hospitalier Universitaire Gabriel-Montpied, Clermont-Ferrand, France; Doug Smith, University of Ottawa, Ottawa, Ontario, Canada; Vincent Sobanski, Centre Hospitalier Régional Universitaire de Lille - Hôpital Claude Huriez, Lille, France; Robert Spiera, Hospital for Special Surgery, New York, New York, USA; Virginia Steen, Georgetown University, Washington, DC, USA; Wendy Stevens, St Vincent’s Hospital and University of Melbourne, Melbourne, Victoria, Australia; Evelyn Sutton, Dalhousie University, Halifax, Nova Scotia, Canada; Benjamin Terrier, Assistance Publique Hôpitaux de Paris – Hôpital Cochin, Paris, France; Carter Thorne, Southlake Regional Health Centre, Newmarket, Ontario, Canada; John Varga, Northwestern University, Chicago, Illinois, USA; Pearce Wilcox, St. Paul’s Hospital and University of British Columbia, Vancouver, British Columbia, Canada; Michelle Wilson, St Vincent’s Hospital and University of Melbourne, Melbourne, Victoria, Australia; Julie Cumin, Jewish General Hospital, Montreal, Quebec, Canada; Rina S Fox, San Diego State University and University of California, San Diego, San Diego, California, USA; Shadi Gholizadeh, San Diego State University and University of California, San Diego, San Diego, California, USA; Lisa R Jewett, Jewish General Hospital and McGill University, Montréal, Québec, Canada; Brooke Levis, Jewish General Hospital and McGill University, Montreal, Quebec, Canada; Mia R Pepin, Jewish General Hospital, Montreal, Quebec, Canada; Kimberly A Turner, Jewish General Hospital, Montreal, Quebec, Canada.

    • Patient consent for publication Not required.