Objectives Several ABILHAND Rasch-built manual ability scales were previously developed for chronic stroke (CS), cerebral palsy (CP), rheumatoid arthritis (RA), systemic sclerosis (SSc) and neuromuscular disorders (NMD). The present study aimed to explore the applicability of a generic manual ability scale unbiased by diagnosis and to study the nature of manual ability across diagnoses.
Design Cross-sectional study.
Setting Outpatient clinic homes (CS, CP, RA), specialised centres (CP), reference centres (CP, NMD) and university hospitals (SSc).
Participants 762 patients from six diagnostic groups: 103 CS adults, 113 CP children, 112 RA adults, 156 SSc adults, 124 NMD children and 124 NMD adults.
Primary and secondary outcome measures Manual ability as measured by the ABILHAND disease-specific questionnaires, diagnosis and nature (ie, uni-manual or bi-manual involvement and proximal or distal joints involvement) of the ABILHAND manual activities.
Results The difficulties of most manual activities were diagnosis dependent. A principal component analysis highlighted that 57% of the variance in the item difficulty between diagnoses was explained by the symmetric or asymmetric nature of the disorders. A generic scale was constructed, from a metric point of view, with 11 items sharing a common difficulty among diagnoses and 41 items displaying a category-specific location (asymmetric: CS, CP; and symmetric: RA, SSc, NMD). This generic scale showed that CP and NMD children had significantly less manual ability than RA patients, who had significantly less manual ability than CS, SSc and NMD adults. However, the generic scale was less discriminative and responsive to small deficits than disease-specific instruments.
Conclusions Our finding that most of the manual item difficulties were disease-dependent emphasises the danger of using generic scales without prior investigation of item invariance across diagnostic groups. Nevertheless, a generic manual ability scale could be developed by adjusting and accounting for activities perceived differently in various disorders.
- Rehabilitation Medicine
This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
To explore the applicability of a generic ABILHAND manual ability scale unbiased by diagnosis across various clinical populations.
To analyse prior data from cross-sectional studies that developed disease-specific manual ability questionnaires in order to investigate the co-calibration of patient-perceived item difficulty on a common metric.
To better understand the nature of the measured variable, namely, manual ability.
The difficulty of most manual activities was diagnosis-dependent, emphasising the danger of using generic scales without prior investigation of item invariance across diagnostic groups.
The vast majority (85%) of the difficulty variations observed in manual activities across diagnostic groups was explained by (1) the symmetric or asymmetric nature of the disorder (57% of the variance) and (2) the proximal or distal nature of the disorder (28% of the variance).
Although less sensitive than diagnosis-specific scales, a generic manual ability scale could be developed by adjusting and accounting for activities perceived differently in various disorders, which allows quantitative comparisons of manual ability between diagnostic groups.
Strengths and limitations of this study
Our study explores a large set of data (732 patients) spread out evenly over six diagnostic groups (stroke adults, cerebral palsy children, adults with rheumatoid arthritis, adults with systemic sclerosis, children and adults with neuromuscular disorders).
Our study proposes an original methodology (combining differential item functioning tests, principal component analysis and manual activities categorisation) that investigates the factors contributing to the hierarchy of manual item difficulty observed across diagnoses allowing the nature of manual ability to be better understood.
One fundamental goal of rehabilitation is to improve the subjects’ ability to manage the daily activities necessary for autonomous living.1 Such an ability belongs to the domain of latent variables concealed within the person, such as pain or intelligence. It cannot be observed directly, but it can be inferred from the subject's perceived difficulty in performing activities, also called items, using patient self-reported questionnaires. Over the past decade, questionnaires have therefore become widely used as outcome measures in clinical trials2 and rating scale data are becoming integral to patient care, prescribing and policymaking. It is essential that functional rating scales provide scientifically robust and clinically meaningful results to ensure appropriate interpretations and decision-making regarding disease effects, clinical implications, treatment, health policies and resource allocation. Unfortunately, most rating scales generate ordinal data by summating scores assigned to a set of items representing the intended variable, and metric properties of raw ordinal scores are known to have limited validity.3 ,4 In view of this limitation, the Rasch model5 is becoming increasingly popular for health measurements because it enables the direct transformation from ordinal scores to linear measures with a constant unit.
Over the last 20 years, our research group has developed several manual ability rating scales (known under the umbrella term of ABILHAND questionnaires) by applying the Rasch model to various diagnostic groups. ABILHAND scales are self-administered questionnaires that measure ‘manual ability’, which is defined as, ‘the capacity to manage daily activities requiring the use of the upper limbs, whatever the strategies involved’.6 Disease-specific manual ability ‘rulers’ were previously developed for the following patient groups: chronic stroke (CS),6 cerebral palsy (CP),7 rheumatoid arthritis (RA),8 systemic sclerosis (SSc)9 and neuromuscular disorders (NMD).10 Each ABILHAND scale has its own Rasch-derived item difficulty calibration, which defines a disease-specific manual ability measurement continuum. ABILHAND questionnaires present good psychometric qualities, including linearity, unidimensionality, construct validity, and test–retest reliability.
Disease-specific scales, which are highly sensitive and detect small, yet clinically important changes, are frequently used in research because they ensure comprehensive assessment of health aspects directly related to the condition.11 ,12 In contrast, generic scales enable comparisons of various diagnoses and healthcare interventions, which may provide useful data for health policies, cost-effective analyses and resource allocation.11 ,12 They best meet rehabilitation requirements when disability treatment is not dependent on a specific underlying diagnosis.13 For instance, as a single bathroom scale can be used to weigh all patients, a generic manual ability scale would enable quantitative comparisons of the ability to use the upper limbs in daily activities across patients of various diagnoses (and also with healthy subjects).
From a metric point of view, it is possible to co-calibrate various disease-specific ABILHAND questionnaire items on the same scale, provided that the scales are based on an identical unidimensional construct.14 In theory, and similar to the graduations of a metric ruler, items should have the same difficulty for all diagnostic groups, regardless of the disease being measured. Nevertheless, the main implicit assumption made by the users of generic scales is that the difficulties of daily activities are invariant across diagnoses. However, in practice, item difficulty hierarchy may vary across groups, demonstrating differential item functioning (DIF).15 The Rasch model can be used to test the invariance of item difficulty hierarchy and to accommodate for DIF.16 When the items of a generic scale are unstable across diagnoses, the measurements generated by them cannot be used to make meaningful comparisons.
The present study explored the applicability of a generic ABILHAND manual ability scale, which is unbiased by diagnoses, across various clinical populations. Setting out this objective, we also intended to improve the current understanding of the nature of manual ability and especially its interaction with diagnosis. We analysed prior data from cross-sectional studies that developed disease-specific manual ability questionnaires in order to investigate the co-calibration of patient perceived item difficulty on a common metric.
Data from 732 subjects, who previously provided informed consent, were analysed. Patients with the following disorders were evaluated: 103 CS adults,6 113 CP children,7 112 RA adults,8 156 SSc adults,9 124 NMD children (NMDc) and 124 NMD adults (NMDa).10 Table 1 provides patient characteristics. The ethics committee of the Université catholique de Louvain, Faculty of Medicine in Brussels, Belgium, authorised and approved the study.
Manual ability measure
Original data included 83 manual activities shared by at least two diagnostic groups (the 83 items are provided in the supplementary table). Original items covered different domains of daily living such as feeding, grooming or dressing and were selected in previous studies based on literature review, and patient and expert interviews. Twelve items were child-specific (eg, ‘throwing a ball’), 19 were adult-specific (eg, ‘hammering a nail’) and 52 were common to both groups (eg, ‘buttoning up trousers’). Adult patients and children's parents provided their perceived difficulty in performing each activity based on a three-level response scale: impossible (0), difficult (1) or easy (2). Each activity had to be completed without technical or human assistance and irrespective of the limb(s) and adaptive strategies used. Missing values were included when a given diagnostic group did not provide responses for a particular item, as the activity may not have been submitted to a group. The nature of the items was assessed by 10 occupational or physical therapists according to the following criteria: unimanual or bimanual involvement required to perform the activity, and involvement of proximal or distal joints.
All responses were analysed with RUMM2020, a Rasch analysis computer program. The Rasch model5 can be used to estimate, on a single manual ability construct, the location of each patient, that is, their manual ability, the location of each item, that is, the difficulty of the manual activities, and the location of each threshold between successive categories of the response scale, that is, the locations along the latent construct at which two successive categories are equally likely to be observed. The model can be used to verify that successive response categories for each item represent increasing levels of ability and that thresholds between successive response categories are located in the anticipated order.17
The model also requires that the probability of endorsing any response category to an item depends solely on the subject's ability, the item difficulty and the location of the threshold between adjacent response categories. In the case of manual ability measurement, no attribute of the person—such as diagnosis—besides manual ability is theorised to account for the probability of choosing a given response to a given item. The similarity between the observed and expected responses can be investigated using a χ2 fit statistic computed over five class intervals (CI) of patients with increasing ability.18 Items with a p value lower than 0.05 indicate a threat to the fit requirement.
Invariance of the item difficulty hierarchy
Unidimensionality also requires that patients with identical ability, but different diagnoses, have the same probability of succeeding any particular item. Consequently, the invariance of item difficulties across patient diagnostic groups must be controlled using DIF tests.15 To investigate the invariance of item difficulty hierarchy, a two-way analysis of variance (ANOVA) was computed on the standardised residuals of the different CIs;19 ,20 the first factor was the diagnostic group and the second factor was the CI of increasing manual ability. Significant diagnostic main effects represented group differences in item difficulty hierarchy. A solution to the presence of DIF by diagnosis is the removal of items showing difficulty variations. Another solution is to allow for the variations that exist across DIF items by splitting them into disease-specific items, one for each diagnostic, with a difficulty peculiar to the corresponding diagnosis.16 In this case, the different diagnostic groups can be compared on the same continuum even if they have specific items provided that there are common linking items unbiased by DIF.
Two different approaches can be used to combine data from different scales responded by different samples. The ‘co-calibration’, also called ‘concurrent equating’, merges all items together as one scale with empty spaces for missing values. The ‘anchoring’ approach anchors items that are common to all diagnoses and then includes diagnosis-specific items in the same frame of reference. The anchoring approach requires that the common linking items be free of DIF,21–23 which was not the case in our dataset. Therefore, the co-calibration approach, also applied in previous rehabilitation studies,24–27 was followed and the analysis process is illustrated in figure 1. The first step in the data analysis was to co-calibrate the ABILHAND data of all diagnostic groups by analysing all responses (n=732) to the 83 items. The second step was to remove items with disordered thresholds and items that misfit a unidimensional variable (ie, presenting a χ² p value <0.05). In the third step, the invariance of item difficulty hierarchy was detected across diagnostic groups through DIF tests. The fourth step consisted in splitting the items presenting a DIF by diagnosis providing one specific item for each diagnostic group who answered the item.16 In the fifth step, a principal component analysis (PCA) was performed to identify the potential factors explaining item difficulty hierarchy variations observed across the diagnostic groups. The PCA was performed on the differences between item difficulty specific to each diagnostic group and the average item difficulty for all diagnoses, as these differences reflect disease-specific patterns of item difficulty. In the sixth step, the items presenting a DIF among diagnoses (detected in the third step) were split into two main groups: asymmetric disorders (CS and CP) and symmetric disorders (RA, SSc and NMD). Finally, the seventh step included successive analyses performed to remove items with disordered thresholds, misfitting items and items presenting a DIF by diagnosis for another reason than the symmetric/asymmetric nature of the disorders. So a generic co-calibrated scale was created and manual ability was compared among diagnostic groups using a Kruskal-Wallis ANOVA of ranks and Dunn's method for pairwise multiple comparisons. Finally, the metric properties of the generic scale were compared with ABILHAND disease-specific scale properties.
Invariance of the item difficulty hierarchy
Thirty-two of the initial 83 items were deleted owing to the unidimensionality requirement violation. Assessment of invariance from the remaining 51 unidimensional items showed that 13 items shared a common location between diagnostic groups. Thirty-eight items presented a DIF and were split into a total of 152 items with diagnosis-specific locations. Differences between item difficulty specific to each diagnostic group and the mean item difficulty for all diagnoses were computed to identify disease-specific patterns of item difficulty. Positive values indicated that the items were more difficult for a particular diagnosis than average, while negative values indicated that they were easier than average (figure 2). With respect to disease-specific item difficulties, bimanual activities, such as ‘spreading butter on a slice of bread’, presented a greater challenge for patients with asymmetric disorders (CS and CP) than for patients with symmetric diagnoses (RA, SSc, NMDc and NMDa). Conversely, unimanual activities, such as ‘turning off a tap’, were perceived as easier in asymmetric disordered patients. About 85% of the DIF items were related to the unimanual or bimanual nature of the activities.
In addition, we found that proximal activities, such as ‘ringing a door bell’, were categorised as more difficult for NMD, CP and CS patients compared with RA and SSc patients. In contrast, digital activities, such as ‘counting banknotes’, presented the greatest challenge for SSc subjects who primarily had a distal impairment. Approximately one-third of the DIF items were concerned with the proximal or distal nature of the activities. It should be noted that some items fit both criteria. Moreover, DIF activities were related, to a lesser extent, to other factors such as age (about 30% of the items) or mechanical constraints induced in the upper limb joints (about 10–15% of the items).
PCA on diagnosis-specific-to-average item difficulty differences
PCA results showed that 57% of the variation of item difficulty hierarchy between diagnostic groups was explained by the symmetric or asymmetric nature of the disorders (figure 3). Indeed, CS adults and CP children were located at one extremity of the first PCA component while symmetric disorders were located at the other extremity. The second PCA component explained 28% of the variation of item difficulty hierarchy between diagnostic groups and distinguished patients expressing greater difficulties with proximal activities, such as NMD and CP, from more distal disorders such as SSc.
A ‘generic’ ABILHAND manual ability scale
Based on PCA results, the DIF items were split into two main groups: asymmetric (CS and CP) and symmetric (RA, SSc and NMD) disorders. When the 13 items sharing a common location between diagnostic groups were co-calibrated with the 38 DIF items split into a total of 75 items (one item was responded neither by CS nor CP subjects) with locations specific to either asymmetric or symmetric disorders, 2 items with disordered thresholds, 7 misfitting items and 27 remaining DIF items were removed. The resulting 52-item generic scale included 11 items sharing a common location between diagnostic groups and 41 items with locations specific to asymmetric (27 items) or symmetric (14 items) disorders. The 52 items are listed in table 2 in the order of decreasing difficulty (range: 3.60–3.93 logits).
Hand involvement, whether particular groups responded to each item, and item difficulty with SEs (mean: 0.2 logits; range: 0.09–0.56 logits) are also reported. It should be noted that only one diagnostic group responded to 21 items (40%) of the generic scale, while two or three diagnostic groups responded to as many as 12 items (23%). All diagnostic groups responded to the item ‘fastening a snap (eg, jacket and bag)’. The person separation reliability of the generic scale was 0.93, indicating that 5.19 strata of manual ability can be distinguished in our sample. The average measure of the entire sample was 2.34 logits indicating that the patients’ ability level exceeded the scale average difficulty.
Manual ability across diagnostic groups
Figure 4 shows the distribution of manual ability across the six diagnostic groups. Significant differences in manual ability measures were observed among diagnoses (p<0.001). The CP and NMDc groups had significantly less manual ability than the RA group, who, in turn, had less manual ability than CS, SSc and NMDa patients (p<0.05, Dunn's pairwise comparisons).
Comparison of the generic and disease-specific scales
As reported in table 3, SEs of patient locations were greater for the generic scale than for disease-specific scales, and a smaller range of patient measures was observed in the generic scale. The generic scale is globally less accurate than the disease-specific scales leading to a higher number of extreme persons. In addition, table 3 shows that manual ability measures of generic and disease-specific scales were highly correlated (range: 0.94–0.97).
The present study investigated the applicability of a generic manual ability scale unbiased by diagnosis across six populations. We analysed previous subject responses gathered during calibrations of disease-specific ABILHAND questionnaires, and we examined similarities and differences in manual ability among diagnostic groups. A unidimensional scale was constructed with 11 items sharing a common location between diagnostic groups and 41 items having a location specific to asymmetric (CS and CP) or symmetric (NMD, RA and SSc) disorders. The resulting generic scale revealed that CP and NMD children had significantly less manual ability than RA patients, who, in turn, had significantly less manual ability than CS, SSc and NMD adults.
A generic manual ability scale should best meet the requirements of upper-limb rehabilitation, insofar as a common instrument with a diagnosis-independent calibration can be used across clinical settings. Of course, the use of a generic scale assumes that individuals achieving identical activities have the same manual ability level regardless of their diagnosis. However, this assumption may not hold true in clinical practice. In our study, we found that only 11 of 52 items had difficulties unbiased by diagnosis, indicating that individuals’ underlying diseases may bias the perceived difficulty of manual activities. Using a sample size of 100 patients per diagnostic group, a DIF of 1 logit, namely, the approximate amplitude of DIF observed for the items split between symmetric and asymmetric disorders (see figure 2), in a test containing 10 items or more answered by at least 100 subjects can be detected at a significance level of 0.05 with a power of 95% or more.28 This indicates that the power of the DIF observed in our study is more than adequate considering the study set-up (ie, test length, sample size and significance level). Our results differ from those of Simone et al27 who found that the 23 CS-specific ABILHAND item scale ‘can be routinely applied to a variety of motor impairments’. These authors argue that the item hierarchy can be successfully preserved across diagnoses. Using our patient responses, we conducted a comparable analysis on the same 23 items from the CS-specific ABILHAND scale as Simone et al.27 Our findings showed that 21 items (91%) presented a significant DIF, which contrasts with the apparent invariance reported by Simone et al.27 Two factors may contribute to the observed differences in results: sample size and case mix. Our sample included 732 patients which is significantly more than the 150 subjects in the Simone et al study.27 In addition, the unbalanced case mix in the Simone et al27 project (83 CS, 17 multiple sclerosis, 13 ataxia, 10 tetraplegics, 3 Parkinson's disease and 24 healthy controls) may have concealed possible disease influences on difficulty ratings.
An explicit construct theory initiated the development of disease-specific ABILHAND scales. For each diagnosis, the scale content was selected to delineate a single unidimensional construct, correlated to the patients’ functional, clinical and demographic characteristics.6–10 The nature of the measured variable, namely, manual ability, can be determined by investigating the factors contributing to the hierarchy of manual item difficulty, that is, observed across diagnoses. To address this issue, we developed an original methodology that combines DIF tests, PCA and manual activities categorisation about their nature. Although an activity is expressed in the same way for all patients, its perceived difficulty may vary according to one's disease or disorder and the specificity of underlying motor impairments. Several studies have also shown that manual ability limitations are, at least partially, related to underlying upper limb impairments.6 ,29 Hence, it is not surprising that disease characteristics contribute to the difficulties experienced in performing manual activities.
The PCA results suggest that the vast majority (85%) of the difficulty variations observed in manual activities across diagnostic groups was explained by two characteristics: (1) the symmetric or asymmetric nature of the disorder (57% of the item difficulty hierarchy variations observed across disorders) and (2) the proximal or distal nature of the disorder (28% of the item difficulty variations). For example, activities requiring greater bimanual involvement (eg, ‘peeling potatoes with a knife’) tended to be rated as more difficult by patients with asymmetric disorders (CP children and CS adults) than by patients with more symmetric disorders (RA, SSc, NMDc and NMDa). On the other hand, unimanual activities (eg, ‘turning on a television’) or bimanual activities manageable in several unimanual steps (eg, ‘handling a stapler’) were rated as less difficult for patients with asymmetric disorders, probably because these activities can be achieved by exclusively using the unaffected or less affected hand.7 ,30 Activities involving the shoulder (eg, ‘drinking a glass of water’) were generally more difficult for NMD and CP patients. Indeed, the NMD groups included several diseases in which proximal segments were more likely to be affected than distal ones (eg, Duchenne/limb girdle muscular dystrophy, facio-scapulo-humeral dystrophy and spinal muscular atrophy).10 Moreover, and in contrast to other diagnoses, NMD and CP groups included subjects in a wheelchair, which may prevent the achievement of activities such as, ‘ringing a door bell’ or ‘replacing a light bulb’. In contrast, digital activities (eg, ‘winding up a wristwatch’) were particularly difficult for SSc subjects, who have reduced digital dexterity.9 Other characteristics of the diseases than their symmetric/asymmetric or proximal/digital nature may explain, even though to a lesser extent, the variations of item difficulty hierarchy between disorders. Activities inducing high mechanical constraints on the upper limb joints (eg, ‘screwing on a nut’) presented the highest challenge for RA patients owing to wrist and metacarpophalangeal joint involvement.8 Similar to a previous study,31 activities related to dressing (eg, ‘fastening the zipper of a jacket’) and self-care (eg, ‘cutting one's nails’) were more challenging for children than for adults as well as activities requiring turning something (eg, ‘turning on/off a tap’). Parents of unhealthy children may inhibit some activities to prevent risk (eg, ‘cutting one's nails’) or save time (eg, dressing items).32 Activities related to eating (eg, ‘unwrapping a chocolate bar’) were easier for children than for adults. It can be hypothesised that children are more motivated to compensate their hand impairments by learning adapted strategies (such as breaking down a bimanual activity into several unimanual sequences) for eating activities than for dressing or self-care tasks.29 It is also important to note that several activities presented a DIF for more than one reason.
Nevertheless, using 11 linked items unbiased by diagnoses, we successfully constructed, from a metric point of view, a unidimensional scale common to six diagnostic groups by separating items with difficulties specific to asymmetric and to symmetric disorders. In our study, the obtained SEs on items estimates on the generic ABILHAND scale range from 0.09 to 0.56 logits, average 0.20 logits and correspond to the expected values regarding sample size and targeting.33 The strong correlations (R≥0.94) observed between the generic scale and each of the disease-specific ABILHAND scales point out that they measure the same construct, namely, manual ability. However, disease-specific scales which often included a greater number of disease-relevant activities enable more accurate measures (ie, patient estimates have lower SEs) than the generic scale. This is most likely due to the fact that disease-specific scales have been constructed to maximise their person separation reliability and, therefore, also their accuracy. Overall, our findings are consistent with several studies showing that disease-specific instruments are substantially more discriminative and responsive to small deficits than generic instruments.34 ,35 Consequently, this increased sensitivity allows for the detection and quantification of small, yet clinically significant health changes.11 ,12 For example, ABILHAND disease-specific scales should be used to determine pathology impacts on manual ability, to measure clinical changes consecutive to specific treatments and to tailor interventions to the specific needs of individuals with a particular diagnosis. All these concerns are important for patients and clinicians in their daily practice. In contrast, the generic ABILHAND scale allows the manual ability of patients with different diagnoses to be compared and can be used, for example, to identify the relative burden of diagnoses, compare various healthcare programmes and demonstrate evidence of cost-effectiveness of different healthcare interventions.11 ,12 ,36 According to this generic scale, children had, on the whole, less manual ability than adults. This finding is consistent with previous results31 ,37 showing that children have relatively greater difficulty with manipulation activities than adults.
The generic ABILHAND scale includes 52 items: 11 items sharing a common location between diagnostic groups and 41 items having a location specific to asymmetric or symmetric disorders. The 11 common items were used to establish links that connect the 41 items specific to the symmetry of the disorders to place all measures in the same frame of reference (ie, on the same ‘ruler’). From a metric point of view, the common-item linking has enabled the development of a generic scale that can be used to compare subjects with various diagnoses since they are located on one single continuum. However, only one-fifth of the items of the ‘generic’ scale are common to several diagnoses. From a clinical point of view, this means that most manual activities present a difficulty that varies according to the underlying diagnosis and that various pathologies may affect differently the achievement of daily activities. The finding that the difficulties of most manual activities were disease-dependent emphasises the danger of using generic scales without prior investigation of item invariance across diagnostic groups.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online table
Contributors CA performed the statistical analyses, conducted the literature search and drafted the manuscript. CA, LV and MP participated in the data collection. CA, LV, MP and JLT contributed to the study design and the data analysis. All authors participated in the data interpretation, critically revised the draft of the manuscript for important intellectual content and contributed to the writing. All authors have read and approved the final manuscript.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Dataset is available from the corresponding author at firstname.lastname@example.org.