Objective To study the psychometric characteristics of German version of the Hospital Survey on Patient Safety Culture and to compare its dimensionality to other language versions in order to understand the instrument’s potential for cross-national studies.
Design Cross-sectional multicentre study to establish psychometric properties of German version of the survey instrument.
Setting 73 units from 37 departments of two German university hospitals.
Participants Clinical personnel (n=995 responses, response rate 39.6%).
Primary and secondary outcome measures Psychometric properties (eg, model fit, internal consistency, construct validity) of the instrument and comparison of dimensionality across different language translations.
Results The instrument demonstrated acceptable to good internal consistency (Cronbach’s alpha 0.64–0.88). Confirmatory factor analysis of the original 12-factor model resulted in marginally satisfactory model fit (root mean square error of approximation (RMSEA)=0.05; standardised root mean residual (SRMR)=0.05; comparative fit index (CFI)=0.90; goodness of fit index (GFI)=0.88; Tucker-Lewis Index (TLI)=0.88). Exploratory factor analysis resulted in an alternative eight-factor model with good model fit (RMSEA=0.05; SRMR=0.05; CFI=0.95; GFI=0.91; TLI=0.94) and good internal consistency (Cronbach’s alpha 0.73–0.87) and construct validity. Analysis of the dimensionality compared with models from 10 other language versions revealed eight dimensions with relatively stable composition and appearance across different versions and four dimensions requiring further improvement.
Conclusions The German version of Hospital Survey on Patient Safety Culture demonstrated satisfactory psychometric properties for use in German hospitals. However, our comparison of instrument dimensionality across different language versions indicates limitations concerning cross-national studies. Results of this study can be considered in interpreting findings across national contexts, in further refinement of the instrument for cross-national studies and in better understanding the various facets and dimensions of patient safety culture.
- quality in health care
- international health services
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Strengths and limitations of this study
Our study supports the development of a more uniform factor structure for the Hospital Survey on Patient Safety Culture across language versions in order to facilitate its use in cross-national research.
By evaluating commonalities and variations in different language versions of the Hospital Survey on Patient Safety Culture, we identify relatively stable factors, as well as those in need for improvement.
This is the first study to validate the German version of the Hospital Survey on Patient Safety Culture for clinical personnel.
The considerable diversity in study methodology and reporting of studies with different language versions of the Hospital Survey on Patient Safety Culture presents an obstacle for cross-national use of the instrument that has yet to be overcome.
All healthcare organisations face specific sets of risks and challenges regarding patient safety. These challenges change dynamically over time, reflecting developments within the organisation as well as in its operating environment such as changes in demographics and epidemiology or in patient behaviour. To effectively manage these challenges, it is recommended for healthcare organisations to develop a culture of safety that prioritises safety and organisational learning among other organisational goals.1 Safety culture is generally considered to be a relatively stable construct, rooted in organisational culture.2
A number of instruments for measuring safety culture in healthcare organisations have been developed. These instruments enable researchers and decision makers to evaluate and compare results on different levels of the healthcare system.3 Comparing results across units and hospitals and establishing benchmarks can drive continuous patient safety improvement. One of the most widely used instruments for evaluating healthcare providers’ perception of safety culture in hospital setting is the Hospital Survey on Patient Safety Culture (HSPSC).4 The instrument has been translated into many languages and used in different countries around the world.5–16
There are two gaps that this study aims to address. First, so far, no German version of HSPSC has been validated for healthcare personnel in Germany. Second, despite some attempts at comparing safety culture at the international level,17 18 the comparability of the different language versions of the instrument has not been studied systematically. While satisfactory psychometric properties were reported for the original North-American version4 with 12 dimensions of patient safety culture, alternative factor structures have been reported for other language versions, with the number of dimensions ranging from 8 to 12.5–7 9–12 14–16 Because an instrument’s dimensionality determines the interpretation of results, similarities and differences in dimensionality across different language versions should be considered for cross-national studies of patient safety culture.
Therefore, the aim of this study is twofold: (1) validation of German version of HSPSC (HSPSC-D) by evaluation of its psychometric properties and (2) evaluation of the instrument’s potential for cross-national studies, by comparative analysis of instrument’s dimensionality as reported for different language versions.
This study was based on data from the cross-sectional, multicentre study ‘Working conditions, safety culture and patient safety in hospitals: what predicts the safety of the medication process (WorkSafeMed),’ conducted between 2014 and 2017. In this article, we focus on HSPSC-D data to evaluate its psychometric properties. The WorkSafeMed study with all its components has been approved by the responsible ethics committees of the medical faculties of the project partners in Bonn (#350/14) and Tubingen (#547/2014BO1). Each partner complied with confidentiality requirements according to German law.
Safety culture data were collected in two German university hospitals from April to July 2015. We included staff from inpatient units with ≥500 patients a year. Intensive care and psychiatric units were excluded. Across the two hospitals, a total of 73 units from 37 departments participated in the study. The HSPSC-D questionnaire was distributed to 2512 healthcare professionals. All participants received an initial invitation to participate in the study, followed by two reminders. Study material included all required information regarding the study and data handling. Participation in the study was anonymous, and participants’ consent was implied by returning completed questionnaires. Non-responder analysis was not performed.
In order to develop a version of the HSPSC for German healthcare professionals (HSPSC-D), we used two previous German language versions as a starting point. A first translation of the HSPSC for hospital staff in the German speaking part of Switzerland7 had been culturally and linguistically adapted for use in Swiss hospitals. Hammer et al.19 used the Swiss version as a starting point for developing a management version of HSPSC to study perceptions of safety culture among medical directors in German hospitals. In our study, the instrument was adapted to be used with healthcare personnel in German hospitals.
The resulting HSPSC-D questionnaire follows the structure of the original North-American version4 and includes 44 items, 42 of which compose 12 dimensions (10 safety culture dimensions and 2 outcome dimensions). These 42 items use a five-point Likert scale to measure agreement ranging from ‘strongly disagree’ (1) to ‘strongly agree’ (5) or frequency ranging from ‘never’ (1) to ‘always’ (5). The remaining two single item measures are ‘Number of events reported’ (measured on six frequency groups from ‘No event reports’ to ‘21 event reports or more’) and ‘Patient safety grade’ (measured on five-point scale from ‘Failing’ to ‘Excellent’).
Data processing and preliminary analysis
After excluding responses with more than 30% missing values in HSPSC-D items, we conducted multiple imputations based on the expectation maximisation (EM) algorithm using the statistical software NORM V.2.0320 21 to replace remaining missing values. Negatively worded items were reverse coded before further analysis.
Several indices were taken into account to ensure that our study sample, as well as every subset used in further analysis, was appropriate for factor analysis. Kaiser-Meyer-Olkin (KMO) indicates if the sample of items is adequate for factor analysis, while Measure of Sampling Adequacy (MSA) indicates if an individual item is adequate for factor analysis. For both indices, the value >0.7 is desired, and the value of >0.9 is considered perfect.22 A significant p-value (<0.05) of Bartlett’s test of sampling adequacy indicates that it is possible to extract more than one factor.22 The analyses were performed using SAS V.9.4.
We calculated composite scores for each dimension suggested by Sorra and Nieva4 by calculating the average of corresponding items. We also calculated percentages of positive responses for each dimension by dividing the number of positive responses on corresponding items by the number of non-missing answers in the dimension. Descriptive statistics for each item and dimension were evaluated, including range, mean and SD.
Exploratory factor analysis
We used exploratory factor analysis (EFA) to evaluate the factor structure emerging from the study data. In general, EFA and confirmatory factor analysis (CFA) should be performed using different subsets.23 Thus, we performed the split-half cross validation, by randomly splitting our sample in two: ‘Exploring’ (for EFA) and ‘Testing’ subsets (for subsequent CFA). EFA using maximum likelihood was conducted using the ‘Exploring’ subset. We used Varimax orthogonal pre-rotation, and Promax oblique rotation to aid with interpretation of factor model.23 We used scree plot and Kaiser Criterion (Eigenvalues >1) for factor extraction. Factor loadings ≥0.4 were considered significant, and factor cross loading <0.4 was considered acceptable.22 23 Applying these criteria, we gradually eliminated problematic items until EFA resulted in a satisfactory factor structure.
Confirmatory factor analysis
We evaluated the model fit of the factor structure resulting from the EFA by conducting CFA using the ‘Testing’ subset. By conducting a series of CFA using the complete dataset, we evaluated model fit of original 12-factor model,4 as well as other factor models reported by studies of different language versions of HSPSC. From the official website of the Agency for Healthcare Research and Quality (AHRQ),24 we retrieved a list of studies including psychometric evaluation of the instrument and identified those reporting a different factor structure.
Internal consistency was evaluated by calculating Cronbach’s alpha as an indicator of correlation between each item and the factor. In their exploratory study, Sorra and Nieva4 considered Cronbach’s alpha ≥0.6 as acceptable. We used Cronbach’s alpha ≥0.7, as it is typically used in later studies using the HSPSC5 6 9 11 14 15 17 19 and is well supported by the literature.22 23 Cronbach’s alphas were calculated for all factor models considered in the CFA, including the factor model that emerged from EFA.
By calculating average of corresponding non-missing items, we calculated mean values for each dimension for the original 12-factor model and for the new model that emerged from EFA. Pearson’s correlations were evaluated between dimensions in each model. We expected low to moderate correlations between dimensions. However, correlations >0.85 would indicate possible multicollinearity.4 22 We also evaluated the correlations between dimensions of both models with two single item outcome variables – ‘Patient safety grade’ and ‘Number of incidents reported.’
Evaluation of common dimensionality
In order to evaluate the potential of the instrument for cross-national studies, we evaluated its dimensionality as reported for different language versions. We evaluated appearance and composition of each of the 12 dimensions proposed by Sorra and Nieva4 and of the 42 corresponding items in all factor models identified from AHRQ web page.24
Study sample and descriptive statistics
Out of 2512 distributed questionnaires, 995 were completed, resulting in a response rate of 39.6%. Sample characteristics are presented in table 1.
Out of our sample of n=995, 766 responses (76.98%) had no missing values on HSPSC items. Twenty-one responses (2.1%) contained more than 30% missing values on HSPSC items and were thus not included in the analysis. Remaining missing values were imputed using multiple imputations based on the EM algorithm. As a result, n=974 cases were available for further analysis. Descriptive statistics of HSPSC-D items and dimensions after imputing remaining missing answers and reverse coding of the negatively worded items are presented in table 2.
KMO for the complete sample was 0.93, and MSA for individual items ranged from 0.87 to 0.96. For ‘Exploring’ and ‘Testing’ subsets, KMO was 0.91 and 0.92, respectively, and MSA of individual items in both subsets ranged from 0.84 to 0.96. Bartlett’s test was highly significant (p<0.001) for the dataset, as well as for both subsets. Preliminary analyses indicated that our sample and the subsets were adequate for factor analysis.
Exploratory factor analysis
We conducted EFA using the ‘Exploring’ subset. We considered factor loadings ≥0.4 as significant, as this cut-off value was typically used in similar studies4–6 10–12 14–16 and was supported by the literature.22 23 Fourteen items not meeting the criteria (factor loading ≥0.4, cross loading <0.4) were excluded from the model, resulting in an eight-factor model with 28 items. The dimension ‘Organisational learning – continuous improvement’ was completely removed. The dimensions ‘Staffing’ and ‘Overall perceptions of safety’ were merged together, as were the dimensions ‘Feedback and communication about error’ with ‘Communication openness’, and ’Teamwork across hospital units' with ’Handoffs and transitions'. The resulting eight-factor model is presented in table 3.
Confirmatory factor analysis
CFA using the ‘Testing’ subset demonstrated a satisfactory model fit of the factor structure that emerged from EFA (see table 4). The model satisfied desired thresholds of most analysed indices (root mean square error of approximation (RMSEA)=0.05; standardised root mean residual (SRMR)=0.05; goodness of fit index (GFI)=0.90; comparative fit index (CFI)=0.93; Tucker-Lewis Index (TLI)/non-normed fit index (NNFT)=0.91).
From the official website of AHRQ,24 we retrieved the list of 23 articles reporting psychometric analyses on international level. From these articles, we extracted 10 factor models that differed from the original North-American version. These factor models were from the following countries: England (UK),9 Scotland (UK),5 France,15 Switzerland (French14 and German7), the Netherlands,10 Sweden,11 Slovenia,6 Turkey12 and Palestine.16 The 11 factor model considered in the analysis was the original 12-factor model.4
Subsequent series of CFA revealed satisfactory fit of the models from England (UK)9 (RMSEA=0.05; SRMR=0.05; GFI=0.92; CFI=0.93; TLI/NNFT=0.91) and Palestine16 (RMSEA=0.05; SRMR=0.05; GFI=0.90; CFI=0.91; TLI/NNFT=0.90) to our data. The original 12-factor model resulted in marginally satisfactory model fit (RMSEA=0.05; SRMR=0.05; GFI=0.88; CFI=0.90; TLI/NNFT=0.88). The models from Scotland (UK), France, Switzerland, the Netherlands and Slovenia resulted in suboptimal values of CFA indices (table 4). Models from Sweden and Turkey demonstrated unsatisfactory model fit in CFA.
The original 12-factor model demonstrated good Cronbach’s alpha for all dimensions except ‘Organisational learning – continuous improvement’ (0.68) and ‘Communication openness’ (0.64). Cronbach’s alpha for dimensions of the eight-factor model were between 0.73 and 0.87. Two dimensions, ‘Teamwork within units’ and ‘Communication openness,’ demonstrated consistently low alphas in other factor models analysed. Three dimensions, ‘Non-punitive response to error,’ ‘Staffing’ and ‘Handoffs and transitions,’ had lower than 0.7 values only in one or two of analysed models. Cronbach’s alpha for the remaining seven dimensions in all analysed models was ≥0.7, if present in the model (table 5).
Correlation between dimensions of original 12-factor model was between 0.10 and 0.61 (p<0.01). All 12 dimensions were positively correlated with the outcome variable ‘Patient safety grade’ (correlations between 0.26 and 0.70, p<0.01). Dimensions of eight-factor model from EFA were also positively inter-correlated (0.18–0.54, p<0.01) and positively correlated with the outcome variable ‘Patient safety grade’ (0.29–0.58, p<0.01). All dimensions in both factor models resulted in no or week correlation (<0.2) with the outcome variable ‘Number of events reported.’ All correlations are presented in the online supplementary appendix 1.
Supplementary file 1
Evaluation of common dimensionality
We analysed the appearance and role of each individual item and dimension from the original 12-factor model in factor model from EFA and in 10 models reported by studies from different language versions. Table 3 presents 42 items of the original 12-factor model and their appearance in all 12 analysed models. The uncoloured cells represent no change, where the item retains its original role in the factor model. Changes are represented by coloured boxes, which indicate elimination of the questionnaire item (N) or moving it to a different dimension (labelled from 1 to 12).
Fourteen items were eliminated from analysis in EFA. Of these 14 items, 11 demonstrated significant inconsistency, since in at least half of 10 analysed factor models, they were also eliminated, moved or merged with another dimension. All of the remaining 28 items of our eight-factor model demonstrated relative stability by retaining a similar role in at least 50% of the 10 analysed factor models; 23 items maintained their role in 80% or more of the models.
Eight dimensions, including ‘Teamwork within units,’ ‘Non-punitive response to error,’ ‘Supervisor expectations and actions promoting patient safety,’ ‘Frequency of events reported,’ ‘Staffing,’ ‘Feedback and communication about error,’ ‘Management support for patient safety’ and ‘Teamwork across hospital units’ demonstrated relative stability over the different language models, appearing in 80% or more of the 10 analysed models. The dimension ‘Communication openness’ was merged with the dimension ‘Feedback and communication about error’ in seven models.5–7 11 12 14 16 Similarly, the dimension ‘Hospital handoffs and transitions’ was merged with the dimension ‘Teamwork across hospital units’ in four models,6 7 14 15 and the dimension ‘Overall perceptions of safety’ with the dimension ‘Staffing’ in five models.5–7 9 11 The items from the dimension ‘Organisational learning – continuous improvement’ were shown to be highly inconsistent across various models. In five models, the items from this dimension were either removed from the model9 or merged with other dimensions7 10 11 15 (eg, with ‘Feedback and communication about error’).
The aim of this study was to evaluate the psychometric properties of the HSPSC-D and compare its dimensionality with factor structures derived from different language versions of the HSPSC. Our split-half validation resulted in an alternative eight-factor model with good psychometric properties. Most parts of the instrument demonstrate relative stability over different language versions and appear suitable for cross-national studies. However, items of four safety culture dimensions require further improvement to support a common structure for comparison across language versions.
In our study, HSPSC-D demonstrated marginally satisfactory psychometric properties, allowing for its use in German hospitals. HSPSC-D demonstrated a somewhat unsatisfactory model fit in CFA with the original 12-factor model. EFA resulted in an alternative eight-factor model, with good model fit. Nevertheless, the instrument demonstrated satisfactory to good internal consistency in both models. Studies with other language versions of the HSPSC have repeatedly reported similar results—good model fit of different factor structure and mostly good internal consistency.5–7 9 11 12 14 15 These findings indicate that the HSPSC is a useful instrument for measuring and comparing patient safety culture within a healthcare system for which the particular HSPSC version has previously been validated.
Our analysis of instrument dimensionality across language versions revealed that while some dimensions maintain relative stability of appearance and composition across language versions, others vary significantly. When analysing 12 different factor models, including the original North American 12-factor model and the 8-factor model resulting from our EFA, we found that items from eight dimensions maintain relative stability in appearance and composition over different cultural adaptations. These dimensions were ‘Teamwork within units,’ ‘Non-punitive response to error,’ ‘Staffing,’ ‘Supervisor/manager expectations/actions,’ ‘Frequency of event reporting,’ ‘Feedback and communication about error,’ ‘Hospital management support for patient safety’ and ‘Teamwork across hospital units.’ The items from these dimensions seem to maintain their coherence and measure one common factor in different language adaptations and different healthcare systems. In contrast the remaining four dimensions, namely ‘Organisational learning – continuous improvement,’ ‘Overall perceptions of safety,’ ‘Communication openness’ and ‘Hospital handoffs and transitions’ appeared in only ≤60% of analysed models, since corresponding items were either removed, or migrated to or merged with other dimensions. Similarly, Hedskoeld et al.7 revealed a nine-factor model but argues against removing items and dimensions from the instrument, stating that they can still be used to understand and improve patient safety. Even though these dimensions and corresponding items may be very important in studies of patient safety culture, they need to be refined in order to support their stability over different cultural adaptations.
Evaluation of psychometric properties of a translated version of the instrument is important, as only the results of validated instruments can be properly interpreted and used for comparison in local contexts. A number of studies reported that the original 12-factor model did not fit the data well, and alternative factor models were suggested.5–7 9–12 14–16 Variation in the factor structure may be partially attributed to the differences between study samples and study populations. These studies differ by setting, sample size, representation of different professional groups and other characteristics, which can have influence on the performance of the instrument, hence should be considered in analysis. Finally, the specific characteristics of study population’s culture, as well as of local healthcare system influences how the respondents perceive, understand and respond to each individual item in the questionnaire, ultimately altering the factor structure and interpretation of the results.
Concerning the international use of the instrument, several articles highlight the importance of a common factor structure. For example, Occelli et al. 15 underline the need to adapt the tool to each country’s environment while stating that ‘for international comparison purposes, a core set of dimensions consistently assessed as valid should be defined and measured in all countries.’ Perneger et al. 14 further argue that local improvements to a translated version can be ineffective, due to several unresolved issues inherent in the instrument, such as limited internal consistency of some dimensions, different dimensionality found in various language versions and the lack of external validation of study results.
The data analysis and results in the study were limited to two German university hospitals. Also, our findings should not be generalised to all hospital employees, as the study sample mainly consists of nurses and physicians. However, our findings regarding psychometric properties of the instrument, as well as its dimensionality, are in line with those of similar studies from other countries. While exploring the common dimensionality of various language versions, our analysis was limited to research articles retrieved from the official web page of AHRQ.24 Taking into account more studies that report a different factor structure based on a systematic review could improve the analysis. Lastly, the diversity of study methodology and reporting of studies with different language versions of HSPSC may be considered an additional obstacle for cross-national use of the instrument.
Overall, the German version of the HSPSC demonstrated acceptable psychometric properties for surveying clinical personnel in German hospitals. We found that most safety culture dimensions were relatively stable across different language models. However, other dimensions demonstrate high variability and inconsistency. Such inconsistencies need to be refined in order to support a more uniform factor structure across language versions in order to facilitate the use of HSPSC at the cross-national level.
We thank the members of the advisory board for their valuable advice at various stages of the project; Prof Johannes Giehl (Competence Center Quality Assurance/Management (KCQ), Medical Service of Statutory Healthcare Assurance in Germany), Prof Ulrich Jaehde (Institute of Pharmacy, University of Bonn), Dr Constanze Lessing (Berlin), Dr Barbara Strohbuecker (Deutscher Pflegerat, Cologne), Prof David Schwappach (Swiss Patient Safety Foundation, Zurich), Prof Petra Thürmann (University Witten/Herdecke, Chair of Clinical Pharmacology; HELIOS University Clinic Wuppertal). Last but not least, we would like to thank all study participants. We acknowledge the support of the hospital management, the efforts of study coordinators in participating departments and units to facilitate data collection and the respondents for their effort and time to fill in the surveys.
↵* NG and AH are co-first authors.
GambashidzeN, HammerA, Brösterhaus M, etal. Evaluation of psychometric propertiesof the German Hospital Survey on Patient SafetyCulture and its potential for cross-cultural comparisons:a cross-sectionalstudy. BMJOpen 2017;0:e018366.doi:10.1136/bmjopen-2017-018366
Contributors Data were collected by the WorkSafeMed Consortium. Data analysis was carried out by NG under the supervision of TM and AH. NG and AH wrote the manuscript that was then revised by MB and TM. The final version of the manuscript has been approved by all authors.
Funding The WorkSafeMed study was funded by the Federal Ministry of Education and Research (FKZ 01GY1325A and 01GY1325B). We acknowledge additional financial support by the German Research Foundation and the administrative support by the DLR Project Management Agency. The data analysis and preparation of the publication was conducted with support of the German Academic Exchange Service awarded to NG.
Competing interests None declared.
Ethics approval Ethics committees of the medical faculties in Bonn (#350/14) and Tubingen (#547/2014BO1).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Because of data security aspects, data from the WorkSafeMed study will not be made available in the public domain. However, data will be used by students of both project partners for their theses. Data will be stored in accordance with national and regional data security standards.
Collaborators Luntz E, Rieger MA (project lead), Sturm H, Wagner A (Institute of Occupational and Social Medicine and Health Services Research, University Hospital of Tuebingen), Hammer A, Manser T (Institute for Patient Safety, University Hospital Bonn), Martus P (Institute for Clinical Epidemiology and Applied Biometry, University Hospital of Tuebingen), Holderied M (University Hospital Tuebingen).
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.